Effort Estimation Template

After preliminary analysis of requirement, we understand the scope of the project. Once scope is finalized we go for effort estimation. By finalizing scope we identify, source and target, number of interfaces, design pattern and complexity of the interfaces. For integration projects, efforts are calculated on the basis of these parameters.

We spend time on following artifacts/tasks as per SDLC .

  1. Analysis
  2. Functional Specification
  3. Detail Design Document
  4. Coding and unit testing
  5. Test case preparation
  6. QA deployment preparation
  7. User acceptance testing support
  8. Production deployment support
  9. Hypercare
  10. Closure activities/Transition to support team.

Here is the sample effort catalogue templates you can use for Effort/Cost calculation for integration projects.

Design Pattern Type Scenario Analysis FS DD Code ITS SIT QA UAT Prd Total Effort(PD)
Synchronous Process Simple One Source one Destination, simple Lookup for data enrichment( database or  xref table), Simple flow control(try Catch, Decision, Branching)
Simple Transformation(string manipulation),Simple error handling Logic(Reuse existing error handling process), Simple security requirement( standard Basic Authentication)
1 1 1 3 1 1 1 1 1 11
Synchronous Process Medium One Source one Destination, Medium flow control(try Catch, decision ,branching, few subprocess call)  medium lookup complexity  for data enrichment(database, xref table or webservices), simple scripts(groovy,shell, bat etc),
Medium Transformation logic( String conversion, Date format conversion,simple math) , Medium process logic, Medium Error handling logic( reuse or custom built error handling processes)  Medium security( Encyption/Decyption,  Basic Authentication)
2 2 2 5 1 2 1 2 1 18
Synchronous Process Complex One Source more than one Destination,  Complex Flow Control(try/catch,multiple, Branch,route, decesion shapes etc are used), Multiple  lookup for data enrichment to Database, Xref table, Webservices, complex database query, Complex scripting requirement, Multiple subprocesses,
Complex Transformation logic(String conversion, Mathematical function,lookup,custom scripts), Complex  Error handling logic( Custom built error handling process required), Additional Security  Requirement(  encryption/decryption, Mauth, Two-way SSL, Message signing, URL signing ),  Audit Requirement
3 3 3 8 2 3 1 3 1 27
Asynchronous Publisher Process Simple Send Messages to the single Topic/queue, Simple schema validation( Only incoming message to be validated against predefined schema) Simple flow control, No  Lookup for data enrichment,
Simple Transformation,
Simple Error handling Logic, Simple security requirement
1 1 1 3 1 1 1 1 1 11
Asynchronous Publisher Process Medium Send Messages to single  queue,Medium process logic, Medium flow control, Medium lookup  complexity for data enrichment,
Medium Transformation logic, Medium Error handling logic, Medium security
1 1 1 5 1 2 1 2 1 15
Asynchronous Publisher Process Complex Send Message to multiple Queue, Complex lookup for data enrichment, Complex flow control,
Complex Transformation logic, Complex  Error handling logic, Additional Security  Requirement
2 2 2 7 2 3 3 2 23
Asynchronous Subscriber Process Simple Subscribe to the single message Queue, Simple process logic, Simple flow control, Simple lookup for data enrichment,
Simple Transformation,
Simple Error handling Logic, Simple security
1 1 1 3 1 1 1 1 1 11
Asynchronous Subscriber Process Medium Subscribe to the single message Queue, Medium process logic,  Medium flow control,Medium  lookup  complexity for data enrichment,
Medium Transformation logic, Medium Error handling logic, Medium security
2 2 2 5 1 2 1 2 1 18
Asynchronous Subscriber Process Complex Subscribe to the single message Queue, Commit messages to the database, Sequencing requirement,Complex  flow control  logic, Complex lookup for data enrichment,
Complex Transformation logic, Complex  Error handling logic, Additional Security  Requirement
3 3 3 8 2 3 1 3 2 28
FTP Process Simple One Source one Destination file transfer, Simple backup and delete after  file transfer, simple error notification call using common error code. 1 1 1 2 1 1 1 1 1 10
FTP Process Medium One Source , one Destination, Medium flow control , encryption/decryption of the file, custom built error handling process 2 2 2 4 1 2 1 2 1 17
FTP Process Complex One Source multiple  destination,  Complex flow control, Complex Data transformation, Intermediate SFTP servers, Complex Scripting (shell /Groovy/bat) requirement.
Additional Security  Requirement, Custom  built error handling process, Mail Notification, Auditing requirement
3 3 3 8 2 3 1 3 2 28
Bulk data Process(
File to File
File to DB
File to web Service
DB to Web service
Batch Retry)
Simple One Source one Destination, Simple flow control, No  Lookup for data enrichment,
Simple Transformation,
Simple Error handling Logic, Simple security
1 1 1 2 1 1 1 1 1 10
Bulk data Process Medium One Source one Destination, Medium flow control logic, Medium lookup  complexity for data enrichment, Single split and merge requirement,
Medium Transformation logic, Medium Error handling logic, Medium security requirement
2 2 2 4 1 2 1 2 1 17
Bulk data Process Complex One Source multiple  Destination, Complex  flow control logic, Complex split and merging requirement, Complex   lookup call for data enrichment ,
Complex Transformation logic, Complex  Error handling logic, Additional Security  Requirement, Audit requirement.
3 3 3 8 2 3 1 3 1 27

You categorize the interfaces while capturing requirement as per requirement capture template. Then you can use effort catalogue to calculate total cost of the project. The above template categorize all the interfaces in 5 pattern types; synchronous, asynchronous publishers and subscribers (Message Queue based) , SFTP, Bulk processing. We have three types of complexity; simple, medium, complex.

Suppose you have 5 simple synchronous interfaces and 2 sftp complex  interfaces , then total effort should be  5 *11+ 2*28 = 55+56=111 PD.

If blended rate is $200 per PD, total cost 111*200=$22200

You can add project management overhead as 20%, so total cost = 22200 + 4440 =      $26640

Additionally   you can add hyper care or production support overhead as 20%.

for exiting interfaces change requirement , you can consider a re-usability factors while calculating effort.  Suppose your total   effort 111 PD. If you add 60 percent re usability factor total effort would be 111*60/100 = 66.6 PD.

Some tasks take same amount of time irrespective of number of interfaces to develop.  Like production migration preparation . In that case we should consider that effort separately , we should not simply multiple with number of interfaces.

 

Integration Authentication Protocol

Security consideration is the most critical aspects when you are integration applications. Service end point should be protected from un-authorized access. Source, destination, middleware integration application or ESB support several authentication protocols. We should select the most suitable one on the basis of the scenario. Like cloud to cloud integration needs more security, so we may use Two-way SSL. By Two-way SSL we are authenticating users by their certificates. This is more secure than basic authentication. Similarly we can use SAML for federated SSO. Basic authentication can be used to connect intranet application. We can use NTLM for connecting to SQL Server.
Apart  from authentication and authorization, access control to the server, auditing, compliance requirement is critical for security.
I have come up with some guidelines where to use which authentication protocol on the basis of application capabilities. Please check below.

Source Destination Scenario Recommended Authentication Protocol
Cloud Application Middleware Integration  Application Partner platforms are already integrated with Same IDP for federated SSO SAML
Cloud  Application Middleware Integration  Application There is no common Identity Provider and Both support Two way SSL Two Way SSL
Cloud Application Middleware Integration  Application Middleware exposing  third party resources Oauth/Openid Connect
On-Premise  Application Middleware Integration  Application Application paltforms are already integrated with Same IDP  for federated SSO SAML
On Premise Application Middleware Integration  Application Applications are in same  windows domain and Microsoft  Windows active directory  used as identity provider Kerberos
Middleware Integration  Application Cloud application ESB is invoking external Application TwoWay SSL
Middleware Integration  Application Cloud application ESB is accessing third party resources Oauth/Openid Connect
Middleware Integration  Application On-Premise Application ESB is invoking On-Premise Application and LDAP server  is used for  technical account LDAP based Basic authentication
Middleware Integration  Application Database ESB is  making database call using JDBC Basic authentication
Middleware Integration  Application SQL Server Database Integration application  and database are in same windows Domain NTLM
Middleware Integration  Application Fileserver Connect with SFTP protocol Basic authentication
Middleware Integration  Application External Application This is mostly accessing cloud application where it is required to sign the rest request Signing  the  request

Two Node Server Cluster Topology

Server Clusters:

Middleware application server should be setup as per fault tolerant architecture.

In two node separate clusters topology, we will have two node cluster in primary site and two node cluster in DR site.

Primary site nodes will be active carrying real traffic. DR site node will be passive; it will not take any workload while primary site is operational. In case of disaster, DR site clusters will become active and take the workload.

During production deployment, Processes should be deployed in both primary and DR site server clusters .Also externalized variables /process properties should be configured in both server accordingly.

Fault tolerant architecture.

It will have multilevel failure protection; server level, Load balancer (LB) lever, storage level. If any of the server is down, traffic will be forwarded to other server by load balancer.

From the internet, web traffic will come via WAF.  If primary site LB is down, traffic can be forwarded to DR site LB by WAF. But unless it is absolute necessary, don’t channel the traffic to DR site, because you need to synch up data from DR site to primary site during failback.

If case storage /database failure of primary site, this setup can leverage data of DR as it is expected that data will be synched up to DR site in real time by leveraging storage replication tools.

Unlike java web application, the web service call is stateless. That means server does not   need to maintain state for the web service call. Each call is independent. So session replication is not required. Though server can replicate other information like cache if   distributed caching is used.

If server stores operational data in SAN or database that may be replicated to DR site. But if process does not require to store any information on disk or database then nothing needs to be replicated.

Failover:

In case of disaster source application can point to DR site URL directly. In that case source code may need to be changed where URLs are hard coded.. Normally URL are externalized variable and kept in a process properties. That process properties can be updated directly from server console without redeployment requirement. Another option is not to change the URL but is to change the IP address of the URL.  In this case requester application does not have any impact in case of disaster. We should ask Domain name server administrator to change the IP address of the URL. The only problem in this case is that it may take sometimes for server to point to the correct IP address because new DNS lookup will not be initiated by server unless DNS records expire as per TTL value. . Typical TTL value in DNS records are set below 1 hour in case of web service call. In this case, it is expected that system will get the correct IP address of URL in DNS lookup after couple of hours.  We should maintain a check list of activities for failover. Not only URL needs to get changed, other variables also may require to change. Also database/storage admin support may be required as replication needs to be in opposite direction (from DR to Primary site).  Now- a-days modern middleware system can provide RPO to 0 and RTO less than 1 hour.

 Failback:

If primary site comes up then application should use primary site infrastructure. All the application which is using DR site URL/IP address should use primary site URL/IP address.  Requester application configuration /properties should be changed again to point to primary site resources. SAN and database of DR site should by synched up with primary site as required.

Hybrid Design Pattern

Synchronous Message Processing – Request-Response Pattern; Handle Failed Request Asynchronously-Hybrid Design pattern.

This is mainly a synchronous design pattern. But sometimes, some messages can’t be posted to the destination in real time due to connection failure or other reason. So we need to process those messages asynchronously. This design pattern is partly synchronous and partly asynchronous. So you can call this a hybrid design pattern.  Those   failed messages needs to be stored temporarily in a message queue or database for retry.  ESB main process will send an intermediate status code to the requester application in such case. A separate ESB process will listen to the queue and try to post it to the destination. Requester application needs to provide a call back web service.

Happy path will be always synchronous. Only few messages that could not be posted will be processed asynchronously.

Messages should be published to the queue to post in   near real-time. As faster message processing is the foremost important in synchronous design pattern, preferably we should use messages queue instead of database. If messages are stored in a database we should write a batch process which will try periodically to post those messages. Also we can use database to persist the messages if we need fine grain control.

For synchronous main process,   every process component like duplicate checking, auditing, error handling is applicable.

For the second process which work asynchronously, we may not need to have duplicate check. But we should have other common sub-processes like audit and error handling. We may store the enriched and transformed messages or we may store original messages sent from source application. But in this case again we need to do message enrichment and transformation, which may not be required and may be avoided.

We need to have two distinct processes to implement this design pattern.

          Main Process-Invoke Provider and send failed messages to the queue.

 Requester will invoke the Integration process web service and wait for response.

  1. The main integration process will validate the request and if messages are ok, enrich and transform the messages and invoke provider service.
  2. The process  will send the response in real time to the requester
  3. The process will send the messages, which could not be posted, to the message queue.

         Process 2-Process from queue

  1. This process 2 can be scheduled process or a queue listener process
  2. Process 2 will get requester messages from the queue or database
  3. Process 2 will invoke provider web service to pass the messages
  4. Process 2 will pass the response to requester by call  back web service
  5. In case messages could not be delivered; the messages will be placed in the error queue or it can keep on retrying the messages. In case messages are stored in the database, message flag can be updated for successful delivery of the messages of delivery failure.
  6. This process will try repeatedly to push the failed messages to the target application.

 

Use Case

 Use this pattern, when requester wants to pass messages to the provider preferably in real time. However, it requires integration layer to push failed messages to the destination asynchronously. A call back web service should be provided by requester to send acknowledgement to the requester.

 

 

 

Request Response Pattern

synchronous communication- Request Response

Synchronous Message Processing – Request Response Pattern

This is the simple and most frequently used design pattern for application integration.

Source application invoke web service and wait for reply. The integration layer post the messages to the destination and send back reply to the source application.  Until response is received source application thread is blocked.

Validation:

Incoming messages should be validated against schema before forwarding it to destination. Schema can be stored in the database or in filesystem. If validation fails error message should be sent to requester.

Enrichment:

Message enrichment may be required before posting the message to the destination application.  External web services, database, filesystem or other source can be used to enrich the messages.  For complex business logic, you might have to make multiple web service call or you may have to do service orchestration. But this should be avoided in synchronous design pattern. No complex data enrichment or complex business rule should not be implemented in synchronous design pattern.

Integration process should be very light weight and should be able to send response within few seconds as requester is waiting for reply and connection timeout would happen if response not sent in shorter duration.

Transformation:

Ideally transformation should happen after data enrichment is completed. We should do field mapping from source to destination. Normally middleware tools provide drag and drop facility for field mapping.

Auditing:

Best practice is you have to keep audit logs of the service invocation.

Like when request has been sent by source application and when response has been sent back. You can post audit messages to the queue or database.  For faster response time you can write audit log in asynchronous way like sending the audit messages in a queue. You can write the audit messages directly to the database in case database response time is faster.

Audit information can contain source application, destination application, timestamp, and payload. If payload is big you can extract id /key values or some significant fields from the payload and store it.

Error Handling:

In case of error, you should notify requester application. You can populate HTTP Error code or you can provide error messages in the body as agreed upon. For business logic error you should normally pass message in the message body along with HTTP error code.

HTTP error code may be set to   401 (unauthorized) 500 (internal server error) or any others depending on the cause as per standard.  Also you can use any custom error code. But this should be agreed upon with respective project team of source application.

You should not retry to connect destination multiple times because source system thread is blocked and waiting for response.

Duplicate checking:

Duplicate checking is very important for non-idempotent provider web services. Normally duplicate messages comes mostly due to retry operation from the service requester application. Sometimes due to network issues TCP connection might be broken. Though messages are successfully delivered, requester application could not get response. So it retries and it results in duplicate messages.

If duplicates messages are passed to provider application and it may change the state of the field values. In case of idempotent web services duplicate messages are not a challenge but for non-idempotent services, duplicate messages should be rejected.

We can do this simply by keeping messages id or any key values in the look up table or in server cache. Server cache should be the most preferred option as it would be faster to retrieve value from the cache.

For single instance of server keeping the last processed message id in the cache will solve the issues. But for server cluster distributed caching should be used. If requester application is single threaded then keeping the last message id in the distributed cache and compare it with incoming message id will resolve the issues.

For multithreaded clients or cluster clients you need to store   messages ids for last few minutes or hours. For most of the integration requirements, it might be enough.  For zero duplicate toleration it is good to use look up table, where you can store all the previous message id in the database. Incoming messages should be looked up and validated against this table.

But it will have performance overhead. Remember to set cache expiry to reasonable time limit so that cache does not grow big.

Usages:

This pattern can be used when provider application can response back to requester within few seconds.    When requester needs to send messages to provider in real time, this pattern should be applied. Requester application should take the responsibility of error handling, like   communication fault, system faults or business logic faults.

We need to have   a single process to implement this design pattern in ESB.

 Requester will invoke the Integration application web service and wait for response.

  1. Integration application will validate the request and if messages are ok, enrich and transform the messages  and invoke provider service.
  2. Integration application will send the response in real time to the requester
  3. The other reusable sub process  for audit, error handling to be developed and called from main process accordingly.
  4. In case of error , this process will call common error handling sub  process

 

Integration Design Pattern

I would like to provide you a practical guide on application integration. I have come up with the patterns that we commonly use in integrating application.

Success of the integration project depends on selection of suitable design pattern.

The design patterns described below may be used as a reference for the developers and architects for building new integration process.  Suitable design pattern should be selected for implementation as per customer requirement. Complex design pattern should be avoided to reduce cost of development and maintenance.

Find below the list of the standard patterns. This list is commonly followed approach in EAI.

 

Type Pattern
Message Processing Synchronous Message Processing – Request Response Pattern
Synchronous Message Processing – Request-Response pattern; Handle failed request asynchronously
Synchronous Message Processing – Request-Response pattern, handle both failed request and response asynchronously
Asynchronous Fire and Forget Pattern
Asynchronous Message Processing – Delayed response
Asynchronous Message Processing –Asynchronous Response.
Asynchronous Message Processing – Queue as staging area for requester messages
Asynchronous Message Processing – Queue as staging area for response messages
Asynchronous Message Processing – Queue as staging area  for request message and response messages
Asynchronous Message Processing -Publish and Subscribe pattern
Asynchronous Message Processing-Requester read the status of messages  from a database
FTP  Processes Standard FTP Process – One Source to One destination
Standard FTP Process – One Source to Multiple (N) Destinations.
Standard FTP Process – Multiple Sources (N) to Multiple (N) Destinations.
Bulk Data Processing Bulk data processing– Read database and call web services to process each records individually.
Bulk data processing– Read the file and call web services to processes each record individually.
Bulk data processing– Read File from share directory, process each record individually and Merge and write back to the shared directory/FTP server.
Bulk data processing– Read File from FTP server and process each record individually and FTP the file.
Bulk data processing– Read multiple Files from FTP server and processes each record individually and FTP the files.
Bulk data processing – Single message  containing  multiple records
Sending reports in the mail – Read the database or file and produce a report in a scheduled process
Database Polling Poll databases for new records
Special  Messaging Need Asynchronous Message Processing – Message to be processed in order of receiving.
Asynchronous Message Processing – Asynchronous Request, – Message set (co-related) to be processed in order irrespective of receiving order.
Message Coordination Message aggregation
Message Split
Message Orchestration
Messages Security Authentication
Encryption
Threat prevention in messages content.
Transport layer data protection
Message Storage Database vs Message Queue
Other Topics Will Come up later

 

Follow the next post for details of each design pattern.