Author Archives: admin

Single File to Multiple Targets – approach 3

In this approach, we copy the file in a temporary storage and after archiving and deleting the file from source folder.  A single file can be replicated in different folders destined for each target.

We will have   separate file transfer process for each target to send the file from source to target.    Each process will read the files from corresponding folders and send it to the destination in parallel. After successful transfer, each process deletes the file from corresponding folders. If particular process fails, then that process to be restarted.

Use Case: Use this approach if file size is small.  Temporary storage can be provided by source system or it can be within ESB server.

Benefits: In this case no file transfer tracking table may be required.  Process for each target can be run in parallel.  Each process  will be simpler thus giving us higher maintainability.  New process for new target can be added without impacting existing process .

Disadvantage: Overhead of replicating files for each target.

Single File to Multiple Targets – approach 2

First approach as we discussed, have one process to transfer file to multiple destinations. But as we mentioned that the process will be complex and maintainability ,scale up will be expensive. So it is good to have separate process to transfer file to each target.  Also we can run each process in parallel.

We will have a separate archival process which run at last after all the files are transferred to each destination.

We can use file transfer tracking table to keep track of whether file is transferred successfully nor not.  Each process will insert a record in file transfer tracking table after successful completion.

The archival process needs to check if the file is transferred successfully or not to all the targets. We can do this by calculating number of process successfully run from file transfer tracking table. This job should be scheduled after the main file transfer processes.  Also each process can trigger archival process after completion, in that case we don’t need to schedule it separately.

Benefits:  Each process can  run in parallel. Process will be very simple and maintainability will be high. We can add new process to transfer the file to new target , without changing existing process.

Single File to Multiple Targets – approach 1

In this requirement, we need to send a single file to multiple targets.

In this design pattern a single ESB process will handle file transfer to multiple destinations. The flowchart is given above.  We have to use a file transfer tracking table to track status of file transfer for each target. We should keep checks in the process not to send the file to target if the file is already sent. This is required when we rerun the process after exception occurs.

The file transfer tracking table can have following fields.  Some fields can be optional. This generic table can be reused for all the ESB file transfer processes.

Program name: Name of the program. This is applicable for big initiative where there are multiple projects under the program.

Project name: Name of the project under the program.

Event name: Event name or message name or entity name etc can be stored

Process name: Name of the process transferring the file.

Filename: This is the unique file name. Source will generate unique file name  normally by post -fixing timestamp or by any  other method.

Messageid: Messageid uniquely identify contents of a file. We can check if file’s content is duplicate with this messageid field. If we see we processed the file with that message id already, we can ignore the file to process further. This can be a primary key of the table.

File Source: Source from where file is being sent

File Target: Destination system where file  is to be sent.

File Sent date: Timestamp of when file sent to target.

Status: Delivery status of the file. You can store value such as “delivered” . Also we can store other intermediate status such as  “picked up”, “archived” etc.

FileTransferid: This may the primary key of the table.  Also we can also use messageid field as primary key. In that case we don’t need this attribute.

This can be used in combination of audit table.

Benefit: Single process can handle file transfer to multiple targets.

Disadvantage:   We can use this pattern, if number of targets are less (may be 2 to3). More number of targets means, process will be more complex. If new targets are added or logic to be changed for particular target, the exiting process needs to get changed. So it requires regression testing. As process is complex maintainability will be expensive.

Standard File Transfer Process

We can use capability of standard ESB tools to transfer files from source to target.  Also we can use standard MFT tools available in the market.

While designing file transfer process, we may need to consider following

  1. Successful transfer of the file to the destination
  2. Archiving and deletion of the file from source after successful transfer.
  3. Resubmission of process after failures.
  4. Not to send duplicate file to the target

Single file to single target

Single file to single target is the easiest file transfer pattern.

We need a single process to implement this pattern.  If the file is transferred successfully, we archive the file and delete the file from source.  If file transfer is not successful than we need to re-run the process.

If process fails in archive and delete stage after successful file transfer, you should not rerun the process. In that case you need to archive and delete manually.

But if you need to do that automatically with re-submission of the process, you need to use file transfer tracking table. With the tracking table you can skip the steps that were already executed. The content of the tracking table is described in multiple target file transfer section.

Common Error Table

Common Error Table.

 When error happens, exception handling processes notify support team, raise tickets with remedy, and sometimes use log4j to log error messages in the server. In addition to these, it is good to have common error table to log the error. It helps us to analyze system health directly integrating with other tools.

 Following can be the fields of common error table.

Error key: This is the primary key of the table. This can be a sequence.

Program Name: Name of the program. This is applicable where under a single program there are multiple projects.

Project Name: Name of the project under the program.

Process Name: Name of the process producing audit log.

Event Name: Name of the Event /Message/Entity

MessageId: The MessageId uniquely identify the message. This can be used for reconciliation purpose.  This can be optional.

Source: Source System name.

Target: Target System name.

Error Code: This field will contain error code

Error Message: Detail error messages

Error Summary: Brief description of error

Create-timestamp: Timestamp when error occurs

Incident number: Error can be linked with remedy ticket number when a ticket is raised automatically in remedy.

Connector name: Name of the connectors causing error. This can be optional

Attribute1, Attribute2, Attribute3: You can store any important information in these fields like  ids , number etc to  troubleshoot issues

Storing the Messages for Retry

Sometimes we need to store messages into the database for retry when messages cannot be pushed to the target at the first attempt.

Again here we need to think of what fields are required in the database table to store the messages. Otherwise you will keep on adding fields afterwards which may need changes of code and documents.   So you should decide which fields are required in the database table along with storing the message payload at the first place. Typically you should have following fields.

Message ID:  This is the primary key of the table. This is the unique id which to be retrieved from the message payload. Each message should have message id which uniquely identify the particular message. If you receive the message with same message id, it means it is duplicate and you should reject that message. Sometimes you may not get this unique message id in the message payload, in that case try to identify some unique field or combination of fields in the messages which will uniquely identify a message and act as a key.

Program: Name of the program.  This is applicable for large initiative where under a single program, there are multiple projects.  Otherwise for small project, program name and project name can be the same.

Project: Name of the project under the program.

Process: Name of the process producing the message

Message Type: Message or event type. You can correlate the messages which to be processed by the particular process. Though you can correlate with process name. But it is good to have this field.

Source System: Message source system which is sending the message  to the destination

Destination System: Target system where message to be sent

Message-Payload: This is the message payload. This can be a text or  clob type data.

Received-timestamp: Timestamp when message is received

Last-time retried timestamp: Timestamp when message are retried to push to target

Retry-flag: This is a Boolean flag.  If denotes that messages to be retried to send to destination or not. If messages are successfully pushed to destination, retried flag should be set to “N”.

Retry count: Number of times messages are retried to push to destination. Message retry may not be successful in the first attempt. So you should keep on increasing the counter every time you attempt to push the message to the target.

Message status: Message are successfully sent or not. Store “success” if the message is successfully sent to destination. Otherwise store failure.

Error Code: This is optional. You may store the error code to know the reason of retry failure

Error  Description: This is optional. You can store detail error messages.

Also remember you should call common audit process from retry process. Retry process should be the common reusable module of the main process which try to push the message to the destination.

Common Audit Log Process

Audit process:

Keeping audit log is very important in message oriented middleware. Sometimes standard audit log feature available with ESB tools are not enough. We need to build   custom audit log. We should create a common reusable process which will be called by main process to create audit log.

There should be standard table defined which will contain audit log. But in that case it will be slower, if every process access the table to create audit records. Best practice is to create an audit queue where messages can be published. We can have a subscriber which will subscribe to the audit queue and write the audit log to the table.

Audit table should contain meta-data about messages. We can generate various reports from audit log. You can get a complete insight how ESB is performing from audit log.

Sometimes we create custom audit table without giving due importance of the fields of the table and later on we keep on adding new fields.

Typically audit table should have following attributes.

Audit Key: Primary key of the table. This should be incremented by one every time a record is inserted.

Program Name: Name of the program. This is applicable where under a single program there are multiple projects.

Project Name: Name of the project under the program.

Process Name: Name of the process producing audit log.

Event Name: Name of the Event /Message/Entity

MessageId: The MessageId uniquely identify the message. This can be used for reconciliation purpose.

Payload: This is optional. We don’t encourage to store payload in the database. It will increase size of audit tables rapidly and may deteriorate performance. Also if payload contain sensitive data, you should not store as a plan text. In that case, you need to encrypt the payload.

Source: Source System name.

Target: Target System name.

Error Code: This field will contain error code

Error Message: Detail error messages

Status: You can store success and failure status of messages in terms of successful transfer from source to target.

Received Timestamp: This will contain timestamp of when messages is received

Sent Timestamp: This will contain timestamp of when messages are sent/committed.

Log Timestamp: The timestamp where messages are logged.

Server Name: In a cluster environment it comes handy if you store the information of server which has processed the messages. You can monitor performance of each server in processing of messages.

Atribute1, Atribute2, and Attribute 3:  You should have some optional attributes in the table. You may use to store additional attributes of messages.

This audit log can be used not only for transferring messaging, but can also be used to keep the audit log of other pattern such as file transfer, bulk data transfer etc.

It is better to accumulate all the data and call audit log process at the end of the processes.

You can have common sub process or you can post the data directly to the audit log queue.

We can implement queue based audit log with two processes

First process

Main process will call audit publisher messages.  This process will validate the messages and will write the messages to the queue.

Second process.

This is a subscriber which will subscribe messages from the queue and write the messages in a database table.

You may directly send the messages to the audit queue. In that case you don’t need to have publisher process. In that case you need validate the messages before sending to the audit queue. Otherwise subscriber may reject the messages.

Publish Subscribe Delayed Response

This is a publish-subscribe pattern variant.

In this case integration process needs to send acknowledgment to the source application if messages are delivered successfully in the target application. But message successful delivery cannot be determined by the first response sent by target application. Target application system might have messaging queue. It stores the messages in the message queue   and send the response to the caller before applying it to the database.  This is the first response. So to know if the messages are actually applied and committed in the database of target application you need to wait for couple of minutes to get the final response.

You can have a batch process which periodically invoke a web service to get final status of the messages from the target application.

The subscriber process read message queue and try to push to target. It writes success or failed messages to the status table.

The batch process reads the status table (which was populated by subscriber process), invoke web service to get the status and update the status code in the table. The table can have a key fields like messages id /transaction id / batch number or other business keys and web service can pass any of these keys as a parameter to get the status.

Source application can invoke web service   periodically to get the status. Integration layer should expose a web service by which it returns the status to the source application after reading the table as per key values passed in the web service.

If source application can provide a call back web service then batch process can directly send the acknowledgement to the source application by using web service call back.

Publish Subscribe using Messaging as a Service

Publish Subscribe   Messaging as a service.

 This is another variant of publish subscriber design pattern.

Sometimes organization offers “messaging as a service”. That means source application can publish and subscribe message directly to the message Queue or Topic.  Messaging service should support HTTPS protocol.  External application should connect with messaging service using HTTPS protocol.  JMS connector can be used if source application is in intranet.

Target application can directly listen to the queue. But sometimes target application does not have the capability to directly listen to the queue or directly access the queue to retrieve messages. In that case we need a subscriber process, which will listen to the queue and push to the target.

To implement this design pattern we need a subscriber process and a batch process to handle failed messages.  Integration layer does not need any publisher process.

Subscriber and batch process should be the same as described in main publisher and subscriber design pattern.

Additionally subscriber process can validate the message. If the messages are not valid, it may invoke a call back web service provided by source application to send the acknowledgment.

Acknowledgement can be sent to Source application from subscriber process or batch process as required.

Publish Subscribe using selector process

Publish Subscribe using Selector Process.

 

 This is another publish and subscribe design pattern variant.

We can implement this design pattern using JMS message selector API. Publisher publish the messages in a common queue. Selector  process listen to that queue and send the messages to destination specific queue.

Selector  process use specific filter  criteria to send the messages to different destination queue.

 

We need at least 4 process to implement this design pattern.

Publisher process: this is the same as describer in earlier section

Selector process.

 

This process use selection criteria to select the messages from the common queue. This can be a batch process or listener process.

The selector’s process select the messages from the common queue on the basis of  filter criteria and move the messages to the destination specific queue.

We can have multiple selector process depending on the use case.   We can have separate selector process for each destination.  In this case  governance, error handling and monitoring  of the process would be easier.

Subscriber process: as described in earlier section. Normally you should have separate subscriber process for each destination.

Batch process to handle failed messages.

You should have separate batch process to handle the failed messages for each destination. It would provide you more control and granularity. If you can parameterize your connection URL you can run the same batch jobs with different connection parameter.  Normally you should have single table to hold the all error messages.