Bulk Processing

Bulk processing

Bulk processing means multiple number of messages, documents, records etc from source get processed by ESB and are sent to target in a single execution.

Source application send large number of messages in one go to the ESB layer. ESB layer enrich and transform each message separately and send it to the destination.

Source application can push the messages to the ESB or ESB can pull the messages from source application. ESB can pull the messages from file, database, web service etc. provided by source application.

Messages can be processed by middleware in different ways

 Parallel Processing

bulk processing -parallel

bulk processing -parallel

 Source application sends number of messages in one go to the ESB and ESB enriches the messages and pass to the destination in parallel.

Source application invokes web service provided by ESB and pushes large number of messages to the ESB.  ESB splits batch of messages into individual message and processes each messages in parallel.

Usage:

Use this pattern for faster processing when target support parallel connections and number of messages are less. Use this pattern when server has enough available memory and CPU. This pattern consumes lots of resources in server.

Sequential Processing

 Source application send the large number of messages in one go and ESB process the messages sequentially.

Usages:  Use this pattern, when target does not support parallel connection.

Hybrid Processing

 This is combination of parallel and sequential processing.

Here large batch of messages can be split into smaller batch and each batch can be processed in parallel. Messages in each batch can be processed in sequence.  Also reverse can be done, like processing each batch in sequence and messages in each batch in parallel.

Usage

Parallel processing is faster but needs lot of resources in server. Also target application should support parallel processing. On the other hand sequential processing takes less resources but can be slow. Use hybrid pattern to strike a balance between these two methods.

Error Handling.

The messages in a batch that cannot be posted to destination, should retried later. The messages can be stored in the Error Table for retry.  The layout of the table   is given in another post.  Message look up should be done against error table to avoid overwriting new messages will old one during retry of the messages.

Scheduled Bulk Process

Here ESB process pulls the data from source system; it may be   data from file, database or messages returned by web service. This is normally a scheduled process. Sometimes firewall does not allow push connection from external systems to the internal targets. When push is not allowed, pull method can be used. The pull process can be scheduled to periodically pull the data from source and sending to the target. After pulling the messages from the source it enriches and transforms and sends to the target. It can use any of the design pattern like parallel, sequential, hybrid processing that were described above.