Category Archives: servertopology

Sandbox Server Topology

This server is uncontrolled server. Normally access is given to every developers to play with menu, interfaces, configuration etc. Also this server can be used for patch testing and version upgrade. You don’t need to have this environment if you don’t want to invest more on infrastructure. But we recommend to have this server.

preprod server topology

Preproduction environment should be exact replica of production environment. Though you don’t need to have DR setup for preprod and corresponding WAF and LB setup. It should have same infrastructure; same hardware in terms of machine, CPU, memory and peripherals. This environment can be used as staging area before codes are moved to production environment. Also this environment can be used for performance testing. Because this is replica of production environment and by doing performance testing in preprod, you can ensure that system will react similarly in production with the same load. So you can determine how interfaces will work in different load condition in production.

It is not mandatory to have preprod environment. Many of server setup in reputed organization does not have prepod. But it is good to have specially when performance is paramount important and one should do performance testing before production migration of the interfaces.

You can simulate normal load situation in production using preprod .You can also simulate Disaster situation load by switching servers on and off.

In two nodes preprod cluster you can have 2 nodes in primary site. You don’t need DR site servers.

In four node clusters topology , you should have 4 nodes ;two node in primary and 2 nodes in DR site.

QA Dev Server Topology

QA servers are used for user acceptance testing. It can be used for integration testing before actual user acceptance testing.

Dev server is used for development and unit testing. Also it can be used for dry run.

It is better to have similar configuration for Prod, Preprod, QA and Dev server. If prod has server cluster, other lower environment also should have server cluster. Very often we avoid this aspect to save money on infrastructure which create inconvenience in later stage.

Sometime we see that lower environment like QA and DEV has single server installation. They don’t have server clusters. It creates problem. You may get correct data in those environment .But you may get incorrect data in production. Even process can fails. This is especially true when message sequencing is required. You will not notice this issue in single server environment. So multi server and multi-threading processing should be taken care of beforehand.

Also QA, Dev server might be in intranet, prod server might be in cloud; it also create issues.

If you have public URL for prod server, you should have public URL for all the lower environments. All the testing (UT, SIT, UAT) should be done using public URL. Your server may have both, intranet and internet URL (public) . If you use intranet URL in production ,all the testing in lower environment should be done using intranet URL. You should not mix up intranet /internet URL across environment. Also if your prod server is in cloud, you should have all the servers of QA, DEV in cloud. It may be seen that some installation don’t have intranet URL for production. Also production server may be in cloud, whereas test environments server intranet. You should avoid this configuration. You should maintain uniformity across environments. Otherwise you will face many unforeseen issues.

Sandbox : This server is uncontrolled server. Normally access is given to every developers to play with menu, interfaces, configuration etc. Also this server can be used for patch testing and version upgrade. You don’t need to have this environment if you don’t want to invest more on infrastructure.

Four Nodes Server Cluster Topology

Four Nodes cluster:

In this architecture topology, we have 2 nodes in primary site and 2 nodes in DR site. All the nodes are active and take loads equally. They are connected with primary site load balancer in normal situation. DR site servers points to primary site database and shared file system. If disaster strikes, the traffic is routed through to DR site load balancer and correspondingly DR site database and file system are used. So in normal situation we have 4 nodes sharing the load and during disaster we have 2 nodes taking the load. Also when DR site is not available, primary site nodes take entire loads.

Fault Tolerant architecture:

This is very flexible and fault tolerant server setup. In parallel it ensures equal utilization of resources. This architecture has minimum human intervention requirement in case of disaster.

Failover: In case of disaster, no special activities to be performed for server setup perspective. External application and database connection url may change due to disaster if external application also affected. In that case configuration needs to be changed accordingly. If server uses internal database, database url may also needs to be changed assuming database will be replicated using oracle dataguard or any other technology. If local dns entry is changed, then no need to change any connection url.

Failback: No intervention required unless, external urls invoked by servers also get changed due to disaster. Internal database connection url may need to be changed. If local dns entry is changed for local or intranet application, then you may not need to change anything.

Disadvantage: This 4 node active clusters works well when primary and DR site are in close proximity and connected with high speed network. Don’t use this type of configuration if the locations are far or connected with slow network. In that case some request will be served faster and some request will be served slower. This configuration is not recommended in that case.

Two Node Server Cluster Topology

Server Clusters:

Middleware application server should be setup as per fault tolerant architecture.

In two node separate clusters topology, we will have two node cluster in primary site and two node cluster in DR site.

Primary site nodes will be active carrying real traffic. DR site node will be passive; it will not take any workload while primary site is operational. In case of disaster, DR site clusters will become active and take the workload.

During production deployment, Processes should be deployed in both primary and DR site server clusters .Also externalized variables /process properties should be configured in both server accordingly.

Fault tolerant architecture.

It will have multilevel failure protection; server level, Load balancer (LB) lever, storage level. If any of the server is down, traffic will be forwarded to other server by load balancer.

From the internet, web traffic will come via WAF. If primary site LB is down, traffic can be forwarded to DR site LB by WAF. But unless it is absolute necessary, don’t channel the traffic to DR site, because you need to synch up data from DR site to primary site during failback.

If case storage /database failure of primary site, this setup can leverage data of DR as it is expected that data will be synched up to DR site in real time by leveraging storage replication tools.

Unlike java web application, the web service call is stateless. That means server does not need to maintain state for the web service call. Each call is independent. So session replication is not required. Though server can replicate other information like cache if distributed caching is used.

If server stores operational data in SAN or database that may be replicated to DR site. But if process does not require to store any information on disk or database then nothing needs to be replicated.

Failover:

In case of disaster source application can point to DR site URL directly. In that case source code may need to be changed where URLs are hard coded.. Normally URL are externalized variable and kept in a process properties. That process properties can be updated directly from server console without redeployment requirement. Another option is not to change the URL but is to change the IP address of the URL. In this case requester application does not have any impact in case of disaster. We should ask Domain name server administrator to change the IP address of the URL. The only problem in this case is that it may take sometimes for server to point to the correct IP address because new DNS lookup will not be initiated by server unless DNS records expire as per TTL value. . Typical TTL value in DNS records are set below 1 hour in case of web service call. In this case, it is expected that system will get the correct IP address of URL in DNS lookup after couple of hours. We should maintain a check list of activities for failover. Not only URL needs to get changed, other variables also may require to change. Also database/storage admin support may be required as replication needs to be in opposite direction (from DR to Primary site). Now- a-days modern middleware system can provide RPO to 0 and RTO less than 1 hour.

Failback:

If primary site comes up then application should use primary site infrastructure. All the application which is using DR site URL/IP address should use primary site URL/IP address. Requester application configuration /properties should be changed again to point to primary site resources. SAN and database of DR site should by synched up with primary site as required.