In this section: |
Failover enables a suspended Service Manager server to "shadow" a live Service Manager server. If the live server fails, the suspended server leaves suspended state and begins full operation. The servers can opt to share a configuration, allowing single-point management of the system. The following image illustrates this shared configuration.
The back-up Service Manager is started with the command line switch -b or the Java system property iway.backup set to true. The default is live operation. The back-up engine monitors the designated IP channel for activity initiated by the live system. If no activity is detected for a period of time, the back-up system enters live mode.
A hot back-up system that goes live acts as a live system, using the back-up location to begin to attempt a heartbeat.
At start up, the failover (back-up) system begins with the -b parameter (iwsrv -b). You can also run the server with the Java system property iway.backup=true. Either option instructs the server to begin as a back-up server. You can configure properties using the Service Manager console.
The following table lists and describes the configuration properties.
Property |
Description |
---|---|
Location of Backup Note: On live system |
Location of live system failover partner. Each URL entry must carry an attribute of the name of the server to which it applies. Heartbeat signals are sent to the location for live systems. Location is in the form, host:port, for example, 1:8989. |
Heartbeat Port Note: On backup system |
Port on which the back-up server listens for the live system heartbeat, for example, 8989. |
Threshold Note: On backup system |
Period to wait for a live signal. |
The listener monitor shows the back-up listeners in the state of WAITING.
The location is extracted on start up. Each live server can have only one back-up server.
Periodically, the server sends a heartbeat signal to the host:port identified as the failover for the server.
When a stop command is entered, the stop signal is sent to the back-up server to prevent it from becoming live.
Upon start up, the back-up system checks for the <hotbackup> configuration. It sets an accept operation on the back-up port and awaits signals. After the first signal arrives, it begins a countdown clock for the threshold time period. One heartbeat signal is required to set the back up into failover mode; this allows the back-up system to be started before the live system.
As signals arrive, a timeout clock is reset. Should the time expire, the failover system enters live mode.
As a best practice, the time-out threshold should never be less than three seconds, and preferably it should be at least five seconds.
As each heartbeat arrives, it is checked to determine whether this is a stop signal. If so, the failover returns to initial mode to await a heartbeat signal to restart the cycle.
When the system becomes live, it begins normal operation. A common practice is to point a file listener at a start-up document that is emitted by e-mail. This informs an administrator that a hot back up occurred.
Caution: The engine alone cannot institute a complete hot back-up capability. Protocols that carry only virtual names can be switched over. Others, especially the TCP-based protocols, cannot be switched over. Unless the hot back up is on the same computer as the failing system it takes over (thus voiding some of the purpose of hot back up), the field client must "know" the host name. Accordingly, customers must use commercial TCP switches to alleviate this issue.
This issue is less critical in cases where iWay Software is on both sides of the interface, as is the case with MQSI nodes. In this case, a retry for a second host address can be made part of the recovery cycle.
Both systems must be on the same side of the firewall, and the back-up system must be reachable from the active system through TCP.
In the following example, the hot back-up machine is on server iam1, which is listening on port 1200. The live server uses this entry to determine where to send heartbeats. The back-up server uses the heartbeat port entry to determine the port on which it is listening and the threshold field to determine the number of seconds to tolerate a loss of heartbeat before attempting to take over.
Note: The backup utility is available only from the iWay Adapter Manager Configuration Console. To navigate to the Adapter Manager console, click the build version number, for example, smsp1.7105 in the upper-right corner of the iSM console.
The following image illustrates this example.
A single Service Manager cannot be both a live server and a back-up server. The -b parameter is used to determine whether the server is a back up. Only the location of the back-up field applies to live servers; the heartbeat port and threshold apply to the back-up server. Because you can change a server from live to a back up depending upon how it is started, all fields are available.
iWay Software does not recommend a hot back up on the same machine as the live server, as the hot back up is intended to compensate for the unexpected loss of a complete system. Usually, this is caused by loss of the computer itself and therefore, having the hot back up on the same machine would result in the loss of both.
The following configuration works but is not recommended (see Deploying iWay in a High Availability Environment). The following example shows two servers, serv_a and serv_b. Each uses the other as a back up, using the same port. Usually, serv_a is the live server, and serv_b is the back up. Both use port 1200. The serv_a configuration is shown in the following image.
The serv_b configuration is shown in the following image:
You can start either server as the back up using the -b parameter.
For more information on how to manage server failover and deploy iWay Service Manager in a high availability environment, see Deploying iWay in a High Availability Environment.
iWay Software |