BAM Driver Recovery

In this section:

How to:

BAM records information into a database during execution of the application. When properly configured, applications and configurations can share this database. The system relies upon the ability to properly insert records into the database with no errors. Most applications using BAM depend on the BAM driver being able to update its database. The server allows the application to determine the action to be taken in the event of failure.

The BAM driver is designed to impose the least possible performance penalty on the application. The driver database update is asynchronous to the application itself. There is no mechanism for the application to check the status of the driver. For this reason, the driver and server must be in control of the recovery.

When the asynchronous updater detects a loss of connection to the database (SQL State 08xxx) it will attempt to reconnect based on the value specified for the retryCount parameter. It will wait between attempts based on the value specified for the retryInterval parameter. If a reconnect cannot be accomplished, the asynchronous updater shifts to recovery mode and reports the error to the driver itself. When all updaters have reported this condition, the driver begins recovery action.

The driver will attempt to run a process flow as configured in the Loss Connection Flow Name parameter of the driver configuration. You must create and publish this process flow to the system area of the configuration.

The process flow can take actions, such as notifying the user of the problem (for example, by sending an email). The post message (XDControlAgent) service might be used to stop accepting messages on other channels, effectively pausing the application. The process flow should end on one of three End nodes, with specific names. This is the standard method by which a subflow reports its status back to a calling process flow. The designated action takes effect following the return from the process flow. For an application that does not depend upon BAM (for example, using it only for statistical and analysis purposes), Force or Finish can allow the application to continue without such statistics.

End Name

Queue Action

Driver Action

Force

Stop accepting and delete all in queue.

Shutdown

Persist (default)

Serialize any pending entries to a file to be retried on startup.

Shutdown

Continue

Continue accepting entries.

Continue the operation. The driver continues in recovery mode.

Offline

Continue accepting entries and store them in the file system.

Continue the operation. The driver continues in recovery mode.

If a serialized queue exists, the driver will deserialize the entries during startup. This will cause the serialized entries to be written to the BAM log.

The xalog start <drivername> command can be used to restart the driver when the database condition is corrected. This command can be scheduled, such as in a run script, as needed.

Important: In order to fully recover from a lost RDBMS connection, it is necessary to add a validation query to the JDBC provider configuration. For MS SQLServer, Select 1 is fine. This adds a small performance cost, but makes it possible for the pool to replace bad connections with good connections.

The XDXalogControlAgent is not appropriate for use in this process flow. Use the Continue return to retry the lost connection. An appropriate use of the XDXalogControlAgent service might be to run in a scheduled situation. For example, the lost connection process flow might alert a DBA to problems and then schedule a recovery process flow to test the database connection and restart the driver or reschedule. It would then return Persist, which instructs the server to continue caching events to disk.


Top of page

x
Procedure: How to Modify the BAM Driver Recovery Properties

The BAM driver recovery properties are an advanced IT operation and require direct driver configuration.

To modify the BAM driver recovery properties:

  1. In the iWay Service Manager Administration Console, click Activity Facility in the left pane, as shown in the following image.

    The Activity Facility pane opens.

  2. Select BAMSenderDriver.

    The Activity Facility pane opens for the preconfigured BAMSenderDriver, which lists the configuration parameters for the driver.

  3. Modify values for the required parameters in the Connection Management section, which are listed and described in the following table.

    Parameter

    Description

    Retry Count

    Number of attempts to reconnect to the underlying BAM database.

    Retry Interval

    Wait interval between reconnect attempts.

    Lost Connection Behavior

    Select one of the following values from the drop-down list:

    • Force. Terminate any logging activity and abandon any entries in flight.
    • Persist. Terminate any logging activity and serialize any entries in flight.
    • Offline. Continue to run logger in offline mode, which will log the activity to a file store for later processing when the connection is recovered to the database.

    Lost Connection Flow Name

    The process flow to execute upon connectivity loss and failure to reestablish a connection. The process flow must terminate with End node that corresponds to the action, (for example, Force, Persist, or Offline) or an additional option (Continue) to continue denoting that the connection has been restored. The process flow takes priority over the preconfigured Lost Connection Behavior parameter.

  4. Click Update when you have finished modifying the parameters for the preconfigured BAMSenderDriver.

Top of page

x
Understanding Recovery Mode

When an updater thread shifts to recovery mode, it returns the message that it was processing to the database to the work queue. It then sets itself to write all messages to the local database rather then to the RDBMS. It then resumes accepting messages.

Depending on the settings and the return from the process flow, the driver starts an analysis thread. This thread periodically pools the RDBMS, awaiting its availability. When the RDBMS becomes available, the recovery thread sets the updater threads to resume standard mode. In this mode, the updaters accept messages from the queue and send them to the RDBMS. The recovery thread then begins reading, in order, the persisted messages written to the local disk. Each message is added to the work queue, so that an updater thread will send it to the RDBMS. The persisted messages are deleted as they are added to the queue.

If the system is terminated, on the next startup, the persisted messages will be loaded from disk to the work queue, so that normal operations can continue.


iWay Software