Network Deployment (Distributed operating systems), v8.0 > Applications > Service integration > High availability and workload sharing > High availability


Messaging engine recovery from exception conditions

In service integration, there can be exception conditions that do not require a messaging engine to restart, exception conditions that require an automatic restart of the messaging engine, exception conditions that are detected by explicit health monitoring and handled by the HAManager, and exception conditions that require user intervention.


Recovery with the messaging engine running

A messaging engine can handle certain exception conditions without requiring the messaging engine to restart or fail over. The exception condition is corrected automatically and an entry is added to the system error log that explains the exception and suggests any user actions. The messaging engine continues to run and to honor the quality of service specified for the messages it is processing.


Recovery with automatic restart of the messaging engine (local exceptions)


Recovery from exceptions detected by explicit health monitoring


Recovery that requires user intervention (global exceptions)

A messaging engine cannot recover from global exceptions by restarting or failing over the messaging engine. For example, if the data store for a messaging engine becomes corrupted, the problem is not resolved by running the messaging engine on a different server because it encounters the same problem. If a messaging engine in this situation were to be failed over, the messaging engine would be continually failed over because it could not run in any server. There would be unwanted disruption to the cluster as servers attempted to run the messaging engine and were shut down.

To avoid such a situation, if a global exception occurs, the messaging engine logs an error, stops processing messages, and is not failed over. The messaging engine cannot be restarted until you correct the global exception condition and restart the server.



Injecting failures into a high availability system


+

Search Tips   |   Advanced Search