Recovering a single queue manager at an alternative site
In the case of a total loss of a WebSphere MQ computing center, we can recover on another queue manager or queue-sharing group at a recovery site. (See Recovering a queue-sharing group at the alternative site for the alternative site recovery procedure for a queue-sharing group.)
To recover on another queue manager at a recovery site, regularly back up the page sets and the logs. As with all data recovery operations, the objectives of disaster recovery are to lose as little data, workload processing (updates), and time, as possible.
At the recovery site:
- The recovery queue managers must have the same names as the lost queue managers.
- The system parameter module (for example, CSQZPARM) used on each recovery queue manager must contain the same parameters as the corresponding lost queue manager.
When you have done this, reestablish all your queue managers as described in the following procedure. This can be used to perform disaster recovery at the recovery site for a single queue manager. It assumes that all that is available are:
- Copies of the archive logs and BSDSs created by normal running at the primary site (the active logs will have been lost along with the queue manager at the primary site).
- Copies of the page sets from the queue manager at the primary site that are the same age or older than the most recent archive log copies available.
We can use dual logging for the active and archive logs, in which case we need to apply the BSDS updates to both copies:
- Define new page set data sets and load them with the data in the copies of the page sets from the primary site.
- Define new active log data sets.
- Define a new BSDS data set and use Access Method Services REPRO to copy the most recent archived BSDS into it.
- Use the print log map utility CSQJU004 to print information from this most recent BSDS. At the time this BSDS was archived, the most recent archived log you have would have just been truncated as an active log, and does not appear as an archived log. Record the STARTRBA and ENDRBA of this log.
- Use the change log inventory utility, CSQJU003, to register this latest archive log data set in the BSDS that you have just restored, using the STARTRBA and ENDRBA recorded in Step 4.
- Use the DELETE option of CSQJU003 to remove all active log information from the BSDS.
- Use the NEWLOG option of CSQJU003 to add active logs to the BSDS, do not specify STARTRBA or ENDRBA.
- Use CSQJU003 to add a restart control record to the BSDS. Specify
CRESTART CREATE,ENDRBA=highrba, where highrba is the high RBA of the most recent archive log available (found in Step 4), plus 1.
The BSDS now describes all active logs as being empty, all the archived logs you have available, and no checkpoints beyond the end of your logs.
- Restart the queue manager with the usual START QMGR command. During initialization, an operator reply message such as the following is issued:
CSQJ245D +CSQ1 RESTART CONTROL INDICATES TRUNCATION AT RBA highrba. REPLY Y TO CONTINUE, N TO CANCELType
Y to start the queue manager. The queue manager starts, and recovers data up to ENDRBA specified in the CRESTART statement.
See Using the WebSphere MQ Utilities for information about using CSQJU003 and CSQJU004.
Figure 47 shows sample input statements for CSQJU003 for steps 6, 7, and 8:
Figure 47. Sample input statements for CSQJU003* Step 6 DELETE DSNAME=MQM2.LOGCOPY1.DS01 DELETE DSNAME=MQM2.LOGCOPY1.DS02 DELETE DSNAME=MQM2.LOGCOPY1.DS03 DELETE DSNAME=MQM2.LOGCOPY1.DS04 DELETE DSNAME=MQM2.LOGCOPY2.DS01 DELETE DSNAME=MQM2.LOGCOPY2.DS02 DELETE DSNAME=MQM2.LOGCOPY2.DS03 DELETE DSNAME=MQM2.LOGCOPY2.DS04 * Step 7 NEWLOG DSNAME=MQM2.LOGCOPY1.DS01,COPY1 NEWLOG DSNAME=MQM2.LOGCOPY1.DS02,COPY1 NEWLOG DSNAME=MQM2.LOGCOPY1.DS03,COPY1 NEWLOG DSNAME=MQM2.LOGCOPY1.DS04,COPY1 NEWLOG DSNAME=MQM2.LOGCOPY2.DS01,COPY2 NEWLOG DSNAME=MQM2.LOGCOPY2.DS02,COPY2 NEWLOG DSNAME=MQM2.LOGCOPY2.DS03,COPY2 NEWLOG DSNAME=MQM2.LOGCOPY2.DS04,COPY2 * Step 8 CRESTART CREATE,ENDRBA=063000The things we need to consider for restarting the channel initiator at the recovery site are similar to those faced when using ARM to restart the channel initiator on a different z/OS image. See Using ARM in a WebSphere MQ network for more information. Your recovery strategy should also cover recovery of the WebSphere MQ product libraries and the application programming environments that use WebSphere MQ (CICS, for example).
Other functions of the change log inventory utility (CSQJU003) can also be used in disaster recovery scenarios. The HIGHRBA function allows the update of the highest RBA written and highest RBA off-loaded values within the bootstrap data set. The CHECKPT function allows the addition of new checkpoint queue records or the deletion of existing checkpoint queue records in the BSDS.