This chapter provides information about configuring the IRIS FailSafe DMF database option for use on the IRIS FailSafe system.
The required software for DMF fail over is as follows:
DMF software as described in the Cray DMF Administrator's Guide for IRIX Systems.
Base IRIS FailSafe software (see the IRIS FailSafe Administrator's Guide for information on installing FailSafe)
IRIS FailSafe DMF software (included in the FailSafe DMF package):
ha_dmf.books, which contains this guide
ha_dmf.man, which contains the relnotes file
ha_dmf.sw, which contains the following:
DMF daemon monitoring script
FailSafe DMF fail over functions ( takeover, takeback, giveback, giveaway)
Template file ha.conf.dmf for ha.conf modifications
Tape dismount software for configurations attached to STK silos running the Automated Cartridge System Library Software (ACSLS)
DMF databases, log files, journal files, and user file systems must be XFS file systems or XLV logical volumes and must be located on a shared disk within the cluster. User file systems must be configured with the dmi mount option in the ha.conf file. The file systems are normally created as NFS file systems so that they can be mounted and accessed remotely. Procedure 2-1 describes how to make changes to the configuration file.
The only tape configuration that is currently supported with FailSafe DMF is a cluster connected to an STK silo running ACSLS library control software. Only the DMF tape autoloader service configuration is supported. FailSafe DMF does not support Open Vault and Tape Management Facility (TMF) tape system configurations. Consult the Cray DMF Administrator's Guide for IRIX Systems and the DMF release online files (Readme and News) for a description of how DMF is configured for each of these tape management systems.
Each host in the cluster is connected to a separate set of drives in the tape library. You must create a drive identification file on each host; the file defines the drives that DMF uses on the other host in the cluster. These files are required so that FailSafe DMF will know what drives were in use when the fail over occurred. FailSafe DMF will dismount any tapes that were in use at the time of a fail over. These files are created in /etc/config on each host. “Tape Drive Identification Files” describes how to create and name these files.
Tape drive identification files identify tape components that DMF uses on each host. There are two types of components identified in these files: a tape loader and tape drives. The information for each component is obtained from the DMF autoloader tape configuration file /etc/config/al_api.rc. The template for these files is as follows:
loader:loader name drive:drive_1[:drive_2][: drive_3]... |
The loader name and drive names are those specified in /etc/config/al_api.rc.
Create the file /etc/config/ha_serv.drives on the backup node. It contains the drives that are connected on the server node. Create the file /etc/config/ha_back.drives on the primary server node. It contains the drives connected to the backup node.
Example 2-1. Creating Drive Identification Files
This example lists the /etc/config/ha_back.drives and /etc/config/ha_serv.drives files for a cluster containing a primary server machine cm1 and a backup node cm2.
File ha_back.drives on machine cm1 indicates that drives t2 and t3 are being used by DMF on node cm2 and are managed by the loader wolfy. It contains the following lines:
loader:wolfy drive:t2:t3 |
File ha_serv.drives on machine cm2 indicates the drives t1 and t4 are being used by DMF on node cm1 and are managed by the loader wolfy. It contains the following lines:
loader:wolfy drive:t1:t4 |
DMF must be installed on each host in the cluster; therefore, each host will have a dmf_config configuration file.
If DMF is migrating files using only the FTP MSP (that is, the media-specific process (MSP) that runs over the file transfer protocol (FTP)), the dmf_config files on each host will be identical. If you will be migrating files to tape using the tape MSP, each configuration file will differ only in the actual tape drives they specify for the tape MSP. The configuration file on each host will specify the drives attached to that host as defined in the /etc/config/al_api.rc file.
For example, suppose the dmf_config file for each host has the following information for the MSP named msptim :
define msptim TYPE msp COMMAND dmatmsp TAPE_TYPE tim_drives CACHE_SPACE 800m CHILD_MAXIMUM 2 DISK_IO_SIZE 1024k MAX_PUT_CHILDREN 2 enddef |
In the dmf_config configuration file on the first host, the drives defined for tim_drives are as follows:
define tim_drives TYPE device LOADER_NAME wolfy TAPE_UNITS t1 t4 enddef |
On the second host, the drives defined for tim_drives are defined as follows:
define tim_drives TYPE device LOADER_NAME wolfy TAPE_UNITS t3 t5 enddef |
![]() | Note: The drives defined by TAPE_UNITS are the only difference between the two dmf_config files. |
This section describes the procedure for creating the ha.conf configuration file that includes DMF configuration information. The procedure assumes that a configuration file that doesn't include DMF has been created, installed, and tested as described in the IRIS FailSafe Administrator's Guide. Using Procedure 2-1, add DMF information to the configuration file. Install the configuration file as /var/ha/ha.conf on both nodes as described in the IRIS FailSafe Administrator's Guide.
Procedure 2-1. Making Changes for DMF in the Configuration File
Complete the following steps:
For example, if the file system fs1 is a DMF user file system that contains files to migrate, the file system description block in ha.conf might look like the following:
filesystem fs1 { mount_point /fs1 mount_info { fs_type = xfs volume_name = fs1 mode = dmi, rw, noauto, wsync } } |
Make a copy of the /var/ha/ha.conf file on one node.
Add all of the file systems and volumes that will be used for DMF to the copy of ha.conf. See the IRIS FailSafe Administrator's Guide and the Cray DMF Administrator's Guide for IRIX Systems for more information on volume and file system configuration.
![]() | Note: When you are setting up a file system block for a DMF user file system, the dmi mount option must be specified as one of the mount options. |
Make a copy of the /var/ha/templates/ha.conf.dmf file. In the copy's dmf block, modify the definitions of the server-node and backup-node fields. (The server node is the node that normally would be running DMF and the backup node would serve as a backup platform for DMF within the cluster.) For more information, see “DMF Application-Class Block” in Chapter 3.
Append the modified copy of ha.conf.dmf to the end of the ha.conf copy.
Define an NFS block in the copy of the ha.conf configuration file for each DMF user file system if the file systems will be accessed remotely.
Using the information in “DMF Action and Action-Timer Blocks” in Chapter 3, prepare the action dmf and action-timer dmf blocks.
Use the information in section about creating the configuration file in theIRIS FailSafe Administrator's Guide to verify the ha.conf copy and then install it on each node. You can begin with the step involving the ha_cfgverify command.
The following procedure explains how to test the DMF configuration and fail over.
Procedure 2-2. Testing the Fail Over
Complete the following steps:
Install DMF on each node in the cluster.
Stop FailSafe if it is running by issuing the following command:
/etc/init.d/failsafe stop |
Make a backup copy of the ha.conf file on each node in the cluster. Install the ha.conf file created in Procedure 2-1 on each node in the cluster.
Bring up FailSafe by issuing the following command:
/etc/init.d/failsafe start |
Verify that all the DMF file systems defined in ha.conf are mounted and that the DMF daemon is running by issuing the command:
# /etc/dmf/dmbase/etc/dmdstat -v Daemon status OK; '1' responses received. |
Stop the DMF daemon by issuing the following command:
/etc/init.d/dmf stop |
Verify the following events:
An error message is sent to the system console indicating that DMF has stopped.
An error message is issued to the DMF monitor log in /var/ha/logs/ha_dmf_lmon.$HOST.log.
A mail message with the error information is sent to the fsafe_admin alias.
Bring DMF back up by issuing the following command:
/etc/init.d/dmf start |
Issue the ha_admin -fs command to put the host running DMF into standby mode. Verify the DMF file systems, the DMF daemon, and the DMF MSPs are failed over to the other node in the cluster.
Issue the following command on the host that was put in standby mode in step 9 in order to make the node rejoin the cluster:
ha_admin -fr |
If this step was successful, the following will be true:
DMF is running on the reactivated node
DMF user, log, and database file systems are remounted on the reactivated node
The host is running in normal mode. You can determine the status by issuing the following command:
ha_admin -a |