This chapter explains how to test the FailSafe system configuration using the FailSafe Manager GUI and the cmgr command. For general information on using these tools, see Chapter 5, “Administration Tools”.
The sections in this chapter are as follows:
This section describes how to perform diagnostic tasks with the GUI.
This task requires root rsh access between nodes. To test connectivity, do the following from the FailSafe Manager:
![]() | Note: The Test Node Connectivity screen requires rsh access between hosts. The /.rhosts file must contain the hosts and local host between which you want to test connectivity. |
Choose whether to test by network or serial connectivity by clicking on the appropriate button.
Choose a node to be tested from the pull-down list and add it to the test list by clicking on Add.
To delete a node from the list of nodes to be tested, click on the logical name to select it and then click Delete .
To start the tests, click on Start Tests. To stop the tests, click Stop Tests.
To run another test, click Clear Output to clear the status screen.
To exit from the window, click Close.
The Test Resources task lets you test the resources on the nodes in your cluster by entering the requested inputs. You can test resources by type and by group. You can test the resources of a resource type or in a resource group on all of the nodes in the cluster at one time, or you can specify an individual node to test. Resource tests are performed only on nodes in the resource group's application failover domain.
The following subsections described how to perform diagnostic tasks on your system using the cmgr command.
You can use the cmgr command to test the serial connections between the FailSafe nodes. This test performs a ping on each specified node through the serial line and produces an error message if the ping is not successful.
![]() | Note: Do not execute this command while FailSafe is running. |
Use the following command to test the serial connections for the machines in a cluster:
test serial in cluster Clustername [on node node1 node node2 ...] |
For example, to test multiple nodes:
cmgr> test serial in cluster test-cluster on node blue node green |
The serial test yields an error message when it encounters its first error, indicating the node that did not respond. If you receive an error message after executing this test, verify the cable connections of the serial cable from the indicated node's serial port to the remote power control unit or the system controller port of the other nodes and run the test again.
For example:
cmgr> test serial in cluster eagan on node cm1 Success: testing serial... Success: Ensuring Node Can Get IP Addresses For All Specified Hosts Success: Number of IP addresses obtained for <cm1> = 1 Success: The first IP address for <cm1> = 128.162.19.34 Success: Checking serial lines via crsd (crsd is running) Success: Successfully checked serial line Success: Serial Line OK Success: overall exit status:success, tests failed:0, total tests executed:1 |
The following shows an example of an attempt to run the test serial command while FailSafe is running (causing the command to fail to execute):
cmgr> test serial in cluster eagan on node cm1 Error: Cannot run the serial tests, diagnostics has detected FailSafe (ha_cmsd) is running Failed to execute FailSafe tests/diagnostics ha test command failed cmgr> |
You can use the cmgr command to test the network connectivity in a cluster. This test checks if the specified nodes can communicate with each other through each configured interface in the nodes. This test will not run if FailSafe is running.
Use the following command to test the network connectivity for the machines in a cluster:
test connectivity in cluster Clustername [on node node1 node node2 ...] |
The following shows an example of the test connectivity command:
cmgr> test connectivity in cluster eagan on node cm1 Success: testing connectivity... Success: checking that the control IP_addresses are on the same networks Success: pinging address cm1-priv interface ef0 from host cm1 Success: pinging address cm1 interface ef1 from host cm1 Success: overall exit status:success, tests failed:0, total tests executed:1 |
This test yields an error message when it encounters its first error, indicating the node that did not respond. If you receive an error message after executing this test, verify that the network interface has been configured up, using the ifconfig command, for example:
# /usr/etc/ifconfig ec3
ec3: flags=c63<UP,BROADCAST,NOTRAILERS,RUNNING,FILTMULTI,MULTICAST> inet 190.0.3.1 netmask 0xffffff00 broadcast 190.0.3.255 |
The UP in the first line of output indicates that the interface is configured up.
If the network interface is configured up, verify that the network cables are connected properly and run the test again.
You can use the cmgr command to test any configured resource by resource name or by resource type.
Use the following to test a resource by name:
test resource Resourcename of resource_type RTname in cluster Clustername [on node Node1 node Node2 ...] |
cmgr> test resource /disk1 of resource_type filesystem in cluster eagan on machine cm1 Success: *** testing node resources on node cm1 *** Success: *** testing all filesystem resources on node cm1 *** Success: testing resource /disk1 of resource type filesystem on node cm1 Success: overall exit status:success, tests failed:0, total tests executed:1 |
Use the following to test a resource by resource type:
test resource_type RTname in cluster Clustername [on node Node1 node Node2 ...] |
cmgr> test resource_type filesystem in cluster eagan on machine cm1 Success: *** testing node resources on node cm1 *** Success: *** testing all filesystem resources on node cm1 *** Success: testing resource /disk4 of resource type filesystem on node cm1 Success: testing resource /disk5 of resource type filesystem on node cm1 Success: testing resource /disk2 of resource type filesystem on node cm1 Success: testing resource /disk3 of resource type filesystem on node cm1 Success: testing resource /disk1 of resource type filesystem on node cm1 Success: overall exit status:success, tests failed:0, total tests executed:5 |
You can use cmgr to test volume and filesystem resources in destructive mode. This provides a more thorough test of filesystems and volumes. cmgr tests will not run in destructive mode if FailSafe is running.
Use the following to test resources in destructive mode:
test resource Resourcename of resource_type RTname in cluster Clustername [on node Node1 node Node2 ...] destructive |
The following sections describe the diagnostic tests available for resources.
You can use the cmgr command to test the logical volumes in a cluster. This test checks if the specified volume is configured correctly.
Use the following command to test a logical volume:
test resource Resourcename of resource_type volume on cluster Clustername [on node Node1 node Node2 ...] |
For example:
cmgr> test resource alternate of resource_type volume on cluster eagan Success: *** testing node resources on node cm1 *** Success: *** testing all volume resources on node cm1 *** Success: running resource type volume tests on node cm1 Success: *** testing node resources on node cm2 *** Success: *** testing all volume resources on node cm2 *** Success: running resource type volume tests on node cm2 Success: overall exit status:success, tests failed:0, total tests executed:2 |
The following example tests a logical volume in destructive mode:
cmgr> test resource alternate of resource_type volume on cluster eagan destructive Warning: executing the tests in destructive mode Success: *** testing node resources on node cm1 *** Success: *** testing all volume resources on node cm1 *** Success: running resource type volume tests on node cm1 Success: successfully assembled volume: alternate Success: *** testing node resources on node cm2 *** Success: *** testing all volume resources on node cm2 *** Success: running resource type volume tests on node cm2 Success: successfully assembled volume: alternate Success: overall exit status:success, tests failed:0, total tests executed:2 |
You can use cmgr to test the filesystems configured in a cluster. This test checks if the specified filesystem is configured correctly and if the volume the filesystem will reside on is configured correctly.
Use the following command to test a filesystem:
test resource Resourcename of resource_type filesystems on cluster Clustername [on node Node1 node Node2 ...] |
The following example displays the filesystems that have been defined in a cluster and tests one of them:
cmgr> show resources of resource_type filesystem in cluster eagan /disk4 type filesystem /disk5 type filesystem /disk2 type filesystem /disk3 type filesystem /disk1 type filesystem cmgr> test resource /disk4 of resource_type filesystem in cluster eagan on node cm1 Success: *** testing node resources on node cm1 *** Success: *** testing all filesystem resources on node cm1 *** Success: successfully mounted filesystem: /disk4 Success: overall exit status:success, tests failed:0, total tests executed:1 |
The following example tests a filesystem in destructive mode:
cmgr> test resource /disk4 of resource_type filesystem in cluster eagan on node cm1 destructive Warning: executing the tests in destructive mode Success: *** testing node resources on node cm1 *** Success: *** testing all filesystem resources on node cm1 *** Success: successfully mounted filesystem: /disk4 Success: overall exit status:success, tests failed:0, total tests executed:1 |
You can use cmgr to test a resource group. This test cycles through the resource tests for all of the resources defined for a resource group. Resource tests are performed only on nodes in the resource group's application failover domain.
Use the following to test resource groups:
test resource_group RGname in cluster Clustername [on node Node1 node Node2 ...] |
The following displays the resource groups that have been defined in a cluster and test one of them:
cmgr> show resource_groups in cluster eagan Resource Groups: nfs2 informix cmgr> test resource_group nfs2 in cluster eagan on machine cm1 Success: *** testing node resources on node cm1 *** Success: testing resource /disk4 of resource type NFS on node cm1 Success: testing resource /disk3 of resource type NFS on node cm1 Success: testing resource /disk3/statmon of resource type statd_unlimited on node cm1 Success: testing resource 128.162.19.45 of resource type IP_address on node cm1 Success: testing resource /disk4 of resource type filesystem on node cm1 Success: testing resource /disk3 of resource type filesystem on node cm1 Success: testing resource dmf1 of resource type volume on node cm1 Success: testing resource dmfjournals of resource type volume on node cm1 Success: overall exit status:success, tests failed:0, total tests executed:16 |
You can use cmgr to test whether a failover policy is defined correctly. This test checks the failover policy by validating the policy script, failover attributes, and whether the application failover domain consists of valid nodes from the cluster.
Use the following to test a failover policy:
test failover_policy FPname in cluster Clustername [on node Node1 node Node2 ...] |
The following example uses a show command to display the failover policies that have been defined in a cluster and tests one of them:
cmgr> show failover_policies Failover Policies: reverse ordered-in-order cmgr> test failover_policy reverse in cluster eagan Success: *** testing node resources on node cm1 *** Success: testing policy reverse on node cm1 Success: *** testing node resources on node cm2 *** Success: testing policy reverse on node cm2 Success: overall exit status:success, tests failed:0, total tests executed:2 |