Chapter 5. Administration Tools

You can perform FailSafe administration tasks using either the FailSafe Manager graphical user interface (GUI) or the cmgr command. Although these tools use the same underlying software command line interface (CLI) to configure and monitor a FailSafe system, the GUI provides the additional features that are particularly important in a production system; see “GUI Features”.

FailSafe Manager GUI

The FailSafe Manager GUI lets you configure, administer, and monitor a FailSafe cluster. This section contains the following:

Starting the GUI and Connecting to a Node

There are several methods to start the GUI and connect to a node.

Starting the GUI

To start the GUI, use one of the following methods:

  • On an IRIX system where the GUI-client software ( sysadm_failsafe2.sw.client) and desktop support software ( sysadm_failsafe2.sw.desktop) are installed, do one of the following:


    Note: SGI does not recommend this method across a wide-area network (WAN) or virtual private network (VPN), or if the IRIX system has an R5000 or earlier CPU and less than 128-MB memory.


    • Enter the following command line:

      # /usr/sbin/fsmgr

      (The fsdetail and fstask commands perform the identical function as fsmgr; these command names are kept for historical purposes.)

    • Choose the following from the Toolchest:

      System -> FailSafe Manager

      You must restart the Toolchest after installing FailSafe in order to see the FailSafe entry on the Toolchest display. Enter the following commands to restart the Toolchest:

      # killall toolchest
      # /usr/bin/X11/toolchest &

If you are using WAN or VPN, see “Starting the GUI on a PC”.

Starting the GUI on a PC

To start the GUI on a PC or if you want to perform administration from a remote location via VPN or WAN, do the following:

  • Install a web server (such as Apache) and the following packages on one of the FailSafe nodes:

    sysadm_failsafe2.sw.web
    sysadm_xvm.sw.web

  • Install the Java2 v1.4.1 or v1.3.2 plug-in on your PC.

  • Close any existing Java windows and restart the Web browser on the PC.

  • Enter the URL http://server/FailSafeManager/ (where server is the name of a FailSafe node in the pool)

  • At the resulting webpage, click the FailSafe Manager icon.


Note: This method can be used on IRIX systems, but it is not the preferred method unless you are using WAN or VPN. If you load the GUI using Netscape on IRIX and then switch to another page in Netscape, the GUI will not operate correctly. To avoid this problem, leave the GUI web page up and open a new Netscape window if you want to view another web page.


Summary of GUI Platforms

Table 5-1 describes the platforms where the GUI may be started, connected to, and displayed.

Table 5-1. GUI Platforms

GUI Mode

Where You Start the GUI

Where You Connect the GUI

Where the GUI Displays

fsmgr(1) or Toolchest

An IRIX system (such as an SGI 2000 series or SGI O2 workstation) with sysadm_failsafe2.sw.client and sysadm_failsafe2.sw.desktop software installed

The node in the pool that you want to use for cluster administration

The system where the GUI was invoked

Web

Any system with a web browser and Java2 1.4.1 or 1.4.2 plug-in installed and enabled

The FailSafe node in the pool that you want to use for cluster administration

The same system with the web browser


Logging In

To ensure that the required GUI privileges are available for performing all of the tasks, you should log in to the GUI as root. However, some or all privileges can be granted to any other user using the GUI privilege tasks; see . (This functionality is also available with the Privilege Manager, part of the IRIX Interactive Desktop System Administration sysadmdesktop product. For more information, see the Personal System Administration Guide.)

A dialog box will appear prompting you to log in to a host. You can choose one of the following connection types:

  • Local runs the server-side process on the local host instead of going over the network

  • Direct creates a direct socket connection using the tcpmux TCP protocol ( tcpmux must be enabled)

  • Remote Shell connects to the server via a user-specified command shell, such as rsh(1C) or ssh(1). For example:

    ssh -l root servername


    Note: For secure connection, choose Remote Shell and type a secure connection command using a utility such as ssh(1). Otherwise, GUI will not encrypt communication and transferred passwords will be visible to users of the network.


  • Proxy connects to the server through a firewall via a proxy server

Making Changes Safely

Do not make configuration changes on two different administration nodes in the pool simultaneously, or use the GUI, cmgr(1M), and xvm(1M) commands simultaneously to make changes. You should run one instance of the cmgr(1M) command or the GUI on a single administration node in the pool when making changes at any given time. However, you can use any node in the pool when requesting status or configuration information. Multiple GUI windows accessed via the File menu are all part of the same application process; you can make changes from any of these windows.

The node to which you connect the GUI affects your view of the cluster. You should wait for a change to appear in the view area before making another change; the change is not guaranteed to be propagated across the cluster until it appears in the view area. (To see the location of the view area, see Figure 5-1.) The entire cluster status information is sent to every FailSafe node each time a change is made to the cluster database.

GUI Features

The FailSafe Manager GUI allows you to administer the entire cluster from a single point. It provides access to the tasks that help you set up and administer your FailSafe cluster:

  • Tasks let you set up and monitor individual components of a cluster, including XVM volumes. For details about XVM tasks, see the XVM Volume Manager Administrator's Guide.

  • Guided configuration tasks consist of a group of tasks collected together to accomplish a larger goal. For example, Set Up a New Cluster steps you through the process for creating a new cluster and allows you to launch the necessary individual tasks by simply clicking their titles.

This section discusses the following:

GUI Window Layout

By default, the window is divided into two sections: the view area and the details area. The details area shows generic overview text if no item is selected in the view area. You can use the arrows in the middle of the window to shift the display.

File Menu

The File menu lets you display multiple windows for this instance of the GUI, the /var/adm/SYSLOG system log file, and the /var/sysadm/salog system administration log file. (The salog file shows the commands run directly by this instance of the GUI or run as a side effect of someone running cmgr commands on the system or some other instance of the GUI running commands on the system. Changes should not be made simultaneously by multiple instances of the GUI or the GUI and cmgr.) The File menu also lets you close the current window and exit the GUI completely.

Edit Menu

The Edit menu lets you expand and collapse the contents of the view area. You can choose to automatically expand the display to reflect new nodes added to the pool or cluster. You can also use this menu to select all items in the view menu or clear the current selections.

Tasks Menu

The Tasks menu contains the following:

  • Guided Configuration, which contains the tasks to set up your cluster, define filesystems, create volumes, check status, and modify an existing cluster

  • Nodes, which contains tasks to define and manage the nodes

  • Cluster, which contains tasks to define and manage the cluster

  • Resource Types, which contains tasks to manage or modify existing resource types, or create new ones

  • Resources, which contains tasks to set up and configure individual resources

  • Failover Policies, which contains tasks to determine how FailSafe should keep resource groups highly available

  • Resource Groups, which contains tasks to define resource groups and manage them

  • FailSafe HA Services, which allows you to start and stop highly available (HA) services, set the FailSafe tiebreaker node, and set the log configuration

  • Diagnostics, which contains the tasks to test connectivity, resources, and failover policies

  • Privileges, which lets you grant or revoke access to a specific task for one or more users

  • Find Tasks, which lets you use keywords to search for a specific task

Help Menu

The Help menu provides an overview of the GUI and a key to the icons. You can also get help for certain items in blue text by clicking on them.

View Menu

Choose what you want to view from the View menu:

  • Resources in groups

  • Groups owned by nodes

  • Resources owned by nodes

  • Resources by type

  • Groups by failover policies

  • Groups

  • Nodes in the cluster

  • Nodes in the pool (that is, all defined nodes)

  • Users

  • Task privileges

Selecting Items to View or Modify

You can use the following methods to select items:

  • Click to select one item at a time

  • Shift+click to select a block of items

  • Ctrl+click to toggle the selection of any one item

Another way to select one or more items is to type a name into the Find text field and then press Enter or click the Find button.

Viewing Component Details

To view the details on any component, click its name in the view area; see “Selecting Items to View or Modify”.

The configuration and status details for the component will appear in the details area to the right. At the bottom of the details area will be the Applicable Tasks list, which displays tasks you may wish to launch after evaluating the component's configuration details. To launch a task, click the task name; based on the component selected, default values will appear in the task window.

To see more information about an item in the details area, select its name (which will appear in blue); details will appear in a new window. Terms with glossary definitions also appear in blue.

Performing Tasks

To perform an individual task, do the following:

  1. Select the task name from the Task menu or click the right mouse button within the view area. For example:

    Task -> Guided Configuration -> Set Up a New Cluster

    The task window appears.

    As a shortcut, you can right-click an item in the view area to bring up a list of tasks applicable to that item; information will also be displayed in the details area.


    Note: You can click any blue text to get more information about that concept or input field.


  2. Enter information in the appropriate fields and click OK to complete the task. (Some tasks consist of more than one page; in these cases, click Next to go to the next page, complete the information there, and then click OK.)


    Note: In every task, the cluster configuration will not update until you click OK.

    A dialog box appears, confirming the successful completion of the task.

  3. Continue launching tasks as needed.

Getting More Information

Click blue text to see term definitions, instructions on what to input, or item configuration details, or to launch tasks.

In general, clicking on blue text will display one of the following:

  • Term definitions

  • Input instructions

  • Item details

  • The selected task window

Screens

Figure 5-1 shows a sample GUI window.

Figure 5-1. GUI Showing Details for a Resource

GUI Showing Details for a Resource

Figure 5-2 shows an example of the pop-up menu of applicable tasks that appears when you click the right mouse button on a selected item; in this example, clicking on the resource group name bartest-group displays a list of applicable resource-group tasks.

Figure 5-2. Pop-up Menu that Appears After Clicking the Right Mouse Button

Pop-up Menu that Appears After Clicking the
Right Mouse Button

cmgr Command

The cmgr command enables you to configure and administer a FailSafe system using a command-line interface on an IRIX system. It provides a minimum of help or formatted output and does not provide dynamic status except when queried. However, an experienced FailSafe administrator may find cmgr to be convenient when performing basic FailSafe configuration tasks, executing isolated single tasks in a production environment, or running scripts to automate some cluster administration tasks.

This section documents how to perform FailSafe administrative tasks by means of the cmgr command. You must be logged in as root.

The cmgr command uses the same underlying FailSafe commands as the GUI.

To use cmgr, enter the following:

# /usr/cluster/bin/cmgr

For more assistance, you can use the -p option on the command line; see “Using Prompt Mode”.

After you have entered this command, you will see the following:

Welcome to SGI Cluster Manager Command-Line Interface
cmgr>

Once the command prompt displays, you can enter the cluster manager commands.

At any time, you can enter ? or help to bring up the help display.

This section contains the following:

Getting Help

After the command prompt displays, you can enter subcommands. At any time, you can enter ? or help to bring up the cmgr help display.

Using Prompt Mode

The cmgr command provides an option which displays detailed prompts for the required inputs that define and modify FailSafe components. You can run in prompt mode in either of the following ways:

  • Specify a -p option when you enter the cmgr command, as in the following example:

    # cmgr -p

  • Execute a set prompting on command while in normal interactive mode, as in the following example:

    cmgr> set prompting on

    This method of entering prompt mode allows you to toggle in and out of prompt mode as you execute individual cmgr commands.

    To get out of prompt mode, enter the following command:

    cmgr> set prompting off

For example, if you are not in the prompt mode and you enter the following command to define a node, you will see a single prompt, as indicated:

cmgr> define node cm1a
Enter commands, when finished enter either "done" or "cancel"

cm1a?

At the cm1a? prompt, enter the individual node definition commands in the following format (for full information on defining nodes, see “Define a Node with cmgr” in Chapter 6). For example:

cm1a? set hostname to hostname

A series of commands is required to define a node. If you are running cmgr in prompt mode, however, you are prompted for each required command, as shown in the following example:

cmgr> define node cm1a
Enter commands, you may enter "done" or "cancel" at any time to exit

Node Name [cm1a]? cm1a

Hostname[optional]? cm1a
Is this a FailSafe node <true|false> ? true
Is this a CXFS node <true|false> ? false
Node ID ? 1
Partition ID[optional] ? (0)
Reset type <powerCycle|reset|nmi> ? (powerCycle)
Do you wish to define system controller info[y/n]:y
Sysctrl Type <msc|mmsc|l2|l1>? (msc) msc
Sysctrl Password [optional]? ( )
Sysctrl Status <enabled|disabled>? enabled
Sysctrl Owner? cm2
Sysctrl Device? /dev/ttyd2
Sysctrl Owner Type <tty> [tty]? 
Number of Network interfaces [2]? 2
NIC 1 - IP Address? cm1
NIC 1 - Heartbeat HB (use network for heartbeats) <true|false>? true
NIC 1 - (use network for control messages) <true|false>? true
NIC 1 - Priority <1,2,...>? 1
NIC 2 - IP Address? cm2
NIC 2 Heartbeat HB (use network for heartbeats) <true|false>? true
NIC 2 - (use network for control messages) <true|false>? false
NIC 2 - Priority <1,2,...>? 2

Completing Actions and Cancelling

When you are creating or modifying a component of a cluster, you can enter either of the following commands:

  • cancel, which aborts the current mode and discards any changes you have made

  • done, which commits the current definitions or modifications and returns to the cmgr> prompt

Command Line Editing within cmgr

The cmgr command supports the following command-line editing commands:

history [n] or h [n] 

Displays command line history. The optional n can be used to set the number commands that will be remembered.

!! 

Refers to the previous command. By itself, this substitution repeats the previous command.

!n 

Refers to command line n.

!-n 

Refers to the current command line minus n.

!string 

Refers to the most recent command starting with string.

exit 

Exits from the shell.

Ctrl-W 

Deletes the previous word.

Ctrl-D 

Deletes the current character.

Ctrl-A 

Goes to the beginning of the line.

Ctrl-E 

Goes to the end of the line.

Ctrl-F  

Moves forward one character.

Ctrl-B 

Moves backward one character.

Ctrl-H  

Deletes the previous character.

Ctrl-N 

Moves down in the history.

Ctrl-K  

Erases to the end of the line from the cursor.

Ctrl-L 

Clears the screen and redisplays the prompt.

Ctrl-P 

Moves up in the history.

Ctrl-U 

Erases to the beginning of line from the cursor.

Ctrl-R  

Redraws the input line.

Esc-f  

Moves forward one word.

Esc-b  

Moves backward one word.

Esc-d  

Deletes the next word.

Esc-DEL  

Deletes the previous word.

Long-Running Tasks

The tasks to define the cluster and to stop HA services are long-running tasks that might take a few minutes to complete. The cmgr command will provide intermediate task status for such tasks. For example:

cmgr> stop ha_services in cluster nfs-cluster
Making resource groups offline
Stopping HA services on node node1
Stopping HA services on node node2

Startup Script

You can set the environment variable CMGR_STARTUP_FILE to point to a startup cmgr script. The startup script that this variable specifies is executed when cmgr is started (with or without the -p option). Only the set and show commands of the cmgr are allowed in the cmgr startup file.

The following is an example of a cmgr startup script file called cmgr_rc:

set cluster test-cluster
show status of resource_group oracle_rg

To specify this file as the startup script, execute the following command:

# setenv CMGR_STARTUP_FILE /cmgr_rc

Whenever cmgr is started, the cmgr_rc script is executed. The default cluster is set to test-cluster and the status of resource group oracle_rg in cluster test-cluster is displayed.

Entering Subcommands on the Command Line

You can enter some cmgr subcommands directly from the command line using the following format:

cmgr -c "subcommand"

where subcommand can be any of the following with the appropriate operands:

  • admin, which allows you to perform certain actions such as resetting a node

  • delete, which deletes a cluster or a node

  • help, which displays help information

  • show, which displays information about the cluster or nodes

  • start, which starts HA services and sets the configuration so that HA services will be automatically restarted upon reboot

  • stop, which stops HA services and sets the configuration so that HA services are not restarted upon reboot

  • test, which tests connectivity

For example, to display information about the cluster, enter the following:

# cmgr -c "show clusters"
1 Cluster(s) defined
      eagan

See Chapter 6, “Configuration”, and the cmgr man page for more information.

Using Script Files

You can execute a series of cmgr commands by using the -f option and specifying an input file, as follows:

cmgr -f input_file

Or you could include the following as the first line of the file and then execute it as a script:

#!/usr/cluster/bin/cmgr -f

Each line of the file must be a valid cmgr command line, comment line (starting with #), or a blank line. (You must include a done command line to finish a multilevel command and end the file with a quit command line.)

If any line of the input file fails, cmgr will exit. You can choose to ignore the failure and continue the process by using the -i option with the -f option, as follows:

cmgr -if input_file

Or include it in the first line for a script:

#!/usr/cluster/bin/cmgr -if


Note: If you include -i when using a cmgr command line as the first line of the script, you must use this exact syntax (that is, -if).


For example, suppose the file /tmp/showme contains the following:

fs6# more /tmp/showme
show clusters
show nodes in cluster fs6-8
quit

You can execute the following command, which will yield the indicated output:

fs6# /usr/cluster/bin/cmgr -if /tmp/showme

1 Cluster(s) defined
        fs6-8


Cluster fs6-8 has following 3 machine(s)
        fs6
        fs7
        fs8

Or you could include the cmgr command line as the first line of the script, give it execute permission, and execute showme itself:

fs6# more /tmp/showme
#!/usr/cluster/bin/cmgr -if
#
show clusters
show nodes in cluster fs6-8
quit

fs6# /tmp/showme

1 Cluster(s) defined
        fs6-8


Cluster fs6-8 has following 3 machine(s)
        fs6
        fs7
        fs8

Creating a cmgr Script Automatically

After you have configured the cluster database, you can use the build_cmgr_script command to automatically create a cmgr script based on the contents of the cluster database. The generated script will contain the following:

  • Node definitions

  • Cluster definition

  • Resource definitions

  • Resource type definitions

  • Resource group definitions

  • Failover policy definitions

  • HA parameters settings

  • Any changes made using either the cmgr command or the GUI

  • CXFS information (only in a coexecution cluster)

When you use the -s option, the command also generates create_resource_type scripts for resource types.

As needed, you can then use the generated script to recreate the cluster database after performing a cdbreinit.

By default, the generated script is placed in the following location:

/var/cluster/ha/tmp/cmgr_create_cluster_clustername_processID

You can specify an alternative path name by using the -o option:

build_cmgr_script [-o script_pathname]

For more details, see the build_cmgr_script man page.

For example:

# /var/cluster/cmgr-scripts/build_cmgr_script -o /tmp/newcdb
Building cmgr script for cluster test-cluster ...
build_cmgr_script: Generated cmgr script is /tmp/newcdb

The example script file contents are as follows:

#!/usr/cluster/bin/cmgr -f

# Node node1 definition
define node node1
        set hostname to node1.dept.company.com
        set is_failsafe to true
        set nodeid to 32065
        set hierarchy to Reset,Shutdown
        set reset_type to powerCycle
        set sysctrl_type to msc
        set sysctrl_status to enabled
        set sysctrl_owner to node2
        set sysctrl_device to /dev/ttyd2
        set sysctrl_owner_type to tty
        add nic 192.0.2.58
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 1
        done
        add nic 160.0.2.15
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 2
        done
done

# Node node2 definition
define node node2
        set hostname to node2.dept.company.com
        set is_failsafe to true
        set nodeid to 24140
        set hierarchy to Reset,Shutdown
        set reset_type to powerCycle
        set sysctrl_type to msc
        set sysctrl_status to enabled
        set sysctrl_owner to node1
        set sysctrl_device to /dev/ttyd2
        set sysctrl_owner_type to tty
        add nic 192.0.2.59
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 1
        done
        add nic 160.0.2.16
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 2
        done
done

# Define cluster and add nodes to the cluster
define cluster test-cluster
        set is_failsafe to true
        set ha_mode to normal        
done

modify cluster test-cluster
        add node node1
        add node node2
done


set cluster test-cluster

quit

Template Scripts

Template files of scripts that you can modify to configure the different components of your system are located in the /var/cluster/cmgr-templates directory.

Each template file contains a list of cmgr commands to create a particular object, as well as comments describing each field. The template also provides default values for optional fields.

Table 5-2 shows the template scripts for cmgr that are found in the var/cluster/cmgr-templates directory.

Table 5-2. Template Scripts for cmgr

File name

Description

cmgr-create-cluster

Creates a cluster

cmgr-create-failover_policy

Creates a failover policy

cmgr-create-node

Creates a node

cmgr-create-resource_group

Creates a resource group

cmgr-create-resource_type

Creates a resource type

cmgr-create-resource- ResourceType

Creates a the specified resource of type

To create a FailSafe configuration, you can concatenate multiple templates into one file and execute the resulting script. If you concatenate information from multiple template scripts to prepare your cluster configuration, you must remove the quit at the end of each template script, except for the final quit. A cmgr script must have only one quit line.

For example, for a three-node configuration with an NFS resource group containing one volume, one filesystem , one IP_address, and one NFS resource, you would concatenate the following files, removing the quit at the end of each template script except the last one:

  • Three copies of the cmgr-create-node file

  • One copy of each of the following files:

    cmgr-create-cluster
    cmgr-create-failover_policy
    cmgr-create-resource_group
    cmgr-create-resource-volume
    cmgr-create-resource-filesystem
    cmgr-create-resource-IP_address
    cmgr-create-resource-NFS

Invoking a Shell from within cmgr

Enter the following command to invoke a shell from within cmgr:

cmgr> sh

To exit the shell and to return to the cmgr prompt, enter exit at the shell prompt.