Appendix F. Reference to cmgr Tasks


Caution: This appendix is included for convenience, but has not been updated to support the current release. With the exception of a few administrative cmgr commands, the preferred CXFS configuration tools are cxfs_admin and the CXFS graphical user interface (GUI). See:



The following cmgr commands are still useful:
admin fence
admin nmi
admin ping
admin powerCycle
admin reset
start/stop cx_services
test connectivity
test serial


This appendix discusses the following:

For an overview of the tasks that must be performed to configure a cluster, see “Initial Setup with the cmgr Command”.

Tasks must be performed using a certain hierarchy. For example, to modify a partition ID, you must first identify the node name.

You can also use the clconf_info tool to view status. See Chapter 14, “Monitoring Status”.


Note: CXFS requires a license key to be installed on each server-capable node. If you install the software without properly installing the license key, you cannot use the cmgr command. For more information about licensing, see Chapter 5, “CXFS License Keys”.

For information about using the preferred cxfs_admin command rather than cmgr, see Chapter 11, “cxfs_admin Command” and Appendix G, “Migration from cmgr to cxfs_admin”

cmgr Overview

To use the cmgr command, you must be logged in as root on a CXFS administration node. Then enter either of the following:

# /usr/cluster/bin/cmgr

or

# /usr/cluster/bin/cluster_mgr

After you have entered this command, you will see the following message and the command prompt (cmgr>):

Welcome to SGI Cluster Manager Command-Line Interface

cmgr>

For more information, see the cmgr(1M) man page ] [2]

Making Changes Safely

Do not make configuration changes on two different administration nodes in the pool simultaneously, or use the CXFS GUI, cmgr, and xvm commands simultaneously to make changes. You should run one instance of the cmgr command or the CXFS GUI on a single administration node in the pool when making changes at any given time. However, you can use any node in the pool when requesting status or configuration information.

Getting Help

After the command prompt displays, you can enter subcommands. At any time, you can enter ? or help to bring up the cmgr help display.

Using Prompt Mode

The -p option to cmgr displays prompts for the required inputs of administration commands that define and modify CXFS components. You can run in prompt mode in either of the following ways:

  • Specify a -p option on the command line:

    # cmgr -p

  • Execute a set prompting on command after you have brought up cmgr, as in the following example:

    cmgr> set prompting on

    This method allows you to toggle in and out of prompt mode as you execute individual subcommands. To get out of prompt mode, enter the following:

    cmgr> set prompting off

The following shows an example of the questions that may be asked in prompting mode (the actual questions asked will vary depending upon your answers to previous questions):

cmgr> define node nodename
Enter commands, you may enter "done" or "cancel" at any time to exit

Hostname[optional] ? 
Is this a FailSafe node <true|false> ? 
Is this a CXFS node <true|false> ? 
Operating System <IRIX|Linux32|Linux64|AIX|HPUX|Solaris|MacOSX|Windows> ?
Node Function <server_admin|client_admin|client_only> ?
Node ID ?[optional] 
Partition ID ?[optional] (0)
Do you wish to define failure hierarchy[y/n]:
Reset type <powerCycle|reset|nmi> ? (powerCycle) 
Do you wish to define system controller info[y/n]:
Sysctrl Type <msc|mmsc|l2|l1|bmc>? (msc) 
Sysctrl Password[optional] ? ( )
Sysctrl Status <enabled|disabled> ? 
Sysctrl Owner ? 
Sysctrl Device ? 
Sysctrl Owner Type <tty|network|ipmi> ? (tty) 
Number of Network Interfaces ? (1)
NIC 1 - IP Address ?

For details about this task, see “Define a Node with cmgr”.

Completing Actions and Cancelling

When you are creating or modifying a component of a cluster, you can enter either of the following commands:

  • cancel, which aborts the current mode and discards any changes you have made

  • done, which executes the current definitions or modifications and returns to the cmgr> prompt

Using Script Files

You can execute a series of cmgr commands by using the -f option and specifying an input file:

cmgr -f input_file

Or, you could include the following as the first line of the file and then execute it as a bash script:

#!/usr/cluster/bin/cmgr -f

Each line of the file must be a valid cmgr command line, comment line (starting with #), or a blank line.


Note: You must include a done command line to finish a multilevel command and end the file with a quit command line.


If any line of the input file fails, cmgr will exit. You can choose to ignore the failure and continue the process by using the -i option with the -f option, as follows:

cmgr -if input_file

Or include it in the first line for a script:

#!/usr/cluster/bin/cmgr -if


Note: If you include -i when using a cmgr command line as the first line of the script, you must use this exact syntax (that is, -if).


For example, suppose the file /tmp/showme contains the following:

cxfs6# more /tmp/showme
show clusters
show nodes in cluster cxfs6-8
quit

You can execute the following command, which will yield the indicated output:

cxfs6# /usr/cluster/bin/cmgr -if /tmp/showme

1 Cluster(s) defined
        cxfs6-8


Cluster cxfs6-8 has following 3 machine(s)
        cxfs6
        cxfs7
        cxfs8

Or you could include the cmgr command line as the first line of the script, give it execute permission, and execute showme itself:

cxfs6# more /tmp/showme
#!/usr/cluster/bin/cmgr -if
#
show clusters
show nodes in cluster cxfs6-8
quit

cxfs6# /tmp/showme

1 Cluster(s) defined
        cxfs6-8



Cluster cxfs6-8 has following 3 machine(s)
        cxfs6
        cxfs7
        cxfs8

For an example of defining a complete cluster, see “Script Example”.

Invoking a Shell from within cmgr

To invoke a shell from within cmgr, enter the following:

cmgr> sh
cxfs6#

To exit the shell and to return to the cmgr> prompt, enter the following:

cxfs6# exit
cmgr>

Entering Subcommands on the Command Line

You can enter some cmgr subcommands directly from the command line using the following format:

cmgr -c "subcommand"

where subcommand can be any of the following with the appropriate operands:

  • admin, which allows you to perform certain actions such as resetting a node

  • delete, which deletes a cluster or a node

  • help, which displays help information

  • show, which displays information about the cluster or nodes

  • start, which starts CXFS services and sets the configuration so that CXFS services will be automatically restarted upon reboot

  • stop, which stops CXFS services and sets the configuration so that CXFS services are not restarted upon reboot

  • test, which tests connectivity

For example, to display information about the cluster, enter the following:

# cmgr -c "show clusters"
1 Cluster(s) defined
      eagan

See the cmgr man page for more information.

Template Scripts

The /var/cluster/cmgr-templates directory contains template cmgr scripts that you can modify to configure the different components of your system.

Each template file contains lists of cmgr commands required to create a particular object, as well as comments describing each field. The template also provides default values for optional fields.

The /var/cluster/cmgr-templates directory contains the following templates to create a cluster and nodes:

  • cmgr-create-cluster

  • cmgr-create-node

To create a CXFS configuration, you can concatenate multiple templates into one file and execute the resulting script.


Note: If you concatenate information from multiple template scripts to prepare your cluster configuration, you must remove the quit at the end of each template script, except for the final quit. A cmgr script must have only one quit line.

For example, for a three-node configuration, you would concatenate three copies of the cmgr-create-node file and one copy of the cmgr-create-cluster file.

Initial Setup with the cmgr Command


Note: For the initial installation, SGI highly recommends that you use the GUI guided configuration tasks. See “Initial Setup with the CXFS GUI” in Chapter 9.

To initially configure the cluster with the cmgr command, do the following:

  1. Follow the directions in “Preliminary Cluster Configuration Steps” in Chapter 9.

  2. Define the nodes that are eligible to be part of the cluster. The hostname/IP-address pairings and priorities of the networks must be the same for each node in the cluster. See “Define a Node with cmgr”.

    For large clusters, SGI recommends that you define only the first three CXFS administration nodes and then continue on to the next step; add the remaining nodes after you have a successful small cluster.

    The following example sequence defines three nodes. (To use the default value for a prompt, press the Enter key. The Enter key is not shown in the examples in this guide.)

    To define the first node, named cxfs6, enter the following:


    Caution: It is critical that you enter the primary name for the first node defined in the pool.


    cxfs6 # /usr/cluster/bin/cmgr -p
    Welcome to SGI Cluster Manager Command-Line Interface
    
    cmgr> define node cxfs6
    Enter commands, you may enter "done" or "cancel" at any time to exit
    
    Hostname[optional] ? 
    Is this a FailSafe node <true|false> ? false
    Is this a CXFS node <true|false> ? true
    Operating System  <IRIX|Linux32|Linux64|AIX|HPUX|Solaris|MacOSX|Windows> ? irix
    Node Function <server_admin|client_admin|client_only> ? server_admin
    Node ID[optional]? 
    Partition ID[optional] ? (0)
    Do you wish to define failure hierarchy[y/n]:n
    Reset type <powerCycle|reset|nmi>  ? (powerCycle) 
    Do you wish to define system controller info[y/n]:y
    Sysctrl Type <msc|mmsc|l2|l1|bmc> ? (msc) 
    Sysctrl Password[optional] ? ( ) 
    Sysctrl Status <enabled|disabled> ? enabled
    Sysctrl Owner ? cxfs8
    Sysctrl Device ? /dev/ttyd2
    Sysctrl Owner Type <tty|network|ipmi> ? (tty) 
    Number of Network Interfaces ? (1)
    NIC 1 - IP Address ? cxfs6
    NIC 1 - Heartbeat HB (use network for heartbeats) <true|false> ? true
    NIC 1 - (use network for control messages) <true|false> ? true
    
    NIC 1 - Priority <1,2,...> 1
    
    
    Successfully defined node cxfs6

    To define the second node, named cxfs7, enter the following:

    cmgr> define node cxfs7 
    Enter commands, you may enter "done" or "cancel" at any time to exit
    
    Hostname[optional] ? 
    Is this a FailSafe node <true|false> ? false
    Is this a CXFS node <true|false> ? true
    Node Function <server_admin|client_admin|client_only> ? server_admin
    Operating System <IRIX|Linux32|Linux64|AIX|HPUX|Solaris|MacOSX|Windows> ? irix
    Node ID[optional] ? 
    Partition ID[optional] ? (0)
    Do you wish to define failure hierarchy[y/n]:n
    Reset type <powerCycle|reset|nmi>  ? (powerCycle) 
    Do you wish to define system controller info[y/n]:y
    Sysctrl Type <msc|mmsc|l2|l1|bmc> ? (msc) 
    Sysctrl Password[optional] ? ( ) 
    Sysctrl Status <enabled|disabled> ? enabled
    Sysctrl Owner ? cxfs6
    Sysctrl Device ? /dev/ttyd2
    Sysctrl Owner Type <tty|network|ipmi> ? (tty) 
    Number of Network Interfaces ? (1) 
    NIC 1 - IP Address ? cxfs7
    NIC 1 - Heartbeat HB (use network for heartbeats) <true|false> ? true
    NIC 1 - (use network for control messages) <true|false> ? true
    
    NIC 1 - Priority <1,2,...> 1
    
    Successfully defined node cxfs7

    To define the third node, named cxfs8, enter the following:

    cmgr> define node cxfs8 
    Enter commands, you may enter "done" or "cancel" at any time to exit
    
    Hostname[optional] ? 
    Is this a FailSafe node <true|false> ? false
    Is this a CXFS node <true|false> ? true
    Node Function <server_admin|client_admin|client_only> ? server_admin
    Operating System <IRIX|Linux32|Linux64|AIX|HPUX|Solaris|MacOSX|Windows> ? irix
    Node ID[optional] ?
    Partition ID[optional] ? (0)
    Do you wish to define failure hierarchy[y/n]:n
    Reset type <powerCycle|reset|nmi>  ? (powerCycle) 
    Do you wish to define system controller info[y/n]:y
    Sysctrl Type <msc|mmsc|l2|l1|bmc> ? (msc) 
    Sysctrl Password[optional] ? ( ) 
    Sysctrl Status <enabled|disabled> ? enabled
    Sysctrl Owner ? cxfs7
    Sysctrl Device ? /dev/ttyd2
    Sysctrl Owner Type <tty|network|ipmi> ? (tty) 
    Number of Network Interfaces ? (1) 
    NIC 1 - IP Address ? cxfs8
    NIC 1 - Heartbeat HB (use network for heartbeats) <true|false> ? true
    NIC 1 - (use network for control messages) <true|false> ? true
    
    NIC 1 - Priority <1,2,...> 1
    
    Successfully defined node cxfs8

    You now have three nodes defined in the pool. To verify this, enter the following:

    cmgr> show nodes in pool
    
    3 Machine(s) defined
            cxfs6
            cxfs7
            cxfs8

    To show the contents of node cxfs6, enter the following:

    cmgr> show node cxfs6
    
    Logical Machine Name: cxfs6
    Hostname: cxfs6.example.com
    Operating System: irix
    Node Is FailSafe: false
    Node Is CXFS: true
    Node Function: server_admin
    Nodeid: 13203
    Partition id: 0
    Reset type: powerCycle
    System Controller: msc
    System Controller status: enabled
    System Controller owner: cxfs8
    System Controller owner device: /dev/ttyd2
    System Controller owner type: tty
    ControlNet Ipaddr: cxfs6
    ControlNet HB: true
    ControlNet Control: true
    ControlNet Priority: 1

  3. Define the cluster and add the nodes to it. See “Define a Cluster with cmgr”.

    For example, to define a cluster named cxfs6-8 and add the nodes that are already defined, enter the following:

    cmgr> define cluster cxfs6-8
    Enter commands, you may enter "done" or "cancel" at any time to exit
    
    Is this a FailSafe cluster <true|false> false ? 
    Is this a CXFS cluster  <true|false> true ?
    Cluster Notify Cmd [optional] ? 
    Cluster Notify Address [optional] ? 
    Cluster mode <normal|experimental>[optional] 
    Cluster ID ? 22
    
    No nodes in cluster cxfs6-8
    
    Add nodes to or remove nodes from cluster cxfs6-8
    Enter "done" when completed or "cancel" to abort
    
    cxfs6-8 ? add node cxfs6
    cxfs6-8 ? add node cxfs7
    cxfs6-8 ? add node cxfs8
    cxfs6-8 ? done
    Successfully defined cluster cxfs6-8
    
    Added node <cxfs6> to cluster <cxfs6-8>
    Added node <cxfs7> to cluster <cxfs6-8>
    Added node <cxfs8> to cluster <cxfs6-8>

    The fail action hierarchy is the set of instructions that determines which method is used in case of failure. If you set a hierarchy including fencing, you could define the switch at this point. For more information, see “Switches and I/O Fencing Tasks with cmgr”.

    To define a list of private networks that can be used in case the highest priority network (consisting by default of the priority 1 NICs) fails, use the add net command; see “Define a Node with cmgr”.

    For more information, see “Define a Cluster with cmgr”.

    To verify the cluster and its contents, enter the following:

    cmgr> show clusters
    
    1 Cluster(s) defined
            cxfs6-8
    
    cmgr> show cluster cxfs6-8
    Cluster Name: cxfs6-8
    Cluster Is FailSafe: false
    Cluster Is CXFS: true
    Cluster ID: 22
    Cluster CX mode: normal
    
    Cluster cxfs6-8 has following 3 machine(s)
            cxfs6
            cxfs7
            cxfs8
    
    CXFS Failover Networks:
         default network 0.0.0.0, mask 0.0.0.0

    For an example of this step using a script, see “Script Example”.

  4. Start CXFS services for each node in the cluster by entering the following:

    start cx_services for cluster clustername

    For example:

    cmgr> start cx_services for cluster cxfs6-8
    
    CXFS services have been activated in cluster cxfs6-8

    This action starts CXFS services and sets the configuration so that CXFS services will be restarted automatically whenever a node reboots.


    Note: If you stop CXFS services using either the GUI or cmgr , the automatic restart capability is turned off. You must start CXFS services again to reinstate the automatic restart capability.


    To verify that CXFS services have been started in the cluster and there is a membership formed, you can use the following cmgr command:

    show status of cluster clustername

    For example:

    cmgr> show status of cluster cxfs6-8
    
    Cluster (cxfs6-8) is not configured for FailSafe
    
    
    CXFS cluster state is ACTIVE.

    You can also use the clconf_info command. For example:

    cxfs6 # /usr/cluster/bin/clconf_info
    
    Event at [2004-04-16 09:20:59]
    
    Membership since Fri Apr 16 09:20:56 2004
    
    ____________ ______ ________ ______ ______
    Node         NodeID Status   Age    CellID
    ____________ ______ ________ ______ ______
    cxfs7         12812 up        0          1
    cxfs6         13203 up        0          0
    cxfs8         14033 up        0          2
    ____________ ______ ________ ______ ______
    0 CXFS FileSystems

    For more information, see “Display a Cluster with cmgr”.

  5. Obtain a shell window for one of the CXFS administration nodes in the cluster and use the IRIX fx(1M) command or the Linux parted(8) command to create a volume header on the disk drive. For information, see the man pages, IRIX Admin: Disks and Filesystems, and Linux Configuration and Operations Guide.

  6. Create the XVM logical volumes. In the shell window, use the xvm command line interface. For information, see the XVM Volume Manager Administrator's Guide.

  7. Make the XFS filesystems. In the shell window, use the mkfs command. For information, see the XVM Volume Manager Administrator's Guide and IRIX Admin: Disks and Filesystems.

  8. Define the filesystems by using the define cxfs_filesystem subcommand to cmgr. See “CXFS Filesystem Tasks with cmgr”.

    The following example shows two potential metadata servers for the fs1 filesystem; if cxfs6 (the preferred server, with rank 0) is not up when the cluster starts or later fails or is removed from the cluster, then cxfs7 (rank1) will be used. It also shows the filesystem being mounted by default on all nodes in the cluster (Default Local Status enabled) but explicitly not mounted on cxfs8.


    Note: Although the list of metadata servers for a given filesystem is ordered, it is impossible to predict which server will become the server during the boot-up cycle because of network latencies and other unpredictable delays.

    Do the following:

    cmgr> define cxfs_filesystem fs1 in cluster cxfs6-8
    
    (Enter "cancel" at any time to abort)
    
    Device ? /dev/cxvm/d76lun0s0
    Mount Point ? /mnts/fs1
    Mount Options[optional] ? 
    Use Forced Unmount ? <true|false> ? false
    Default Local Status <enabled|disabled> ? (enabled) 
    
    DEFINE CXFS FILESYSTEM OPTIONS
    
            0) Modify Server.
            1) Add Server.
            2) Remove Server.
            3) Add Enabled Node.
            4) Remove Enabled Node.
            5) Add Disabled Node.
            6) Remove Disabled Node.
            7) Show Current Information.
            8) Cancel. (Aborts command)
            9) Done. (Exits and runs command)
    
    Enter option:1
    
    No current servers
    
    Server Node ? cxfs6
    Server Rank ? 0
    
            0) Modify Server.
            1) Add Server.
            2) Remove Server.
            3) Add Enabled Node.
            4) Remove Enabled Node.
            5) Add Disabled Node.
            6) Remove Disabled Node.
            7) Show Current Information.
            8) Cancel. (Aborts command)
            9) Done. (Exits and runs command)
    
    Enter option:1
    Server Node ? cxfs7
    Server Rank ? 1
    
    
            0) Modify Server.
            1) Add Server.
            2) Remove Server.
            3) Add Enabled Node.
            4) Remove Enabled Node.
            5) Add Disabled Node.
            6) Remove Disabled Node.
            7) Show Current Information.
            8) Cancel. (Aborts command)
            9) Done. (Exits and runs command)
    
    Enter option:5
    
    No disabled clients
    
    Disabled Node ? cxfs8
    
            0) Modify Server.
            1) Add Server.
            2) Remove Server.
            3) Add Enabled Node.
            4) Remove Enabled Node.
            5) Add Disabled Node.
            6) Remove Disabled Node.
            7) Show Current Information.
            8) Cancel. (Aborts command)
            9) Done. (Exits and runs command)
    
    Enter option:7
    
    Current settings for filesystem (fs1)
    
    CXFS servers:
            Rank 0          Node cxfs6
            Rank 1          Node cxfs7
    
    Default local status: enabled
    
    No explicitly enabled clients
    
    Explicitly disabled clients:
            Disabled Node: cxfs8
    
            0) Modify Server.
            1) Add Server.
            2) Remove Server.
            3) Add Enabled Node.
            4) Remove Enabled Node.
            5) Add Disabled Node.
            6) Remove Disabled Node.
            7) Show Current Information.
            8) Cancel. (Aborts command)
            9) Done. (Exits and runs command)
    
    Enter option:9
    Successfully defined cxfs_filesystem fs1
    
    cmgr> define cxfs_filesystem fs2 in cluster cxfs6-8
    
    (Enter "cancel" at any time to abort)
    
    Device ? /dev/cxvm/d77lun0s0
    Mount Point ? /mnts/fs2
    Mount Options[optional] ? 
    Use Forced Unmount ? <true|false> ? false
    Default Local Status <enabled|disabled> ? (enabled)
    
    DEFINE CXFS FILESYSTEM OPTIONS
    
            0) Modify Server.
            1) Add Server.
            2) Remove Server.
            3) Add Enabled Node.
            4) Remove Enabled Node.
            5) Add Disabled Node.
            6) Remove Disabled Node.
            7) Show Current Information.
            8) Cancel. (Aborts command)
            9) Done. (Exits and runs command)
    
    Enter option:1
    
    Server Node ? cxfs8
    Server Rank ? 0
    
            0) Modify Server.
            1) Add Server.
            2) Remove Server.
            3) Add Enabled Node.
            4) Remove Enabled Node.
            5) Add Disabled Node.
            6) Remove Disabled Node.
            7) Show Current Information.
            8) Cancel. (Aborts command)
            9) Done. (Exits and runs command)
    
    Enter option:7
    
    Current settings for filesystem (fs2)
    
    CXFS servers:
            Rank 0          Node cxfs8
    
    Default local status: enabled
    
    No explicitly enabled clients
    
    No explicitly disabled clients
    
            0) Modify Server.
            1) Add Server.
            2) Remove Server.
            3) Add Enabled Node.
            4) Remove Enabled Node.
            5) Add Disabled Node.
            6) Remove Disabled Node.
            7) Show Current Information.
            8) Cancel. (Aborts command)
            9) Done. (Exits and runs command)
    
    Enter option:9
    Successfully defined cxfs_filesystem fs2

    To see the modified contents of cluster cxfs6-8, enter the following:

    cmgr> show cxfs_filesystems in cluster cxfs6-8
    
    fs1
    fs2
    

  9. Mount the filesystems on all nodes in the cluster by using the admin cxfs_mount cxfs_filesystem subcommand to cmgr. See “Mount a CXFS Filesystem with cmgr”. For example:

    cmgr> admin cxfs_mount cxfs_filesystem fs1 in cluster cxfs6-8
    cxfs_mount operation successful
    
    cmgr> admin cxfs_mount cxfs_filesystem fs2 in cluster cxfs6-8
    cxfs_mount operation successful

  10. To quit out of cmgr, enter the following:

    cmgr> quit

Set Configuration Defaults with cmgr

You can set a default cluster and node to simplify the configuration process for the current session of cmgr. The default will then be used unless you explicitly specify a name. You can use the following commands to specify default values:

set cluster  clustername
set node hostname 

clustername and hostname are logical names. Logical names cannot begin with an underscore (_) or include any whitespace characters, and can be at most 255 characters.

To view the current defaults, use the following:

show set defaults

For example:

cmgr> set cluster cxfs6-8
cmgr> set node cxfs6
cmgr> show set defaults
Default cluster set to: cxfs6-8

Default node set to: cxfs6
Default cdb set to: /var/cluster/cdb/cdb.db
Default resource_type is not set
Extra prompting is set off

Node Tasks with cmgr

This section tells you how to define, modify, delete, display, and reset a node using cmgr.


Note: The entire cluster status information is sent to each CXFS administration node each time a change is made to the cluster database; therefore, the more CXFS administration nodes in a configuration, the longer it will take.


Define a Node with cmgr

To define a node, use the following commands:

define node logical_hostname
    set hostname to hostname
    set is_failsafe to true|false
    set is_cxfs to true|false
    set operating_system to irix|linux32|linux64|aix|solaris|macosx|windows
    set node_function to server_admin|client_admin|client_only
    set nodeid to nodeID
    set partition_id to partitionID
    set hierarchy to [system][fence][reset][fencereset][shutdown]
    set reset_type to powerCycle|reset|nmi (reset is recommended)
    set sysctrl_type to msc|mmsc|l2|l1|bmc (based on node hardware)
    set sysctrl_password to password
    set sysctrl_status to enabled|disabled
    set sysctrl_owner to node_sending_reset_command
    set sysctrl_device to port|IP_address_or_hostname_of_device
    set sysctrl_owner_type to tty|network|ipmi
    add nic IP_address_or_hostname (if DNS)
            set heartbeat to true|false
            set ctrl_msgs to true|false
            set priority to integer
    remove nic IP_address_or_hostname (if DNS)
    set weight to 0|1 (no longer needed)

Usage notes:

  • logical_hostname is a simple hostname (such as lilly) or a fully qualified domain name (such as lilly.example.com) or an entirely different name (such as nodeA). Logical names cannot begin with an underscore (_) or include any whitespace characters, and can be at most 255 characters.

  • hostname is the fully qualified hostname unless the simple hostname is resolved on all nodes. Use the ping to display the fully qualified hostname. Do not enter an IP address. The default for hostname is the value for logical_hostname; therefore, you must supply a value for this command if you use a value other than the hostname or an abbreviation of it for logical_hostname.

  • If you are running just CXFS on this node, set is_cxfs to true and is_failsafe to false. If you are running both CXFS and FailSafe on this node in a coexecution cluster, set both values to true.

  • operating_system can be set to irix, linux32, linux64, aix, solaris, macosx, or windows. Choose windows for Windows 2000, Windows 2003, or Windows XP. Choose linux64 when defining an x86_64 or ia64 architecture node. (Use the uname -i command to determine the architecture type.)


    Note: For support details, see the CXFS 5 Client-Only Guide for SGI InfiniteStorage.


    If you specify aix, solaris, macosx or windows, the weight is assumed to be 0. If you try to specify incompatible values for operating_system and is_failsafe or weight, the define command will fail.

  • node_function specifies the function of the node. Enter one of the following:

    • server_admin is a node on which you will execute cluster administration commands and that you also want to be a CXFS metadata server. (You will use the Define a CXFS Filesystem task to define the specific filesystem for which this node can be a metadata server.) Use this node function only if the node will be a metadata servers. You must have installed the cluster_admin product on this node.

    • client_admin is an IRIX node on which you will execute cluster administration commands but that you do not want to use as a CXFS metadata server. Use this node function only if the node will run FailSafe but you do not want it to be a metadata server. You must install the cluster_admin product on this node.

    • client_only, is a node that shares CXFS filesystems but on which you will not execute cluster administration commands and that will not be a CXFS metadata server. Use this node function for all nodes other than those that will be metadata servers, or those that will run FailSafe without being a metadata server. You must install the cxfs_client product on this node. This node can run any supported OS.

    AIX, Solaris, Mac OS X, and Windows nodes are automatically specified as client-only. You should specify client-only with linux32.

  • nodeid is an integer in the range 1 through 32767 that is unique among the nodes in the pool. You must not change the node ID number after the node has been defined.

    • For administration nodes, this value is optional. If you do not specify a number for an administration node, CXFS will calculate an ID for you. The default ID is a 5-digit number based on the machine's serial number and other machine-specific information; it is not sequential.

    • For client-only nodes, you must specify a unique value.

  • partition_id uniquely defines a partition in a partitioned Origin 3000 system, Altix 3000 series system, or Altix 4700 system. The set partition_id command is optional; if you do not have a partitioned Origin 3000 system, you can skip this command or enter 0.


    Note: In an Origin 3000 system, use the mkpart command to determine this value:

    • The -n option lists the partition ID (which is 0 if the system is not partitioned).

    • The -l option lists the bricks in the various partitions (use rack#.slot# format in cmgr)

      For example (output truncated here for readability):

      # mkpart -n
      Partition id = 1
      # mkpart -l
      partition: 3 = brick: 003c10 003c13 003c16 003c21 003c24 003c29 ...
      partition: 1 = brick: 001c10 001c13 001c16 001c21 001c24 001c29 ...

      You could enter one of the following for the Partition ID field:

      1
      001.10


    To unset the partition ID, use a value of 0 or none.

    On an Altix 3000, you can find the partition ID by reading the proc file. For example:

    [root@linux root]# cat /proc/sgi_sn/partition_id 
    0

    The 0 indicates that the system is not partitioned. If the system is partitioned, the number of partitions (such as 1, 2, etc.) is displayed.

  • hierarchy defines the failpolicy hierarchy, which determines what happens to a failed node. You can specify up to three options. The second option will be completed only if the first option fails; the third option will be completed only if both the first and second options fail. Options must be separated by commas and no whitespace.

    The option choices are as follows:

    • system deletes all hierarchy information about the node from the database, causing the system defaults to be used. The system defaults are the same as entering reset,shutdown. This means that a reset will be performed on a node with a system controller; if the reset fails or if the node does not have a system controller, CXFS services will be forcibly shut down.

    • fence disables access to the SAN from the problem node. Fencing provides faster recovery of the CXFS kernel membership than reset.

    • fencereset performs a fence and then, if the node is successfully fenced, also performs an asynchronous reset of the node via a system controller; recovery begins without waiting for reset acknowledgement.


      Note: SGI recommends that a server-capable node include reset in its hierarchy (unless it is the only server-capable node in the cluster). See “Data Integrity Protection” in Chapter 1.


    • reset performs a system reset via a system controller.

    • shutdown tells the other nodes in the cluster to wait for a period of time (long enough for the node to shut itself down) before reforming the CXFS kernel membership. (However, there is no notification that the node's shutdown has actually taken place.)


      Caution: Because there is no notification that a shutdown has occurred, if you have a cluster with no tiebreaker, you must not use the shutdown setting for any server-capable node in order to avoid multiple clusters being formed. See “Shutdown” in Chapter 2.

      You cannot use shutdown on client nodes if you choose dynamic monitoring.


    Note: If the failure hierarchy contains reset or fencereset, the reset might be performed before the system kernel core-dump can complete, resulting in an incomplete core-dump.

    For a list of valid fail action sets, see “Data Integrity Protection” in Chapter 1.

    To perform a reset only if a fencing action fails, specify the following:

    set hierarchy fence,reset


    Note: If shutdown is not specified and the other actions fail, the node attempting to deliver the CXFS kernel membership will stall delivering the membership until either the failed node attempts to re-enter the cluster or the system administrator intervenes using cms_intervene. Objects held by the failed nodes stall until membership finally transitions and initiates recovery.

    To perform a fence and an asynchronous reset, specify the following:

    set hierarchy fencereset

    To return to system defaults (reset,shutdown), specify the following:

    set hierarchy system

  • reset_type applies to SGI hardware and can be one of the following:

    • powerCycle shuts off power to the node and then restarts it

    • reset simulates the pressing of the reset button on the front of the machine (recommended)

    • nmi (nonmaskable interrupt) performs a core-dump of the operating system kernel, which may be useful when debugging a faulty machine


      Note: The nmi type depends upon kernel support, which may not be present on all SGI Altix ia64 systems; if the kernel support is not provided, the nmi setting will not.

      nmi is not available on systems containing a baseboard management controller (BMC), such as SGI Altix XE x86_64 systems.


  • sysctrl_type is the system controller type based on the node hardware, as show in Table 11-1.

  • sysctrl_password is the password for the system controller port, not the node's root password or PROM password. On some nodes, the system administrator may not have set this password. If you wish to set or change the system controller password, consult the hardware manual for your node.

  • sysctrl_status allows you to provide information about the system controller but temporarily disable reset by setting this value to disabled (meaning that CXFS cannot reset the node). To allow CXFS to reset the node, enter enabled. For nodes without system controllers, set this to disabled; see “Hardware and Software Requirements for Server-Capable Administration Nodes” in Chapter 1.

  • sysctrl_device is one of the following:

    • For systems with serial ports (reset_comms=tty ), this is the name of the terminal port (TTY) on the owner node (the node issuing the reset). A serial cable connects the terminal port on the owner node to the system controller of the node being reset. /dev/ttyd2 is the most commonly used port, except on Origin 300 and Origin 350 systems (where /dev/ttyd4 is commonly used) and on Altix 350 systems (where /dev/ttyIOC0 is commonly used).


      Note: Check the owner node's specific hardware configuration to verify which device to use.


    • For systems with network-attached L2 system controllers (reset_comms=network), this is the IP address or hostname of the L2 controller on the node being reset. For example, reset_device=nodename-l2.company.com. For systems with network-attached baseboard management controller (BMC) system controllers (reset_comms=ipmi ), this is the IP address or hostname of the BMC controller on the node being reset. For example:

      reset_device=nodename-bmc.company.com

  • sysctrl_owner is the name of the node that sends the reset command. If you use tty for sysctrl_owner_type, serial cables must physically connect the node being defined and the owner node through the system controller port. At run time, the node must be defined in the CXFS pool.

  • sysctrl_owner_type is either tty for TTY serial devices or network for network reset (available for systems with L2 system controllers).


    Note: If you are running in coexecution with FailSafe, the networkand ipmi selections are not supported.

    For example:

    • For an Origin 3000 system:

      Sysctrl Device? /dev/ttyd2
      Sysctrl Owner Type? <tty|network|ipmi> tty

    • For an Altix system with an integrated L2, such as a NUMAlink 4 R-brick, or SGI Altix 3000 Bx2 systems:

      Sysctrl Device? nodename-l2.company.com
      Sysctrl Owner Type? <tty|network|ipmi> network

    • For an Altix 350:

      Sysctrl Device? /dev/ttyIOC0
      Sysctrl Owner Type? <tty|network|ipmi> tty

    • For an Altix XE system with an integrated baseboard management controller (BMC) using the Intelligent Platform Management Interface (IPMI):

      Sysctrl Device? nodename-bmc.company.com
       Sysctrl Owner Type?  ipmi

  • nic is the IP address or hostname of the private network. (The hostname must be resolved in the /etc/hosts file.)

    There can be up to 8 network interfaces. The NIC number is not significant. Priority 1 is the highest priority. By default, only the priority 1 NICs will be used as the CXFS private network and they must be on the same subnet. However, you can use the add net command to configure the NICs of a given priority into a network; each network takes its priority from the set of NICs it contains. In this case, if the highest priority network fails, the second will be used, and so on; see “Define a Cluster with cmgr”.


    Note: You cannot add a NIC or a network grouping while CXFS services are active (that is, when start cx_services has been executed); doing so can lead to cluster malfunction. If services have been started, they should be stopped with stop cx_services.

    If you do not use the add net command to group the NICs into a set of networks, all NICs other than priority 1 are ignored.

    SGI requires that this network be private; see “Private Network” in Chapter 1.

    For more information about using the hostname, see “Hostname Resolution and Network Configuration Rules” in Chapter 6.

  • weight, which is automatically set internally to either 0 or 1 to specify how many votes a particular CXFS administration node has in CXFS kernel membership decisions. This information is now set by the Node Function field and this command is no longer needed.


    Note: Although it is possible to use the set weight command to set a weight other than 0 or 1, SGI recommends that you do not do so. There is no need for additional weight.


For more information, see “Define a Node with the GUI” in Chapter 10.

In prompting mode, press the Enter key to use default information. (The Enter key is not shown in the examples.) For general information, see “Define a Node with the GUI” in Chapter 10. Following is a summary of the prompts.

cmgr> define node logical_hostname
Enter commands, you may enter "done" or "cancel" at any time to exit

Hostname[optional] ? hostname
Is this a FailSafe node <true|false> ? true|false
Is this a CXFS node <true|false> ? truet 
Operating System <IRIX|Linux32|Linux64|AIX|HPUX|Solaris|MacOSX|Windows> ?OS_type
Node Function <server_admin|client_admin|client_only> ? node_function
Node ID ?[optional] node_ID
Partition ID ?[optional] (0)partition_ID
Do you wish to define failure hierarchy[y/n]:y|n
Do you wish to define system controller info[y/n]:y|n
Reset type <powerCycle|reset|nmi> ? (powerCycle) 
Do you wish to define system controller info[y/n]:y|n
Sysctrl Type <msc|mmsc|l2|l1|bmc>? (msc) type (based on node hardware)
Sysctrl Password[optional] ? ( )password
Sysctrl Status <enabled|disabled> ? enabled|disabled
Sysctrl Owner ? node_sending_reset_command
Sysctrl Device ? port|IP_address_or_hostname_of_device
Sysctrl Owner Type <tty|network|ipmi> ? (tty) tty|network|ipmi
Number of Network Interfaces ? (1) number
NIC 1 - IP Address ? IP_address_or_hostname (if DNS)

For example, in normal mode:

# /usr/cluster/bin/cmgr
Welcome to SGI Cluster Manager Command-Line Interface

cmgr> define node foo
Enter commands, you may enter "done" or "cancel" at any time to exit

? set is_failsafe to false
? set is_cxfs to true
? set operating_system to irix
? set node_function to server_admin
? set hierarchy to fencereset,reset
? add nic 111.11.11.111
Enter network interface commands, when finished enter "done" or "cancel"

NIC 1 - set heartbeat to true
NIC 1 - set ctrl_msgs to true
NIC 1 - set priority to 1
NIC 1 - done
? done

For example, in prompting mode:

# /usr/cluster/bin/cmgr -p
Welcome to SGI Cluster Manager Command-Line Interface

cmgr> define node foo
Enter commands, you may enter "done" or "cancel" at any time to exit

Hostname[optional] ? 
Is this a FailSafe node <true|false> ? false
Is this a CXFS node <true|false> ? true
Operating System <IRIX|Linux32|Linux64|AIX|HPUX|Solaris|MacOSX|Windows> ? irix
Node Function <server_admin|client_admin|client_only> server_admin
Node ID[optional]? 
Partition ID ? [optional] (0)
Do you wish to define failure hierarchy[y|n]:y
Hierarchy option 0 <System|FenceReset|Fence|Reset|Shutdown>[optional] ? fencereset
Hierarchy option 1 <System|FenceReset|Fence|Reset|Shutdown>[optional] ? reset
Hierarchy option 2 <System|FenceReset|Fence|Reset|Shutdown>[optional] ? 
Reset type <powerCycle|reset|nmi>  ? (powerCycle) 
Do you wish to define system controller info[y/n]:n
Number of Network Interfaces ? (1)
NIC 1 - IP Address ? 111.11.11.111
NIC 1 - Heartbeat HB (use network for heartbeats) <true|false> ? true
NIC 1 - (use network for control messages) <true|false> ? true

NIC 1 - Priority <1,2,...> 1

Following is an example of defining a Solaris node in prompting mode (because it is a Solaris node, no default ID is provided, and you are not asked to specify the node function because it must be client_only).

cmgr> define node solaris1
Enter commands, you may enter "done" or "cancel" at any time to exit

Hostname[optional] ?
Is this a FailSafe node <true|false> ? false
Is this a CXFS node <true|false> ? true
Operating System <IRIX|Linux32|Linux64|AIX|HPUX|Solaris|MacOSX|Windows> ? solaris
Node ID ? 7
Do you wish to define failure hierarchy[y/n]:y
Hierarchy option 0 <System|FenceReset|Fence|Reset|Shutdown>[optional] ? fence
Hierarchy option 1 <System|FenceReset|Fence|Reset|Shutdown>[optional] ? 
Number of Network Interfaces ? (1)
NIC 1 - IP Address ? 163.154.18.172

Modify a Node with cmgr

To modify an existing node, use the following commands:

modify node logical_hostname
    set hostname to hostname
    set partition_id to partitionID
    set reset_type to powerCycle|reset|nmi
    set sysctrl_type to msc|mmsc|l2|l1|bmc (based on node hardware)
    set sysctrl_password to password
    set sysctrl_status to enabled|disabled
    set sysctrl_owner to node_sending_reset_command
    set sysctrl_device to port|IP_address_or_hostname_of_device
    set sysctrl_owner_type to tty|network|ipmi
    set is_failsafe to true|false
    set is_cxfs to true|false
    set weight to 0|1
    add nic IP_address_or_hostname (if DNS)
            set heartbeat to true|false
            set ctrl_msgs to true|false
            set priority to integer
    remove nic IP_address_or_hostname (if DNS)
    set hierarchy to [system] [fence][reset][fencereset][shutdown]

The commands are the same as those used to define a node. You can change any of the information you specified when defining a node except the node ID. For details about the commands, see “Define a Node with cmgr”.


Caution: Do not change the node ID number after the node has been defined.

You cannot add a NIC or a network grouping while CXFS services are active (that is, when start cx_services has been executed); doing so can lead to cluster malfunction. If services have been started, they should be stopped with stop cx_services.

You cannot modify the operating_system setting for a node; trying to do so will cause an error. If you have mistakenly specified the incorrect operating system, you must delete the node and define it again.

You cannot modify the node function. To change the node function, you must delete the node and redefine it (and reinstall software products, as needed); the node function for a Solaris or Windows node is always client_only.

Example of Partitioning

The following shows an example of partitioning an Origin 3000 system:

# cmgr
Welcome to SGI Cluster Manager Command-Line Interface

cmgr> modify node n_preston
Enter commands, when finished enter either "done" or "cancel"

n_preston ? set partition_id to 1
n_preston ? done

Successfully modified node n_preston

To perform this function with prompting, enter the following:

# cmgr -p
Welcome to SGI Cluster Manager Command-Line Interface

cmgr> modify node n_preston
Enter commands, you may enter "done" or "cancel" at any time to exit

Hostname[optional] ? (preston.example.com) 
Is this a FailSafe node <true|false> ? (true) 
Is this a CXFS node <true|false> ? (true) 
Node ID[optional] ? (606) 
Partition ID[optional] ? (0) 1
Do you wish to modify failure hierarchy[y/n]:n
Reset type <powerCycle|reset|nmi> ? (powerCycle) 
Do you wish to modify system controller info[y/n]:n
Number of Network Interfaces? (1)
NIC 1 - IP Address ? (preston) 
NIC 1 - Heartbeat HB (use network for heartbeats)  ? (true) 
NIC 1 - (use network for control messages)  ? (true) 
NIC 1 - Priority <1,2,...> ? (1) 

Successfully modified node n_preston

cmgr> show node n_preston
Logical Machine Name: n_preston
Hostname: preston.example.com
Operating System: IRIX
Node Is FailSafe: true
Node Is CXFS: true
Node Function: client_admin
Nodeid: 606
Partition id: 1
Reset type: powerCycle
ControlNet Ipaddr: preston
ControlNet HB: true
ControlNet Control: true
ControlNet Priority: 1

To unset the partition ID, use a value of 0 or none.

Changing Failure Hierarchy

The following shows an example of changing the failure hierarchy for the node perceval from the system defaults to fencereset,reset,shutdown and back to the system defaults.


Caution: If you have a cluster with an even number of server-capable nodes and no tiebreaker: to avoid a split-brain scenario, you should not use the shutdown setting for any server-capable node. For a more detailed explanation, see “Shutdown” in Chapter 2.


cmgr> modify node perceval
Enter commands, you may enter "done" or "cancel" at any time to exit

Hostname[optional] ? (perceval.example.com)
Is this a FailSafe node <true|false> ? (false)
Is this a CXFS node <true|false> ? (true)
Node ID[optional] ? (803)
Partition ID[optional] ? (0)
Do you wish to modify failure hierarchy[y/n]:y
Hierarchy option 0 <System|FenceReset|Fence|Reset|Shutdown>[optional] ?fencereset
Hierarchy option 1 <System|FenceReset|Fence|Reset|Shutdown>[optional] ?reset
Hierarchy option 2 <System|FenceReset|Fence|Reset|Shutdown>[optional] ?shutdown
Reset type <powerCycle|reset|nmi> ? (powerCycle)
Do you wish to modify system controller info[y/n]:n
Number of Network Interfaces ? (1)
NIC 1 - IP Address ? (163.154.18.173)

Successfully modified node perceval

cmgr> show node perceval
Logical Machine Name: perceval
Hostname: perceval.example.com
Operating System: IRIX
Node Is FailSafe: false
Node Is CXFS: true
Node Function: client_admin
Nodeid: 803
Node Failure Hierarchy is: FenceReset Reset Shutdown
Reset type: powerCycle
ControlNet Ipaddr: 163.154.18.173
ControlNet HB: true
ControlNet Control: true
ControlNet Priority: 1

To return to system defaults:

cmgr> modify node perceval

Enter commands, you may enter "done" or "cancel" at any time to exit

Hostname[optional] ? (perceval.example.com)
Is this a FailSafe node <true|false> ? (false)
Is this a CXFS node <true|false> ? (true)
Node ID[optional] ? (803)
Partition ID[optional] ? (0)
Do you wish to modify failure hierarchy[y/n]:y
Hierarchy option 0 <System|FenceReset|Fence|Reset|Shutdown>[optional] ?
(FenceReset) system
Reset type <powerCycle|reset|nmi> ? (powerCycle)
Do you wish to modify system controller info[y/n]:n
Number of Network Interfaces ? (1)
NIC 1 - IP Address ? (163.154.18.173)
NIC 1 - Heartbeat HB (use network for heartbeats) <true|false> ? (true)
NIC 1 - (use network for control messages) <true|false> ? (true)
NIC 1 - Priority <1,2,...> ? (1)

cmgr> show node perceval
Logical Machine Name: perceval
Hostname: perceval.example.com
Operating System: IRIX
Node Is FailSafe: false
Node Is CXFS: true
Node Function: client_admin
Nodeid: 803
Reset type: powerCycle|reset|nmi
ControlNet Ipaddr: 163.154.18.173
ControlNet HB: true
ControlNet Control: true
ControlNet Priority: 1


Note: When the system defaults are in place for failure hierarchy, no status is displayed with the show command.


Perform an NMI on a Node with cmgr

When CXFS daemons are running, you can perform a nonmaskable interrupt (NMI) on a node with the following command:

admin nmi node nodename


Note: The nmi type depends upon kernel support, which may not be present on all SGI Altix ia64 systems; if the kernel support is not provided, the nmi setting will not.

nmi is not available on systems containing a baseboard management controller (BMC), such as SGI Altix XE x86_64 systems. The nmi option is therefore not available for a dev_type of ipmi and sysctrl_type of bmc.

This command uses the CXFS daemons to perform an NMI on the specified node.

You can perform an NMI on a node in a cluster even when the CXFS daemons are not running by using the standalone option:

admin nmi standalone node nodename

This command does not go through the CXFS daemons.

If the node has not been defined in the cluster database, you can use the following command line:

admin nmi dev_name port|IP_address_or_hostname_of_device of dev_type tty|network|ipmi with sysctrl_type msc|mmsc|l2|l1|bmc

For example:

admin nmi dev_name /dev/ttyIOC0 of dev_type tty  with sysctrl_type l2

If crsd does not see the response it expects within a certain time, it will issue another NMI, which invalidates the dump. This is determined by the setting of the following values in the cluster database (defaults shown):

CrsResetInterval = "20000"
CrsRetryInterval = "1000"
CrsResendTimeout = "10000"
CrsResendRetries = "2"


Note: These values can only be changed using advanced tools. If you feel that these values need to be changed, please contact your local SGI support provider.


Convert a Node to CXFS or FailSafe with cmgr

To convert an existing FailSafe node so that it also applies to CXFS, use the modify command to change the setting.


Note: You cannot turn off FailSafe or CXFS for a node if the respective high availability (HA) or CXFS services are active. You must first stop the services for the node.

For example, in normal mode:

cmgr> modify node cxfs6
Enter commands, when finished enter either "done" or "cancel"

cxfs6 ? set is_FailSafe to true
cxfs6 ? done

Successfully modified node cxfs6

For example, in prompting mode:

cmgr> modify node cxfs6
Enter commands, you may enter "done" or "cancel" at any time to exit

Hostname[optional] ? (cxfs6.example.com) 
Is this a FailSafe node <true|false> ? (false) true
Is this a CXFS node <true|false> ? (true) 
Node ID[optional] ? (13203) 
Partition ID[optional] ? (0)
Do you wish to modify failure hierarchy[y/n]:n
Reset type <powerCycle|reset|nmi> ? (powerCycle) 
Do you wish to modify system controller info[y/n]:n
Number of Network Interfaces ? (1)
NIC 1 - IP Address ? (163.154.18.172) 
NIC 1 - Heartbeat HB (use network for heartbeats) <true|false> ? (true) 
NIC 1 - (use network for control messages) <true|false> ? (true) 
NIC 1 - Priority <1,2,...> ? (1) 

Successfully modified node cxfs6

Delete a Node with cmgr

To delete a node, use the following command:

delete node hostname

You can delete a node only if the node is not currently part of a cluster. If a cluster currently contains the node, you must first modify that cluster to remove the node from it.

For example, suppose you had a cluster named cxfs6-8 with the following configuration:

cmgr> show cluster cxfs6-8
Cluster Name: cxfs6-8
Cluster Is FailSafe: true
Cluster Is CXFS: true
Cluster ID: 20
Cluster HA mode: normal
Cluster CX mode: normal


Cluster cxfs6-8 has following 3 machine(s)
        cxfs6
        cxfs7
        cxfs8

To delete node cxfs8, you would do the following in prompting mode (assuming that CXFS services have been stopped on the node):

cmgr> modify cluster cxfs6-8
Enter commands, when finished enter either "done" or "cancel"

Is this a FailSafe cluster <true|false> ? (false) 
Is this a CXFS cluster <true|false> ? (true) 
Cluster Notify Cmd [optional] ? 
Cluster Notify Address [optional] ? 
Cluster CXFS mode <normal|experimental>[optional] ? (normal) 
Cluster ID ? (20) 

Current nodes in cluster cxfs6-8:
Node - 1: cxfs6
Node - 2: cxfs7
Node - 3: cxfs8

Add nodes to or remove nodes/networks from cluster 
Enter "done" when completed or "cancel" to abort


cxfs6-8 ? remove node cxfs8
cxfs6-8 ? done
Successfully modified cluster cxfs6-8

cmgr> show cluster cxfs6-8
Cluster Name: cxfs6-8
Cluster Is FailSafe: false
Cluster Is CXFS: true
Cluster ID: 20
Cluster CX mode: normal


Cluster cxfs6-8 has following 2 machine(s)
        cxfs6
        cxfs7

To delete cxfs8 from the pool, enter the following:

cmgr> delete node cxfs8

Deleted machine (cxfs6).

Display a Node with cmgr

After you have defined a node, you can display the node's parameters with the following command:

show node hostname

For example:

cmgr> show node cxfs6
Logical Machine Name: cxfs6
Hostname: cxfs6.example.com
Operating System: IRIX
Node Is FailSafe: false
Node Is CXFS: true
Node Function: server_admin
Nodeid: 13203
Reset type: powerCycle
ControlNet Ipaddr: 163.154.18.172
ControlNet HB: true
ControlNet Control: true
ControlNet Priority: 1

You can see a list of all of the nodes that have been defined with the following command:

show nodes in pool

For example:

cmgr> show nodes in pool

3 Machine(s) defined
        cxfs8
        cxfs6
        cxfs7

You can see a list of all of the nodes that have been defined for a specified cluster with the following command:

show nodes [in cluster clustername]

For example:

cmgr> show nodes in cluster cxfs6-8

Cluster cxfs6-8 has following 3 machine(s)
        cxfs6
        cxfs7
        cxfs8

If you have specified a default cluster, you do not need to specify a cluster when you use this command. For example:

cmgr> set cluster cxfs6-8
cmgr> show nodes

Cluster cxfs6-8 has following 3 machine(s)
        cxfs6
        cxfs7
        cxfs8

Test Node Connectivity with cmgr

You can use cmgr to test the network connectivity in a cluster. This test checks if the specified nodes can communicate with each other through each configured interface in the nodes. This test will not run if CXFS is running. This test requires that the /etc/.rhosts file be configured properly; see “Modifications for CXFS Connectivity Diagnostics” in Chapter 7.

Use the following command to test the network connectivity for the nodes in a cluster:

test connectivity in cluster clustername [on node nodename1 node nodename2 ...]

For example:

cmgr> test connectivity in cluster cxfs6-8 on node cxfs7
Status: Testing connectivity...
Status: Checking that the control IP_addresses are on the same networks
Status: Pinging address cxfs7 interface ef0 from node cxfs7 [cxfs7]
Notice: overall exit status:success, tests failed:0, total tests executed:1

This test yields an error message when it encounters its first error, indicating the node that did not respond. If you receive an error message after executing this test, verify that the network interface has been configured up, using the ifconfig command. For example (line breaks added here for readability):

# /usr/etc/ifconfig ef0
ef0: flags=405c43 <UP,BROADCAST,RUNNING,FILTMULTI,MULTICAST,CKSUM,DRVRLOCK,IPALIAS>
inet 128.162.89.39 netmask 0xffff0000 broadcast 128.162.255.255

The UP in the first line of output indicates that the interface is configured up.

If the network interface is configured up, verify that the network cables are connected properly and run the test again.

Cluster Tasks with cmgr

This section tells you how to define, modify, delete, and display a cluster using cmgr. It also tells you how to start and stop CXFS services.

Define a Cluster with cmgr

When you define a cluster with cmgr, you define a cluster and add nodes to the cluster with the same command. For general information, see “Define a Cluster with the GUI” in Chapter 10.

Use the following commands to define a cluster:

define cluster clustername
    set is_failsafe to true|false
    set is_cxfs to true|false
    set clusterid to clusterID
    set notify_cmd to notify_command
    set notify_addr to email_address
    set ha_mode to normal|experimental
    set cx_mode to normal|experimental
    add node node1name
    add node node2name
    add net network network_address mask netmask

Usage notes:

  • cluster is the logical name of the cluster. Logical names cannot begin with an underscore (_) or include any whitespace characters, and can be at most 255 characters. Clusters must have unique names

  • If you are running just CXFS, set is_cxfs to true and is_failsafe to false. If you are running a coexecution cluster, set both values to true.

  • clusterid is a unique number within your network in the range 1 through 255. The cluster ID is used by the operating system kernel to make sure that it does not accept cluster information from any other cluster that may be on the network. The kernel does not use the database for communication, so it requires the cluster ID in order to verify cluster communications. This information in the kernel cannot be changed after it has been initialized; therefore, you must not change a cluster ID after the cluster has been defined. Clusters must have unique IDs.

  • notify_cmd is the command to be run whenever the status changes for a node or cluster.

  • notify_addr is the address to be notified of cluster and node status changes. To specify multiple addresses, separate them with commas. CXFS will send e-mail to the addresses whenever the status changes for a node or cluster. If you do not specify an address, notification will not be sent. If you use the notify_addr command, you must specify the e-mail program (such as /usr/sbin/Mail ) as the notify_command.

  • The set ha_mode and set cx_mode commands should usually be set to normal. The set cx_mode command applies only to CXFS, and the set ha_mode command applies only to IRIS FailSafe.

  • net defines a set of NICs into a network. If the highest priority network (beginning with NIC priority 1) fails, the next highest will be used. All NICs within one network must be at the same priority. NICs of a given priority (such as priority 2) cannot be in two separate net networks. Although the primary network must be private, the backup network may be public.

    If you do not specify a net list, the set of priority 1 NICs are used by default as the CXFS heartbeat network and there will be no failover to any other set of NICs.

    The network parameter specifies an IP network address (such as 1.2.3.0) and the mask parameter specifies the subnet mask (such as 255.255.255.0) in decimal notation. The order in which you specify network or mask is not important.


    Note: You cannot add a NIC or a network grouping while CXFS services are active (that is, when start cx_services has been executed); doing so can lead to cluster malfunction. If services have been started, they should be stopped with stop cx_services.


The following shows the commands with prompting:

cmgr> define cluster clustername
Enter commands, you may enter "done" or "cancel" at any time to exit

Is this a FailSafe cluster <true|false> ? true|false
Is this a CXFS cluster  <true|false> ? true|false
Cluster Notify Cmd [optional] ? 
Cluster Notify Address [optional] ? 
Cluster CXFS mode <normal|experimental>[optional] use_default_of_normal
Cluster ID ? cluster_ID
No nodes in cluster clustername

No networks in cluster clustername

Add nodes to or remove nodes/networks from cluster clustername
Enter "done" when completed or "cancel" to abort

clustername ? add node node1name
clustername ? add node node2name
...
clustername ? done
Successfully defined cluster clustername

Added node <node1name> to cluster <clustername>
Added node <node2name> to cluster <clustername>

...

You should set the cluster to the default normal mode. Setting the mode to experimental turns off heartbeating in the CXFS kernel membership code so that you can debug the cluster without causing node failures. For example, this can be useful if you just want to disconnect the network for a short time (provided that there is no other cluster networking activity, which will also detect a failure even if there is no heartbeating). However, you should never use experimental mode on a production cluster and should only use it if directed to by SGI customer support. SGI does not support the use of experimental by customers.

For example:

cmgr> define cluster cxfs6-8
Enter commands, you may enter "done" or "cancel" at any time to exit

Is this a FailSafe cluster <true|false> ? false 
Is this a CXFS cluster  <true|false> ? true
Cluster Notify Cmd [optional] ? 
Cluster Notify Address [optional] ? 
Cluster CXFS mode <normal|experimental>[optional] 
Cluster ID ? 20

No nodes in cluster cxfs6-8

No networks in cluster cxfs6-8

Add nodes to or remove nodes/networks from cluster cxfs6-8
Enter "done" when completed or "cancel" to abort

cxfs6-8 ? add node cxfs6
cxfs6-8 ? add node cxfs7
cxfs6-8 ? add node cxfs8
cxfs6-8 ? done
Successfully defined cluster cxfs6-8

Added node <cxfs6> to cluster <cxfs6-8>
Added node <cxfs7> to cluster <cxfs6-8>
Added node <cxfs8> to cluster <cxfs6-8>

To do this without prompting, enter the following:

cmgr> define cluster cxfs6-8
Enter commands, you may enter "done" or "cancel" at any time to exit

cluster cxfs6-8? set is_cxfs to true
cluster cxfs6-8? set clusterid to 20
cluster cxfs6-8? add node cxfs6
cluster cxfs6-8? add node cxfs7
cluster cxfs6-8? add node cxfs8
cluster cxfs6-8? done
Successfully defined cluster cxfs6-8

Modify a Cluster with cmgr

The commands are as follows:

modify cluster clustername
    set is_failsafe to true
    set is_cxfs to true
    set clusterid to clusterID
    set notify_cmd to command
    set notify_addr to email_address
    set ha_mode to normal|experimental
    set cx_mode to normal|experimental
    add node node1name
    add node node2name
    remove node node1name
    remove node node2name
    add net network network_address mask netmask
    remove net network network_address 

These commands are the same as the define cluster commands. For more information, see “Define a Cluster with cmgr”, and “Define a Cluster with the GUI” in Chapter 10.


Note: If you want to rename a cluster, you must delete it and then define a new cluster. If you have started CXFS services on the node, you must either reboot it or reuse the cluster ID number when renaming the cluster.

However, be aware that if you already have CXFS filesystems defined and then rename the cluster, CXFS will not be able to mount the filesystems. For more information, see “Cannot Mount Filesystems” in Chapter 15.


Convert a Cluster to CXFS or FailSafe with cmgr

To convert a cluster, use the following commands:

modify cluster clustername
  set is_failsafe to true|false
  set is_cxfs to true|false
  set clusterid to clusterID

  • cluster is the logical name of the cluster. Logical names cannot begin with an underscore (_) or include any whitespace characters, and can be at most 255 characters.

  • If you are running just CXFS, set is_cxfs to true and is_failsafe to false. If you are running a coexecution cluster, set both values to true.

  • clusterid is a unique number within your network in the range 1 through 255. The cluster ID is used by the operating system kernel to make sure that it does not accept cluster information from any other cluster that may be on the network. The kernel does not use the database for communication, so it requires the cluster ID in order to verify cluster communications. This information in the kernel cannot be changed after it has been initialized; therefore, you must not change a cluster ID after the cluster has been defined.

For example, to convert CXFS cluster cxfs6-8 so that it also applies to FailSafe, enter the following:

cmgr> modify cluster cxfs6-8
Enter commands, when finished enter either "done" or "cancel"

cxfs6-8 ? set is_failsafe to true

The cluster must support all of the functionalities (FailSafe and/or CXFS) that are turned on for its nodes; that is, if your cluster is of type CXFS, then you cannot modify a node that is part of the cluster so that it is of type FailSafe or of type CXFS and FailSafe. However, the nodes do not have to support all the functionalities of the cluster; that is, you can have a node of type CXFS in a cluster of type CXFS and FailSafe.

Delete a Cluster with cmgr

To delete a cluster, use the following command:

delete cluster clustername 

However, you cannot delete a cluster that contains nodes; you must first stop CXFS services on the nodes and then redefine the cluster so that it no longer contains the nodes.

For example, in normal mode:

cmgr> modify cluster cxfs6-8
Enter commands, when finished enter either "done" or "cancel"

cxfs6-8 ? remove node cxfs6
cxfs6-8 ? remove node cxfs7
cxfs6-8 ? remove node cxfs8
cxfs6-8 ? done
Successfully modified cluster cxfs6-8

cmgr> delete cluster cxfs6-8

cmgr> show clusters

cmgr>

For example, in prompting mode:

cmgr> modify cluster cxfs6-8
Enter commands, you may enter "done" or "cancel" at any time to exit

Cluster Notify Cmd [optional] ? 
Cluster Notify Address [optional] ? 
Cluster mode <normal|experimental>[optional] ? (normal) 
Cluster ID ? (55) 

Current nodes in cluster cxfs6-8:
Node - 1: cxfs6
Node - 2: cxfs7
Node - 3: cxfs8

Add nodes to or remove nodes from cluster cxfs6-8
Enter "done" when completed or "cancel" to abort

cxfs6-8 ? remove node cxfs6
cxfs6-8 ? remove node cxfs7
cxfs6-8 ? remove node cxfs8
cxfs6-8 ? done
Successfully modified cluster cxfs6-8

cmgr> delete cluster cxfs6-8

cmgr> show clusters

cmgr>

Display a Cluster with cmgr

To display the clusters and their contents, use the following commands:

show clusters
show cluster clustername

For example, the following output shows that cluster mycluster has six nodes and two private networks, permitting network failover:

cmgr> show cluster mycluster
Cluster Name: mycluster
Cluster Is FailSafe: false
Cluster Is CXFS: true
Cluster ID: 1
Cluster CX mode: normal


Cluster mycluster has following 6 machine(s)
        nodeA
	nodeB
	nodeC
	nodeD
	nodeE
	nodeF


CXFS Failover Networks:
    network 192.168.0.0, mask 255.255.255.0
    network 134.14.54.0, mask 255.255.255.0

The multiple networks listed indicates that if the higher priority network should fail, the next priority network will be used. However, the order in which the networks are listed in this output is not an indication of priority. To determine the priority of the networks, you must look at the NIC priorities in the node definition.

Cluster Services Tasks with cmgr

The following tasks tell you how to start and stop CXFS services and set log levels.

Start CXFS Services with cmgr

To start CXFS services, and set the configuration to automatically restart CXFS services whenever the system is rebooted, use one of the following commands:

start cx_services [on node hostname ] for cluster clustername

For example, to start CXFS services on all nodes in the cluster:

cmgr> start cx_services for cluster cxfs6-8

Stop CXFS Services with cmgr

When CXFS services are stopped on a node, filesystems are automatically unmounted from that node.

To stop CXFS services on a specified node or cluster, and prevent CXFS services from being restarted by a reboot, use the following command:

stop cx_services [on node hostname]for cluster clustername [force]


Note: This procedure is only recommended as needed for CXFS administration node because it updates the cluster database and is therefore intrusive to other nodes. When shutting down a CXFS client-only node, do not administratively stop the CXFS services on the node Rather, let the CXFS services stop by themselves when the client-only node is shut down.

For example:

cmgr> stop cx_services on node cxfs6 for cluster cxfs6-8

CXFS services have been deactivated on node cxfs6 (cluster cxfs6-8)

cmgr> stop cx_services for cluster cxfs6-8

After you have stopped CXFS services in a node, the node is no longer an active member of the cluster.


Caution: If you stop CXFS services, the node will be marked as INACTIVE and it will therefore not rejoin the cluster after a reboot. To allow a node to rejoin the cluster, you must restart CXFS services using cmgr or the GUI.


Set the Tiebreaker Node with cmgr

A CXFS tiebreaker node determines whether a CXFS kernel membership quorum is maintained when exactly half of the server-capable nodes can communicate with each other. There is no default CXFS tiebreaker.


Caution: If one of the server-capable nodes is the CXFS tiebreaker in a two server-capable cluster, failure of that node or stopping the CXFS services on that node will result in a cluster-wide forced shutdown. Therefore SGI recommends that you use client-only nodes as tiebreakers so that either server could fail but the cluster would remain operational via the other server.

The reset capability or I/O fencing with switches is mandatory to ensure data integrity for all nodes. Clusters should have an odd number of server-capable nodes. If you have an even number of server-capable administration nodes, define a CXFS tiebreaker node. SGI recommends making a client-only node the tiebreaker.

To set the CXFS tiebreaker node, use the modify command as follows:

modify cx_parameters 
[on node nodename] in cluster clustername
set tie_breaker to hostname

To unset the CXFS tiebreaker node, use the following command:

set tie_breaker to none

For example, in normal mode:

cmgr> modify cx_parameters in cluster cxfs6-8
Enter commands, when finished enter either "done" or "cancel"

cxfs6-8 ? set tie_breaker to cxfs8
cxfs6-8 ? done
Successfully modified cx_parameters

For example, in prompting mode:

cmgr> modify cx_parameters in cluster cxfs6-8

(Enter "cancel" at any time to abort)

Tie Breaker Node ? (cxfs7) cxfs8
Successfully modified cx_parameters

cmgr> show cx_parameters in cluster cxfs6-8

_CX_TIE_BREAKER=cxfs8

Set Log Configuration with cmgr

For general information about CXFS logs, see “Set Log Configuration with the GUI” in Chapter 10.

Display Log Group Definitions with cmgr

Use the following command to view the log group definitions:

show log_groups

This command shows all of the log groups currently defined, with the log group name, the logging levels, and the log files.

Use the following command to see messages logged by a specific daemon on a specific node:

show log_group LogGroupName [on node Nodename]

To exit from the message display, enter Cntrl-C.

Configure Log Groups with cmgr

You can configure a log group with the following command:

define log_group log_group on node adminhostname [in cluster clustername]
  set log_level to log_level
  add log_file log_file 
  remove log_file log_file

Usage notes:

  • log_group can be one of the following:

    clconfd
    cli
    crsd
    diags

  • log_level can have one of the following values:

    • 0 gives no logging

    • 1 logs notifications of critical errors and normal operation (these messages are also logged to the SYSLOG file)

    • 2 logs Minimal notifications plus warnings

    • 5 through 7 log increasingly more detailed notifications

    • 10 through 19 log increasingly more debug information, including data structures

  • log_file


    Caution: Do not change the names of the log files. If you change the names, errors can occur.


For example, to define log group cli on node cxfs6 with a log level of 5:

cmgr> define log_group cli on node cxfs6 in cluster cxfs6-8

(Enter "cancel" at any time to abort)

Log Level ? (11) 5

CREATE LOG FILE OPTIONS

        1) Add Log File.
        2) Remove Log File.
        3) Show Current Log Files.
        4) Cancel. (Aborts command)
        5) Done. (Exits and runs command)

Enter option:5
Successfully defined log group cli

Modify Log Groups with cmgr

Use the following command to modify a log group:

modify log_group log_group_name on node hostname [in cluster clustername]

You modify a log group using the same commands you use to define a log group.

For example, to change the log level of cli to be 10, enter the following:

cmgr> modify log_group cli on node cxfs6 in cluster cxfs6-8

(Enter "cancel" at any time to abort)

Log Level ? (2) 10

MODIFY LOG FILE OPTIONS

        1) Add Log File.
        2) Remove Log File.
        3) Show Current Log Files.
        4) Cancel. (Aborts command)
        5) Done. (Exits and runs command)

Enter option:5
Successfully modified log group cli

Revoke Membership of the Local Node with cmgr

To revoke CXFS kernel membership for the local node, such as before the forced CXFS shutdown, enter the following on the local node:

admin cxfs_stop

This command will be considered as a node failure by the rest of the cluster. The rest of the cluster may then fail due to a loss of CXFS kernel membership quorum, or it may decide to reset the failed node. To avoid the reset, you can modify the node definition to disable the system controller status.

Allow Membership of the Local Node with cmgr

Allowing CXFS kernel membership for the local node permits the node to reapply for CXFS kernel membership. You must actively allow CXFS kernel membership for the local node in the following situations:

  • After a manual revocation as in “Revoke Membership of the Local Node with cmgr”.

  • When instructed to by an error message on the console or in /var/adm/SYSLOG.

  • After a kernel-triggered revocation. This situation is indicated by the following message in /var/adm/SYSLOG:

    Membership lost - withdrawing from cluster

To allow CXFS kernel membership for the local node, use the following command:

cmgr> admin cxfs_start

See also “Shutdown of the Database and CXFS” in Chapter 12.

CXFS Filesystem Tasks with cmgr

This section tells you how to define a filesystem, specify the nodes on which it may or may not be mounted (the enabled or disabled nodes), and perform mounts and unmounts.

A given filesystem can be mounted on a given node when the following things are true:

  • One of the following is true for the node:

    • The default local status is enabled and the node is not in the filesystem's list of explicitly disabled nodes

    • The default local status is disabled and the node is in the filesystem's list of explicitly enabled nodes

    See “Define a CXFS Filesystem with cmgr”.

  • The global status of the filesystem is enabled. See “Mount a CXFS Filesystem with cmgr”.

Define a CXFS Filesystem with cmgr

Use the following commands to define a filesystem and the nodes on which it may be mounted:

define cxfs_filesystem logical_filesystem_name [in cluster clustername]
   set device_name to devicename
   set mount_point to mountpoint
   set mount_options to mount_options
   set force to true|false
   set dflt_local_status to enabled|disabled
   add cxfs_server admin_nodename
      set rank to 0|1|2|...
   add enabled_node nodename
   add disabled_node nodename
   remove cxfs_server admin_nodename
   remove enabled_node nodename
   remove disabled_node nodename

Usage notes:

  • Relocation is disabled by default. Recovery and relocation are supported only when using standby nodes. Therefore, you should only define multiple metadata servers for a given filesystem if you are using the standby node model. See “Relocation” in Chapter 1.

  • The list of potential metadata servers for any given filesystem must all run the same operating system type.

  • cxfs_filesystem can be any logical name. Logical names cannot begin with an underscore (_) or include any whitespace characters, and can be at most 255 characters.


    Note: Within the GUI, the default is to use the last portion of the device name; for example, for a device name of /dev/cxvm/d76lun0s0 , the GUI will automatically supply a logical filesystem name of d76lun0s0. The GUI will accept other logical names defined with cmgr but the GUI will not allow you to modify a logical name; you must use cmgr to modify the logical name.


  • device_name is the device name of an XVM volume that will be shared among all nodes in the CXFS cluster. The name must begin with /dev/cxvm/. For more information, see XVM Volume Manager Administrator's Guide.

  • mount_point is a directory to which the specified XVM volume will be attached. This directory name must begin with a slash (/). For more information, see the mount man page.

  • mount_options are options that are passed to the mount command and are used to control access to the specified XVM volume. For a list of the available options, see the fstab man page.

  • force controls what action CXFS takes if there are processes that have open files or current directories in the filesystem(s) that are to be unmounted. If set to true, then the processes will be killed and the unmount will occur. If set to false, the processes will not be killed and the filesystem will not be unmounted. The force option off (set to true) by default.

  • dflt_local_status defines whether the filesystem can be mounted on all unspecified nodes or cannot be mounted on any unspecified nodes. You can then use the add enabled_node or add disabled_node commands as necessary to explicitly specify the nodes that differ from the default. There are multiple combinations that can have the same result.

    For example, suppose you had a cluster with 10 nodes ( node1 through node10). You could use the following methods:

    • If you want the filesystem to be mounted on all nodes, and want it to be mounted on any nodes that are later added to the cluster, you would specify:

      set dflt_local_status to enabled

    • If you want the filesystem to be mounted on all nodes except node5, and want it to be mounted on any nodes that are later added to the cluster, you would specify:

      set dflt_local_status to enabled
      add disabled_node cxfs5

    • If you want the filesystem to be mounted on all nodes except node5, and you also do not want it to be mounted on any nodes that are later added to the cluster, you would specify:

      set dflt_local_status to disabled
      add enabled_node cxfs1
      add enabled_node cxfs2
      add enabled_node cxfs3
      add enabled_node cxfs4
      add enabled_node cxfs6
      add enabled_node cxfs7
      add enabled_node cxfs8
      add enabled_node cxfs9
      add enabled_node cxfs10

    • If you want the filesystem to be mounted on node5 through node10 and on any future nodes, you could specify:

      set dflt_local_status to enabled
      add disabled_node cxfs1
      add disabled_node cxfs2
      add disabled_node cxfs3
      add disabled_node cxfs4

    To actually mount the filesystem on the enabled nodes, see “Mount a CXFS Filesystem with cmgr”.

  • cxfs_server adds or removes the specified CXFS administration node name to the list of potential metadata servers.


Note: After a filesystem has been defined in CXFS, running mkfs on it will cause errors to appear in the system log file. To avoid these errors, run mkfs before defining the filesystem in CXFS, or delete the CXFS filesystem before running mkfs. See “Delete a CXFS Filesystem with cmgr”.

The following examples shows two potential metadata servers for the fs1 filesystem; if cxfs6 (the preferred server, with rank 0) is not up when the cluster starts or later fails or is removed from the cluster, then cxfs7 (rank1) will be used. The filesystem is mounted on all nodes.


Note: Although the list of metadata servers for a given filesystem is ordered, it is impossible to predict which server will become the server during the boot-up cycle because of network latencies and other unpredictable delays.

For example, in normal mode:

cmgr> define cxfs_filesystem fs1 in cluster cxfs6-8

cxfs_filesystem fs1 ? set device_name to /dev/cxvm/d76lun0s0
cxfs_filesystem fs1 ? set mount_point to /mnts/fs1
cxfs_filesystem fs1 ? set force to false
cxfs_filesystem fs1 ? add cxfs_server cxfs6 
Enter CXFS server parameters, when finished enter "done" or "cancel"

CXFS server - cxfs6 ? set rank to 0
CXFS server - cxfs6 ? done
cxfs_filesystem fs1 ? add cxfs_server cxfs7 
Enter CXFS server parameters, when finished enter "done" or "cancel"

CXFS server - cxfs7 ? set rank to 1
CXFS server - cxfs7 ? done
cxfs_filesystem fs1 ? set dflt_local_status to enabled
cxfs_filesystem fs1 ? done
Successfully defined cxfs_filesystem fs1

cmgr> define cxfs_filesystem fs2 in cluster cxfs6-8

cxfs_filesystem fs2 ? set device_name to /dev/cxvm/d76lun0s1
cxfs_filesystem fs2 ? set mount_point to /mnts/fs2
cxfs_filesystem fs2 ? set force to false
cxfs_filesystem fs2 ? add cxfs_server cxfs8 
Enter CXFS server parameters, when finished enter "done" or "cancel"

CXFS server - cxfs8 ? set rank to 0
CXFS server - cxfs8 ? done
cxfs_filesystem fs2 ? set dflt_local_status to enabled
cxfs_filesystem fs2 ? done
Successfully defined cxfs_filesystem fs2



For example, in prompting mode:

cmgr> define cxfs_filesystem fs1 in cluster cxfs6-8

(Enter "cancel" at any time to abort)

Device ? /dev/cxvm/d76lun0s0
Mount Point ? /mnts/fs1
Mount Options[optional] ? 
Use Forced Unmount ? <true|false> ? false
Default Local Status <enabled|disabled> ? (enabled) 

DEFINE CXFS FILESYSTEM OPTIONS

        0) Modify Server.
        1) Add Server.
        2) Remove Server.
        3) Add Enabled Node.
        4) Remove Enabled Node.
        5) Add Disabled Node.
        6) Remove Disabled Node.
        7) Show Current Information.
        8) Cancel. (Aborts command)
        9) Done. (Exits and runs command)

Enter option:1

No current servers

Server Node ? cxfs6
Server Rank ? 0

        0) Modify Server.
        1) Add Server.
        2) Remove Server.
        3) Add Enabled Node.
        4) Remove Enabled Node.
        5) Add Disabled Node.
        6) Remove Disabled Node.
        7) Show Current Information.
        8) Cancel. (Aborts command)
        9) Done. (Exits and runs command)

Enter option:1
Server Node ? cxfs7
Server Rank ? 1


        0) Modify Server.
        1) Add Server.
        2) Remove Server.
        3) Add Enabled Node.
        4) Remove Enabled Node.
        5) Add Disabled Node.
        6) Remove Disabled Node.
        7) Show Current Information.
        8) Cancel. (Aborts command)
        9) Done. (Exits and runs command)

Enter option:9
Successfully defined cxfs_filesystem fs1

cmgr> define cxfs_filesystem fs2 in cluster cxfs6-8

(Enter "cancel" at any time to abort)

Device ? /dev/cxvm/d77lun0s1
Mount Point ? /mnts/fs2
Mount Options[optional] ? 
Use Forced Unmount ? <true|false> ? false
Default Local Status <enabled|disabled> ? (enabled)

DEFINE CXFS FILESYSTEM OPTIONS

        0) Modify Server.
        1) Add Server.
        2) Remove Server.
        3) Add Enabled Node.
        4) Remove Enabled Node.
        5) Add Disabled Node.
        6) Remove Disabled Node.
        7) Show Current Information.
        8) Cancel. (Aborts command)
        9) Done. (Exits and runs command)

Enter option:1

Server Node ? cxfs8
Server Rank ? 0

        0) Modify Server.
        1) Add Server.
        2) Remove Server.
        3) Add Enabled Node.
        4) Remove Enabled Node.
        5) Add Disabled Node.
        6) Remove Disabled Node.
        7) Show Current Information.
        8) Cancel. (Aborts command)
        9) Done. (Exits and runs command)

Enter option:9
Successfully defined cxfs_filesystem fs2

Mount a CXFS Filesystem with cmgr

To mount a filesystem on the enabled nodes, enter the following:

admin cxfs_mount cxfs_filesystem logical_filesystem_name [on node nodename] [in cluster clustername]

This command enables the global status for a filesystem; if you specify the nodename, it enables the local status. (The global status is only affected if a node name is not specified.) For a filesystem to mount on a given node, both global and local status must be enabled; see “CXFS Filesystem Tasks with cmgr”.

Nodes must first be enabled by using the define cxfs_filesystem and modify cxfs_filesystem commands; see “Define a CXFS Filesystem with cmgr”, and “Modify a CXFS Filesystem with cmgr”.

For example, to activate the f1 filesystem by setting the global status to enabled, enter the following:

cmgr> admin cxfs_mount cxfs_filesystem fs1 in cluster cxfs6-8

The filesystem will then be mounted on all the nodes that have a local status of enabled for this filesystem.

To change the local status to enabled, enter the following:

cmgr> admin cxfs_mount cxfs_filesystem fs1 on node cxfs7 in cluster cxfs6-8

If the filesystem's global status is disabled, nothing changes. If the filesystem's global status is enabled , the node will mount the filesystem as the result of the change of its local status.


Note: If CXFS services are not active, mounting a filesystem will not completely succeed. The filesystem will be marked as ready to be mounted but the filesystem will not actually be mounted until you have started CXFS services. For more information, see “Start CXFS Services with cmgr”.


Unmount a CXFS Filesystem with cmgr

To unmount a filesystem, enter the following:

admin cxfs_unmount cxfs_filesystem filesystemname [on node nodename] [in cluster clustername]

Unlike the modify cxfs_filesystem command, this command can be run on an active filesystem.

For example, to deactivate the f1 filesystem by setting the global status to disabled, enter the following:

cmgr> admin cxfs_unmount cxfs_filesystem fs1 in cluster cxfs6-8

The filesystem will then be unmounted on all the nodes that have a local status of enabled for this filesystem.

To change the local status to disabled, enter the following:

cmgr> admin cxfs_unmount cxfs_filesystem fs1 on node cxfs7 in cluster cxfs6-8

If the filesystem's global status is disabled, nothing changes. If the filesystem's global status is enabled , the node will unmount the filesystem as the result of the change of its local status.

Modify a CXFS Filesystem with cmgr


Note: You cannot modify a mounted filesystem.

Use the following commands to modify a filesystem:

modify cxfs_filesystem logical_filesystem_name [in cluster clustername]
   set device_name to devicename
   set mount_point to mountpoint
   set mount_options to options
   set force to true|false
   set dflt_local_status to enabled|disabled
   add cxfs_server servername
     set rank to 0|1|2|...
   modify cxfs_server servername
     set rank to 0|1|2|...
   add enabled_node nodename
   add disabled_node nodename
   remove cxfs_server nodename
   remove enabled_node nodename
   remove disabled_node nodename

These are the same commands used to define a filesystem; for more information, see “Define a CXFS Filesystem with cmgr”.

For example, in normal mode:

cmgr> show cxfs_filesystem fs1 in cluster cxfs6-8

Name: fs1
Device: /dev/cxvm/d76lun0s0
Mount Point: /mnts/fs1
Forced Unmount: false
Global Status: disabled
Default Local Status: enabled

Server Name: cxfs6
       Rank: 0
Server Name: cxfs7
       Rank: 1
Disabled Client: cxfs8

cmgr> modify cxfs_filesystem fs1 in cluster cxfs6-8
Enter commands, when finished enter either "done" or "cancel"

cxfs_filesystem fs3 ? modify cxfs_server cxfs6
Enter CXFS server parameters, when finished enter "done" or "cancel"

Current CXFS server cxfs6 parameters:
        rank : 0 
CXFS server - cxfs6 ? set rank to 2
CXFS server - cxfs6 ? done
cxfs_filesystem fs1 ? done

Successfully modified cxfs_filesystem fs1
cmgr> show cxfs_filesystem fs1 in cluster cxfs6-8

Name: fs1
Device: /dev/cxvm/d76lun0s0
Mount Point: /mnts/fs1
Forced Unmount: false
Global Status: disabled
Default Local Status: enabled

Server Name: cxfs6
       Rank: 2
Server Name: cxfs7
       Rank: 1
Disabled Client: cxfs8

In prompting mode:

cmgr> show cxfs_filesystem fs1 in cluster cxfs6-8

Name: fs1
Device: /dev/cxvm/d76lun0s0
Mount Point: /mnts/fs1
Forced Unmount: false
Global Status: disabled
Default Local Status: enabled

Server Name: cxfs6
       Rank: 0
Server Name: cxfs7
       Rank: 1
Disabled Client: cxfs8

cmgr> modify cxfs_filesystem fs1 in cluster cxfs6-8

(Enter "cancel" at any time to abort)

Device ? (/dev/cxvm/d76lun0s0) 
Mount Point ? (/mnts/fs1) 
Mount Options[optional] ? 
Use Forced Unmount ? <true|false>  ? (false) 
Default Local Status <enabled|disabled> ? (enabled) 

MODIFY CXFS FILESYSTEM OPTIONS

        0) Modify Server.
        1) Add Server.
        2) Remove Server.
        3) Add Enabled Node.
        4) Remove Enabled Node.
        5) Add Disabled Node.
        6) Remove Disabled Node.
        7) Show Current Information.
        8) Cancel. (Aborts command)
        9) Done. (Exits and runs command)

Enter option:0

Current servers:
CXFS Server 1 - Rank: 0         Node: cxfs6
CXFS Server 2 - Rank: 1         Node: cxfs7

Server Node ? cxfs6
Server Rank ? (0) 2

        0) Modify Server.
        1) Add Server.
        2) Remove Server.
        3) Add Enabled Node.
        4) Remove Enabled Node.
        5) Add Disabled Node.
        6) Remove Disabled Node.
        7) Show Current Information.
        8) Cancel. (Aborts command)
        9) Done. (Exits and runs command)

Enter option:7

Current settings for filesystem (fs1)

CXFS servers:
        Rank 2          Node cxfs6
        Rank 1          Node cxfs7

Default local status: enabled

No explicitly enabled clients

Explicitly disabled clients:
        Disabled Node: cxfs8

        0) Modify Server.
        1) Add Server.
        2) Remove Server.
        3) Add Enabled Node.
        4) Remove Enabled Node.
        5) Add Disabled Node.
        6) Remove Disabled Node.
        7) Show Current Information.
        8) Cancel. (Aborts command)
        9) Done. (Exits and runs command)

Enter option:9
Successfully modified cxfs_filesystem fs3

Relocate the Metadata Server for a Filesystem with cmgr

If relocation is explicitly enabled in the kernel with the cxfs_relocation_ok systune (see “Relocation” in Chapter 1), you can relocate a metadata server to another node using the following command if the filesystem must be mounted on the system that is running cmgr:

admin cxfs_relocate cxfs_filesystem filesystem_name to node nodename [in cluster clustername]


Note: This function is only available on a live system.

To relocate the metadata server from cxfs6 to cxfs7 for fs1 in cluster cxfs6-8 , enter the following:

cmgr> admin cxfs_relocate cxfs_filesystem fs1 to node cxfs7 in cluster cxfs6-8

CXFS kernel membership is not affected by relocation. However, users may experience a degradation in filesystem performance while the metadata server is relocating.

For more details, see “Modify a CXFS Filesystem with cmgr”.

Delete a CXFS Filesystem with cmgr

Use the following command to delete a filesystem:

delete cxfs_filesystem filesystemname [in cluster clustername]

For example:

cmgr> delete cxfs_filesystem fs2 in cluster cxfs6-8

Switches and I/O Fencing Tasks with cmgr

The following tasks let you configure switches and I/O fencing. For general information, see “I/O Fencing” in Chapter 2.


Note: Nodes without system controllers require I/O fencing to protect data integrity. A switch is mandatory to support I/O fencing; therefore, multiOS CXFS clusters require a switch. See the release notes for supported switches.


Define a Switch with cmgr

This section describes how to use the cmgr command to define a new Brocade switch to support I/O fencing in a cluster.


Note: To define a switch other than a Brocade switch, such as a QLogic switch, you must use the GUI or the cxfs_admin or hafence(1M) commands. (You cannot use the cmgr command to completely define a switch other than Brocade.) See “Create a Switch with cxfs_admin” in Chapter 11 and “Switch Manipulation Using hafence” in Chapter 12.

To define a new Brocade switch, use the following command:

define switch switch_hostname username username password password [mask mask]

Usage notes:

  • switch specifies the hostname of the Fibre Channel switch; this is used to determine the IP address of the switch.

  • username specifies the user name to use when sending a telnet message to the switch.

  • password specifies the password for the specified username.

  • mask specifies one of the following:

    • A list of ports in the switch that will never be fenced. The list has the following form, beginning with the # symbol and separating each port number with a comma::

      #port,port,port...

      Each port is a decimal integer in the range 0 through 1023. For example, the following indicates that port numbers 2, 4, 5, 6, 7, and 23 will never be fenced:

      #2,4,5,6,7,23


      Note: For the bladed Brocade 48000 switch (where the port number is not unique), the value you should use for mask is the Index value that is displayed by the switchShow command. For example, the switchShow output below indicates that you would use a mask value of #16 for port 0 in slot 2:
      brocade48000:admin> switchShow
      Index Slot Port Address Media Speed State     Proto
      ===================================================
        0    1    0   010000   id    N4   Online    F-Port  10:00:00:00:c9:5f:9b:ea
        1    1    1   010100   id    N4   Online    F-Port  10:00:00:00:c9:5f:ab:d9
      ....
      142    1   30   018e00   id    N4   Online    F-Port  50:06:0e:80:04:5c:0b:46
      143    1   31   018f00   id    N4   Online    F-Port  50:06:0e:80:04:5c:0b:66
       16    2    0   011000   id    N4   Online    F-Port  10:00:00:00:c9:5f:a1:f5
       17    2    1   011100   id    N4   Online    F-Port  10:00:00:00:c9:5f:a1:72
      ...



    • A hexadecimal string that represents ports in the switch that will never be fenced. Ports are numbered from 0. If a given bit has a binary value of 0, the port that corresponds to that bit is eligible for fencing operations; if 1, then the port that corresponds to that bit will always be excluded from any fencing operations. For an example, see Figure 10-5.

    CXFS administration nodes automatically discover the available HBAs and, when fencing is triggered, fence off all of the Fibre Channel HBAs when the Fence or FenceReset fail action is selected. However, masked HBAs will not be fenced. Masking allows you to prevent the fencing of devices that are attached to the SAN but are not shared with the cluster, to ensure that they remain available regardless of CXFS status. You would want to mask HBAs used for access to tape storage, or HBAs that are only ever used to access local (nonclustered) devices.

For example, using the direct port-number specification method:

cmgr> define switch ptg-brocade username admin password password mask #2,4,5,7,65

Or, using the hexadecimal bitmask method:

cmgr> define switch ptg-brocade username admin password password mask 200000000000000F4

Modify a Switch Definition with cmgr

To modify the user name, password, or mask for a Brocade switch, use the following command:

modify switch switch_hostname username username password password [mask mask]

The arguments are the same as for “Define a Switch with cmgr”.


Note: To modify the definition of another type of switch, such as QLogic, you must use the GUI, hafence(1M), or cxfs_admin(1M) commands. See “Switch Manipulation Using hafence” in Chapter 12.

For example, to change the mask for switch ptg-brocade from A4 to 0 (which means that all of the ports on the switch are eligible for fencing), enter the following:

cmgr> modify switch ptg-brocade username admin password password mask 0

Raise the I/O Fence for a Node with cmgr

Raising an I/O fence isolates the node from the SAN; CXFS sends a messages via the telnet protocol to the switch and disables the port. After the node is isolated, it cannot corrupt data in the shared CXFS filesystem. Use the following command:

admin fence raise [node nodename]

nodename is the name of the node to be isolated.

For example, to isolate the default node, enter the following:

cmgr> admin fence raise

To isolate node Node3, enter the following:

cmgr> admin fence raise node Node3

Lower the I/O Fence for a Node with cmgr

To lower the I/O fence for a given node in order to reenable the port, allowing the node to connect to the SAN and access the shared CXFS filesystem, use the following command:

admin fence lower [node nodename]

nodename is the name of the node to be reconnected.

For example, to provide access for the default node, enter the following:

cmgr> admin fence lower

To provide access for node Node3, enter the following:

cmgr> admin fence lower node Node3

Update Switch Port Information with cmgr

To update the mappings in the cluster database between the host bus adapters (HBAs) and switch ports, use the following command:

admin fence update

You should run this command if you reconfigure any switch or add ports.

Delete a Switch Definition with cmgr

To delete a switch, use the following command:

delete switch switch_hostname

switch_hostname is the hostname of the Fibre Channel switch; this is used to determine the IP address of the switch.

For example:

cmgr> delete switch ptg-brocade
Successfully updated switch config.

Show Switches with cmgr

To display the switches in the system, use the following command:

show switches

To show the switches for a given node, use the following command:

show switch hostname

For example:

cmgr> show switch ptg-brocade
  Switch[0]
      *Hostname ptg-brocade Username admin Password password Mask 0
      Vendor BROCADE Number of ports 8
            0 0000000000000000 Reset 
            1 210000e08b0102c6 Reset 
            2 210000e08b01fec5 Reset 
            3 210000e08b019dc5 Reset 
            4 210000e08b0113ce Reset 
            5 210000e08b027795 Reset thump
            6 210000e08b019ef0 Reset 
            7 210000e08b022242 Reset 

Query Switch Status with cmgr

To query the status of each port on the switch, use the following command:

admin fence query

For example:

cmgr> admin fence query
  Switch[0] "brocade04" has 16 ports
    Port 4 type=FABRIC status=enabled  hba=210000e08b0042d8 on host o200c
    Port 5 type=FABRIC status=enabled  hba=210000e08b00908e on host cxfs30
    Port 9 type=FABRIC status=enabled  hba=2000000173002d3e on host cxfssun3

For more verbose display, (which shows all ports on the switch, rather than only those attached to nodes in the default cluster), use the following command:

admin fence query verbose

For example:

cmgr> admin fence query verbose
  Switch[0] "brocade04" has 16 ports
    Port 0 type=FABRIC status=enabled  hba=2000000173003b5f on host UNKNOWN
    Port 1 type=FABRIC status=enabled  hba=2000000173003adf on host UNKNOWN
    Port 2 type=FABRIC status=enabled  hba=210000e08b023649 on host UNKNOWN
    Port 3 type=FABRIC status=enabled  hba=210000e08b021249 on host UNKNOWN
    Port 4 type=FABRIC status=enabled  hba=210000e08b0042d8 on host o200c
    Port 5 type=FABRIC status=enabled  hba=210000e08b00908e on host cxfs30
    Port 6 type=FABRIC status=enabled  hba=2000000173002d2a on host UNKNOWN
    Port 7 type=FABRIC status=enabled  hba=2000000173003376 on host UNKNOWN
    Port 8 type=FABRIC status=enabled  hba=2000000173002c0b on host UNKNOWN
    Port 9 type=FABRIC status=enabled  hba=2000000173002d3e on host cxfssun3
    Port 10 type=FABRIC status=enabled  hba=2000000173003430 on host UNKNOWN
    Port 11 type=FABRIC status=enabled  hba=200900a0b80c13c9 on host UNKNOWN
    Port 12 type=FABRIC status=disabled hba=0000000000000000 on host UNKNOWN
    Port 13 type=FABRIC status=enabled  hba=200d00a0b80c2476 on host UNKNOWN
    Port 14 type=FABRIC status=enabled  hba=1000006069201e5b on host UNKNOWN
    Port 15 type=FABRIC status=enabled  hba=1000006069201e5b on host UNKNOWN

Script Example

The following script defines a three-node cluster of type CXFS. The nodes are of type CXFS.


Note: This example only defines one network interface. The hostname is used here for simplicity; however, you may wish to use the IP address instead to avoid confusion. This example does not address the system controller definitions.


#!/usr/cluster/bin/cmgr -if
#
#Script to define a three-node cluster


define node cxfs6
        set hostname to cxfs6
        set is_cxfs to true
        set operating_system to irix
        set node_function to server_admin
        add nic cxfs6
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 1
        done
done

define node cxfs7
        set hostname to cxfs7
        set is_cxfs to true
        set operating_system to irix
        set node_function to server_admin
        add nic cxfs7
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 1
        done
done

define node cxfs8
        set hostname to cxfs8
        set is_cxfs to true
        set operating_system to irix
        set node_function to server_admin
        add nic cxfs8
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 1
        done
done

define cluster cxfs6-8
        set is_cxfs to true
        set is_failsafe to true
        set clusterid to 20
        add node cxfs6
        add node cxfs7
        add node cxfs8
        done
quit

After running this script, you would see the following output:

Successfully defined node cxfs6

Successfully defined node cxfs7

Successfully defined node cxfs8

Successfully defined cluster cxfs6-8

The following script defines two filesystems; fs1 is mounted on all but node cxfs8, and fs2 is mounted on all nodes:

#!/usr/cluster/bin/cmgr -if
# Script to define two filesystems
# Define fs1, do not mount on cxfs8
define cxfs_filesystem fs1 in cluster cxfs6-8
set device_name to /dev/cxvm/d76lun0s0
set mount_point to /mnts/fs1
set force to false
add cxfs_server cxfs6
  set rank to 0
  done
add cxfs_server cxfs7
  set rank to 1
  done
set dflt_local_status to enabled
add disabled_node cxfs8
done
#
# Define fs2, mount everywhere
define cxfs_filesystem fs2 in cluster cxfs6-8
set device_name to /dev/cxvm/d76lun0s1
set mount_point to /mnts/fs2
set force to false
add cxfs_server cxfs8
set rank to 0
done
set dflt_local_status to enabled
done

Creating a cmgr Script Automatically

After you have configured the cluster database, you can use the build_cmgr_script command to automatically create a cmgr script based on the contents of the cluster database. The generated script will contain the following:

  • Node definitions

  • Cluster definition

  • Switch definitions

  • CXFS filesystem definitions

  • Parameter settings

  • Any changes made using either the cmgr command or the GUI

  • FailSafe information (in a coexecution cluster only)

As needed, you can then use the generated script to recreate the cluster database after performing a cdbreinit.


Note: You must execute the generated script on the first node that is listed in the script. If you want to execute the generated script on a different node, you must modify the script so that the node is the first one listed.

By default, the generated script is named:

/var/cluster/ha/tmp/cmgr_create_cluster_clustername_processID

You can specify an alternative pathname by using the -o option:

build_cmgr_script [-o script_pathname]

For more details, see the build_cmgr_script man page.

For example:

# /var/cluster/cmgr-scripts/build_cmgr_script -o /tmp/newcdb
Building cmgr script for cluster clusterA ...
build_cmgr_script: Generated cmgr script is /tmp/newcdb

The example script file contents are as follows; note that because nodeE is the first node defined, you must execute the script on nodeE:

#!/usr/cluster/bin/cmgr -f

# Node nodeE definition
define node nodeE
        set hostname to nodeE.example.com
        set operating_system to IRIX
        set is_failsafe to false
        set is_cxfs to true
        set node_function to server_admin
        set nodeid to 5208
        set reset_type to powerCycle
        add nic nodeE
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 1
        done
done

# Node nodeD definition
define node nodeD
        set hostname to nodeD.example.com
        set operating_system to IRIX
        set is_failsafe to false
        set is_cxfs to true
        set node_function to server_admin
        set nodeid to 5181
        set reset_type to powerCycle
        add nic nodeD
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 1
        done
done

# Node nodeF definition
define node nodeF
        set hostname to nodeF.example.com
        set operating_system to IRIX
        set is_failsafe to false
        set is_cxfs to true
        set node_function to server_admin
        set nodeid to 5401
        set reset_type to powerCycle
        add nic nodeF
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 1
        done
done

# Define cluster and add nodes to the cluster
define cluster clusterA
        set is_failsafe to false
        set is_cxfs to true
        set cx_mode to normal
        set clusterid to 35
done

modify cluster clusterA
        add node nodeD
        add node nodeF
        add node nodeE
done

set cluster clusterA

define cxfs_filesystem fs1
        set device_name to /dev/cxvm/fs1
        set mount_point to /fs1
        set force to false
        set dflt_local_status to enabled
        add cxfs_server nodeE
                set rank to 1
        done
        add cxfs_server nodeD
                set rank to 2
        done
        add cxfs_server nodeF
                set rank to 0
        done
done

define cxfs_filesystem fs2
        set device_name to /dev/cxvm/fs2
        set mount_point to /fs2
        set force to false
        set dflt_local_status to enabled
        add cxfs_server nodeE
                set rank to 1
        done
        add cxfs_server nodeD
                set rank to 2
        done
        add cxfs_server nodeF
                set rank to 0
        done
done

define cxfs_filesystem fs2
        set device_name to /dev/cxvm/fs2
        set mount_point to /fs2
        set force to false
        set dflt_local_status to enabled
        add cxfs_server nodeE
                set rank to 1
        done
        add cxfs_server nodeD
                set rank to 2
        done
        add cxfs_server nodeF
                set rank to 0
        done
done

# Setting CXFS parameters
modify cx_parameters
        set tie_breaker to none
done

quit

Troubleshooting cmgr

You should only use cmgr when you are logged in as root.

The following message may appear in /var/cluster/ha/log/cli_ hostname if the underlying command line interface (CLI) was invoked by a login other than root: CI_IPCERR_AGAIN, ipcclnt_connect(): file /var/cluster/ha/comm/clconfd-ipc_cxfs0 lock failed - Permission denied

Additional cmgr Examples

This section contains the following:

Also see:

Example of Normal CXFS Shutdown Using cmgr

To perform a normal CXFS shutdown, for example enter the following cmgr command:

cmgr> stop cx_services on node nodename for cluster clustername

This action deactivates CXFS services on one node, forming a new CXFS kernel membership after deactivating the node. If you want to stop CXFS services on multiple nodes, you must enter this command multiple times or perform the task using the GUI.

After you stop CXFS services on a node, the node is marked as inactive and is no longer used when calculating the CXFS kernel membership.

Example of Forced CXFS Shutdown Using cmgr

To perform an administrative stop, enter the following cmgr command to revoke the CXFS kernel membership of the local node:

cmgr> admin cxfs_stop

This action can also be triggered automatically by the kernel after a loss of CXFS kernel membership quorum.

Example of Rejoining the Cluster after a Stopping CXFS Services Using cmgr

The node will not rejoin the cluster after a reboot. The node will rejoin the cluster only when CXFS services are explicitly reactivated with the CXFS GUI (see “Start CXFS Services with the GUI” in Chapter 10) or the following command:

cmgr> start cx_services on node nodename for cluster clustername

In cxfs_admin, you can disable individual nodes with the disable command.

Example of Rejoining the Cluster after a Forced CXFS Shutdown Using cmgr

After a forced CXFS shutdown, the local node will not resume CXFS kernel membership until the node is rebooted or until you explicitly allow CXFS kernel membership for the local node for example by entering the following cmgr command:

cmgr> admin cxfs_start

Example of Configuring Private Network Failover Using cmgr

This section provides an example of modifying a cluster to provide private network failover by using the cmgr command.

Suppose your cluster has the following configuration:

irix# cxfs-config
Global:
    cluster: mycluster (id 1)
    cluster state: enabled
    tiebreaker: yellow

Networks:
    
Machines:
...
    node red: node 55    cell 4  enabled  IRIX    server_admin
        hostname: red.example.com
        fail policy: Fence, Shutdown
        nic 0: address: 192.168.0.1 priority: 1 

    node yellow: node 2     cell 3  enabled  IRIX    server_admin
        hostname: yellow.example.com
        fail policy: Fence, Shutdown
        nic 0: address: 192.168.0.2 priority: 1 

To change the configuration to support private network failover, you would do the following:

  1. Ensure that CXFS services are not active.


    Note: You cannot add a NIC or a network grouping while CXFS services are active (that is, when start cx_services has been executed); doing so can lead to cluster malfunction.

    If services have been started, stopped them as follows:

    [root@linux root]# cmgr -p
    Welcome to SGI Cluster Manager Command-Line Interface
    
    cmgr> stop cx_services for cluster mycluster
    
    CXFS services have been deactivated in cluster
    mycluster

  2. Add another set of NICs to support a second CXFS network. (The second network will be used as the failover network and can be the public network and thus does not have to be a second CXFS private network.) For example:

    cmgr>
    modify node red
    Enter commands, you may enter "done" or "cancel" at any time to exit
    
    Hostname[optional] ? (red.example.com) 
    Is this a FailSafe node <true|false> ? (false) 
    Is this a CXFS node <true|false> ? (true) 
    Partition ID[optional] ? (0) 
    Do you wish to modify failure hierarchy[y/n]:n
    Reset type <powerCycle|reset|nmi> ? (powerCycle) 
    Do you wish to modify system controller
    info[y/n]:n
    Number of Network Interfaces ? (1) 2
    NIC 1 - IP Address ? (192.168.0.1) 
    NIC 1 - Heartbeat HB (use network for heartbeats) <true|false> ?
    (true) 
    NIC 1 - (use network for control messages) <true|false> ? (true) 
    NIC 1 - Priority <1,2,...> ? (1) 
    NIC 2 - IP Address ? 192.168.1.1
    NIC 2 - Heartbeat HB (use network for heartbeats) <true|false> ?
    true
    NIC 2 - (use network for control messages) <true|false> ? true
    NIC 2 - Priority <1,2,...> ? 2
    
    Successfully modified node red
    
    cmgr> modify node yellow
    Enter commands, you may enter "done" or "cancel" at any time to exit
    
    Hostname[optional] ? (yellow.example.com) 
    Is this a FailSafe node <true|false> ? (false) 
    Is this a CXFS node <true|false> ? (true) 
    Partition ID[optional] ? (0) 
    Do you wish to modify failure hierarchy[y/n]:n
    Reset type <powerCycle|reset|nmi> ? (powerCycle) 
    Do you wish to modify system controller
    info[y/n]:n
    Number of Network Interfaces ? (1) 2
    NIC 1 - IP Address ? (192.168.0.2) 
    NIC 1 - Heartbeat HB (use network for heartbeats) <true|false> ?
    (true) 
    NIC 1 - (use network for control messages) <true|false> ? (true) 
    NIC 1 - Priority <1,2,...> ? (1) 
    NIC 2 - IP Address ? 192.168.1.2
    NIC 2 - Heartbeat HB (use network for heartbeats) <true|false> ?
    true
    NIC 2 - (use network for control messages) <true|false> ? true
    NIC 2 - Priority <1,2,...> ? 2
    
    Successfully modified node yellow

    Repeat this process for each node. You can use the cxfs-config command to display the defined NICs. For example:

    irix# cxfs-config
    ...
            node red: node 55    cell 4  enabled  IRIX    server_admin
            hostname: red.example.com
            fail policy: Fence, Shutdown
            nic 0: address: 192.168.0.1 priority: 1 
            nic 1: address: 192.168.1.1 priority: 2 
    
        node yellow: node 2     cell 3  enabled  IRIX    server_admin
            hostname: yellow.example.com
            fail policy: Fence, Shutdown
            nic 0: address: 192.168.0.2 priority: 1 
            nic 1: address: 192.168.1.2 priority: 2 

  3. Configure the NICs into networks. (CXFS will ignore NICs other than priority 1 unless you configure the NICs into networks.)

    1. Configure the primary network:

      cmgr> modify cluster mycluster
      Enter commands, you may enter "done" or "cancel" at any time to exit
      
      Is this a FailSafe cluster <true|false> ? (false) 
      Is this a CXFS cluster <true|false> ? (true) 
      Cluster Notify Cmd [optional] ? 
      Cluster Notify Address [optional] ? 
      Cluster CXFS mode <normal|experimental>[optional] ? (normal) 
      Cluster ID ? (1) 
      
      Current nodes in cluster mycluster:
      Node - 1: green
      Node - 2: orange
      Node - 3: red
      Node - 4: purple
      Node - 5: yellow
      Node - 6: blue
      
      
      No networks in cluster mycluster
      
      Add nodes to or remove nodes/networks from cluster mycluster
      Enter "done" when completed or "cancel" to abort
      
      mycluster ? add net network 192.168.0.0 mask 255.255.255.0
      mycluster ? done
      Successfully modified cluster mycluster

      At this point, cxfs-config will show the primary network (network 0):

      irix#
      cxfs-config
      ...
      Networks:
          net 0: type tcpip  192.168.0.0      255.255.255.0   
      
      Machines:
      ...
          node red: node 55    cell 4  enabled  IRIX    server_admin
              hostname: red.example.com
              fail policy: Fence, Shutdown
              nic 0: address: 192.168.0.1 priority: 1 network: 0
              nic 1: address: 192.168.1.1 priority: 2 network: none
      
          node yellow: node 2     cell 3  enabled  IRIX    server_admin
              hostname: yellow.example.com
              fail policy: Fence, Shutdown
              nic 0: address: 192.168.0.2 priority: 1 network: 0
              nic 1: address: 192.168.1.2 priority: 2 network: none
      ...
      

    2. Configure the secondary network:

      cmgr> modify cluster mycluster
      Enter commands, you may enter "done" or "cancel" at any time to exit
      
      Is this a FailSafe cluster <true|false> ? (false) 
      Is this a CXFS cluster <true|false> ? (true) 
      Cluster Notify Cmd [optional] ? 
      Cluster Notify Address [optional] ? 
      Cluster CXFS mode <normal|experimental>[optional] ? (normal) 
      Cluster ID ? (1) 
      
      Current nodes in cluster mycluster:
      Node - 1: green
      Node - 2: orange
      Node - 3: red
      Node - 4: purple
      Node - 5: yellow
      Node - 6: blue
      
      
      No networks in cluster mycluster
      
      Add nodes to or remove nodes/networks from cluster mycluster
      Enter "done" when completed or "cancel" to abort
      
      mycluster ? add net network 192.168.1.0 mask 255.255.255.0
      mycluster ? done
      Successfully modified cluster mycluster

      The cxfs-config command will now display the secondary network (network 1):

      irix#
      cxfs-config
      ...
      Networks:
          net 0: type tcpip  192.168.0.0      255.255.255.0   
          net 1: type tcpip  192.168.1.0      255.255.255.0   
      
      Machines:
      ...
          node red: node 55    cell 4  enabled  IRIX    server_admin
              hostname: red.example.com
              fail policy: Fence, Shutdown
              nic 0: address: 192.168.0.1 priority: 1 network: 0
              nic 1: address: 192.168.1.1 priority: 2 network: 1
      
          node yellow: node 2     cell 3  enabled  IRIX    server_admin
              hostname: yellow.example.com
              fail policy: Fence, Shutdown
              nic 0: address: 192.168.0.2 priority: 1 network: 0
              nic 1: address: 192.168.1.2 priority: 2 network:
      1

      When you restart cx_services, the first membership delivered message will appear:

      NOTICE: Membership delivered for cells 0x14.
      Cell(age): 3(1) 4(1)

To delete a network, do the following:

cmgr> modify cluster mycluster
Enter commands, you may enter "done" or "cancel" at any time to exit

Is this a FailSafe cluster <true|false> ? (false) 
Is this a CXFS cluster <true|false> ? (true) 
Cluster Notify Cmd [optional] ? 
Cluster Notify Address [optional] ? 
Cluster CXFS mode <normal|experimental>[optional] ? (normal) 
Cluster ID ? (1) 

Current nodes in cluster mycluster:
Node - 1: green
Node - 2: orange
Node - 3: red
Node - 4: purple
Node - 5: yellow
Node - 6: blue


Current networks in cluster mycluster:
Network 0 - network 192.168.0.0, mask 255.255.255.0
Network 1 - network 192.168.1.0, mask 255.255.255.0

cmgr> modify cluster mycluster
Enter commands, when finished enter either "done" or "cancel"

mycluster ? remove net network 192.168.1.0
mycluster ? done       
Successfully modified cluster mycluster

While there are networks defined, the cluster will try to use the highest priority network and failover as needed to the lower priority networks as possible. Deleting all networks will return the cluster to the default mode, in which a network consisting only of the priority 1 NICs is used.

For more information, see “Define a Cluster with cmgr” and “Modify a Cluster with cmgr”.

Example of Configuring a Large Cluster Using cmgr

Following is an example cmgr script for configuring a one-node cluster that can be copied and repeated for the number of nodes required:

#!/usr/cluster/bin/cmgr -f
# Node nodename definition
define node nodename
        set hostname to nodename
        set operating_system to OS
        set node_function to server_admin|client_admin|client_only
        set is_failsafe to false
        set is_cxfs to true
        set nodeid to nodeID#
        set hierarchy to [system][fence][reset][fencereset][shutdown]
        set reset_type to powerCycle|reset|nmi
        add nic IP address  or nodename
                set heartbeat to true
                set ctrl_msgs to true
                set priority to 1
        done
done
# Define cluster and add nodes to the cluster
define cluster clustername
        set is_failsafe to false
        set is_cxfs to true
        set cx_mode to normal
        set clusterid to clusterID#
done
modify cluster clustername
        add node nodename
done
set cluster clustername
define cxfs_filesystem filesystemname
        set device_name to /dev/cxvm/volumename
        set mount_point to /mountpoint
        set force to false
        set dflt_local_status to enabled
        add cxfs_server server1, server2, etc
                set rank to 0
        done
done
# Setting CXFS parameters
modify cx_parameters
        set tie_breaker to none
done
start cx_services for cluster clustername
quit

Example of Performing a Forced CXFS Shutdown Using cmgr

# /usr/cluster/bin/cmgr -p
cmgr> admin cxfs_stop

Example of Relocation Error Using cmgr

If you try to relocate a filesystem and see an error similar to the following cmgr example, it means that relocation has not been enabled:

CMD(/bin/mount -n -o remount,set_server=node1 /lsan1): exited with status
32 (0x20)

Failed to admin:
        cxfs_relocate

admin command failed

To allow the relocation to occur, you must enable relocation as specified in “Relocation” in Chapter 1.

Example of Checking Cluster Status Using cmgr

To query node and cluster status, use the following cmgr command on a CXFS administration node:

cmgr> show status of cluster cluster_name 

Example of Querying Node Status Using cmgr

To query node status, use the following cmgr command:

cmgr> show status of node node_name 

Example of Pinging the System Controller Using cmgr

When CXFS is running, you can determine whether the system controller on a node is responding by using the following cmgr command:

cmgr> admin ping node node_name 


Note: This is not required when using cxfs_admin because it will attempt to verify that each node is connected as it is added to the cluster.

This command uses the CXFS daemons to test whether the system controller is responding.

You can verify reset connectivity on a node in a cluster even when the CXFS daemons are not running by using the standalone option of the admin ping command:

cmgr> admin ping standalone node node_name

This command calls the ping command directly to test whether the system controller on the indicated node is responding.

Example of Monitoring Reset Lines Using cmgr

You can use the cmgr command to ping the system controller at a node as follows (line break for readability):

cmgr> admin ping dev_name device_name of dev_type device_type
with sysctrl_type system_controller_type 


Note: This is not required when using cxfs_admin.


Example of I/O Fencing Status Using cmgr

To check the current fencing status, use the admin fence query command in cmgr

To check current failure action settings, use the show node nodename command in cmgr.

Example of Using build_cmgr_script to Recreate the Cluster Database

You can use the build_cmgr_script command from one node in the cluster to create a cmgr script that will recreate the node, cluster, switch, and filesystem definitions for all nodes in the cluster database. You can then later run the resulting script to recreate a database with the same contents; this method can be used for missing or corrupted cluster databases.


Note: The build_cmgr_script script does not contain local logging information, so it cannot be used as a complete backup/restore tool.

To perform a database backup, use the build_cmgr_script script from one node in the cluster, as described in “Creating a cmgr Script Automatically”.


Caution: Do not make configuration changes while you are using the build_cmgr_script command.

By default, this creates a cmgr script in the following location:

/var/cluster/ha/tmp/cmgr_create_cluster_clustername_processID

You can specify another filename by using the -o option.

To perform a restore on all nodes in the pool, do the following:

  1. Stop CXFS services on all nodes in the cluster.

  2. Stop the cluster database daemons on each node.

  3. Remove all copies of the old database by using the cdbreinit command on each node.

  4. Execute the cmgr script (which was generated by the build_cmgr_script script) on the node that is defined first in the script. This will recreate the backed-up database on each node.


    Note: If you want to run the generated script on a different node, you must modify the generated script so that the node is the first one listed in the script.


  5. Restart cluster database daemons on each node.

For example, to back up the current database, clear the database, and restore the database to all administration nodes, do the following on administration nodes as directed:

On one node:
# /var/cluster/cmgr-scripts/build_cmgr_script -o /tmp/newcdb
Building cmgr script for cluster clusterA ...
build_cmgr_script: Generated cmgr script is /tmp/newcdb

On one node:
     cmgr> stop cx_services for cluster clusterA

On each node: 
# /etc/init.d/cxfs stop

On each node:
     IRIX:
     # /etc/init.d/cluster stop

     SGI Foundation Software: 
     # /etc/init.d/cxfs_cluster stop

On each node: 
# /usr/cluster/bin/cdbreinit

On each node:
     IRIX:
     # /etc/init.d/cluster start

     SGI Foundation Software:
     # /etc/init.d/cxfs_cluster start

On the *first* node listed in the /tmp/newcdb script:
# /tmp/newcdb



[2] This man page is also accessible by man cluster_mgr for historical purposes.