Access control list.
A server-capable administration node chosen from the list of potential metadata servers. There can be only one active metadata server for any one filesystem. See also metadata.
See server-capable administration node.
See forced CXFS shutdown.
The cxfs_admin complexity mode that provides a list of possible choices when using the <TAB> key, prompts for all possible fields, displays all attributes, and includes debugging information in output.
The number of membership transitions in which the node has participated. The age is 1 the first time a node joins the membership and will increment for each time the membership changes. This number is dynamically allocated by the CXFS software (the user does not define the age).
Asymmetric logical unit access, a feature of some RAID devices that permits automatic path failover
Address resolution protocol.
The cxfs_admin complexity mode that only shows the common options and attributes in show output, provides a list of possible choices when using the <TAB> key, and uses prompting.
Maximum capacity for data transfer.
A node that is explicitly not permitted to be automatically configured into the cluster database.
Baseboard management controller.
A number associated with a node that is allocated when a node is added into the cluster definition with the GUI or when it is defined with cxfs_admin. The first node in the cluster has cell ID 0, and each subsequent node added gets the next available incremental cell ID. If a node is removed from the cluster, its cell ID becomes available. It differs from node ID.
Underlying command-line interface commands used by CXFS Manager .
In CXFS, a node other than the active metadata server that mounts a CXFS filesystem. A server-capable administration node can function as either an active metadata server or as a CXFS client, depending upon how it is configured and whether it is chosen to be the active metadata server. A client-only node always functions as a client.
A node that is installed with the cxfs_client.sw.base software product; it does not run cluster administration daemons and is not capable of coordinating CXFS metadata. Any node can be client-only node. See also server-capable administration node.
The process by which the active metadata server resolves the state of client nodes after a client is removed from the CXFS kernel membership due to an interruption in CXFS services. See also client recovery and metadata-server recovery .
A cluster is the set of systems (nodes) configured to work together as a single computing resource. A cluster is identified by a simple name and a cluster ID. A cluster running multiple operating systems is known as a multiOS cluster .
There is only one cluster that may be formed from a given pool of nodes.
Disks or logical units (LUNs) are assigned to clusters by recording the name of the cluster on the disk (or LUN). Thus, if any disk is accessible (via a SAN connection) from machines in multiple clusters, then those clusters must have unique names. When members of a cluster send messages to each other, they identify their cluster via the cluster ID. Cluster names must be unique.
Because of the above restrictions on cluster names and cluster IDs, and because cluster names and cluster IDs cannot be changed once the cluster is created (without deleting the cluster and recreating it), SGI advises that you choose unique names and cluster IDs for each of the clusters within your organization.
The set of daemons on a server-capable administration node that provide the cluster infrastructure: cad, cmond, fs2d, crsd. See “CXFS Control Daemons” in Chapter 1.
CXFS Manager and the cxfs_admin command-line tools that let you configure and administer a CXFS cluster, and other tools that let you monitor the state of the cluster. See “CXFS Tools” in Chapter 1.
The person responsible for managing and maintaining a cluster.
The database that contains configuration information about all nodes and the cluster. The database is managed by the cluster administration daemons.
The group of server-capable administration nodes in the pool that are accessible to cluster administration daemons and therefore are able to receive cluster database updates; this may be a subset of the nodes defined in the pool. The cluster administration daemons manage the distribution of the cluster database (CDB) across the server-capable administration nodes in the pool. (Also known as user-space membership and fs2d database membership .)
The XVM concept in which a filesystem applies to the entire cluster, not just to the local node. See also local domain .
A unique number within your network in the range 1 through 255. The cluster ID is used by the operating system kernel to make sure that it does not accept cluster information from any other cluster that may be on the network. The kernel does not use the database for communication, so it requires the cluster ID in order to verify cluster communications. This information in the kernel cannot be changed after it has been initialized; therefore, you must not change a cluster ID after the cluster has been defined. Clusters IDs must be unique.
One of two methods of CXFS cluster operation, Normal or Experimental. In Normal mode, CXFS monitors and acts upon CXFS kernel heartbeat or cluster database heartbeat failure; in Experimental mode, CXFS ignores heartbeat failure. Experimental mode allows you to use the kernel debugger (which stops heartbeat) without causing node failures. You should only use Experimental mode during debugging with approval from SGI support.
A node that is defined as part of the cluster. See also node.
See CXFS cluster services .
The manner in which cxfs_admin operates. See basic mode and advanced mode .
Messages that the cluster software sends between the cluster nodes to request operations on or distribute information about cluster nodes. Control messages, CXFS kernel heartbeat messages, CXFS metadata, and cluster database heartbeat messages are sent through a node's network interfaces that have been attached to a private network.
See private network.
Clustered XFS, a clustered filesystem for high-performance computing environments. See “What is CXFS?” in Chapter 1.
The daemon (cxfs_client) that controls CXFS cluster services on a client-only node.
The daemon (clconfd) that controls CXFS cluster services on a server-capable administration node. See “Cluster Administration Daemons” in Chapter 1.
See cluster database.
The group of CXFS nodes that can share filesystems in the cluster, which may be a subset of the nodes defined in a cluster. During the boot process, a node applies for CXFS kernel membership. Once accepted, the node can share the filesystems of the cluster. (Also known as kernel-space membership.) CXFS kernel membership differs from cluster database membership.
The enabling/disabling of a node, which changes a flag in the cluster database. This disabling/enabling does not affect the daemons involved. The daemons that control CXFS cluster services are clconfd on a server-capable administration node and cxfs_client on a client-only node. See “CXFS Services” in Chapter 1.
To enable a node, which changes a flag in the cluster database, by using an administrative task in the CXFS GUI or the cxfs_admin enable command.
To disable a node, which changes a flag in the cluster database, by using the CXFS GUI or the cxfs_admin disable command. See also forced CXFS shutdown.
See forced CXFS shutdown and shutdown.
A node identified as a tiebreaker for CXFS to use in the process of computing CXFS kernel membership for the cluster, when exactly half the nodes in the cluster are up and can communicate with each other. There is no default CXFS tiebreaker. SGI recommends that the tiebreaker node be a client-only node.
See cluster database.
See cluster database membership.
The portion of the GUI window that displays details about a selected component in the view area. See also view area .
Domain name service
See cluster domain and local domain.
Starts monitoring CXFS kernel heartbeat only when an operation is pending. Once monitoring initiates, it monitors at 1-second intervals and declares a timeout after 5 consecutive missed seconds, just like static heartbeat monitoring.
Using the cxfs_admin command and the autoconf object to specify new client-only nodes that are allowed to be automatically configured into the cluster database.
See NFS edge-serving.
See fail policy.
The set of instructions that determine what happens to a failed node; the second instruction will be followed only if the first instruction fails; the third instruction will be followed only if the first and second fail. The available actions are: fence, fencereset, reset, and shutdown.
The failure policy method that isolates a problem node so that it cannot access I/O devices, and therefore cannot corrupt data in the shared CXFS filesystem. I/O fencing can be applied to any node in the cluster (CXFS clients and metadata servers). The rest of the cluster can begin immediate recovery.
The failure policy method that fences the node and then, if the node is successfully fenced, performs an asynchronous system reset; recovery begins without waiting for reset acknowledgment. If used, this fail policy method should be specified first. If the fencing action fails, the reset is not performed; therefore, reset alone is also highly recommended for all server-capable administration nodes (unless there is a single server-capable administration node in the cluster).
The process of recovery from fencing, in which the affected node automatically withdraws from the CXFS kernel membership, unmounts all filesystems that are using an I/O path via fenced HBA(s), and then rejoins the cluster.
The withdrawl of a node from the CXFS kernel membership, either due to the fact that the node has failed somehow or by issuing an admin cxfs_stop command. This disables filesystem and cluster volume access for the node. The node remains enabled in the cluster database. See also CXFS services stop and shutdown.
See cluster database membership.
ARP that broadcasts the MAC address to IP address mappings on a specified interface.
Guaranteed rate I/O.
The ggd2 daemon.
Graphical user interface. The CXFS Manager GUI lets you set up and administer CXFS filesystems and XVM logical volumes. It also provides icons representing status and structure. See Chapter 10, “CXFS GUI”.
GUID partition table
globally unique identifier
high-availability
Host channel adapter
Messages that cluster software sends between the nodes that indicate a node is up and running. CXFS kernel heartbeat messages, cluster database heartbeat messages, CXFS metadata, and control messages are sent through the node's network interfaces that have been attached to a private network.
If no CXFS kernel heartbeat or cluster database heartbeat is received from a node in this period of time, the node is considered to be dead. The heartbeat timeout value must be at least 5 seconds for proper CXFS operation.
See fence.
Intelligent Platform Management Interface .
SGI InfiniteStorage Software Platform, the distribution method for CXFS software.
See CXFS kernel membership.
Local area network.
XVM concept in which a filesystem applies only to the local node, not to the cluster. See also cluster domain.
A log configuration has two parts: a log level and a log file, both associated with a log group. The cluster administrator can customize the location and amount of log output, and can specify a log configuration for all nodes or for only one node. For example, the crsd log group can be configured to log detailed level-10 messages to the crsd-nodeA log only on the node nodeA and to write only minimal level-1 messages to the crsd log on all other nodes.
A file containing notifications for a particular log group. A log file is part of the log configuration for a log group.
A set of one or more CXFS processes that use the same log configuration. A log group usually corresponds to one daemon, such as gcd.
A number controlling the number of log messages that CXFS will write into an associated log group's log file. A log level is part of the log configuration for a log group.
A logical organization of disk storage in XVM that enables an administrator to combine underlying physical disk storage into a single unit. Logical volumes behave like standard disk partitions. A logical volume allows a filesystem or raw device to be larger than the size of a physical disk. Using logical volumes can also increase disk I/O performance because a volume can be striped across more than one disk. Logical volumes can also be used to mirror data on different disks. For more information, see the XVM Volume Manager Administrator Guide.
Logical unit. A logical disk provided by a RAID. A logical unit number (LUN) is a representation of disk space. In a RAID, the disks are not individually visible because they are behind the RAID controller. The RAID controller will divide up the total disk space into multiple LUNs. The operating system sees a LUN as a hard disk. A LUN is what XVM uses as its physical volume ( physvol). For more information, see the XVM Volume Manager Administrator Guide.
See cluster database membership and CXFS kernel membership.
A number associated with a node's cell ID that indicates the number of times the CXFS kernel membership has changed since a node joined the membership.
Information that describes a file, such as the file's name, size, location, and permissions.
The server-capable administration node that coordinates the updating of metadata on behalf of all nodes in a cluster. There can be multiple potential metadata servers, but only one is chosen to be the active metadata server for any one filesystem.
The process by which the active metadata server moves from one node to another due to an interruption in CXFS services on the first node. See also client recovery and recovery.
A cluster that contains of clients running different operating systems, such one client running Linux and another running Windows.
A device that provides four DB9 serial ports from a 36-pin connector.
A configuration in which CXFS client nodes can export data with NFS.
Network Information Service
A node is an operating system (OS) image, usually an individual computer. (This is different from the NUMA definition for a brick/blade on the end of a NUMAlink cable.)
A given node can be a member of only one pool and only one cluster. See also client-only node and server-capable administration node.
An integer in the range 1 through 32767 that is unique among the nodes defined in the pool. You must not change the node ID number after the node has been defined. It differs from cell ID .
The list of nodes that are active (have CXFS kernel membership) in a cluster.
The command used to notify the cluster administrator of changes or failures in the cluster and nodes. The command must exist on every node in the cluster.
OpenFabrics Alliance
A system that can control a node remotely, such as power-cycling the node. At run time, the owner host must be defined as a node in the pool.
The device file name of the terminal port (TTY) on the owner host to which the system controller is connected. The other end of the cable connects to the node with the system controller port, so the node can be controlled remotely by the owner host.
A model of data access in which the shared files are treated as local files by all of the hosts in the cluster. Each host can read and write the disks at near-local disk speeds; the data passes directly from the disks to the host requesting the I/O, without passing through a data server or over a LAN. For the data path, each host is a peer on the SAN; each can have equally fast direct data paths to the shared disks.
Physical volume. A disk that has been labeled for use by XVM. For more information, see the XVM Volume Manager Administrator Guide.
The set of nodes from which a particular cluster may be formed. Only one cluster may be configured from a given pool, and it need not contain all of the available nodes. (Other pools may exist, but each is disjoint from the other. They share no node or cluster definitions.)
When you define a node using cxfs_admin, it is automatically added to both the pool and the specific cluster definition. If you define a node with the CXFS GUI, it is merely added to the pool and you must explicitly add it to the cluster.
The password for the system controller port, usually set once in firmware or by setting jumper wires. (This is not the same as the node's root password.)
The type of system controller port used for node reset.
A server-capable administration node that is listed in the metadata server list when defining a filesystem; only one node in the list will be chosen as the active metadata server.
A network that is dedicated to CXFS kernel heartbeat messages, cluster database heartbeat messages, CXFS metadata, and control messages. The private network is accessible by administrators but not by users. Also known as control network.
The number of nodes required to form a cluster, which differs according to membership:
For CXFS kernel membership:
A majority (>50%) of the server-capable administration nodes defined in the cluster plus the tiebreaker node (usually a client-only node) are required to form an initial membership
Half (50%) of the server-capable administration nodes defined in the cluster are required to maintain an existing membership
For cluster database membership, 50% of the server-capable administration nodes in the pool are required to form and maintain a cluster.
| Note: When using the CXFS GUI, a newly defined node is added to the pool and must be explicitly added to the cluster definition; when using the cxfs_admin tool, a newly defined node is added automatically to both the pool and the cluster definition. |
The node that is chosen to propagate the cluster database to the other server-capable administration nodes in the pool.
Redundant array of independent disks.
The process by which a node is removed from the CXFS kernel membership due to an interruption in CXFS services. It is during this process that the remaining nodes in the CXFS kernel membership resolve their state for cluster resources owned or shared with the removed node. See also client recovery and metadata-server recovery.
The process by which the metadata server moves from one node to another due to an administrative action; other services on the first node are not interrupted.
The failure policy method that performs a system reset via the system controller.
The communication interface used to send a system reset to a controller port on a remote node.
The action taken upon node failure.
Registered state change notification.
Storage area network. A high-speed, scalable network of servers and storage devices that provides storage resource consolidation, enhanced data access, and centralized storage management.
A node that is installed with the cluster_admin product and is also capable of coordinating CXFS metadata.
Licensing that uses license keys on the CXFS server-capable administration nodes; it does not require node-locked license keys on CXFS client-only nodes. The license keys are node-locked to each server-capable administration node and specify the number of client-only nodes that may join the cluster membership.
The fail policy that tells the other nodes in the cluster to wait before reforming the CXFS kernel membership. The surviving cluster delays the beginning of recovery to allow the node time to complete the shutdown. See also forced CXFS shutdown.
A situation in which cluster membership divides into two clusters due to an event (such as a network partition or an unresponsive server-capable administration node) and the lack of reset or CXFS tiebreaker capability. This results in multiple clusters, each claiming ownership of the same filesystems, which can result in filesystem data corruption. Also known as split-brain syndrome.
A security breach involving illicit viewing.
See split cluster.
A security breach in which one machine on the network masquerades as another.
SCSI RDMA Protocol
Monitors CXFS kernel heartbeat constantly at 1-second intervals and declares a timeout after 5 consecutive missed seconds (default). See also dynamic heartbeat monitoring.
See SAN.
A port sitting on a node that provides a way to power-cycle the node remotely. Enabling or disabling a system controller port in the cluster database tells CXFS whether it can perform operations on the system controller port.
Log files in which system messages are stored.
See CXFS tiebreaker node.
I/O per second.
See cluster database membership.
The portion of the GUI window that displays components graphically. See also details area.
Virtual local area network.
A node that is explicitly allowed to be automatically configured into the cluster database.
A filesystem implementation type for the Linux operating system. It defines the format that is used to store data on disks managed by the filesystem.