A server-capable administration node chosen from the list of potential metadata servers. There can be only one active metadata server for any one filesystem. See also metadata.
Access control list.
See server-capable administration node.
See forced CXFS shutdown.
Address resolution protocol.
Maximum capacity for data transfer.
A node that is explicitly not permitted to be automatically configured into the cluster database.
Baseboard management controller.
A number associated with a node that is allocated when a node is added into the cluster definition with the GUI or cxfs_admin. The first node in the cluster has cell ID of 0, and each subsequent node added gets the next available (incremental) cell ID. If a node is removed from the cluster definition, its cell ID becomes available. It is not the same thing as the node ID.
Underlying command-line interface commands used by the CXFS Manager graphical user interface (GUI).
In CXFS, a node other than the active metadata server that mounts a CXFS filesystem. A server-capable administration node can function as either an active metadata server or as a CXFS client, depending upon how it is configured and whether it is chosen to be the active metadata server. A client-only node always functions as a client.
A node that is installed with the cxfs_client.sw.base software product; it does not run cluster administration daemons and is not capable of coordinating CXFS metadata. Any node can be client-only node. See also server-capable administration node.
A cluster is the set of systems (nodes) configured to work together as a single computing resource. A cluster is identified by a simple name and a cluster ID. A cluster running multiple operating systems is known as a multiOS cluster .
There is only one cluster that may be formed from a given pool of nodes.
Disks or logical units (LUNs) are assigned to clusters by recording the name of the cluster on the disk (or LUN). Thus, if any disk is accessible (via a Fibre Channel connection) from machines in multiple clusters, then those clusters must have unique names. When members of a cluster send messages to each other, they identify their cluster via the cluster ID. Cluster names must be unique.
Because of the above restrictions on cluster names and cluster IDs, and because cluster names and cluster IDs cannot be changed once the cluster is created (without deleting the cluster and recreating it), SGI advises that you choose unique names and cluster IDs for each of the clusters within your organization.
The set of daemons on a server-capable administration node that provide the cluster infrastructure: cad, cmond, fs2d, crsd.
The CXFS graphical interface (GUI) and the cxfs_admin command-line tools that let you configure and administer a CXFS cluster, and other tools that let you monitor the state of the cluster.
The person responsible for managing and maintaining a cluster.
Contains configuration information about all nodes and the cluster. The database is managed by the cluster administration daemons.
XVM concept in which a filesystem applies to the entire cluster, not just to the local node. See also local domain .
The group of server-capable administration nodes in the pool that are accessible to cluster administration daemons and therefore are able to receive cluster database updates; this may be a subset of the nodes defined in the pool. The cluster administration daemons manage the distribution of the cluster database (CDB) across the server-capable administration nodes in the pool. (Also known as user-space membership and fs2d database membership .)
A unique number within your network in the range 1 through 255. The cluster ID is used by the operating system kernel to make sure that it does not accept cluster information from any other cluster that may be on the network. The kernel does not use the database for communication, so it requires the cluster ID in order to verify cluster communications. This information in the kernel cannot be changed after it has been initialized; therefore, you must not change a cluster ID after the cluster has been defined. Clusters IDs must be unique.
One of two methods of CXFS cluster operation, Normal or Experimental. In Normal mode, CXFS monitors and acts upon CXFS kernel heartbeat or cluster database heartbeat failure; in Experimental mode, CXFS ignores heartbeat failure. Experimental mode allows you to use the kernel debugger (which stops heartbeat) without causing node failures. You should only use Experimental mode during debugging with approval from SGI support.
Messages that the cluster software sends between the cluster nodes to request operations on or distribute information about cluster nodes. Control messages, CXFS kernel heartbeat messages, CXFS metadata, and cluster database heartbeat messages are sent through a node's network interfaces that have been attached to a private network.
A node that is defined as part of the cluster. See also node.
See private network.
Clustered XFS, a clustered filesystem for high-performance computing environments.
The daemon (cxfs_client) that controls CXFS services on a client-only node.
The daemon (clconfd) that controls CXFS services on a server-capable administration node.
See cluster database.
The group of CXFS nodes that can share filesystems in the cluster, which may be a subset of the nodes defined in a cluster. During the boot process, a node applies for CXFS kernel membership. Once accepted, the node can share the filesystems of the cluster. (Also known as kernel-space membership.) CXFS kernel membership differs from cluster database membership.
The enabling/disabling of a node, which changes a flag in the cluster database. This disabling/enabling does not affect the daemons involved. The daemons that control CXFS services are clconfd on a server-capable administration node and cxfs_client on a client-only node.
To enable a node, which changes a flag in the cluster database, by using an administrative task in the CXFS GUI or the cxfs_admin enable command.
To disable a node, which changes a flag in the cluster database, by using the CXFS GUI or the cxfs_admin disable command. See also forced CXFS shutdown.
See forced CXFS shutdown and shutdown.
A node identified as a tiebreaker for CXFS to use in the process of computing CXFS kernel membership for the cluster, when exactly half the nodes in the cluster are up and can communicate with each other. There is no default CXFS tiebreaker. SGI recommends that the tiebreaker node be a client-only node.
See cluster database.
See cluster database membership.
The portion of the GUI window that displays details about a selected component in the view area. See also view area .
See cluster domain and local domain.
Starts monitoring CXFS kernel heartbeat only when an operation is pending. Once monitoring initiates, it monitors at 1-second intervals and declares a timeout after 5 consecutive missed seconds, just like static heartbeat monitoring.
Disk volume header.
See fail policy.
The set of instructions that determine what happens to a failed node; the second instruction will be followed only if the first instruction fails; the third instruction will be followed only if the first and second fail. The available actions are: fence, fencereset, reset, and shutdown.
The failure policy method that isolates a problem node so that it cannot access I/O devices, and therefore cannot corrupt data in the shared CXFS filesystem. I/O fencing can be applied to any node in the cluster (CXFS clients and metadata servers). The rest of the cluster can begin immediate recovery.
The failure policy method that fences the node and then, if the node is successfully fenced, performs an asynchronous system reset; recovery begins without waiting for reset acknowledgment. If used, this fail policy method should be specified first. If the fencing action fails, the reset is not performed; therefore, reset alone is also highly recommended for all server-capable administration nodes (unless there is a single server-capable administration node in the cluster).
The process of recovery from fencing, in which the affected node automatically withdraws from the CXFS kernel membership, unmounts all filesystems that are using an I/O path via fenced HBA(s), and then rejoins the cluster.
The withdrawl of a node from the CXFS kernel membership, either due to the fact that the node has failed somehow or by issuing an admin cxfs_stop command. This disables filesystem and cluster volume access for the node. The node remains enabled in the cluster database. See also CXFS services stop and shutdown.
See cluster database membership.
ARP that broadcasts the MAC address to IP address mappings on a specified interface.
Graphical user interface. The CXFS GUI lets you set up and administer CXFS filesystems and XVM logical volumes. It also provides icons representing status and structure.
GUID partition table
Messages that cluster software sends between the nodes that indicate a node is up and running. CXFS kernel heartbeat messages, cluster database heartbeat messages, CXFS metadata, and control messages are sent through the node's network interfaces that have been attached to a private network.
If no CXFS kernel heartbeat or cluster database heartbeat is received from a node in this period of time, the node is considered to be dead. The heartbeat timeout value must be at least 5 seconds for proper CXFS operation.
See fence.
Intelligent Platform Management Interface.
SGI Infinite Storage Software Platform, the distribution method for CXFS software.
See CXFS kernel membership.
Local area network.
XVM concept in which a filesystem applies only to the local node, not to the cluster. See also cluster domain.
A log configuration has two parts: a log level and a log file, both associated with a log group. The cluster administrator can customize the location and amount of log output, and can specify a log configuration for all nodes or for only one node. For example, the crsd log group can be configured to log detailed level-10 messages to the crsd-foo log only on the node foo and to write only minimal level-1 messages to the crsd log on all other nodes.
A file containing notifications for a particular log group. A log file is part of the log configuration for a log group.
A set of one or more CXFS processes that use the same log configuration. A log group usually corresponds to one daemon, such as gcd.
A number controlling the number of log messages that CXFS will write into an associated log group's log file. A log level is part of the log configuration for a log group.
A logical organization of disk storage in XVM that enables an administrator to combine underlying physical disk storage into a single unit. Logical volumes behave like standard disk partitions. A logical volume allows a filesystem or raw device to be larger than the size of a physical disk. Using logical volumes can also increase disk I/O performance because a volume can be striped across more than one disk. Logical volumes can also be used to mirror data on different disks. For more information, see the XVM Volume Manager Administrator's Guide.
Logical unit. A logical disk provided by a RAID. A logical unit number (LUN) is a representation of disk space. In a RAID, the disks are not individually visible because they are behind the RAID controller. The RAID controller will divide up the total disk space into multiple LUNs. The operating system sees a LUN as a hard disk. A LUN is what XVM uses as its physical volume (physvol). For more information, see the XVM Volume Manager Administrator's Guide.
See cluster database membership and CXFS kernel membership.
A number associated with a node's cell ID that indicates the number of times the CXFS kernel membership has changed since a node joined the membership.
Information that describes a file, such as the file's name, size, location, and permissions.
The server-capable administration node that coordinates updating of metadata on behalf of all nodes in a cluster. There can be multiple potential metadata servers, but only one is chosen to be the active metadata server for any one filesystem.
The process by which the metadata server moves from one node to another due to an interruption in CXFS services on the first node. See also recovery.
A cluster that is running multiple operating systems, such as SGI ProPack and Solaris.
A device that provides four DB9 serial ports from a 36-pin connector.
A node is an operating system (OS) image, usually an individual computer. (This use of the term node does not have the same meaning as a node in an SGI Origin 3000 or SGI 2000 system and is different from the NUMA definition for a brick/blade on the end of a NUMAlink cable.)
A given node can be a member of only one pool (and therefore) only one cluster.
See also client-only node, server-capable administration node, and standby node.
An integer in the range 1 through 32767 that is unique among the nodes defined in the pool. You must not change the node ID number after the node has been defined. It is not the same thing as the cell ID.
The list of nodes that are active (have CXFS kernel membership) in a cluster.
The command used to notify the cluster administrator of changes or failures in the cluster and nodes. The command must exist on every node in the cluster.
A system that can control a node remotely, such as power-cycling the node. At run time, the owner host must be defined as a node in the pool.
The device file name of the terminal port (TTY) on the owner host to which the system controller is connected. The other end of the cable connects to the node with the system controller port, so the node can be controlled remotely by the owner host.
A model of data access in which the shared files are treated as local files by all of the hosts in the cluster. Each host can read and write the disks at near-local disk speeds; the data passes directly from the disks to the host requesting the I/O, without passing through a data server or over a LAN. For the data path, each host is a peer on the SAN; each can have equally fast direct data paths to the shared disks.
Physical volume. A disk that has been labeled for use by XVM. For more information, see the XVM Volume Manager Administrator's Guide.
The set of nodes from which a particular cluster may be formed. Only one cluster may be configured from a given pool, and it need not contain all of the available nodes. (Other pools may exist, but each is disjoint from the other. They share no node or cluster definitions.)
A pool is formed when you connect to a given node and define that node in the cluster database using the CXFS GUI. You can then add other nodes to the pool by defining them while still connected to the first node, or to any other node that is already in the pool. (If you were to connect to another node and then define it, you would be creating a second pool).
The password for the system controller port, usually set once in firmware or by setting jumper wires. (This is not the same as the node's root password.)
A server-capable administration node that is listed in the metadata server list when defining a filesystem; only one node in the list will be chosen as the active metadata server.
A network that is dedicated to CXFS kernel heartbeat messages, cluster database heartbeat messages, CXFS metadata, and control messages. The private network is accessible by administrators but not by users. Also known as control network.
The number of nodes required to form a cluster, which differs according to membership:
For CXFS kernel membership:
A majority (>50%) of the server-capable administration nodes in the cluster are required to form an initial membership
Half (50%) of the server-capable administration nodes in the cluster are required to maintain an existing membership
For cluster database membership, 50% of the nodes in the pool are required to form and maintain a cluster.
The node that is chosen to propagates the cluster database to the other server-capable administration nodes in the pool.
Redundant array of independent disks.
The process by which a node is removed from the CXFS kernel membership due to an interruption in CXFS services. It is during this process that the remaining nodes in the CXFS kernel membership resolve their state for cluster resources owned or shared with the removed node. See also metadata server recovery.
The process by which the metadata server moves from one node to another due to an administrative action; other services on the first node are not interrupted.
The failure policy method that performs a system reset via a serial line connected to the system controller. The reset may be a powercycle, serial reset, or NMI (nonmaskable interrupt).
Storage area network. A high-speed, scalable network of servers and storage devices that provides storage resource consolidation, enhanced data access, and centralized storage management.
A node that is installed with the cluster_admin product and is also capable of coordinating CXFS metadata.
Licensing that uses license keys on the CXFS server-capable administration nodes; it does not require node-locked license keys on CXFS client-only nodes. The license keys are node-locked to each server-capable administration node and specify the number and size of client-only nodes that may join the cluster membership. All nodes require server-side licensing.
The fail policy that tells the other nodes in the cluster to wait before reforming the CXFS kernel membership. The surviving cluster delays the beginning of recovery to allow the node time to complete the shutdown. See also forced CXFS shutdown.
A situation in which cluster membership divides into two clusters due to an event such as a network partition, or unresponsive server-capable administration node and the lack of reset and/or CXFS tiebreaker capability. This results in multiple clusters, each claiming ownership of the same filesystems, which can result in filesystem data corruption. Also known as split-brain syndrome.
A security breach involving illicit viewing.
See split cluster.
A security breach in which one machine on the network masquerades as another.
A server-capable administration node that is configured as a potential metadata server for a given filesystem, but does not currently run any applications that will use that filesystem.
Monitors CXFS kernel heartbeat constantly at 1-second intervals and declares a timeout after 5 consecutive missed seconds (default). See also dynamic heartbeat monitoring.
See SAN.
A port sitting on a node that provides a way to power-cycle the node remotely. Enabling or disabling a system controller port in the cluster database tells CXFS whether it can perform operations on the system controller port.
Log files in which system messages are stored.
See CXFS tiebreaker node.
I/O per second.
See cluster database membership.
The portion of the GUI window that displays components graphically. See also details area.
Virtual local area network.
A node that is explicitly allowed to be automatically configured into the cluster database.
A filesystem implementation type for the Linux operating system. It defines the format that is used to store data on disks managed by the filesystem.