This appendix discusses the following:
The parameters described in this appendix apply to CXFS server-capable administration nodes and client-only nodes running Linux. For more information about system tunable parameters on client-only nodes, see CXFS 7 Client-Only Guide for SGI InfiniteStorage.
This section discusses the following:
SGI recommends that you use the same settings on all applicable nodes in the cluster.
| Note: Before changing any parameter, you should understand the ramifications
of doing so on your system. You should change debugging parameters only
at the recommendation of SGI Support.
The values of these parameters vary in different releases of the product. When upgrading the product, consult SGI Support to determine whether any changes made to the parameters in this chapter should be carried forward. Setting these parameters incorrectly may render the system unstable or otherwise unusable. |
See also “Set System-Tunable Kernel Parameters Appropriately” in Chapter 2.
A number of parameters have a value that is interpreted in the same manner: the least significant 4 bits are used in standard (nondebug) kernels, the next 4 bits are used in debug kernels. In each group of 4 bits, the most significant bit determines whether the system will panic if an error condition is detected. The next bit determines whether part of the code path doing error detection or handling is enabled or disabled. The last 2 bits are interpreted as a debug level:
0 = No messages are printed
1 = Debug level 1
2 = Debug level 2
3 = Debug level 3
A common default (0xf5) is to always set the enable bit, to print only some messages in the nondebug kernel case, and print all messages and panic in the debug kernel case.
Figure D-1 shows an example.
You can set a parameter permanently across reboots on a SLES system by adding it to the /etc/modprobe.d/sgi-cxfs-xvm.conf file. Use the following format:
options module systune=value |
where:
module is one of the following:
| sgi-cxfs |
| sgi-cell |
The section that describes a parameter lists the module name.
systune is the parameter name, such as mtcp_hb_watchdog
value is the value you want to set for the parameter, such as 2
| Note: Do not use spaces around the = character. |
There should be only one options line per module; if you want to specify multiple parameters, you must place them all on that single line.
For example, to permanently set the mtcp_hb_watchdog parameter (which is in the sgi-cell module) to 2, add the following line to /etc/modprobe.d/sgi-cxfs-xvm.conf :
options sgi-cell mtcp_hb_watchdog=2 |
The change will take effect upon reboot.
For a temporary change to a dynamic parameter, use the Linux sysctl(8) command as follows:
# sysctl prefix.systune=value |
where:
prefix is one of the following:
| fs.cxfs |
| kernel.cell |
systune is the parameter name, such as mtcp_hb_watchdog
value is the value you want to set for the parameter, such as 2
| Note: Do not use spaces around the = character. |
For example, to temporarily set the mtcp_hb_watchdog parameter (which has the kernel.cell prefix) to 2, enter the following:
# sysctl kernel.cell.mtcp_hb_watchdog=2 kernel.cell.mtcp_hb_watchdog = 2 |
To query the current setting of a parameter, use the Linux sysctl(8) command:
# sysctl prefix.systune |
where:
prefix is one of the following:
| fs.cxfs |
| kernel.cell |
systune is the parameter name, such as mtcp_hb_watchdog
For example, to query the current setting of the mtcp_hb_watchdog parameter (which has the kernel-cell prefix):
# sysctl kernel.cell.mtcp_hb_watchdog kernel.cell.mtcp_hb_watchdog = 2 |
This section discusses site-configurable parameters:
Static parameters require a reboot to take affect.
Specifies how CXFS kernel heartbeat is generated for a Linux node. You should only change this value at the recommendation of SGI Support.
Range of values:
0x0 uses the standard heartbeat generation routine (default).
0x1 uses the interrupt timer list instead of a kernel thread.
0x3 uses a heartbeat generation routine that avoids some memory allocation problems that may occur on nodes with large CPU counts that run massively parallel jobs.
Prefix: kernel.cell
Module: sgi-cell
Specifies (in hundredths of a second) the length of time that CXFS waits for CXFS kernel heartbeat from other nodes before declaring node failure. SGI recommends a value of 500 (5 seconds). You should only change this value at the recommendation of SGI Support. The same value must be used on all nodes in the cluster.
Range of values:
Default: 500
Minimum: 100
Maximum: 12000
| Note: If your cluster includes large systems (greater than 64 processors), you may want to use a larger value, such as 6000 (60 seconds) or 12000 (120 seconds). However, the larger the timeout, the longer it takes the cluster to recognize a failed node and start recovery of the shared resources granted to that node. See “Avoid CXFS Kernel Heartbeat Issues on Large Systems” in Chapter 2. |
Prefix: kernel.cell
Module: sgi-cell
Specifies the minimum interval between warnings about heartbeat size, in seconds. A setting of 0 suppresses the warnings. You should only modify this parameter setting if there is more than one cluster using the public network as a backup private network.
Range of values:
Default: 60
Minimum: 0
Maximum: 604800
Prefix: kernel.cell
Module: sgi-cell
Also see “alive Message Errors” in Chapter 15.
Controls the behavior of the CXFS kernel heartbeat monitor watchdog. This facility monitors the generation of heartbeats in the kernel.
Range of values:
0 disables the watchdog (default)
1 specifies that watchdog expiration causes CXFS shutdown
2 specifies that watchdog expiration causes panic
Prefix: kernel.cell
Module: sgi-cell
Specifies whether to enable or disable TCP_NODELAY on CXFS message channels.
Range of values:
0 disables
1 enables (default)
Prefix: kernel.cell
Module sgi-cell
Specifies whether metadata messages are sent from a separate thread in order to save stack space.
Range of values:
0 disables (default for most nodes)
1 enables
Prefix: kernel.cell
Module: sgi-cell
Specifies the maximum number of auxiliary rhelpd threads to run. (The rhelpd threads help out recovery and relocation tasks. They can be used for activities such as asynchronous inode reconstruction and parallel recoveries. The rhelpd thread pool is global in nature and is created during module load time.)
The system automatically uses a calculated value that is four times the number of CPUs, so long as that the value is below 128. That is:
4 * number_of_CPUs = rhelpd_max_value If: 0 <= rhelpd_max_value <= 128 |
Range of values:
Default: 0, which specifies an automatically calculated value (to disable automatic calculation, set rhelpd_aux to a non-zero value)
Minimum: 0
Maximum: 128
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies the maximum number of rhelpd threads to run. The system automatically uses a calculated value. For more information about rhelpd threads and the calculated value, see “rhelpd_aux”.
Range of values:
Default: 0, which specifies an automatically calculated value (to disable automatic calculation, set rhelpd_max to a non-zero value)
Minimum: 0
Maximum: 128
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies the minimum number of rhelpd threads to run. The system automatically uses a calculated value. For more information about rhelpd threads and the calculated value, see “rhelpd_aux”.
Range of values:
Default: 0, which specifies an automatically calculated value (to disable automatic calculation, set rhelpd_min to a non-zero value)
Minimum: 0
Maximum: 8 (when the value is set explicitly)
Prefix: fs.cxfs
Module: sgi-cxfs
Dynamic parameters take affect as soon as they are changed.
Specifies the action to take when a local node detects that it has failed:
Range of values:
0 withdraws from the cluster (default)
1 halts
2 reboots
Prefix: kernel.cell
Module: sgi-cell
Specifies (in hundredths of a second) the length of time a client may delay telling the metadata server that it has updated the atime timestamp of a file. The default for both cxfs_client_push_period and cxfs_server_push_period is 1/4 of a second, so atime updates are delayed by up to 1/2 second by default. See also “cxfs_server_push_period”.
Range of values:
Default: 25
Minimum: 0
Maximum: 1000
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies the time-out (in seconds) of the dcvn idle period before returning tokens to the server.
Range of values:
Default: 60
Minimum: 5
Maximum: 3600
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies that CXFS should tolerate certain recoverable errors in the token subsystem. See “Interpretations of Bit Values for Standard and Debug Kernels”.
Range of values:
Default: 0xf5
Minimum: 0
Maximum: 0xff
Prefix: fs.cxfs
Module: sgi-cxfs
Verifies that a client has the existence token before trying to obtain additional tokens. See “Interpretations of Bit Values for Standard and Debug Kernels”.
Range of values:
Default: 0xf5
Minimum: 0
Maximum: 0xff
Prefix: fs.cxfs
Module: sgi-cxfs
On a server-capable administration node, checks that an objid received from a client corresponds to an object of the expected type. On a client-only node, verifies the level of reporting on receipt of an EBADOBJID error from the server. See “Interpretations of Bit Values for Standard and Debug Kernels”.
Range of values:
Default: 0xf5
Minimum: 0
Maximum: 0xff
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies whether or not to optimize the way extent lists are sent across the private network by sending a delta when possible.
Range of values:
0 does not optimize
1 optimizes (default)
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies whether or not to allow exported files to have their extents freed by DMAPI via dm_punch_hole().
Range of values:
0 allows extents to be freed (default)
1 does not allow extents to be freed
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies whether relocation is disabled or enabled (must be specified on the active metadata server):
Range of values:
0 disables relocation (default)
1 enables relocation
| Note: Relocation is disabled by default. See: |
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies (in hundredths of a second) how long that a metadata server may delay broadcasting to the clients that it has updated the atime timestamp. The default for both cxfs_client_push_period and cxfs_server_push_period is 1/4 of a second, so atime updates are delayed by up to 1/2 second by default. See also “cxfs_client_push_period”.
Range of values:
Default: 25
Minimum: 0
Maximum: 1000
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies the maximum number of auxiliary cxfsd threads to run per CXFS filesystem.
Range of values:
Default: 0 (calculates value 2 * number_of_cpus, but with a minimum of 4 and a maximum of 64)
Minimum: 0
Maximum: 2048
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies the maximum number of cxfsd threads to run per CXFS filesystem. (The cxfsd threads do the disk block allocation for delayed allocation buffers in CXFS and the flushing of buffered data for files that are being removed from the local cache by the metadata server.) The threads are allocated at filesystem mount time. The value of the cxfsd_max parameter at mount time remains in effect for a filesystem until it is unmounted.
Range of values:
Default: 0, which specifies the value of cxfsd_min + 2. (The value for cxfsd_max is always at least cxfsd_min + 2, even if that forces the kernel to increase the value beyond 2048.) To disable automatic cxfsd_max calculation, set cxfsd_max to a non-zero value.
Minimum: 16
Maximum: 2048
| Note: The value for cxfsd_max cannot be less than the value specified for cxfsd_min. |
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies the minimum number of cxfsd threads to run per CXFS filesystem. The value of the cxfsd_min parameter at mount time remains in effect for a filesystem until it is unmounted.
Range of values:
Default: 0, which specifies an automatically calculated value that will be 2 times the number of CPUS (the number of actual running cxfsd threads is dynamic), as long as it is in the range 16 through 2048. To disable automatic cxfsd_min calculation, set cxfsd_min to a non-zero value.
Minimum: 16
Maximum: 2048
Prefix: fs.cxfs
Module: sgi-cxfs
This section discusses parameters that should be changed only for debugging purposes and only at the recommendation of SGI:
| Caution: Debugging parameters are potentially dangerous. You should change them only at the recommendation of SGI Support. |
Specifies whether or not to disable the Linux splice implementation, which is a performance optimization for I/O-intensive applications that allows them to perform zero-copy reads and writes from the Linux page cache. splice() The NFS server is an example of an application that is capable of using this interface. A value of 1 disables splice().
| Note: The splice implementation is required for applications using the splice(2), vmsplicetee(2), and sendfile(2)system calls. Applications using these system calls will fail, in the case of a CXFS filesystem, if cxfs_disable_splice system is set to 1 . |
Range of values:
0 enables splice
1 disables splice (default)
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies the number of changes to the extent list kept by the CXFS metadata server for generating extents deltas.
Range of values:
Default: 5
Minimum: 0
Maximum: 32
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies the minimum amount of time (in hundredths of a second) that a node is allowed to complete a withdrawal from the cluster on loss of membership. If the metadata server cannot confirm that the node has shut down, it will initiate the node's fail-policy processing to forcefully remove the node from the cluster.
Range of values:
Default: 50
Minimum: 0
Maximum: 6000
Prefix: kernel.cell
Module: sgi-cell
| Caution: Debugging parameters are potentially dangerous. You should change them only at the recommendation of SGI Support. |
Disables selected features of the token module by setting a hexadecimal flag bit:
0x1 disables speculative token acquisition
0x2 (unused)
0x4 disables token prefetching
0x8 uses multiple RPCs to obtain a token set if the rank and class conflict
0x10 disables token lending
0x20 disables the blocking of cached tokens
0x40 disables range tokens
| Note: CXFS token prefetch and range tokens are designed as optimizations for applications using CXFS filesystems on a CXFS client. However, under some workloads, they may cause stability issues and should be disabled, at the direction of SGI Support. |
Range of values:
Default: 0
Maximum: 0
Minimum: 0x7fff
Prefix: kernel.cell
Module: sgi-cell
Specifies the number of seconds that the cms daemon will wait for a fence operation to complete. If a fence operation completion message is not received within that time, cms takes the action specified by cms_fence_timeout_action . A value of 0 is an infinite wait.
Range of values:
Default: 0
Minimum: 0
Maximum: 10000
Prefix: kernel.cell
Module: sgi-cell
Specifies the action that the cmsdaemon should take if the fence operation times out.
| Caution: This parameter is potentially dangerous. You should change it only at the recommendation of SGI Support. |
Range of values:
0 proceeds as if the fence operation returned an error
1 proceeds as if the fence operation succeeded
2 panics
Prefix: kernel.cell
Module: sgi-cell
Specifies whether or not to ignore reset errors.
| Caution: This parameter is potentially dangerous. You should change it only at the recommendation of SGI Support. |
Range of values:
0 does not ignore reset errors (default)
1 ignores reset errors
Prefix: kernel.cell
Module: sgi-cell
Enables or disables tracing for the cms subsystem.
Legal values:
0 enables
1 disables
Prefix: kernel.cell
Module: sgi-cell
Specifies the action that the cmsdaemon should take if reset times out.
| Caution: This parameter is potentially dangerous. You should change it only at the recommendation of SGI Support. |
Range of values:
0 proceeds as if the reset returned an error
1 proceeds as if the reset succeeded
2 panic
Prefix: kernel.cell
Module: sgi-cell
Specifies the maximum number of entries to store in the cred-to-credid cache.
Range of values:
Default: 1000
Minimum: 100
Maximum: 10000
Prefix: kernel.cell
Module: sgi-cell module)
Specifies the priority of the cred aging thread.
| Caution: This parameter is potentially dangerous. You should change it only at the recommendation of SGI Support. |
Range of values:
Default: 255
Minimum: 90
Maximum: 255
Prefix: kernel.cell
Module: sgi-cell
Specifies how long (in seconds) that an entry stays in the cred-to-credid cache.
Range of values:
Default: 15
Minimum: 5
Maximum: 3600
Prefix: kernel.cell
Module: sgi-cell
Specifies the maximum age of a granted range, measured in generations, before a client will voluntarily return it.
Range of values:
Default: 10
Minimum: 0
Maximum: 1000
Prefix: kernel.cell
Module: sgi-cell
See also “cxfs_server_range_age_max”.
During CXFS filesystem relocation/recovery, the data structures for a server-capable administration node must be converted from those for the active metadata server to those for a CXFS client. This parameter specifies the delay (in seconds) before the conversion begins.
| Caution: This parameter is potentially dangerous. You should change it only at the recommendation of SGI Support. |
Range of values:
Default: 0
Minimum: 0
Maximum: 86400 (24 hours)
Prefix: kernel.cell
Module: sgi-cell
Slows down recovery by inserting delays (measured in milliseconds).
Range of values:
Default: 0
Minimum: 0
Maximum: 60000
Prefix: kernel.cell
Module: sgi-cell
Specifies the action taken when a node with stalled recovery is discovered.
0 shuts down a node with stalled recovery (default)
1 panics a node with stalled recovery
Prefix: kernel.cell
Module: sgi-cell
Specifies the time in seconds between recovery time-out polls.
Range of values:
Default: 60
Minimum: 0 (disables recovery polls)
Maximum: 3600
Prefix: kernel.cell
Module: sgi-cell
Specifies the time in seconds after which a node whose status is not changing is considered to have a stalled recovery.
Range of values:
Default: 600
Minimum: 0 (disables time-out; see “Prevent Stalled-Recovery Timeout in a Non-HA DMF Environment” in Chapter 2)
Maximum: 3600
Prefix: kernel.cell
Module: sgi-cell
Specifies the time in seconds following a recovery before the recovery time-out monitoring begins.
Range of values:
Default: 60
Minimum: 0
Maximum: 3600
Prefix: kernel.cell
Module: sgi-cell
Specifies the maximum age of a granted range, measured in generations, before the server will recall it.
Range of values:
Default: 10
Minimum: 0
Maximum: 1000
Prefix: kernel.cell
Module: sgi-cell
See also “cxfs_client_range_age_max”.
Verifies compliance with the token and locking hierarchy. The value is a bitmask as described in “Interpretations of Bit Values for Standard and Debug Kernels”.
Range of values:
Default: 0x0
Minimum: 0
Maximum: Oxff
Prefix: kernel.cell
Module: sgi-cell
Specifies the bitmask that indicates whether cxfsd tasks must be run synchronously or asynchronously by threads from a cxfsd thread pool. The bits correspond to the opcodes in cxfsd.h.
Range of values:
Minimum: 0 (all cxfsd operations asynchronous)
Maximum 0x7fffffff (all cxfsd operations synchronous)
Prefix: fs.cxfs
Module: sgi-cxfs
Specifies the minimum number of threads to use for processing cluster events.
Range of values:
Default: 0 (specifies that the system will pick a suitable value)
Minimum: 0
Maximum: 100000
Prefix: kernel.cell
Module: sgi-cell
Specifies the maximum number of threads to use for processing cluster events.
Range of values:
Default: 0 (specifies that the system will pick a suitable value)
Minimum: 0
Maximum: 100000
Prefix: kernel.cell
Module: sgi-cell
Specifies the minimum number of threads to use for processing general (low-priority) message traffic within the local node.
Range of values:
Default: 0 (specifies that the system will pick a suitable value)
Minimum: 0
Maximum: 100000
Prefix: kernel.cell
Module: sgi-cell
Specifies the maximum number of threads to use for processing priority message traffic within the local node. This ensures that even if mlb_notify_min threads are tied up handling general traffic, additional threads can be created to handle priority traffic.
Range of values:
Default: 0 (specifies that the system will pick a suitable value)
Minimum: 0
Maximum: 100000
Prefix: kernel.cell
Module: sgi-cell
Specifies the maximum number of auxiliary mlb_notify threads used for processing messages for which processing can take a long time, for message traffic within the local node.
Range of values:
Default: 0 (specifies that the system will pick a suitable value)
Minimum: 0
Maximum: 100000
Prefix: kernel.cell
Module: sgi-cell
Specifies how long a thread within the local node will sleep when it finds no work, to help reduce the rate at which CXFS creates and destroy threads.
Range of values:
Default: 0 (specifies that the system will pick a suitable value)
Minimum: 0
Maximum: 3600
Prefix: kernel.cell
Module: sgi-cell
Enables checksumming. Normally, this is not needed and is only used if TCP data corruption is suspected.
Range of values:
0 performs no validation (default)
1 generates checksums, but does not perform validation
2 generates and validates checksums, warns (via a SYSLOG message) on validation failure
3 generates and validates checksums, warns and returns an error message on validation failure
4 generates and validates checksums, warns and panics on validation error
Prefix: kernel.cell
Module: sgi-cell
Specifies the maximum number of auxiliary mtcp_notify threads used for processing message traffic between nodes for those messages that can take a long time.
Range of values:
Default: 0 (specifies that the system will pick a suitable value)
Minimum: 0
Maximum: 100000
Prefix: kernel.cell
Module: sgi-cell
Specifies the maximum number of threads to use for processing priority message traffic between nodes. This ensures that even if mtcp_notify_min threads are tied up handling general traffic, additional threads can be created to handle priority traffic.
Range of values:
Default: 0 (specifies that the system will pick a suitable value)
Minimum: 0
Maximum: 100000
Prefix: kernel.cell
Module: sgi-cell
Specifies the minimum number of threads to use for processing general (low-priority) message traffic between nodes.
Range of values:
Default: 0 (specifies that the system will pick a suitable value)
Minimum: 0
Maximum: 100000
Prefix: kernel.cell
Module: sgi-cell
Specifies how long a thread between nodes will sleep when it finds no work, to help reduce the rate at which CXFS creates and destroy threads.
Range of values:
Default: 0 (specifies that the system will pick a suitable value)
Minimum: 0
Maximum: 3600
Prefix: kernel.cell
Module: sgi-cell