Chapter 3. Administering an Array

An Array system is an aggregation of node, which are IRIX servers bound together with a high-speed network and Array 3.0 software. Array administrators are IRIX system administrators who must use additional tools to configure and manage the Array as an Array.

This chapter surveys the tools that you as an administrator use, with pointers to more detailed information. (Appendix B, “Array Documentation Quick Reference,” summarizes all pointers for quick access.) The main topics covered include:

Using the IRISconsole Workstation

The Silicon Graphics, Inc. O2 or Indy workstation running IRISconsole is your primary control point, from which you can manage the hardware and software of all nodes in the array, performing all operator actions on every node, even including power-cycling and rebooting. Figure 3-1 shows the IRISconsole main window.

Figure 3-1. IRISconsole Main Window

Figure 3-1 IRISconsole Main Window

From the main window you can open the Systems window in which you find an icon for each node in the selected Array. This is shown in Figure 3-2.

Figure 3-2. IRISconsole Systems Window

Figure 3-2 IRISconsole Systems Window

In the Systems window you can select one system icon and apply any of the functions shown as buttons to that system. Several functions allow you to perform radical hardware operations: power-cycle, reset, or cause a nonmaskable interrupt (NMI).

The HW Status function opens a hardware status window for the selected system, as shown in Figure 3-3. From this window you can monitor power-supply voltages, internal temperature, and fan RPM. (This feature is available only for nodes that are Challenge or Onyx systems, not for Origin2000 systems.)

Figure 3-3. IRISconsole Hardware Status Window

Figure 3-3 IRISconsole Hardware Status Window

The Get Console function opens an IRIX shell window that is the system console window for the selected system. Figure 3-4 shows a system console window in the middle of a reboot.

Figure 3-4. IRISconsole System Console Window

Figure 3-4 IRISconsole System Console Window

Using Other Tools With IRISconsole

The same workstation that runs IRISconsole can be used simultaneously to run other administrative and diagnostic software such as Performance Co-pilot.

You or a system programmer can run Performance Co-Pilot in order to analyze the performance of the Array. Performance Co-Pilot works well with IRISconsole: from system console windows you can start and stop processes, and with PCP you can observe the performance effects.

Other products for monitoring and analysis can be executed at the console workstation so that all your display tools are on the same screen. For example, NetVisualyzer™ can be used to monitor network traffic concurrently with IRISconsole and Performance Co-Pilot.

You can also run the aview command at the console station, as another way of staying aware of the system status (see “Browsing With ArrayView”).

Table 3-1 lists information sources for the console management aids.

Table 3-1. Information Sources: Console Management

Topic

Book or URL

Book Number

IRISconsole

IRISconsole Administrator's Guide http://www.sgi.com/Products/hardware/challenge/IRISconsole.html http://www.sgi.com/Products/hardware/challenge/ IRISconsole.html

007-2872-xxx

IRISconsole hardware

IRISconsole ST-1600 Installation GuideIndy Workstation Owner's Guide

007-2839-xxx

007-9804-xxx

Performance Co-Pilot

The Performance Co-Pilot User's and Administrator's Guide

007-2614-xxx

NetVisualyzer

NetVisualyzer User's Guide

007-0812-xxx

aview command

aview(1) and “Browsing With ArrayView”

 


About Array Configuration

The system administrator has to initialize the Array configuration database, a file that is used by the Array Services daemon in executing almost every ainfo and array command. For details about array configuration, see the reference pages cited in Table 3-2.

Table 3-2. Information Sources: Array Configuration

Topic

Book, Reference Page, or URL

Book Number

Array Services overview

array_services(5)

 

Array Services user commands

ainfo(1), array(1)

 

Array Services daemon overview

arrayd(1m)

 

Configuration file format

arrayd.conf(4), /usr/lib/array/arrayd.conf.template

 

Configuration file validator

ascheck(1)

 

Array Services simple configurator

arrayconfig(1m)

 


About the Uses of the Configuration File

The configuration files are read by the Array Services daemon when it starts. Normally it is started in each node during the system bootstrap. (You can also run the daemon from a command line in order to check the syntax of the configuration files.)

The configuration files inform the daemon of the data needed by ainfo and array:

  • The names of Array systems, including the current Array but also any other Arrays on which a user could run an Array Services command (reported by ainfo).

  • The names and types of the nodes in each named Array, especially the hostnames that would be used in an Array Services command (reported by ainfo).

  • The authentication keys, if any, that must be used with Array Services commands (required as -Kl and -Kr command options, see “Summary of Common Command Options”).

  • The commands that are valid with the array command.

About Configuration File Format and Contents

A configuration file is a readable text file. The file contains entries of the following four types, which are detailed in later topics.

Array definition

Describes this array and other known arrays, including array names and the node names and types.

Command definition

Specifies the usage and operation of a command that can be invoked through the array command.

Authentication

Specifies authentication numbers that must be used to access the Array.

Local option

Options that modify the operation of the other entries or arrayd.

Blank lines, white space, and comment lines beginning with “#” can be used freely for readability. Entries can be in any order in any of the files read by arrayd.

Besides punctuation, entries are formed with a keyword-based syntax. Keyword recognition is not case-sensitive; however keywords are shown in uppercase in this text and in the reference page. The entries are primarily formed from keywords, numbers, and quoted strings, as detailed in the reference page arrayd.conf(4).

Loading Configuration Data

The Array Services daemon, arrayd, one or more filenames as arguments. It reads them all, and treats them like logical continuations (in effect, it concatenates them). If no filenames are specified, it reads /usr/lib/array/arrayd.conf and /usr/lib/array/arrayd.auth. A different set of files, and any other arrayd command-line options, can be written into the file /etc/config/arrayd.options, which is read by the /etc/init.d/array script that launches arrayd at boot time.

Since configuration data can be stored in two or more files, you can combine different strategies, for example:

  • One file can have different access permissions than another. Typically, /usr/lib/array/arrayd.conf is world-readable and contains the available array commands, while /usr/lib/array/arrayd.auth is readable only by root and contains authentication codes.

  • One node can have different configuration data than another. For example, certain commands might be defined only in certain nodes; or only the nodes used for interactive logins might know the names of all other nodes.

  • You can use NFS-mounted configuration files. You could put a small configuration file on each machine to define the Array and authentication keys, but you could have a larger file defining array commands that is NFS-mounted from one node.

After you modify the configuration files, you can make arrayd reload them by killing the daemon and restarting it in each machine. The script /etc/init.d/array supports this operation: execute

/etc/init.d/array stop

to kill the daemon, and

/etc/init.d/array restart

to kill and restart it in one operation.

The Array Services daemon in any node knows only the information in the configuration files available in that node. This can be an advantage, in that you can limit the use of particular nodes; but it does require that you take pains to keep common information synchronized. (An automated way to do this is sketched under “Designing New Array Commands”.)

About Substitution Syntax

The reference page arrayd.conf(4) details the syntax rules for forming entries in the configuration files. An important feature of this syntax is the use of several kinds of text substitution, by which variable text is substituted into entries when they are executed.

Most of the supported substitutions are used in Command entries. These substitutions are performed dynamically, each time the array command invokes a subcommand. At that time, substitutions insert values that are unique to the invocation of that subcommand. For example, the value %USER inserts the user ID of the user who is invoking the array command. Such a substitution has no meaning except during execution of a command.

Substitutions in other configuration entries are performed only once, at the time the configuration file is read by arrayd. Only environment variable substitution makes sense in these entries. The environment variable values that are substituted are the values inherited by arrayd from the script that invokes it, which is /etc/init.d/array.

Testing Configuration Changes

The configuration files contain many sections and options (detailed in the topics that follow this one). The Array Services command ascheck performs a basic sanity check of all configuration files in the Array.

After making a change, you can test an individual configuration file for correct syntax by executing arrayd as a command with the -c and -f options. For example, suppose you have just added a new command definition to /usr/lib/array/arrayd.local. You can check its syntax with the command

arrayd -c -f /usr/lib/array/arrayd.local

When testing new commands for correct operation, you need to see the warning and error messages produced by arrayd and processes that it may spawn. The stderr messages from a daemon are not normally visible. You can make them visible by the following procedure:

  1. On one node, kill the daemon.

  2. In one shell window on that node, start arrayd with the options -n -v. Instead of moving into the background, it remains attached to the shell terminal.


    Note: Although arrayd becomes functional in this mode, it does not refer to /etc/config/arrayd.options, so you need to specify explicitly all command-line options, such as the names of nonstandard configuration files.


  3. From another shell window on the same or other nodes, issue ainfo and array commands to test the new configuration data. Diagnostic output appears in the arrayd shell window.

  4. Terminate arrayd and restart it as a daemon (without -n).

During steps 1, 2, and 4, the test node may fail to respond to ainfo and array commands, so users should be warned that the Array is in test mode.

Configuring Arrays and Machines

Each ARRAY entry gives the name and composition of an Array system that users can access. At least one ARRAY must be defined at every node, the Array in use.

Specifying Arrayname and Machine Names

A simple example of an ARRAY definition is a follows:

array simple
        machine congo
        machine niger
        machine nile

The arrayname simple is the value the user must specify in the -a option (see “Summary of Common Command Options”). One arrayname should be specified in a DESTINATION ARRAY local option as the default array (reported by ainfo dflt). Local options are listed under “Configuring Local Options”.

The MACHINE subentries of ARRAY define the nodenames that the user can specify with the -s option. These names are also reported by the command ainfo machines.

Specifying IP Addresses and Ports

The simple MACHINE subentries shown in the example are based on the assumption that the hostname is the same as the machine's name to Domain Name Services (DNS). If a machine's IP address cannot be obtained from the given hostname, you must provide a HOSTNAME subentry to specify either a completely-qualified domain name or an IP address, as follows

array simple
        machine congo
            hostname congo.engr.hitech.com
            port 8820
        machine niger
            hostname niger.engr.hitech.com
        machine nile
            hostname "198.206.32.85"

The preceding example also shows how the PORT subentry can be used to specify that arrayd in a particular machine uses a different socket number than the default 5434.

Specifying Additional Attributes

Under both ARRAY and MACHINE you can insert “attributes,” which are named string values. These attributes are not used by Array Services, but they are displayed by ainfo and can be returned to programs using the Array Services library (“Array Services Library”). Some examples of attributes would be as follows:

array simple
        array_attribute config_date="04/03/96"
        machine a_node
            machine_attribute aka="congo"
            hostname congo.engr.hitech.com


Tip: You can write code that fetches any arrayname, machine name, or attribute string from any node in the array. See “Database Interrogation”.


Configuring Authentication Codes

In Array 3.0 only one type of authentication is provided: a simple numeric key that can be required with any Array Services command. You can specify a single authentication code number for each node. The user must specify the code with any command entered at that node, or addressed to that node using the -s option (see “Summary of Common Command Options”).

The arshell command is like rsh in that it runs a command on another machine under the userid of the invoking user. Use of authentication codes makes Array Services somewhat more secure than rsh.

Configuring Array Commands

The user can invoke arbitrary IRIX commands on single nodes using the arshell command (see “Using arshell”). The user can also launch MPI and PVM programs that automatically distribute over multiple nodes. However, the only way to launch coordinated IRIX programs on all nodes at once is to use the array command. This command does not accept any IRIX command; it only permits execution of commands that the administrator has configured into the Array Services database.

You can define any set of commands that your users need. You have complete control over how any single Array node executes a command (the definition can be different in different nodes). A command can simply invoke a standard IRIX command, or, since you can define a command as invoking a script, you can make a command arbitrarily complex.

Operation of Array Commands

When a user invokes the array command, the subcommand and its arguments are processed by the destination node specified by -s. Unless the -l option was given, that daemon also distributes the subcommand and its arguments to all other array nodes that it knows about (the destination node might be configured with only a subset of nodes). At each node, arrayd searches the configuration database for a COMMAND entry with the same name as the array subcommand.

For example, when the user enters

array -s tokyo uptime

the subcommand uptime is processed by arrayd in node tokyo. When it finds the subcommand valid, it distributes it to every node that is configured in the default array at node tokyo.

The COMMAND entry for uptime is distributed in this form (you can read it in the file /usr/lib/array/arrayd.conf).

command uptime          # Display uptime/load of all nodes in array
        invoke /usr/lib/array/auptime %LOCAL

The INVOKE subentry tells arrayd how to execute this command. In this case, it executes a shell script /usr/lib/array/auptime , passing it one argument, the name of the local node. This command is executed at every node, with %LOCAL replaced by that node's name.

Summary of Command Definition Syntax

Look at the basic set of commands distributed with Array 3.0 (/usr/lib/array/arrayd.conf ). Each COMMAND entry is defined using the subentries shown in Table 3-3. (These are described in great detail in reference page arrayd.conf(4).)

Table 3-3. Subentries of a COMMAND Definition

Keyword

Meaning of Following Values

COMMAND

The name of the command as the user gives it to array.

INVOKE

An IRIX command to be executed on every node. The argument values can be literals, or arguments given by the user, or other substitution values.

MERGE

An IRIX command to be executed only on the distributing node, to gather the streams of output from all nodes and combine them into a single stream.

USER

The userid under which the INVOKE and MERGE commands run. Usually given as USER %USER, so as to run as the user who invoked array.

GROUP

The groupname under which the INVOKE and MERGE commands run. Usually given as GROUP %GROUP, so as to run in the group of the user who invoked array (see reference page groups(1)).

PROJECT

The project under which the INVOKE and MERGE commands run. Usually given as PROJECT %PROJECT, so as to run in the project of the user who invoked array (see reference page projects(5)).

OPTIONS

A variety of options to modify this command; see Table 3-5.

The IRIX commands called by INVOKE and MERGE must be specified as full pathnames, because arrayd has no defined execution path. As with a shell script, these IRIX commands are often composed from a few literal values and many substitution strings. The substitutions that are supported (which are documented in detail in the arrayd.conf(4) reference page) are summarized in Table 3-4.

Table 3-4. Substitutions Used in a COMMAND Definition

Substitution

Replacement Value

%1..%9; %ARG(n); %ALLARGS; %OPTARG(n)

Argument tokens from the user's subcommand. %OPTARG does not produce an error message if the specified argument is omitted.

%USER, %GROUP, %PROJECT

The effective userid, effective groupid, and project of the user who invoked array.

%REALUSER, %REALGROUP

The real userid and real groupid of the user who invoked array.

%ASH

The ASH under which the INVOKE or MERGE command is to run.

%PID(ash)

List of PID values for a specified ASH. %PID(%ASH) is a common use.

%ARRAY

The arrayname, either default or as given in the -a option.

%LOCAL

The hostname of the executing node.

%ORIGIN

The full domain name of the node where the array command ran and the output is to be viewed.

%OUTFILE

List of names of temporary files, each containing the output from one node's INVOKE command (valid only in the MERGE subentry).

The OPTIONS subentry permits a number of important modifications of the command execution; these are summarized in Table 3-5.

Table 3-5. Options of the COMMAND Definition

Keyword

Effect on Command

LOCAL

Do not distribute to other nodes (effectively forces the -l option).

NEWSESSION

Execute the INVOKE command under a newly-minted ASH. %ASH in the INVOKE line is the new ASH. The MERGE command runs under the original ASH, and %ASH substitutes as the old ASH in that line.

SETRUID

Set both the real and effective user ID from the USER subentry (normally USER only sets the effective UID).

SETRGID

Set both the real and effective group ID from the GROUP subentry (normally GROUP sets only the effective GID).

QUIET

Discard output of INVOKE, unless if MERGE subentry is given, pass INVOKE output to MERGE as usual and discard the MERGE output.

NOWAIT

Discard output and return as soon as the processes are invoked; do not wait for completion (a MERGE subentry is ineffective).


Configuring Local Options

The LOCAL entry specifies options to arrayd itself. The most important options are summarized in Table 3-6.

Table 3-6. Subentries of the LOCAL Entry

Subentry

Purpose

DIR

Pathname for the arrayd working directory, which is the initial, current working directory of INVOKE and MERGE commands. The default is /usr/lib/array.

DESTINATION ARRAY

Name of the default array, used when the user omits the -a option. When only one ARRAY entry is given, it is the default destination.

USER, GROUP, PROJECT

Default values for COMMAND execution when USER, GROUP, or PROJECT are omitted from the COMMAND definition.

HOSTNAME

Value returned in this node by %LOCAL. Default is the hostname.

PORT

Socket to be used by arrayd.

If you do not supply LOCAL USER, GROUP, and PROJECT values, the default values for USER and GROUP are “guest.”


Note: The HOSTNAME entry is needed whenever the IRIX hostname command does not return a node name as specified in the ARRAY MACHINE entry. In order to supply a LOCAL HOSTNAME entry unique to each node, each node needs an individualized copy of at least one configuration file.


Designing New Array Commands

A basic set of commands is distributed in the file /usr/lib/array/arrayd.conf.template . You should examine this file carefully before defining commands of your own. You can define new commands which then become available to the users of the Array system.

Typically, a new command will be defined with an INVOKE subentry that names a script written in sh, csh, or perl syntax. You use the substitution values to set up arguments to the script. You use the USER, GROUP, PROJECT, and OPTIONS subentries to establish the execution conditions of the script. For one example of a command definition using a simple script, see “About the Distributed Example”.

Within the invoked script you can write any amount of logic to verify and validate the arguments, and to execute any sequence of commands. For an example of a script in perl, see /usr/lib/array/aps, which is invoked by the array ps command.


Tip: perl is a particularly interesting choice for array commands, since perl has native support for socket I/O. In principle at least, you could build a distributed application in perl in which multiple instances are launched by array and coordinate and exchange data using sockets. Performance would not rival the highly tuned MPI and PVM libraries, but development would be simpler.

The administrator has need for distributed applications as well, since the configuration files are distributed over the Array. Here is an example of a distributed command to reinitialize the Array Services database on all nodes at once. The script to be executed at each node, called /usr/lib/array/arrayd-reinit would read as follows:

#!/bin/sh
# Script to reinitialize arrayd with a new configuration file
# Usage:  arrayd-reinit <hostname:new-config-file>
sleep 10      # Let old arrayd finish distributing
rcp $1 /usr/lib/array/
/etc/init.d/array restart
exit 0

The script uses rcp to copy a specified file (presumably a configuration file such as arrayd.conf) into /usr/lib/array (this will fail if %USER is not privileged). Then the script restarts arrayd (see /etc/init.d/array) to reread configuration files.

The command definition would be as follows:

command reinit
   invoke /usr/lib/array/arrayd-reinit %ORIGIN:%1
   user   %USER
   group  %GROUP
   options nowait   # Exit before restart occurs!

The INVOKE subentry calls the restart script shown above. The NOWAIT option prevents the daemon's waiting for the script to finish, since the script will kill the daemon.