An Array system is an aggregation of nodes, which are IRIX servers bound together with a high-speed network and Array 3.0 software. Array users are IRIX users who enjoy the advantage of greater performance and additional services. Array users access the system with familiar commands for job control, login and password management, and remote execution.
Array 3.0 augments conventional IRIX facilities with additional services for array users and for array administrators. The extensions include support for global session management, array configuration management, batch processing, message passing, system administration, and performance visualization.
This chapter introduces the extensions for Array use, with pointers to more detailed information. (Appendix B, “Array Documentation Quick Reference,” summarizes all the pointers for quick access.) The principal topics are as follows:
“Using an Array System” summarizes what a user needs to know, and the main facilities a user has available.
“Managing Local Processes” reviews the conventional tools for listing and controlling processes within one node.
“Managing Batch Jobs with NQE” summarizes the use of the Network Queueing Environment.
“Using Array Services Commands” describes the common concepts, options, and environment variables used by the Array Services commands.
“Interrogating the Array” summarizes how to use Array Services commands to learn about the Array and its workload, with examples.
“Managing Distributed Processes” summarizes how to use Array Services commands to list and control processes in multiple nodes.
As an ordinary user of an Array system you are an IRIX (that is, UNIX) user, with the additional benefit of being able to run distributed sessions on multiple nodes of the Array. You access the Array from either
A workstation such as an SGI O2
An X-terminal
An ASCII terminal
In each case, you log in to one node of the Array in the way you would log in to any remote UNIX host. From a workstation or an X-terminal you can of course open more than one terminal window and log into more than one node.
In order to use an Array, you need the following items of information:
The name of the Array.
You use this arrayname in Array Services commands.
The login name and password you will use on the Array
You use these when logging in to the Array to use it.
The hostnames of the array nodes.
Typically these names follow a simple pattern, often arrayname1, arrayname2, etc.
Any special resource-distribution or accounting rules that may apply to you or your group under a job scheduling system.
You can learn the hostnames of the array nodes if you know the array name, using the ainfo command:
ainfo -a arrayname machines |
Each node in an Array is a Silicon Graphics, Inc. multiprocessor system such as an Origin2000. Each node has an associated hostname and IP network address. Typically, you use an Array by logging in to one node directly, or by logging in remotely from another host (such as the Array console or a networked workstation). For example, from a workstation on the same network, this command would log you in to the node named hydra6:
rlogin hydra6 |
For details of the rlogin command, see the reference page rlogin(1).
The system administrators of your Array may choose to disallow direct node logins in order to schedule array resources. If your site is configured to disallow direct node logins, your administrators will be able to tell you how you are expected to submit work to the array—perhaps through remote execution software or batch queueing facilities.
Once you have access to an Array you can invoke programs of several classes:
Ordinary (sequential) applications
Parallel shared-memory applications within a node
Parallel message-passing applications within a node
Parallel message-passing applications distributed over multiple nodes (and possibly other servers on the same network running Array 3.0)
If you are allowed to do so, you can invoke programs explicitly from a logged-in shell command line; or you may use remote execution or a batch queueing system.
Programs that are X-windows clients must be started from an X server, either an X-terminal or a workstation running X Windows.
Some application classes may require input in the form of command line options, environment variables, or support files upon execution. For example:
X client applications need the DISPLAY environment variable set to specify the X server (workstation or X-terminal) where their windows will display.
The DISPLAY variable is normally set automatically when you use rlogin from an SGI workstation.
A multithreaded program may require environment variables to be set describing the number of threads.
For example, C and Fortran programs that use parallel processing directives test the MP_SET_NUMTHREADS variable.
MPI and PVM message-passing programs may require support files to describe how many tasks to invoke on which nodes.
Some information sources on program invocation are listed in Table 2-1.
Table 2-1. Information Sources: Invoking a Program
Topic | Book, Reference Page, or URL | Book Number |
---|---|---|
Remote login | rlogin(1) |
|
Setting environment variables | environ(5), env(1) |
|
Starting MPI and PVM jobs | MPI and PVM User's Guide | 007-3286-xxx |
Each IRIX process has a process identifier (PID), a number that identifies that process within the node where it runs. It is important to realize that a PID is local to the node; so it is possible to have processes in different nodes using the same PID numbers.
Within a node, processes can be logically grouped in process groups. A process group is composed of a parent process together with all the processes that it creates. Each process group has a process group identifier (PGID). Like a PID, a PGID is defined locally to that node, and there is no guarantee of uniqueness across the Array.
You query the status of processes using the IRIX command ps. To generate a full list of all processes on a local system, use a command such as
ps -elfj |
You can monitor the activity of processes using the command top (an ASCII display in a terminal window) or gr_top (displays in a graphical window).
For a global picture of the state of one node you can use gr_osview. It displays a variety of resource use values as histograms or bar-graphs in a graphical window. The command gmemusage displays memory use by all applications in the node where you start it.
You can start a process at a reduced priority, so that it interferes less with other processes, using the nice command. If you use the csh shell, specify /usr/bin/nice to avoid the built-in shell command nice. To start a whole shell at low priority, use a command like
/usr/bin/nice /bin/sh |
You can schedule commands to run at specific times using the at command. You can kill or stop processes using the kill command. To destroy the process with PID 13032, use a command such as
kill -KILL 13032 |
Table 2-2 summarizes information about local process management.
Table 2-2. Information Sources: Local Process Management
Topic | Book, Reference Page, or URL | Book Number |
---|---|---|
Process ID and process group | intro(2) — scan to the section headed “Definitions” |
|
Listing and Monitoring Processes | ps(1), top(1), and gr_top(1); gr_osview(1), gmemusage(1) |
|
Running programs at low priority | nice(1), batch(1) |
|
Running programs at a scheduled time | at(1) |
|
Terminating a process | kill(1) |
|
The Network Queueing Environment (NQE) is used to manage batch jobs. A batch job is a set of commands—a shell script. You submit batch job requests from a workstation to NQE, and NQE routes the jobs to an appropriate server. When a job completes, NQE returns the standard output and standard error files to the workstation. You can monitor the status of jobs, as well as delete or signal jobs.
NQE provides reliable file transfer with the File Transfer Agent (FTA), so that job scripts can transfer files to and from remote systems. If a file transfer fails for a transient reason such as a network link failing, FTA automatically requeues the transfer. This is useful in job requests because a job does not abort if the file transfer fails on the first attempt. If allowed by the site, a password is not required for the file transfer. This capability of FTA is called Network Peer-to-Peer Authorization (NPPA).
NQE is usually installed in /usr/craysoft/nqe/bin on IRIX workstations. If that directory doesn't exist, contact your system administrator to see if NQE is installed and where it is installed. If /usr/craysoft/nqe/bin does exit, add it to your PATH variable. For example:
% setenv PATH $PATH:/usr/craysoft/nqe/bin |
or
$ export PATH=$PATH:/usr/craysoft/nqe/bin |
The easiest way to start using NQE is through its graphical interface as implemented by the nqe command (see the nqe(1) reference page). If you run nqe on your workstation, you just start it. If you need to start nqe on an array node, with output to your workstation, you may need to set your DISPLAY variable first, as shown in the following example:
% setenv DISPLAY myworkstation:0 % nqe |
Figure 2-1 shows the initial (top-level) NQE button bar window that should immediately appear.
To see the status of jobs running under NQE, click on the Status button. Figure 2-2 shows an example of the Status window.
The example Status Window displays two jobs. Both are executing on the server homegrown and both are running (or will run) under the user account guest.
To refresh the status display, use the Refresh button in the Status window. You may also have the display refreshed periodically by setting the refresh option in the NQE Configuration Information Window, shown in Figure 2-3. Access the NQE Configuration Information Window using the Config button on the NQE button bar.
The slide bar titled “Status Refresh Rate” (in Figure 2-3) sets the refresh rate to a value other than 0. If the rate is set to 60, the NQE status display will be refreshed every 60 seconds.
To submit a new batch job, display the Submit window (accessed using the Submit button in the NQE button bar). Figure 2-4 shows an example of the Submit window with a sample job script. To submit the job, click the Submit button.
A few details of the example job script shown in Figure 2-4 are of interest. The #QSUB string is an NQE directive, used to embed command line options within the script. (See the cqsub(1) or qsub(1) reference page for more information on embedded options.) The line
#QSUB -a 8:05 |
indicates to NQE that the job request should be started sometime after 8pm (20:00). The line
#QSUB -A nqearray
indicates to NQE that the job should run using the project “nqearray”. (See the projects(5) reference page for more information on project names.)
You can also operate NQE using a command-line interface. The NQE commands are summarized in Table 2-3. For details of the command-line interface, see the NQE User's Guide.
Table 2-3. NQE Command Line Interface Summary
Command Name | Purpose |
---|---|
cevent | Posts, reads, and deletes information on job-dependency events. |
cqdel | Signals or deletes a job request. |
cqstatl | Displays the status of job requests. |
cqsub | Submits a job script. |
ftua | File transfer utility, similar to FTP but with file transfer queuing and recovery (server command only). |
ilb | Executes commands interactively on a host chosen by NQE. |
qalter | Alters the attributes of a job request (server command only). |
qchkpnt | Checkpoints a job (may only be invoked within a job script). |
qdel | Signals or deletes a job request (server command only). |
qlimit | Displays the job limits that apply to an NQE server (server command only). |
qmsg | Writes messages to stderr, stdout, or the job log (server command only). |
qping | Determines if the local NQS daemon is running (server command only). |
qstat | Displays the status of job requests (server command only). |
qsub | Submits a job script (server command only). |
rft | File transfer command, suitable for use in job scripts (server command only). |
When an application starts processes on more than one node, the PID and PGID are no longer adequate to manage the application. The commands of the Array Services component of Array 3.0 give you the ability to view the entire Array, and to control the processes of multinode programs.
![]() | Tip: You can use Array Services commands from any workstation connected to an Array system, for example from a workstation. You don't have to be logged in to an Array node. |
This topic introduces the terms, concepts, and command options that are common to all Array Services commands. For details about any of the commands, see one of the reference pages listed in Table 2-4.
Table 2-4. Information Sources: Array Services Commands
Topic | Book, Reference Page, or URL | Book Number |
---|---|---|
Array Services Overview | array_services(5) |
|
ainfo command | ainfo(1) |
|
array command | use: array(1); configuration: arrayd.conf(4) |
|
arshell command | arshell(1) |
|
aview command | aview(1) |
|
newsess command | newsess(1) |
|
As noted under “Distributed Management Tools”, Array Services is composed of a daemon—a background process that is started at boot time in every node—and a set of commands such as ainfo. The commands call on the daemon process in each node to get the information they need.
One concept that is basic to Array Services is the array session, which is a term for all the processes of one application, wherever they may execute. Normally, your login shell, with the programs you start from it, constitutes an array session. A batch job is an array session; and you can create a new shell with a new array session identity.
Each session is identified by an array session handle (ASH), a number that identifies any process that is part of that session. You use the ASH to query and to control all the processes of a program, even when they are running in different nodes.
Each node is an IRIX server, and as such has a hostname. The hostname of a node is returned by the hostname command executed in that node:
% hostname tokyo |
The command is simple and documented in the hostname(1) reference page. The more complicated issues of hostname syntax, and of how hostnames are resolved to hardware addresses are covered in hostname(5).
An Array system as a whole has a name too. In most installations there is only a single Array, and you never need to specify which Array you mean. However, it is possible to have multiple Arrays available on a network, and you can direct Array Services commands to a specific Array.
It is possible for the Array administrator to establish an authentication code, which is a 64-bit number, for all or some of the nodes in an array (see “Configuring Authentication Codes”). When this is done, each use of an Array Services command must specify the appropriate authentication key, as a command option, for the nodes it uses. Your system administrator will tell you if this is necessary.
The commands of Array Services—ainfo, array, arshell, aview, and newsess—have a consistent set of command options. Table 2-5 is a summary of these options. Not all options are valid with all commands; and each command has unique options besides those shown. The default values of some options are set by environment variables listed in the next topic.
Table 2-5. Array Services Command Option Summary
Option | Used In | Meaning |
---|---|---|
-a array | ainfo, array, aview | Specify a particular Array when more than one is accessible. |
-D | ainfo, array, arshell, aview | Send commands to other nodes directly, rather than through array daemon. |
-F | ainfo, array, arshell, aview | Forward commands to other nodes through the array daemon. |
-Kl number | ainfo, array, aview | Authentication key (a 64-bit number) for the local node. |
-Kr number | ainfo, array, aview | Authentication key (a 64-bit number) for the remote node. |
-l (letter ell) | ainfo, array | Execute in context of the destination node, not the current node. |
-p port | ainfo, array, arshell, aview | Nonstandard port number of array daemon. |
-s hostname | ainfo, array, aview | Specify a destination node. |
The -l and -s options work together. The -l (letter ell for local) option restricts the scope of a command to the node where the command is executed. By default, that is the node where the command is entered. When -l is not used, the scope of a query command is all nodes of the array. The -s (server, or node, name) option directs the command to be executed on a specified node of the array. These options work together in query commands as follows:
To interrogate all nodes as seen by the local node, use neither option.
To interrogate only the local node, use only -l.
To interrogate all nodes as seen by a specified node, use only -s.
To interrogate only a particular node, use both -s and -l.
The Array Services commands depend on environment variables to define default values for the less-common command options. These variables are summarized in Table 2-6.
Table 2-6. Array Services Environment Variables
Variable Name | Use | Default When Undefined |
---|---|---|
ARRAYD_FORWARD | When defined with a string starting with the letter y, all commands default to forwarding through the array daemon (option -F). | Commands default to direct communication (option -D). |
ARRAYD_PORT | The port (socket) number monitored by the array daemon on the destination node. | The standard number of 5434, or the number given with option -p. |
ARRAYD_LOCALKEY | Authentication key for the local node (option -Kl). | No authentication unless -Kl option is used. |
ARRAYD_REMOTEKEY | Authentication key for the destination node (option -Kr). | No authentication unless -Kr option is used. |
ARRAYD | The destination node, when not specified by the -s option. | The local node, or the node given with -s. |
Any user of an Array system can use Array Services commands to check the hardware components and the software workload of the Array. The commands needed are ainfo, array, and aview.
If your network includes more than one Array system, you can use ainfo arrays at one array node to list all the Array names that are configured, as in the following example.
homegrown% ainfo arrays Arrays known to array services daemon ARRAY DevArray IDENT 0x3381 ARRAY BigDevArray IDENT 0x7456 ARRAY test IDENT 0x655e |
Array names are configured into the array database by the administrator. Different Arrays might know different sets of other Array names.
You can use ainfo machines to learn the names and some features of all nodes in the current Array, as in the following example.
homegrown 175% ainfo -b machines machine homegrown homegrown 5434 192.48.165.36 0 machine disarray disarray 5434 192.48.165.62 0 machine datarray datarray 5434 192.48.165.64 0 machine tokyo tokyo 5434 150.166.39.39 0 |
In this example, the -b option of ainfo is used to get a concise display.
You can use ainfo nodeinfo to request detailed information about one or all nodes in the array. To get information about the local node, use ainfo -l nodeinfo. However, to get information about only a particular other node, for example node tokyo, use -l and -s, as in the following example. (The example has been edited for brevity.)
homegrown 181% ainfo -s tokyo -l nodeinfo Node information for server on machine "tokyo" MACHINE tokyo VERSION 1.2 8 PROCESSOR BOARDS BOARD: TYPE 15 SPEED 190 CPU: TYPE 9 REVISION 2.4 FPU: TYPE 9 REVISION 0.0 ... 16 IP INTERFACES HOSTNAME tokyo HOSTID 0xc01a5035 DEVICE et0 NETWORK 150.166.39.0 ADDRESS 150.166.39.39 UP DEVICE atm0 NETWORK 255.255.255.255 ADDRESS 0.0.0.0 UP DEVICE atm1 NETWORK 255.255.255.255 ADDRESS 0.0.0.0 UP ... 0 GRAPHICS INTERFACES MEMORY 512 MB MAIN MEMORY INTERLEAVE 4 |
If the -l option is omitted, the destination node will return information about every node that it knows.
The IRIX commands who, top, and uptime are commonly used to get information about users and workload on one server. The array command offers Array-wide equivalents to these commands.
To get the names of all users logged in to the whole array, use array who. To learn the names of users logged in to a particular node, for example tokyo, use -l and -s, as in the following example. (The example has been edited for brevity and security.)
homegrown 180% array -s tokyo -l who joecd tokyo frummage.eng.sgi -tcsh joecd tokyo frummage.eng.sgi -tcsh benf tokyo einstein.ued.sgi. /bin/tcsh yohn tokyo rayleigh.eng.sg vi +153 fs/procfs/prd ... |
Two variants of the array command return workload information. The array-wide equivalent of uptime is array uptime, as follows:
homegrown 181% array uptime homegrown: up 1 day, 7:40, 26 users, load average: 7.21, 6.35, 4.72 disarray: up 2:53, 0 user, load average: 0.00, 0.00, 0.00 datarray: up 5:34, 1 user, load average: 0.00, 0.00, 0.00 tokyo: up 7 days, 9:11, 17 users, load average: 0.15, 0.31, 0.29 homegrown 182% array -l -s tokyo uptime tokyo: up 7 days, 9:11, 17 users, load average: 0.12, 0.30, 0.28 |
The command array top lists the processes that are currently using the most CPU time, with their ASH values, as in the following example.
homegrown 183% array top ASH Host PID User %CPU Command ---------------------------------------------------------------- 0x1111ffff00000000 homegrown 5 root 1.20 vfs_sync 0x1111ffff000001e9 homegrown 1327 guest 1.19 atop 0x1111ffff000001e9 tokyo 19816 guest 0.73 atop 0x1111ffff000001e9 disarray 1106 guest 0.47 atop 0x1111ffff000001e9 datarray 1423 guest 0.42 atop 0x1111ffff00000000 homegrown 20 root 0.41 ShareII 0x1111ffff000000c0 homegrown 29683 kchang 0.37 ld 0x1111ffff0000001e homegrown 1324 root 0.17 arrayd 0x1111ffff00000000 homegrown 229 root 0.14 routed 0x1111ffff00000000 homegrown 19 root 0.09 pdflush 0x1111ffff000001e9 disarray 1105 guest 0.02 atopm |
The -l and -s options can be used to select data about a single node, as usual.
The ArrayView, or aview, command is a graphical window on the status of an array. You can start it with the command aview and it displays a window similar to the one shown in Figure 2-5. The top window shows one line per node. There is a window for each node, headed by the node name and its hardware configuration. Each window contains a snapshot of the busiest processes in that node.
Using commands from the Array Services component of Array 3.0, you create and manage processes that are distributed across multiple nodes of the Array system.
In an Array system you can start a program whose processes are in more than one node. In order to name such collections of processes, Array 3.0 software assigns each process to an array session handle (ASH).
An ASH is a number that is unique across the entire array (unlike a PID or PGID). An ASH is the same for every process that is part of a single array session—no matter which node the process runs in. You display and use ASH values with Array Services commands. Each time you log in to an Array node, your shell is given an ASH, which is used by all the processes you start from that shell.
The command ainfo ash returns the ASH of the current process on the local node, which is simply the ASH of the ainfo command itself.
homegrown 178% ainfo ash Array session handle of process 10068: 0x1111ffff000002c1 homegrown 179% ainfo ash Array session handle of process 10069: 0x1111ffff000002c1 |
In the preceding example, each instance of the ainfo command was a new process: first PID 10068, then PID 10069. However, the ASH is the same in both cases. This illustrates a very important rule: every process inherits its parent's ASH. In this case, each instance of array was forked by the command shell, and the ASH value shown is that of the shell, inherited by the child process.
You can create a new global ASH with the command ainfo newash, as follows:
homegrown 175% ainfo newash Allocating new global ASH 0x11110000308b2f7c |
This feature has little use at present. There is no existing command that can change its ASH, so you cannot assign the new ASH to another command. It is possible to write a program that takes an ASH from a command-line option and uses the Array Services function setash() to change to that ASH (however such a program must be privileged). No such program is distributed with Array 3.0 (but see “Managing Array Service Handles”).
The command array ps returns a summary of all processes running on all nodes in an array. The display shows the ASH, the node, the PID, the associated username, the accumulated CPU time, and the command string.
To list all the processes on a particular node, use the -l and -s options. To list processes associated with a particular ASH, or a particular username, pipe the returned values through grep, as in the following example. (The display has been edited to save space.)
homegrown 182% array -l -s tokyo ps | fgrep wombat 0x261cffff0000054c tokyo 19007 wombat 0:00 -csh 0x261cffff0000054a tokyo 17940 wombat 0:00 csh -c (setenv... 0x261cffff0000054c tokyo 18941 wombat 0:00 csh -c (setenv... 0x261cffff0000054a tokyo 17957 wombat 0:44 xem -geometry 84x42 0x261cffff0000054a tokyo 17938 wombat 0:00 rshd 0x261cffff0000054a tokyo 18022 wombat 0:00 /bin/csh -i 0x261cffff0000054a tokyo 17980 wombat 0:03 /usr/gnu/lib/ema... 0x261cffff0000054c tokyo 18928 wombat 0:00 rshd |
When you have Performance Co-Pilot installed (see “Performance Co-Pilot”) you have two additional commands for listing processes: ashtop displays a continuously updated list of the processes that are executing under a specified ASH (see the ashtop(1) reference page, if installed). The arraytop command produces a similar display for the entire array (see the arraytop(1) reference page, if installed). Both of these, and additional features of Performance Co-Pilot, are described in the pcp_array(5) reference page.
The arshell command lets you start an arbitrary program on a single other node. The array command gives you the ability to suspend, resume, or kill all processes associated with a specified ASH.
The arshell command is an Array Services extension of the familiar rsh command; it executes a single IRIX command on a specified Array node. The difference from rsh is that the remote shell executes under the same ASH as the invoking shell (this is not true of simple rsh). The following example demonstrates the difference.
homegrown 179% ainfo ash Array session handle of process 8506: 0x1111ffff00000425 homegrown 180% rsh guest@tokyo ainfo ash Array session handle of process 13113: 0x261cffff0000145e homegrown 181% arshell guest@tokyo ainfo ash Array session handle of process 13119: 0x1111ffff00000425 |
You can use arshell to start a collection of unrelated programs in multiple nodes under a single ASH; then you can use the commands described under “Managing Session Processes” to stop, resume, or kill them.
Both MPI and PVM use arshell to start up distributed processes.
![]() | Tip: The shell is a process under its own ASH. If you use the array command to stop or kill all processes started from a shell, you will stop or kill the shell also. In order to create a group of programs under a single ASH that can be killed safely, proceed as follows: |
Create a nested shell with a new ASH using newsess. Note the ASH value.
Within the new shell, start one or more programs using arshell.
Exit the nested shell.
Now you are back to the original shell. You know the ASH of all programs started from the nested shell. You can safely kill all jobs that have that ASH because the current shell is not affected.
The programs launched with arshell are not coordinated (they could of course be written to communicate with each other, for example using sockets), and you must start each program individually.
The array command is designed to permit the simultaneous launch of programs on all nodes with a single command. However, array can only launch programs that have been configured into it, in the Array Services configuration file. (The creation and management of this file is discussed under “About Array Configuration”.)
In order to demonstrate process management in a simple way from the command line, the following command was inserted into the configuration file /usr/lib/array/arrayd.conf:
# # Local commands # command spin # Do nothing on multiple machines invoke /usr/lib/array/spin user %USER group %GROUP options nowait |
The invoked command, /usr/lib/array/spin, is a shell script that does nothing in a loop, as follows:
#!/bin/sh # Go into a tight loop # interrupted() { echo "spin has been interrupted - goodbye" exit 0 } trap interrupted 1 2 while [ ! -f /tmp/spin.stop ]; do sleep 5 done echo "spin has been stopped - goodbye" exit 1 |
With this preparation, the command array spin starts a process executing that script on every processor in the array. Alternatively, array -l -s nodename spin would start a process on one specific node.
The following command sequence creates and then kills a spin process in every node. The first step creates a new session with its own ASH. This is so that later, array kill can be used without killing the interactive shell.
homegrown 175% ainfo ash Array session handle of process 8912: 0x1111ffff0000032d homegrown 176% newsess homegrown 175% ainfo ash Array session handle of process 8941: 0x11110000308b2fa6 |
In the new session with ASH 0x11110000308b2fa6, the command array spin starts the /usr/lib/array/spin script on every node. In this test array, there were only two nodes on this day, homegrown and tokyo.
homegrown 176% array spin |
After exiting back to the original shell, the command array ps is used to search for all processes that have the ASH 0x11110000308b2fa6.
homegrown 177% exit homegrown 178% homegrown 177% homegrown 177% ainfo ash Array session handle of process 9257: 0x1111ffff0000032d homegrown 179% array ps | fgrep 0x11110000308b2fa6 0x11110000308b2fa6 homegrown 9033 guest 0:00 /bin/sh /usr/lib/array/spin 0x11110000308b2fa6 homegrown 9618 guest 0:00 sleep 5 0x11110000308b2fa6 tokyo 26021 guest 0:00 /bin/sh /usr/lib/array/spin 0x11110000308b2fa6 tokyo 26072 guest 0:00 sleep 5 0x1111ffff0000032d homegrown 9642 guest 0:00 fgrep 0x11110000308b2fa6 |
There are two processes related to the spin script on each node. The next command kills them all.
homegrown 180% array kill 0x11110000308b2fa6 homegrown 181% array ps | fgrep 0x11110000308b2fa6 0x1111ffff0000032d homegrown 10030 guest 0:00 fgrep 0x11110000308b2fa6 |
The command array suspend 0x11110000308b2fa6 would suspend the processes instead (however, it is hard to demonstrate that a sleep command has been suspended).