Chapter 1. Array System Components

An Array system is a complex system with layers of hardware and software components. This chapter orients you to these components, working from the bottom up:

Each section contains a table of information sources—online and printed books, reference pages, and WWW sites—related to the topic of that section. All such pointers are reproduced in Appendix B, “Array Documentation Quick Reference.”

Array Components

The performance and power of an Array system are the result of linking several symmetric multiprocessor (SMP) computers by a high-performance interconnect, and managing the combination with customized software and bundled application and administrative software.

Array Hardware Components

An Array comprises the following hardware:

  • From two to eight nodes, each of which is a Silicon Graphics, Inc. computer, typically a multiprocessor such as:

    • Origin2000 or Origin200

    • Challenge®, POWER Challenge, or POWER Challenge R10000

    • Onyx2, Onyx or POWER Onyx

  • An interconnecting network, typically one, and as many as six, bidirectional HIPPI network interfaces per node and a HIPPI crossbar switch.

  • One IRISconsole as an administration console.

    An IRISconsole is an O2 or Indy workstation augmented with an IRISconsole serial port multiplexer.

A complete Array system is shown schematically in Figure 1-1.

Figure 1-1. Array System Schematic

Figure 1-1 Array System Schematic

Array Software Components

The Array 3.0 software binds the Array system hardware into a supercomputer that can be programmed and administered as one system. An Array system using Array 3.0 software is based on the following major components:

Array diagnostics

Diagnostics used by Silicon Graphics, Inc. system engineers to verify installation and isolate faults.

IRIX 6.2 and IRIX 6.4

Multiprocessor operating systems including NFS™ version 3 network support.

XFS filesystem

High-performance, ultra-high capacity, journaled filesystem manages large RAID and disk farms.

HIPPI software

Support for high-performance network link, including an SGI-proprietary fast path for minimum overhead on short messages.

Array Services

Integrated administration tools.

IRISconsole

Permits centralized administration of all nodes in the Array.

MPI (Message-Passing Interface) 3.0 and XMPI

Distributed programming environment with support for HIPPI bypass, and visual monitor.

PVM (Parallel Virtual Machine) 1.2 and XPVM

Popular distributed programming environment and visual monitor.

Many optional software packages are available from Silicon Graphics, Inc. to extend Array 3.0, including:

Network Queuing Environment (NQE)

Load-balancing and scheduling facility that lets users submit, monitor, and control workacross machines in a network.

Performance Co-Pilot (PCP)

Performance visualization facility.

ProDev WorkShop

Suite of graphical tools for developing parallel programs.

A variety of software packages from third parties also are available, including:

SHARE II

Resource-centric Fair Share scheduler from Softway (systems using IRIX 6.2 only).

PerfAcct

Accounting software by Instrumental, Inc.

Codine

Batch-scheduling facility by GENIAS Software.

LSF (Load Sharing Facility)

Batch scheduling facility by Platform Computing.

High Performance Fortran (HPF)

Compilers available from the Portland Group (PGI) and Applied Parallel Research (APR)

Most of these components are described at more length in following topics.

Array Architecture

An Array system is a distributed-memory multiprocessor, scalable to several hundred individual MIPS processors in as many as eight nodes, yielding a peak aggregate computing capacity of many GFLOPS. The aggregation of nodes is connected by an industry-standard, 1.0 Gbit per second HIPPI network.

This section examines the components of an Array system in detail.

Array Nodes

The basic computational building block of an Array is a Silicon Graphics, Inc. multiprocessor. Any system running IRIX 6.2 can participate as a node in Array 3.0, but normally a node is a multiprocessor system. Depending on the type of Array and customer's choices, a node can be any of the systems listed in Table 1-1.

Table 1-1. Array Node System Selection

System

Processor Complement

Graphics

Origin2000

2-128 R10000

 

Origin200

2 R10000

 

Onyx2

2-16 R10000

InfiniteReality or RealityMonster

CHALLENGE

2-32 R4400

 

CHALLENGE 10000

2-36 R10000

Extreme Visualization Console

POWER Challenge

2-18 R8000

Extreme Visualization Console

POWER Challenge GR

2-24 R10000

Extreme Visualization Console, or InfiniteReality, or Reality Engine2

POWER Onyx

1-12 R8000

1-3 RealityEngine2

Onyx 10000

1-24 R10000

1-3 InfiniteReality

Table 1-2 lists information sources for the different types of systems.

Table 1-2. Information Sources: Array Component Systems

Topic

Book or URL

Book Number

All SGI Servers

http://www.sgi.com/Products/index.html?hardw are

Origin2000 and Origin200

http://www.sgi.com/Products/hardware/servers/inde x.html

 

Onyx2 and RealityMonster

http://www.sgi.com/Products/hardware/graphic s/products/index.html

 

POWER CHALLENGE

POWER CHALLENGE XL Rackmount Owner's Guide

007-1735-xxx

POWER Onyx

POWER Onyx and Onyx Rackmount Owner's Guide

007-1736-xxx

CHALLENGE and CHALLENGE 10000

POWER CHALLENGE XL Rackmount Owner's Guide

007-1735-xxx

RealityEngine2 and InfiniteReality

http://www.sgi.com/Products/hardware/Onyx/Tech/

 

Extreme Visualization Console

POWER CHALLENGE XL Rackmount Owner's Guide

007-1735-xxx


Hybrid Array

An Array that includes both Origin2000/Onyx2 systems and Challenge/Onyx systems is called a hybrid array. Previous versions of Array software supported only uniform Arrays composed of Challenge and Onyx systems. Array 3.0 software supports uniform arrays of Origin2000/Onyx2 systems, uniform arrays of Challenge/Onyx systems, and hybrid arrays.

The HIPPI Interconnect

Array nodes are normally connected by a high-performance, dual-channel HIPPI network. Each node is equipped with one or more bidirectional HIPPI interfaces. Each interface provides 100 MB per second of data bandwidth in either direction.

The HIPPI interfaces are connected via a high-performance HIPPI crossbar switch (optional in a two-node Array). The HIPPI switch is nonblocking, with sub-microsecond connection delays. The network appears to be fully connected and contention occurs only when two sources send data to the same destination at the same time.

IRIX 6.2 or IRIX 6.4 and the Array 3.0 software provide protocol layers and APIs to access the HIPPI network, including direct physical layer, HIPPI framing protocol, and TCP/IP. The HIPPI support includes special bypass capabilities to expedite transmission of short messages. The bypass capability is transparent to the applications using it.

Table 1-3 lists information sources on HIPPI and the HIPPI crossbar (which is produced by a third party, Essential Communications, Inc.).

Table 1-3. Information Sources: HIPPI Interconnect

Topic

Book or URL

Book Number

HIPPI interface

IRIS HIPPI Administrator's Guide

IRIS HIPPI API Programmer's Guide

007-2229-xxx

007-2227-xxx

HIPPI Crossbar Switch

EPS-1 User's Guide 

http://www.esscom.com

09-9010


Visualization and Interactive Supercomputing

Array nodes can be configured with hardware graphics support, to provide two and three-dimensional visualization performance commensurate with the available compute power. The available graphics options are listed in Table 1-1. Complex supercomputing visualization architectures can be built by aggregating compute and graphics nodes, as illustrated in Figure 1-2.

Figure 1-2. Advanced Visualization With Arrays

Figure 1-2 Advanced Visualization With Arrays

Centralized Console Management

An IRISconsole serves as a single, centralized administrative console for Array administration and maintenance. The IRISconsole consists of an O2 or Indy workstation, an IRISconsole multiplexer box, and the IRISconsole graphical cluster management software. From the IRISconsole, administrators can control, configure, monitor, and maintain the individual Array nodes.

The O2 or Indy workstation serves as a virtual console for each node. The workstation is connected to the multiplexer via a SCSI interface. The multiplexer in turn connects to the Remote System Control port of each node. Commands from the console workstation are routed to the appropriate node, and results from the nodes are routed back.

The IRISconsole graphical user interface provides a convenient graphic representation of the array. Sets of nodes can be selected and operated upon. You can open a command window directly to any node. You can use the IRISconsole graphical interface to

  • Dynamically add and remove nodes in the array

  • Display console messages or enter console commands to any node

  • Interrupt, reset, or power-cycle any node

  • Display and record real-time graphs of hardware operating statistics, including voltage, temperature, and cooling status

  • Enable monitors and alarms for conditions such as excessive temperature

  • View activity logs and other system reports

For more about the features of IRISconsole and illustrations of its use, see “Using the IRISconsole Workstation”. Table 1-4 lists other information sources for IRISconsole and its hardware.

Table 1-4. Information Sources: IRISconsole

Topic

Book or URL

Book Number

IRISconsole

IRISconsole Administrator's Guide

http://www.sgi.com/Products/hardware/challenge/IRISconsole.html http://www.sgi.com/Products/hardware/challenge /IRISconsole.html

007-2872-xxx

IRISconsole
hardware

IRISconsole Installation Guide
Indy Workstation Owner's Guide

007-2839-xxx
007-9804-xxx


Distributed Management Tools

Array 3.0 makes an Array manageable by providing support for process execution, program development, performance instrumentation, and system administration.

This section introduces many of the bundled and third-party tools in detail.

Array Services

Array Services includes administrator commands, libraries, daemons and kernel extensions that support the execution of programs across an Array.

A central concept in Array Services is the array session handle (ASH), a number that is used to logically group related processes that may be distributed across multiple systems. The ASH creates a global process namespace across the Array, facilitating accounting and administration.

Array Services also provides an array configuration database, listing the nodes comprising an array. Array inventory inquiry functions provide a centralized, canonical view of the configuration of each node. Other array utilities let the administrator query and manipulate distributed array applications.

The Array Services package comprises the following primary components:

array daemon

These daemon processes, one in each node, cooperate to allocate ASH values and maintain information about node configuration and the relation of process IDs to ASHes.

array configuration database

One copy at each node, this file describes the Array configuration for use by array daemons and user programs.

ainfo command

Lets the user or administrator query the Array configuration database and information about ASH values and processes.

array command

Executes a specified command on one or more nodes. Commands are predefined by the administrator in the configuration database.

arshell command

Starts an IRIX command remotely on a different node using the current ASH value.

aview command

Displays a multiwindow, graphical display of each node's status.

libarray library

Library of functions that allow user programs to call on the services of array daemons and the array configuration database.

The use of the ainfo, array, arshell, and aview commands is covered in Chapter 2, “Using an Array.” The use of the libarray library is covered in Chapter 4, “Performance-Driven Programming in Array 3.0.”

Performance Co-Pilot

Performance Co-Pilot (PCP) is a Silicon Graphics product for monitoring, visualizing, and managing systems performance.

PCP has a distributed client-server architecture, with performance data collected from a set of servers and displayed on visualization clients. Performance data can be obtained from multiple sources, including the IRIX kernel and user applications. With support for low-intrusion performance data collection, reduction, and analysis, PCP permits a variety of metrics to be captured, correlated, reduced, recorded, and rendered.

PCP has been customized for Array systems to provide visualization of system-level and job-level statistics across the array. An array user can view a variety of relevant performance metrics on the array via the following utilities:

mpvis

Visualize CPU utilization of any node.

dkvis

Visualize disk I/O rates on any node.

nfsvis

Visualize NSF statistics on any node.

pmchart

Plot performance metrics versus time for any node.

procvis

Visualize CPU utilization across an array for tasks belonging to a particular array session handle.

arrayvis

Visualize aggregate Array performance.

ashtop

List of top CPU-using processes under a given ASH.

arraytop

List of top CPU-using processes in the array.

For more information about Performance Copilot, see The Performance Co-Pilot User's and Administrator's Guide (007-2614-xxx).

SHARE II (Fair Share) Scheduling

SHARE II, a “Fair Share” scheduler, allows an organization to create its own resource allocation policy based on its assessment of how resource usage should be fairly distributed to individuals or arbitrarily grouped users. SHARE II is available only for Arrays that use IRIX 6.2; it is not available for IRIX 6.4.

With SHARE II, users are grouped into a system-wide resource allocation and charging hierarchy. The hierarchy can represent projects, divisions, or arbitrary sets of users. Within this hierarchy, resource usage policy can be varied or delegated at any level according to organizational priorities.

Users can be limited in consumption of renewable resources (such as printer pages) and fixed resources (such as instantaneous memory use). Other limits are imposed during periods of scarcity (for example, CPU run time during periods of contention). Thus, SHARE II provides a fair share of the system resources during high-load periods without overcommitment, wasteful static reservations, or expensive administrator intervention.

Accounting With PerfAcct

PerfAcct, a third-party software product, gathers system accounting data from all nodes in a central location, where it is summarized and used to generate usage reports or billing. PerfAcct exploits IRIX extended-session accounting data to provide true job accounting. Job and project accounting permits usage tracking and billing by external or internal contracts, departments, tasks, and projects.

PerfAcct features low-overhead data collection on the nodes being monitored. To minimize system load on the monitored systems, archiving and summarization can be put on a remote low-cost workstation. PerfAcct also includes aggregate accounting statistics, as well as graphical user interface tools for measuring dynamic system load.

Supporting Documentation

Table 1-5 lists information sources for the management tool products.

Table 1-5. Information Sources: Management Tools

Topic

Book, Cross-Reference, or URL

Book Number

Array Services

Chapter 2, “Using an Array”

array_services(5)

 

Performance Co-Pilot data sheet

http://www.sgi.com/Products/hardware/challenge/ CoPilot/CoPilot.html

 

Performance Co-Pilot

The Performance Co-Pilot User's and Administrator's Guide

Performance Co-Pilot for Informix-7 User's Guide

007-2614-xxx

007-3007-xxx

PerfAcct

http://www.instrumental.com

 

SHARE II

Share II for IRIX Administrator's Guide

007-2622-xxx


Job Execution Facilities

An Array system can be used as an interactive system for real-time experimentation, as a coupled multiprocessor for grand-challenge class applications, and as a throughput compute engine for high-efficiency batch execution. This section introduces the job scheduling features.

Interactive Processing

Users can log in to a node to execute jobs interactively using normal IRIX job-control facilities. Interactive jobs can be command-line based, or can be X Windows applications that execute on the node but display on the user's workstation.

Jobs started interactively can be sequential programs, or multi-threaded programs executing within a node, or distributed-memory parallel applications executing across several nodes. Distributed programs using MPI or PVM can be started and monitored using the graphical monitors XMPI and XPVM; these display job status graphically on the user's workstation screen. Table 1-6 lists information sources on interactive processing.

Table 1-6. Information Sources: Interactive Processing

Topic

Book, Cross-Reference, or URL

Book Number

Logging in to a node

Chapter 2, “Using an Array”

 

XMPI and XPVM

MPI and PVM User's Guide

mpirun(1)

007-3286-xxx


Batch Processing

Batch processing allows off-line job scheduling. Batch processing is appropriate for production environments, high job-load environments, and situations where program results are not required immediately.

When an Array system is used for batch scheduling, users submit jobs to batch queues, which contain ordered sets of waiting jobs. When sufficient compute resources become available, and subject to tunable scheduling constraints, jobs are extracted from the batch queues and scheduled on the nodes. Job results and termination status are recorded in files or are electronically mailed to the user. See Figure 1-3.

Figure 1-3. Batch Processing on an Array System

Figure 1-3 Batch Processing on an Array System

Several popular batch facilities are compatible with Array 3.0, including the Network Queuing Environment (NQE) from Silicon Graphics, Inc.; the Codine Job-Management System from Genias Software, Inc.; and Load Sharing Facility (LSF) from Platform Computing, Inc.

NQE consists of the following components that provide a seamless environment for users of the Array:

  • The NQE graphical interface allows users to submit batch requests to a central database, and to monitor and control each request.

  • The Network Load Balancer (NLB) routes jobs to available nodes according to their current workload.

  • The NQE scheduler determines when and on which node each request is to run.

  • The File Transfer Agent (FTA) provides synchronous and asynchronous transfer of files, including automatic retry when a network link fails.

IRIX Checkpoint and Restart (CPR) facility allows you to save the status of long-running jobs and restart them easily.

Table 1-7 lists information sources on these products.

Table 1-7. Information Sources: Batch Scheduling Products

Topic

Book or URL

Book Number

IRIX Checkpoint and Restart (CPR)

IRIX Checkpoint and Restart Operation Guide

007-3236-xxx

Network Queuing Environment (NQE) technical papers

http://wwwsdiv.cray.com/~nqe/nqe_external/ind ex.html (pointers to technical papers)

http://www.cray.com/PUBLIC/product-info/sw/ nqe/nqe30.html (illustrated overview)

NQE User's Guide

NQE Administrator's Guide



SG-2148 3.2

SG-2150 3.2

Load Sharing Facility (LSF)

http://www.platform.com

 

Codine

http://www.instrumental.com

 


Compilation, Development, and Execution Facilities

Array 3.0 is complemented by development tools from Silicon Graphics, Inc. and other companies to simplify creation of parallel applications using both shared-memory and distributed-memory models. This section summarizes these tools. Additional discussion of software development appears in Chapter 4, “Performance-Driven Programming in Array 3.0.”

Optimizing and Parallelizing Compilers

The MIPSpro compilers are the third-generation family of optimizing and parallelizing compilers from Silicon Graphics, offering comprehensive support for parallel application development.

Exploiting aggressive dependency analysis, the compilers perform automatic program restructuring, software pipelining, and parallelization. The compilers also provide a comprehensive set of comment directives that enable users to assist the compiler in the parallelization process.

Silicon Graphics, Inc. offers MIPSpro compilers for Fortran 77, Fortran 90, and C; as well as compilers for Ada 95, C++, assembly language, and Pascal. For detailed information about each compiler see the sources listed in Table 1-8.

Table 1-8. Information Sources: Compilers from SGI

Topic

Book or URL

Book Number

MIPSpro compiler features and use

MIPS Compiling and Performance Tuning Guide

007-2479-xxx

C language

C Language Reference Manual

007-0701-xxx

MIPSpro Fortran 77

MIPSpro Fortran 77 Programmer's Guide

MIPSpro Fortran 77 Language Reference Manual

007-2361-xxx

007-2362-xxx

MIPSpro Fortran 90

MIPSpro Fortran 90 Programmer's Guide

007-2761-xxx

Automatic parallelization of C and Fortran code

MIPSpro Power Fortran 77 Programmer's Guide

MIPSpro Power Fortran 90 Programmer's Guide

IRIS Power C User's Guide

007-2363-xxx

007-2760-xxx

007-0702-xxx

C++ language

C++ Programmers Guide

007-0704-xxx

Assembly Language

MIPSPro Assembly Language Programmer's Guide

007-2418-xxx

Ada95 (GNU Ada Translator, GNAT)

GNAT User's Guide

007-2624-xxx

Pascal

Pascal Programming Guide

007-0740-xxx


High Performance Fortran

High Performance Fortran (HPF) is an extended version of Fortran 90 that is emerging as a standard for programming of shared- and distributed-memory systems in the data-parallel style. HPF incorporates a data-mapping model and associated directives that allow a programmer to specify how data is logically distributed in an application. An HPF compiler interprets these directives to generate code that minimizes interprocessor communication in distributed systems and maximizes data reuse in all types of systems.

HPF compilers are available for Array systems from the Portland Group, Inc. and Applied Parallel Research, Inc. Table 1-9 lists information sources for these products.

Table 1-9. Information Sources: High Performance Fortran

Topic

Book or URL

Book Number

High Performance Fortran texbook

The High Performance Fortran Handbook, Koelbel, Loveman, Schreiber, Steele Jr., and Zosel; MIT Press, 1994 (http://www-mitpress.mit.edu/ )

ISBN 0-262-61094-9

High Performance Fortran forum

http://www.crpc.rice.edu/HPFF/home.html

 

Portland Group, Inc.

http://www.pgroup.com

 

Applied Parallel Research

http://www.infomall.org/apri

 


Numerical Libraries

The compilers are complemented by CHALLENGEcomplib, a comprehensive, optimized collection of scientific and math subroutine libraries popular in scientific computing. The library consists of two subcomponents: SGIMATH and SLATEC.

SGIMATH is hand-tuned, optimized, and parallelized, providing high-performance, portable implementations of the following popular numerical facilities:

  • Basic Linear Algebra Subprograms (BLAS), levels 1, 2, and 3

  • 1D, 2D, and 3D Fast Fourier Transforms (FFT)

  • convolutions and correlation routines

  • LAPACK, LINPACK, and EISPACK

  • SCIPORT (portable version of SCILIB)

  • SOLVERS: pcg sparse solvers, direct sparse solvers, symmetric iterative solvers, and solvers for special linear systems

A source for a more detailed overview of CHALLENGEcomplib is listed in Table 1-10. Most of the functions within the library are documented in reference pages that install with the product.

Table 1-10. Information Sources: CHALLENGEcomplib

Topic

Book or URL

Book Number

CHALLENGEcomplib overview

http://www.sgi.com/Products/hardware/Power/ ch_complib.html

 


IRIX 6.2 and 6.4

The primary process control services of the Array are provided by the IRIX operating system, a symmetric multiprocessing operating system based on UNIX SVR4 with compatibility for BSD.

IRIX version 6.2 is required for Array 3.0 on Challenge/Onyx systems, and IRIX 6.4 on Origin systems. This version provides fast, flexible support for shared-memory interprocess communication, high-performance I/O, and performance-centric scheduling. Within a node, related processes are gang-scheduled to prevent one process from wasting time by spinning on locks held by blocked peers. Process placement decisions incorporate cache affinity heuristics, which minimizes multiprogramming-induced cache thrashing by tending to keep particular processes on the same processor.

Real-time processing can be supported with the REACT facilities, including nondegrading priorities, deadline scheduling, and reliably bounded kernel latencies. Hooks to support optional SHARE II Fair Share Scheduling checkpoint-restart facilities are also supported.

IRIX supports a variety of system functions to allow shared memory interprocess communication (IPC) between processes within one node. SVR4-compatible library functions for semaphores, message queues, and shared memory are supported. High-performance IRIX-unique facilities for shared memory, semaphores, and mutex locks are included. POSIX-compatible library functions for semaphores, message queues, and shared memory are integrated into IRIX 6.4 (available as a patch set for IRIX 6.2).

Overview sources on IRIX and on the REACT real-time programming extensions are listed in Table 1-11.

Table 1-11. Information Sources: IRIX and REACT

Topic

Book or URL

Book Number

IRIX 6.2 Data Sheet

http://www.sgi.com/Products/software/IRIX6.2/ IRIX62DS.html

 

IRIX 6.2 Specifications

http://www.sgi.com/Products/software/IRIX6.2/ IRIX62specs.html

 

REACT/pro and real-time programming

http://www.sgi.com/real-time/

 

IRIX IPC facilities

Topics In IRIX Programming

007-2478-xxx


Performance and Debugging Tools

Silicon Graphics includes a powerful set of parallel debugging, profiling, and visualization tools as part of the Developer Magic application development suite. Systemic performance visualization is provided by the Performance Co-Pilot facility, and array extensions.

In addition to these, IRIX 6.2 contains the interactive debugger dbx and profiling tools pixie and prof. Information sources on developer tools are listed in Table 1-12.

Table 1-12. Information Sources: Performance and Debugging Tools

Topic

Book or URL

Book Number

Developer Magic overview

http://www.sgi.com/Products/DevMagic/

 

Developer Magic

Developer Magic: ProDev WorkShop Overview

http://www.sgi.com/Products/WorkShop.html

007-2582-xxx

Performance Co-Pilot data sheet

http://www.sgi.com/Products/hardware/challen ge/CoPilot/CoPilot.html

 

Performance Co-Pilot

The Performance Co-Pilot User's and Administrator's Guide

Performance Co-Pilot for Informix-7 User's Guide

007-2614-xxx

007-3007-xxx

dbx, prof, pixie

dbx User's Guide

MIPS Compiling and Performance Tuning Guide

007-0906-xxx

007-2479-xxx


Message-Passing Protocols

Parallel applications using IPC facilities execute within a single node. However, you can create parallel applications that distribute across one or more nodes using a different model of parallel computation, the message-passing model.

In the message-passing model, processes communicate by exchanging “messages” of application data. The supporting library code chooses the fastest available means to pass the messages—through shared memory IPC within a node, across the HIPPI interconnect between nodes when available, or via TCP/IP.

Array 3.0 supports multiple message-passing protocols that are bundled in the separate product, the Message-Passing Toolkit. This single product contains implementations of three protocols: Two well-known, standardized, message-passing libraries, the Message Passing Interface (MPI) and Parallel Virtual Machine (PVM), and the Cray-designed SHMEM protocol.

MPI is the favored message-passing facility under Array 3.0. The MPI library exploits low-overhead, shared-memory transfers whenever possible. Messages sent between processes residing on different nodes use the HIPPI network; but the MPI library is aware of, and uses, the proprietary HIPPI bypass in Array 3.0 to get higher bandwidth when possible.

While Array 3.0 supports MPI as its native message-passing model, it also supports PVM and SHMEM for portability. The PVM library support has been optimized to exploit shared-memory transfers within a single node, but it does not take advantage of HIPPI bypass, and thus may not achieve the inter-node bandwidth of MPI.

Table 1-13 lists information sources about parallel and distributed programming. This subject is also explored in more detail in Chapter 4, “Performance-Driven Programming in Array 3.0.”

Table 1-13. Information Sources: Parallel and Distributed Programming

Topic

Book or URL

Book Number

Parallel Programming Models Compared

Topics In IRIX Programming

007-2478-xxx

Message Passing Toolkit (MPT) in general

http://www.cray.com/PUBLIC/product-info/sw/

 

MPI Overview

mpi(5)

 

MPI References

Using MPI, Gropp, Lusk, and Skjellum, MIT Press 1995 (http://www-mitpress.mit.edu/ )

MPI, The Complete Reference, Snir, Otto, Huss-Lederman, Walker, and Dongarra, MIT Press 1995

Using MPI (in IRIX Insight library)

ISBN 0-262-69184-1

ISBN 0-262-57104-8

007-2855-001

MPI Standard

http://www.mcs.anl.gov/mpi

 

PVM Overview

pvm(1PVM)

 

PVM Reference

PVM: Parallel Virtual Machine, Geist, Beguelin, Dongarra, Weicheng Jiang, Manchek, and Sunderam, MIT Press 1994

http://www.netlib.org/pvm3/book/pvm-book.html

ISBN 0-262-57108-0

PVM Home Page

http://www.epm.ornl.gov/pvm/pvm_home.html

 

Porting PVM to MPI

Topics In IRIX Programming

007-2478-xxx