This chapter provides an overview of the physical and architectural aspects of your SGI Altix 350 system or SGI Altix 450 or SGI Altix 4700 system. This chapter includes the following sections:
|Note: This chapter provides an overview of the SGI Altix 350 system. For more information on this system, see the SGI Altix 350 System User's Guide available on the SGI Technical Publications Library. It provides a detailed overview of the SGI Altix 350 system components and it describes how to set up and operate the system. For an overview of SGI ProPack software and installation and upgrade information, see the SGI ProPack 5 for Linux Start Here.|
The Altix 350 system advances the SGI NUMAflex approach to mid-range modular computing. It is designed to deliver maximum sustained performance in a compact system footprint. Independent scaling of computational power, I/O bandwidth, and in-rack storage lets you configure a system to meet your unique computational needs. The small footprint and highly modular design of the Altix 350 system makes it ideal for high computational throughput, media streaming, or complex data management.
The Altix 350 system can be expanded from a standalone single-module system with 2GB of memory and 4 PCI/PCI-X slots to a high-performance system that contains 32 processors, one or two routers, up to 192 GB of memory, and 64 PCI/PCI–X slots. For most configurations, the Altix 350 system is housed in one 17U rack or one 39U rack as shown in Figure 2-1; however, for small system configurations, the Altix 350 system can be placed on a table top.
Systems that are housed in 17U racks have a maximum weight of approximately 610 lb (277 kg). The maximum weight of systems that are housed in 39U racks is approximately 1,366 lb (620 kg). The racks have casters that enable you to remove the system from the shipping container and roll it to its placement at your site.
See Chapter 1, “Installation and Operation,” in the SGI Altix 350 System User's Guide for more information about installing your system. Check with your SGI service representative for additional physical planning documentation that may be available.
The Altix 350 system is based on the SGI NUMAflex architecture, which is a shared-memory system architecture that is the basis of SGI HPC servers and supercomputers. The NUMAflex architecture is specifically engineered to provide technical professionals with superior performance and scalability in a design that is easy to deploy, program, and manage. It has the following features:
Shared access of processors, memory, and I/O. The Super Hub (SHub) ASICs and the NUMAlink-4 interconnect functions of the NUMAflex architecture enable applications to share processors, memory, and I/O devices.
Each SHub ASIC in the system acts as a memory controller between processors and memory for both local and remote memory references.
The NUMAlink interconnect channels information between all the modules in the system to create a single contiguous memory in the system of up to 384 GB and enables every processor in a system direct access to every I/O slot in the system.
Together, the SHub ASICs and the NUMAlink interconnect enable efficient access to processors, local and remote memory, and I/O devices without the bottlenecks associated with switches, backplanes, and other commodity interconnect technologies.
System scalability. The NUMAflex architecture incorporates a low-latency, high-bandwidth interconnect that is designed to maintain performance as you scale system computing, I/O, and storage functions. For example, the computing dimension in some system configurations can range from 1 to 32 processors in a single system image (SSI).
Efficient resource management. The NUMAflex architecture is designed to run complex models and, because the entire memory space is shared, large models can fit into memory with no programming restrictions. Rather than waiting for all of the processors to complete their assigned tasks, the system dynamically reallocates memory, resulting in faster time to solution.
|Note: This chapter provides a brief overview of the SGI Altix 4700 series system. For more information on this system, see the SGI Altix 4700 System User's Guide available on the SGI Technical Publications Library. It provides a detailed overview of the SGI Altix 4700 system components and it describes how to set up and operate the system. For an overview of the SGI Altix 450 system, see Chapter 3, “System Overview” in the SGI Altix 450 System User's Guide.|
The Altix 4700 series is a family of multiprocessor distributed shared memory (DSM) computer systems that currently scales from 8 to 512 CPU sockets (up to 1,024 processor cores) and can accommodate up to 6TB of globally shared memory in a single system while delivering a teraflop of performance in a small-footprint rack.
The SGI Altix 450 currently scales from 2 to 76 cores as a cache-coherent single system image (SSI). For an overview of the SGI Altix 450 system, see Chapter 3, “System Overview” in the SGI Altix 450 System User's Guide.
Future releases will scale to larger processor counts for single system image (SSI) applications. Contact your SGI sales or service representative for the most current information on this topic.
In a DSM system, each processor board contains memory that it shares with the other processors in the system. Because the DSM system is modular, it combines the advantages of low entry-level cost with global scalability in processors, memory, and I/O. You can install and operate the Altix 4700 series system in a rack in your lab or server room. Each 42U SGI rack holds from one to four 10U high enclosures that support up to ten processor and I/O sub modules known as "blades." These blades are single printed circuit boards (PCBs) with ASICS, processors, and memory components mounted on a mechanical carrier. The blades slide directly in and out of the Altix 4700 IRU enclosures. Each individual rack unit (IRU) is 10U in height (see Figure 2-2).
|Note: An Altix 4700 system can support up to eight RASC blades per single system image. Current configuration rules require two compute blades for every RASC blade in your system.|
The Altix 4700 computer system is based on a distributed shared memory (DSM) architecture. The system uses a global-address-space, cache-coherent multiprocessor that scales up to sixty four Intel 64-bit processors in a single rack. Because it is modular, the DSM combines the advantages of lower entry cost with the ability to scale processors, memory, and I/O independently to a maximum of 512 processors on a single-system image (SSI). Larger SSI configurations may be offered in the future, contact your SGI sales or service representative for information.
The system architecture for the Altix 4700 system is a fourth-generation NUMAflex DSM architecture known as NUMAlink-4. In the NUMAlink-4 architecture, all processors and memory are tied together into a single logical system with special crossbar switches (routers). This combination of processors, memory, and crossbar switches constitute the interconnect fabric called NUMAlink. There are four router switches in each 10U IRU enclosure.
The basic expansion building block for the NUMAlink interconnect is the processor node; each processor node consists of a Super-Hub (SHub) ASIC and one or two 64-bit processors with three levels of on-chip secondary caches. The Intel 64-bit processors are connected to the SHub ASIC via a single high-speed front side bus.
The SHub ASIC is the heart of the processor and memory node blade technology. This specialized ASIC acts as a crossbar between the processors, local SDRAM memory, the network interface, and the I/O interface. The SHub ASIC memory interface enables any processor in the system to access the memory of all processors in the system. Its I/O interface connects processors to system I/O, which allows every processor in a system direct access to every I/O slot in the system.
Another component of the NUMAlink-4 architecture is the router ASIC. The router ASIC is a custom designed 8-port crossbar ASIC. Using the router ASICs with a highly specialized backplane or NUMAlink-4 cables provides a high-bandwidth, extremely low-latency interconnect between all processor, I/O, and other option blades within the system.