Chapter 2. Storage System Configurations

This chapter explains the various Challenge RAID configurations. Use it to plan your storage system or whenever you contemplate changes in your storage system or physical disk configuration.

A Challenge RAID storage system is configured on two levels:

Before you can plan your disk configuration, you must understand storage system configuration. Several storage system configurations are available for Challenge RAID storage systems. Table 2-1 lists the hardware components making up each configuration and summarizes the features of each.

This chapter discusses these configurations in separate sections. Each section explains the error recovery features of the configuration.

Table 2-1. Challenge RAID Configurations

Configuration

Host

SCSI-2 Interface

SCSI-2 Bus

SPs

Feature

Basic

1

1

1

1

Applications can continue after failure of any disk module, but cannot continue after failure of SCSI-2 interface or SP.

Dual-interface/d ual-processor

1

2

2

2

Provides highest availability and best storage system performance for single-host configurations. Applications can continue after any disk module fails.

Split-bus

2

2 (1 per server)

2 (1 per server)

2

Resembles two basic configurations side by side. Each host and its applications can continue after any disk module fails. The host using a failed SCSI-2 interface or SP cannot continue after failure, but the other host can. If one host, SCSI–2 adapter, or SP fails, the other host can take over the failed host's disks with system operator intervention.

Dual-bus/dual-in itiator

2

4 (2 per server)

2 (1 per server)

2

Provides highest availability and best storage-system performance for dual-host configurations. With RAID of any level other than 0, applications can continue after failure of any disk module. If one host, SCSI–2 adapter, or SP fails, the other host can take over the failed host's disk units with system operator intervention.

This configuration is required for Silicon Graphics IRIS FailSafe and Oracle Parallel Server (OPS).


Basic Configuration

The basic configuration has one host with one SCSI–2 interface connected by a SCSI–2 bus to the SP in the storage system.

The system can survive failure of a disk module within a redundant RAID group, but it cannot continue after failure of a SCSI–2 interface or SP. Table 2-2 lists the error recovery features of the basic configuration.

Table 2-2. Error Recovery: Basic Configuration

Failing Component

Continue After Failure?

Recovery

Disk module

Yes

Applications continue running. System operator replaces module.

Storage-control processor

No

Storage system fails. System operator replaces SP and restarts operating system.

Fan module

Yes

Applications continue running. System operator replaces module.

Power supply

Yes

If redundant power supply module is present, applications continue running; otherwise, storage system fails. Service provider replaces power supply.

SCSI-2 interface

No

I/O operations to storage system disk units fail. Authorized service provider replaces interface, and system operator restarts operating system and applications.

SCSI-2 cable

No

I/O operations to storage system fail. System operator replaces cable, and restarts operating system and applications.


Dual–Interface/Dual–Processor Configuration

The dual–interface/dual–processor configuration has one host with two SCSI–2 interfaces, each connected by a SCSI–2 bus to a different SP in the storage system.

For better performance with this configuration, you can bind some physical disk units on one SP and some other physical disk units on the other SP. The SP that binds a physical disk unit is the default owner of that physical disk unit.

The storage system can continue running after failure of a disk module within a redundant RAID group. It cannot continue after a SCSI–2 interface or an SP fails unless you manually transfer disk ownership. Table 2-3 lists these features.

Table 2-3. Error Recovery: Dual Interface/Dual-Processor Configuration

Failing Component

Continue After Failure?

Recovery

Disk module

Yes

Applications continue running. System operator replaces module.

Storage-control processor

Yes

I/O operations fail to the disk units owned by a failing SP. System operator can transfer control of the failed SP's disk units to the working SP, shut down the host, power off and on the storage system, reboot the host, and, when convenient, replace the SP and transfer control of disk units to the replacement SP.

Fan module

Yes

Applications continue running. Silicon Graphics SSE or other authorized service provider replaces module.

Power supply

Yes

If redundant power supply module is present, applications continue running; otherwise, storage system fails. Service provider replaces power supply.

SCSI-2 interface

Yes

I/O operations fail to storage system disk units owned by the SP attached to the failed interface. System operator can transfer control of the failed SP's disk units to the SP on the surviving interface, shut down the host, power off and on the storage system, and reboot the host. When convenient, the Silicon Graphics SSE or other authorized service provider can replace the interface and the system operator can transfer control of disk units to the replacement SP.

SCSI-2 cable

Yes

I/O operations fail to storage-system disk units owned by the SP attached to the failed cable. System operator can transfer control of these disk units to the other SP, shut down the host, power off and on the storage system, reboot the host, replace the cable, and transfer control of disk units to the replacement SP.

In the example diagrammed in Figure 2-1, one group of five disk modules is bound by storage-control processor A (SP A) and another group of five disk modules is bound by SP B.

Figure 2-1. Dual-Interface/Dual-Processor Configuration Example


In this example, if one SP or SCSI-2 interface fails, stored data in either LUN is available through the alternate path. Automatic path switching in the event of an SP or SCSI-2 interface failure is possible if XLV volumes and applicable patches are used. For information on XLV volumes, see Getting Started With XFS Filesystems.


Note: Only qualified Silicon Graphics System Service Engineers can replace SPs or SCSI-2 interfaces.


Split-Bus Configuration

The split–bus configuration has two hosts, each with a SCSI–2 interface connected by a SCSI–2 bus to a storage-control processor in the storage system. Each host uses its own disks in the storage system independently.

The split-bus configuration resembles two basic configuration systems side by side. This configuration can be used for sites requiring high availability because either host can continue after failure of any disk module within a disk array, and a host can take over a failed host's disks. A host cannot continue after a SCSI–2 interface or an SP fails unless you manually transfer disk ownership.

Table 2-4 lists the error recovery features for this configuration.

Table 2-4. Error Recovery: Split-Bus Configuration

Failing Component

Continue After Failure?

Recovery

Disk module

Yes

Applications continue running. System operator replaces module.

Storage-control processor

Yes

I/O operations fail to the disk units owned by a failing SP. System operator can transfer control of failed SP's disk units to the working SP, shut down the host, power off and on the storage system, and reboot the host. Silicon Graphics SSE or other authorized service provider replaces the SP and transfers control of disk units to the replacement SP.

Fan module

Yes

Applications continue running. Silicon Graphics SSE or other authorized service provider replaces the module.

Power supply

Yes

If redundant power supply module is present, applications continue running; otherwise, storage system fails. Service provider replaces power supply.

SCSI-2 interface

Yes

I/O operations fail to storage-system disk units owned by the SP attached to the failed interface. System operator can transfer control of the failed SP's disk units to the SP on the interface in the other host, shut down the other host, power off and on the storage system, and reboot the other host. Silicon Graphics SSE or other authorized service provider replaces the interface.

SCSI-2 cable

Yes

I/O operations fail to storage-system disk units owned by the SP attached to the failed cable. System operator can transfer control of these disk units to the other SP, shut down the host, power off and on the storage system, reboot the host, replace the cable, and transfer control of disk units to the replacement SP.

In the example diagrammed in Figure 2-2, one group of five disk modules is bound by storage-control processor A (SP A), which is connected via a SCSI-2 bus to one Challenge server; another group of five disk modules is bound by SP B, which is connected by a different SCSI-2 bus to the second Challenge server.

Figure 2-2. Split-Bus Configuration Example



Caution: This configuration does not afford failover capability.

If one SP fails or if the SCSI-2 connection from one host is broken, that host does not have access to the Challenge RAID storage system until the SP is replaced or the SCSI-2 connection is repaired. The host using the remaining SCSI-2 connection and remaining operational SP still has full access to its own data.

The storage-control processor that binds a disk module is the default owner of the disk module. The route through the SP that owns a disk module is the primary route to the disk module. The route through the other SP is the secondary route to the disk module.

In a dual-interface system, either Challenge server can use any of the disk modules in the storage system, but only one Challenge server at a time can use a disk module.

Dual–Bus/Dual–Initiator Configuration

The dual–bus/dual–initiator configuration provides the highest availability. Each host has two SCSI-2 adapters, each of which connects by a separate SCSI-2 bus to a separate SP in the storage system. Since this configuration protects against a SCSI-bus cable failure, it provides higher availability than the dual-initiator configuration. It is for enterprises requiring the highest level of availability, such as the Oracle Parallel Server and FailSafe products.

For better performance with this configuration, you can bind some physical disk units on one SP and the other physical disk units on the other SP. The SP that binds a physical disk unit is its default owner. The route through the SP that owns a physical disk unit is the primary route to the physical disk unit. The route through the other SP is the secondary route to the physical disk unit, and is available if a component in the primary route fails. Table 2-5 lists the error recovery features of the dual-bus/dual-initiator configuration.


Caution: Because both hosts can access the same disk modules simultaneously, the danger exists that one host can overwrite data stored by the other. This configuration requires specific hardware and software (such as a database lock manager) to protect the integrity of the stored data.


Table 2-5. Error Recovery: Dual-Bus/Dual-Initiator Configuration

Failing Component

Continue After Failure?

Recovery

Disk module

Yes

With RAID levels specified at any level other than 0, applications continue running. System operator replaces module.

Storage-control processor

Yes

I/O operations fail to the disk units owned by the failing SP. System operator can transfer control of the failed SP's disk units to the surviving SP, shut down the host, power the storage system off and on, and reboot both hosts. When convenient, Silicon Graphics SSE or other authorized service provider replaces the SP, and the system operation can transfer control of the disk units to the replacement SP.

Fan module

Yes

Applications continue running. System operator replaces module.

Power supply

Yes

If redundant power supply module is present, applications continue running; otherwise, storage system fails. Service provider replaces power supply.

SCSI-2 interface

Yes

The host with the failed adapter cannot access the disks owned by the SP connected to the failed adapter. System operator can transfer control of these disks to the SP connected to the working adapter, shut down both hosts, power off and on the storage system, and reboot both hosts. When convenient, the Silicon Graphics SSE or other authorized service provider can replace the interface, and the system operator can transfer control of disk units to replacement SP.

SCSI-2 cable

Yes

I/O operations fail to storage-system disk units owned by the SP connected to the failed cable. System operator can transfer control of these disk units to the other SP, shut down the host, power off and on the storage system, reboot the host, replace the cable, and transfer control of the disk units to the replacement SP.

Host

Yes

Operations continue on the surviving host.

In the example diagrammed in Figure 2-3, some modules are bound to one SP, which is their primary owner, and the other disk modules are bound to the other SP, which is their primary owner. Either host can use any of the physical disk units in the storage system, but only one host at a time can use a physical disk unit.

Figure 2-3. Dual-Bus/Dual-Initiator Configuration Example



Caution: Because both hosts have access to the all disk modules and their data in this configuration, it is possible for one host to overwrite the other's data unless appropriate filesystem configuration and failsafe software is installed.