This chapter discusses the following:
The SGI InfiniteStorage Gateway provides a single-enclosure server and disk archiving solution that you can attach to appropriate secondary-storage media (see “Secondary-Storage Requirements for DMF Simplified Configuration ”). It integrates the following major components:
SGI Modular InfiniteStorage (MIS) server, which houses the primary cache for user filesystems with either a 40-drive capacity (expandable in 8-drive increments) or 72-drive capacity. Each set of 8 drives is configured as a single 6+2 RAID 6. A total of 8 drives are reserved for administrative use and spares. A full system has a raw capacity of 288 TB, of which 192 TB is usable filesystem space. On a full system, you can create filesystems ranging in size from one 192-TB filesystem up to eight 24-TB filesystems; on a full system, a given filesystem can range from 1 to 8 LUNs.
This provides the following theoretical large-block bandwidth per LUN:
A base two-port 10-Gbit Ethernet card (for either a 40-drive or a 72-drive system) with an option to add a two-port or a four-port 10-Gbit Ethernet card (for a maximum total of six ports).
Four 8-Gbit Fibre Channel ports provide connection to the secondary-storage media. (The ports also run at 4 Gbit and 2 Gbit.)
The Data Migration Facility (DMF), which monitors the capacity of online disk resources and transparently moves file data from the DMF-managed filesystem on online disk to secondary storage. Periodically or when the managed filesystems reach a certain level of fullness, DMF automatically migrates file data to two migration targets on secondary storage (creating two copies of the data in order to prevent file data loss in the event that a migrated copy is lost). Migrated files appear as normal files to users and are always easily accessible via high-performance network connections. By transparently managing the migration of file data, the SGI InfiniteStorage Gateway lets you cost-effectively maintain a seemingly infinite amount of data without sacrificing accessibility.
SGI InfiniteStorage Gateway Management Center interface, which lets you do the following:
Configure DMF (if using the appropriate media, see “Secondary-Storage Requirements for DMF Simplified Configuration ”)
Configure Common Internet File System (CIFS) shares
Perform basic monitoring and operational tasks
Access the DMF Manager interface, which lets you perform detailed monitoring and managing of DMF when necessary
(Optional future offering) LiveArc™ Archive Edition (AE) digital asset management interface, which includes the following features:
At the factory, SGI preinstalls the required software and preconfigures the required administrative filesystems. When you purchase standard installation services, SGI personnel will install the hardware, connect the appropriate storage, and (if using secondary storage that is appropriate for the DMF simplified configuration as described in “Secondary-Storage Requirements for DMF Simplified Configuration ”) create your user filesystems and configure DMF using information that you provide; see Chapter 2, “Initial Configuration”.
DMF requires that you attach secondary-storage media to the SGI InfiniteStorage Gateway system. DMF supports simplified configuration for the following secondary-storage devices:
One or more of the following physical or logical tape libraries attached via Fibre Channel:
Spectra Logic® T50, T120, T-Series (T200, T380, T680), T950, and Tfinity libraries
Oracle® StorageTek SL150, L180, L700, L700e, SL500 and SL3000 libraries using the SCSI interface
Under the simplified configuration, DMF makes two copies of file data onto two separate migration targets on secondary storage. The configuration process will divide the tapes in each library into two DMF volume groups (VGs), and each VG will be a migration target for DMF.
You can have up to four physical libraries.
Unformatted SGI COPAN 400 native massive array of idle disks (MAID) SGI ZeroWatt™ disk. The COPAN MAID cabinet has up to 8 shelves and each shelf will be a VG (migration target) for DMF; therefore, you must use shelves in pairs so that there are two migration targets.
Note the following requirements:
Physical/logical libraries have the following requirements:
The cartridges in the physical/logical libraries that you select for DMF must not contain data that you want to preserve.
| Caution: Any existing data will be destroyed. |
You cannot share volumes within one physical/logical library between DMF and any other application. If you split a physical library into two logical libraries via Shared Library Services (SLS) partitioning, you can use the first logical library for DMF and the second for another application; however, both libraries will appear on the DMF Configuration page in the Management Center, so you must ensure that you select only the library you want to use for DMF when completing the configuration.
All of the cartridges within a given physical/logical library must be of the same cartridge type.
All of the drives within a given physical/logical library must be of the same drive type, such as Ultrium5 (although different manufacturers of the same drive type may be used, such as a mix of both HP Ultrium5 and IBM Ultrium5).
The library must have at least two drives. If you use the library for backups, it must have at least three drives.
The MIS provides four available Fibre Channel ports. If you require more connections for your storage media, you must add a Fibre Channel switch to the configuration.
The DMF simplified configuration process expects each COPAN MAID shelf to be in an unformatted state. If you have an existing COPAN MAID system that you want to use with the SGI InfiniteStorage Gateway, you must first return it to unformatted (factory) state. See “Returning a Shelf to Factory State” in Chapter 7.
The DMF configuration requires that you specify the MIS hostname and the management IP address.
After the DMF configuration completes, all of the secondary storage configured for DMF will be owned by DMF only and cannot be shared with another application.
The name you assign to each physical/logical library and shelf must be unique.
DMF uses volume serial numbers (VSNs) to identify data location in specific cartridges, therefore every VSN in the DMF environment must be unique:
For physical/logical libraries, the tape barcodes will determine the VSNs.
For shelves, the shelf name you choose will determine the VSNs on that shelf. You should name the shelves according to their physical location in the cabinet so that a shelf can be easily identified if service is required. You must follow the naming conventions specified in “Shelves Available for Configuration” in Chapter 2.
The secondary storage must contain at least 14 volumes.
If you use physical/logical tape libraries for backup, at least one library must contain 30 additional backup volumes; this is not needed if you choose to use COPAN MAID as the backup target.
This section discusses more details about DMF:
DMF transparently moves file data from high-performance but expensive disk to levels of decreased-performance but inexpensive media known as secondary storage. This lets you cost-effectively maintain a seemingly infinite amount of data without sacrificing accessibility for users.
As the filesystems fill, DMF frees the data blocks of the least-recently accessed files on MIS disk, thereby always keeping space free for new files and recalled files. During the time that file data resides on both the MIS disk and the secondary storage, it will be returned to the user immediately from MIS disk; after its disk data blocks are freed, it will be recalled from secondary storage after a delay (depending upon the secondary-storage characteristics). Regardless of its actual location, all of the data is always available to users via normal access methods.
Figure 1-1 describes the concept of the DMF migration cycle between the DMF-managed filesystem on MIS disk and the secondary storage.
In a full system, there can be up to eight user filesystems that will be monitored by DMF, one per MIS LUN. (A given filesystem can also consist of multiple entire LUNs.) You will determine this configuration when you create the filesystem; see “Create and Mount the User Filesystems” in Chapter 2.
All of the filesystems beneath the /dmf directory are required for administration of DMF and (if installed) LiveArc AE. You should not manually modify any files in this directory and you cannot unmount these filesystems nor export them via NFS or CIFS. These filesystems are allotted a total of 8 TB on the MIS disk.
In general, only the most timely data resides on the higher-performance MIS disk; DMF automatically migrates less timely data to secondary storage. However, all of the data is always available to users and applications using normal access methods, regardless of the data's actual location.
Although DMF moves file data, it leaves file metadata in place on the MIS disk so that users can access files without knowing the actual location of the data. Metadata consists of items such as index nodes (inodes) and directory structure. Migrated files appear as normal files to users and are always easily accessible via high-performance network connections.
Because migrated files remain cataloged in their original directories, users and applications never need to know where the data actually resides; they can access any migrated file using normal processes. In fact, when drilling into directories or listing their contents using standard POSIX-compliant commands, a user cannot determine the location of file data within the storage tier; determining the data's actual residence requires special commands or command options.
A file whose data blocks have been freed is considered from the DMF perspective to be offline and its data blocks are therefore available for new active data, either new files or recalled files. However, from the user perspective , the file always appears to be online because the inodes and directories remain in the DMF-managed filesystem, allowing users to access the file by normal means.
The only difference users might notice when accessing a file whose data blocks have been freed is a delay in response time, because the data must be retrieved from secondary storage. From the user's perspective, all data always appears to be available online, regardless of its actual location.
DMF continuously monitors the DMF-managed filesystems on MIS disk so that it can maintain a certain amount of free space in those filesystems. This free space permits the creation of new files and the recall of previously migrated files. DMF maintains free space by freeing data blocks on the MIS disk for files that have already been migrated. DMF chooses to release the data blocks of the least-recently accessed files until the filesystem is well below the free-space minimum threshold. From a user's perspective, all content is accessible all of the time.
Figure 1-2 shows the concept of the free-space minimum threshold, where DMF will free the data blocks of less-recently accessed files (such as represented by the letter A) to empty the MIS disk well below the threshold as new files are added or as previously migrated files (such as represented by the letters B and E) are recalled.
| Note: For simplicity, this figure does not show the second copy of file data. Data will be recalled from a second copy only if necessary. |
The SGI InfiniteStorage Gateway migration policy does the following:
Makes two copies of migrated data. DMF places those copies on separate secondary-storage targets (in two separate volumes on the available physical/logical libraries or COPAN MAID shelves). Creating two copies prevents file data loss in the event that one copy is damaged.
Migrates the data for most files to the secondary storage, other than recently-accessed files (to allow file content to stabilize). This occurs every 4 hours or when the free-space minimum threshold is exceeded.
Keeps a small amount of data in the DMF-managed filesystem on MIS disk for each file even after migration (for use by file managers, in order to avoid unnecessary recall of a file due to directory browsing).
Maintains at least 5% of the DMF-managed filesystem free for new data. When the filesystem reaches this threshold, DMF will free the already-migrated data blocks from the filesystem until 10% of the filesystem is free, selecting the least-recently accessed files first. (When LiveArc AE is installed, the numbers are 10% and 20%, respectively.)
During the period when the data has been copied to the secondary storage but remains in the data blocks on the MIS disk, the file is considered to be dual-state. If a user recalls a dual-state file, DMF retrieves it directly from the DMF-managed filesystem on MIS disk for fast access, rather than from one of the copies on secondary storage.
Only the most timely data resides on the higher-performance MIS disk; less timely data is automatically migrated to secondary storage, thereby leaving free space on the MIS disk that can be used for new files and newly recalled files. However, all of the data is always available to users and applications, regardless of its actual location.
When the filesystem on MIS disk reaches the threshold, DMF frees the data blocks for the least-recently accessed files that have copies on secondary storage.
| Note: A file's data blocks can only be freed on the MIS disk after the data has been copied to secondary storage. |
The following figures show concepts of DMF data migration:
Figure 1-3 shows the concept of DMF data migration in which file data is copied from the MIS disk to the secondary storage, but the inode remains in place in the DMF-managed filesystem.
Figure 1-4 shows the concepts of freeing filesystem space while retaining the filesystem metadata and recalling file data from secondary storage.
| Note: For simplicity, these figures do not show the second copy of file data. Data will be recalled from a second copy only if necessary. |
Figure 1-5 shows an example of the path that file data might take from an NFS client to a mount point on the MIS disk (for example, the /mnt0 mount point). The data is stored in the DMF-managed filesystem on the MIS disk and then migrated to the secondary storage (in this case, COPAN MAID shelves 0 and 1).
A user retrieves a file simply by accessing it normally through NFS or CIFS; DMF automatically recalls the file's data from the secondary storage, caching it on the MIS disk as shown in Figure 1-4. After the data is restored to the MIS disk, the file becomes dual-state; if the user changes it, it once again becomes a regular file.
If you recall more files than the DMF-managed filesystem can currently contain, DMF migrates files and will free the data blocks of the least-recently accessed files until the filesystem is once again well below the free-space minimum threshold.
Each day, the SGI InfiniteStorage Gateway will migrate all data to the secondary storage and back up the inodes and directories in the DMF-managed filesystems. For more details about backups, see the SGI InfiniteStorage Gateway release notes; to access the release notes, see “Getting Help” in Chapter 2.
You should monitor DMF on a daily basis to ensure that it is operating properly. After the SGI InfiniteStorage Gateway configuration is complete, several DMF automated tasks will periodically generate reports about activity, status, and errors that you should monitor. Additionally, some serious error conditions generate messages that you should investigate. See Chapter 4, “Basic DMF Monitoring and Management”.
| Note: It is important that you examine these reports and messages regularly so that you can find problems in time to retrieve important information that can help diagnose the problem. |
You should periodically monitor the system for available software updates. See “Create a Software Update Repository” in Chapter 3.
If you run into problems, see Chapter 7, “Troubleshooting”, to determine if the issue is something you can easily solve yourself or if you should contact SGI Support. If you suspect a serious problem, you should contact SGI Support promptly and then collect information about DMF to ensure that the problem can be fixed efficiently; see “Gathering DMF Data” in Chapter 4.