Chapter 1. Introduction

This chapter provides an overview of the Data Migration Facility (DMF) and its administration.

What Is DMF?

DMF is a hierarchical storage management system for Silicon Graphics environments. Its primary purpose is to preserve the economic value of storage media and stored data. The high I/O bandwidth of these machine environments is sufficient to overrun online disk resources. Consequently, capacity scheduling, in the form of native file system migration, has become an integral part of many computing environments and is a requirement for effective use of Silicon Graphics systems.

In addition to ensuring that adequate disk space is always available, capacity scheduling allows you to maintain a data space that is larger than your online disk resource. Oversubscription requires that the value of stored data be recognized as the same or higher than that of online data; DMF provides this capability. Figure 1-1 provides a conceptual overview of the data flow between applications and storage media.

Figure 1-1. Application Data Flow


DMF supports a range of storage management applications. In some environments, DMF is used strictly to manage highly stressed online disk resources. In other environments, it is also used as an organizational tool for safely managing large volumes of offline data. In all environments, DMF scales to the storage application and to the characteristics of the available storage devices.

DMF interoperates with standard data export services such as Network File System (NFS) and File Transfer Protocol (FTP). By combining these services with DMF, as shown in Figure 1-2, you can configure a Silicon Graphics system as a high-performance file server.

Figure 1-2. DMF Network Environment


DMF transports large volumes of data on behalf of many users. Because system interrupts and occasional storage device failures cannot be avoided, it is essential that the safety and integrity of data be verifiable. Therefore, DMF also provides tools necessary to validate your storage environment.

DMF has evolved around these customer requirements for scalability and the safety of data. As a file system migrator, DMF manages the capacity of online disk resources by transparently moving file data from disk to offline media. Most commonly, the offline medium is tape, managed by OpenVault or the Tape Management Facility (TMF). However, the offline medium can be any bulk-storage device accessible locally through NFS or FTP.

DMF accomplishes this data migration transparently; this means that a user cannot determine, by using POSIX-compliant commands for file system enquiry, whether a file is online or offline. Only when special commands or command options are used can a file's actual residence be determined. This transparent migration is possible because DMF leaves inodes and directories intact within the native file system.

How DMF Works

As a DMF administrator, you determine how disk space capacity is handled by selecting which file systems DMF will manage and by specifying the volume of free space that will be maintained on each file system. Space management begins with a list of user files that are ranked according to criteria you define. File size and file age are among the most common ranking criteria.

File migration occurs in two stages. First, a file is migrated to an offline medium. Once the offline copy is secure, the file is eligible to have its data blocks released (this usually occurs after a minimum space threshold is reached). A file with all offline copies completed is called fully backed up. A file that is fully backed up but whose data blocks have not yet been released is called a dual-state file; its data exists both online and offline, simultaneously. After a file's data blocks have been released, the file is called an offline file.

You choose both the percentage of file system volume to migrate and the volume of free space. You can trigger file migration, or file owners can issue manual migration requests.

Offline media is the destination of all migrated data; offline media management is handled by a daemon-like DMF component called the media-specific process (MSP). The MSP manages a pool of media volumes and moves file system data to and from offline media in response to migration requests. This component is designed to make full use of high-capacity, compressible media and to handle a large volume of transactions. The data-recording format uses blocking and checksumming to ensure the accuracy of the data and to facilitate recovery in the event of media failure.

Media transports and robotic automounters are also key components of all DMF installations. Generally, DMF can be used with any transport and automounter that is supported by either OpenVault or TMF. The most commonly used devices on IRIX systems are DLT 4000/7000, SCSI versions of IBM 3590, and STK TimberLine and RedWood drives. All STK robots, Grau, and IBM 3494 are supported. Additionally, DMF supports absolute block positioning, a media transport capability that allows rapid positioning to an absolute block address on the tape volume. When this capability is provided by the transport, positioning speed is often three times faster than that obtained when reading the volume to the specified position.

Ensuring Data Integrity

DMF provides several capabilities that enhance the safety of its operations and ensure the integrity of offline data. For example, you can configure multiple instances of the MSP, with each managing its own pool of media volumes. Therefore, DMF can be configured so that file system data is migrated to multiple offline locations.

DMF stores data that originates in an XFS file system (you can also convert other file servers to IRIX file servers running DMF). Each object stored corresponds to a file in the native file system. When a user deletes a file, the inode for that file is removed from the file system. Deleting a file that has been migrated begins the process of invalidating the offline image of that file. In the tape MSP this eventually creates a gap in the migration medium. To ensure effective use of media, the MSP provides a mechanism for reclaiming space lost to invalid data. This process is called volume merging.

Much of the work done by DMF involves transaction processing that is recorded in databases. DMF uses the Raima Data Manager (RDM) as its database engine. This package provides for full transaction journaling and employs two-phase commit technology. The combination of these two features ensures that DMF applies only whole transactions to its database. Additionally, in the event of an unscheduled system interrupt, it is always possible to replay the database journals in order to restore consistency between the DMF databases and the file system. DMF utilities also allow you to verify the general integrity of the DMF databases themselves.

DMF Architecture

DMF consists of the DMF daemon and one or more MSPs. The DMF daemon accepts requests from the DMF administrator or from users to migrate file system data, and communicates with the operating system kernel to maintain a file's migration state in that file's inode.

The DMF daemon is responsible for dispensing a unique identifier (called a bit file identifier, or bfid) for each file that is migrated. The daemon also determines the destination of migration data and forms requests to the appropriate MSP to make offline copies.

The MSP accepts requests from the DMF daemon. For outbound data, the MSP accrues requests until the volume of data justifies a volume mount. Requests for data retrieval are satisfied as they arrive. When multiple retrieval requests involve the same volume, all file data is retrieved in a single pass across the volume.

When running in the IRIX environment, DMF uses the Data Migration API (DMAPI) kernel interface defined by the Data Management Interface Group (DMIG). DMIG is also supported by X/Open, where it is evolving as the XDSM standard.

Figure 1-3 illustrates the DMF architecture.

Figure 1-3. DMF Architecture


Capacity and Overhead

DMF has evolved in production-oriented, customer environments. It is designed to make full use of parallel and asynchronous operations, and to consume minimal system overhead while it executes, even in busy environments in which files are constantly moving online or offline. Exceptions to this rule will occasionally occur during infrequent maintenance operations when a full scan of file systems or databases is performed.

The capacity of DMF is measured in several ways, as follows:

  • Total number of files. File identifiers used within DMF are 64-bit, thus providing a capacity of 2**64 files. DMF has been tested on file systems with 20 million inodes. The largest customer installation, on an inode-basis, is approximately 5 million. The average DMF database size is approximately 1 million entries.

  • Total volume of data. Capacity in data volume is limited only by the physical environment and the density of media. The largest customer installation, on the basis of data volume stored, is approximately 300 Tbytes. The average customer is storing 5 to 10 Tbytes.

  • Total volume of data moved between online and offline media. The number of tape drives configured for DMF, the number of tape channels, and the number of disk channels all figure highly in the effective bandwidth. In general, DMF provides full-channel performance to both tape and disk. The largest data-velocity customer is moving approximately 2.5 Tbytes per day.

  • Storage capacity. On IRIX XFS, the largest file is 9 Tbytes.

DMF Administration

DMF can be configured for a variety of environments including dedicated file servers, lights-out operations and, most frequently, for support of batch and interactive processing in a general-purpose environment with limited disk space.

DMF manages two primary resources: pools of offline media and free space on native file systems.

As a DMF administrator, you first need to characterize and determine the size of the environment in which DMF will run. You will want to plan for a certain capacity, both in the number of files and in the volume of data. You will also want to estimate the rate at which you will be moving data between the DMF store and the native file system. You will select autoloaders and media transports that are suitable for the data volume and delivery rates you anticipate.

Beyond initial planning and setup, DMF requires that you perform recurring administrative duties. DMF allows you to configure tasks that automate these duties. A task is a cron-like process initiated on a time schedule you determine. Configuration tasks are defined with configuration file parameters. The tasks are described in detail in “Configuring Daemon Maintenance Tasks” in Chapter 2, and “Configuring Tape MSP Maintenance Tasks” in Chapter 2.

DMF requires administrative duties to be performed in the following areas:

  • File ranking. You must decide which files are most important as migration candidates. When DMF migrates and frees files, it chooses files based on criteria you chose. The ordered list of files is called the DMF candidate list. Whenever DMF responds to a critical space threshold, it builds a new migration candidate list for the file system that reached the threshold. “Generating the Candidate List” in Chapter 3, describes candidate list generation.

  • Automated space management. You must decide how much free space to maintain on each managed file system. DMF has the ability to monitor file system capacity and to initiate file migration and the freeing of space when free space falls below the prescribed thresholds. Chapter 3, “Automated Space Management”, provides details about automated space management.

  • Offline data management. DMF offers the ability to migrate data to multiple offline locations. Each location is managed by a separate MSP and is usually constrained to a specific type of medium.

    Complex strategies are possible when using multiple MSPs. For example, short files can be migrated to a device with rapid mount times, while long files can be routed to a device with extremely high density.

    You can describe criteria for MSP selection. When setting up a tape MSP, you assign a pool of tapes for use by that MSP. The dmvoladm(8) utility provides management of the tape MSP media pools.

    You can configure DMF to automatically merge tapes that are becoming sparse--that is, full of data that has been deleted by the owner. With this configuration (the run_merge_tapes.sh task), the media pool is merged on a regular basis in order to reclaim unusable space.

    Recording media eventually becomes unreliable. Sometimes, media transports become misaligned so that a volume written on one cannot be read from another. Two utilities are provided that support management of failing media. The dmatsnf(8) utility is used to scan a DMF volume for flaws, and dmatread(8) is used for recovering data. Additionally, the volume merge process built into the MSP is capable of effectively recovering data from failed media.

    Chapter 6, “Media Specific Processes (MSPs)”, provides more information on MSP administration.

  • Integrity and reliability. Integrity of data is a central concern to the DMF administrator. You will have to understand and monitor processes in order to achieve the highest levels of data integrity, as described below:

    • Even though you are running DMF, you will still have to run backups because DMF moves only the data associated with files, not the file inodes or directories. You can configure DMF to automatically run backups of your DMF-managed file systems.

      The dump utility for your file system ( xfsdump and xfsrestore on IRIX systems) works in concert with DMF in that it understands when a file is fully backed up. The dump utilities have an option that allows for dumping only files that are not fully backed up.

      You can establish a policy of migrating 100% of DMF-managed file systems, thereby leaving only a small volume of data that the dump utility must record. This practice can greatly increase the availability of the machine on which DMF is running because, generally, dump commands must be executed in a quiet environment.

      You can configure the run_full_dump.sh and run_partial_dump.sh tasks to ensure that all files have been migrated. This can be configured to run when the environment is quiet.

    • DMF databases record all information about stored data. The DMF databases must be synchronized with the file systems DMF manages. Much of the work done by DMF ensures that the DMF databases remain aligned with the file systems.

      You can configure DMF to automatically examine the consistency and integrity of the DMF daemon and MSP databases. You can configure DMF to periodically copy the databases to other devices on the system to protect them from loss (using the run_copy_databases.sh task). This task also uses the the dmdbcheck utility to ensure the integrity of the databases before saving them.

      DMF uses journal files to record database transactions. Journals can be replayed in the event of an unscheduled system interrupt. You must ensure that journals are retained in a safe place until a full backup of the DMF databases can be performed.

      You can configure the run_remove_logs.sh and run_remove_journals.sh tasks to automatically remove old logs and journals, which will prevent the DMF SPOOL_DIR directory from overflowing.

    You can configure the run_hard_delete.sh task to automatically perform hard-deletes, which are described in “Recalling a Migrated File”.

The User's View of DMF

While the administrator has access to a wide variety of commands for controlling DMF, the end user sees very little. Migrated files remain cataloged in their original directories and are accessed as if they were still on disk. The only difference users might notice is a delay in access time.

Commands are provided for file owners to affect the manual storing and retrieval of data. Users can do the following:

  • Explicitly migrate files by using the dmput(1) command

  • Explicitly recall files by using the dmget(1) command

  • Copy all or part of the data from a migrated file to an online file by using the dmcopy(1) command

  • Determine whether a file is migrated by using the dmfind(1) and/or dmls(1) commands

  • Test in shell scripts whether a file is online or offline by using the dmattr(1) command

DMF File Concepts and Terms

DMF regards files as being one of the following:

  • Regular files are user files residing only on disk

  • Migrating files are files whose offline copies are in progress

  • Migrated files can be either of the following:

    • Dual-state files are files whose data resides both online and offline

    • Offline files are files whose data is no longer on disk

DMF does not migrate pipes, directories, or UNIX special files.

Like a regular file, a migrated file has an inode. Only an offline file requires the intervention of the DMF daemon to access its data.

The operating system informs the DMF daemon when a migrated file is modified. If anything is written to a migrated file, the offline copy is no longer valid, and the file becomes a regular file until it is migrated again.

Migrating a File

A file is migrated when the automated space management controller dmfsmon(8) selects the file or when an owner requests that the file be migrated by using the dmput(1) command.

The DMF daemon keeps a record of all migrated files in its database. The key to each file is its bfid. For each migrated file, the daemon assigns a bfid that is stored in the file's inode.

When the daemon receives a request to migrate a file, it adjusts the state of the file, ensures that the necessary MSP(s) are active, and sends a request to the MSP(s). MSPs copy data to the offline storage media.

When the MSP(s) have completed the offline copy (or copies), the daemon marks the file as fully backed up in its database and changes the file to dual-state. If the user specified the dmput -r option, or if dmfsmon requested that the file's space be released, the daemon releases the data blocks and changes the user file state to offline.

Recalling a Migrated File

When a migrated file must be recalled, a request is made to the DMF daemon. The daemon selects an MSP from its internal list of MSPs and sends that MSP a request to recall a copy of the file. If more than one MSP has a copy, the first MSP in the list is used. (The list of MSPs is created from the configuration file.)

After a user has modified or removed a migrated file, its bfid is soft-deleted. A file is soft-deleted when it is logically deleted from the daemon database. This is accomplished by setting the delete date field in the database to the current date and time for each entry referring to the modified or removed file.

A file is hard-deleted when its bfid is physically removed from the DMF database. You can configure DMF to automatically perform hard-deletes. This is done using the run_hard_delete.sh task, which uses the dmhdelete(8) utility.

The soft-delete state allows for the possibility that the file system might be restored after the user has removed a file. When a file system is reloaded from a dump image, it is restored to a state at an earlier point in time. A file that had been migrated and then removed might become migrated again due to the restore operation. This can create serious problems if the database entries for the file have been physically deleted (hard-deleted). In this case, the user would receive an error when trying to open the file because the file cannot be retrieved.

Do not hard-delete a database entry until after you are sure that the corresponding files will never be restored. Hard-delete requests are sent to the relevant MSPs so that copies of the file can be removed from media. For a tape MSP this involves compression (or merging).

Command Overview

The following section provides definitions for administrator commands grouped by function.

Configuration Commands

The configuration file, /etc/dmf/dmbase/host/ hostname/dmf_config, contains configuration objects and associated configuration parameters that control the way DMF operates. The hostname is the name of the host on which you installed DMF. By changing the values associated with these objects and parameters, you can modify the behavior of DMF.

For information about editing the configuration file, see Chapter 2, “Configuring DMF”. The following man pages are related to the configuration file:

Man page

Description

dmf_config(5)

Describes the DMF configuration objects and parameters in detail

dmconfig(8)

This command prints DMF configuration parameters to standard output

DMF Daemon and Related Commands

The DMF daemon, dmdaemon(8), communicates with the kernel through a device driver and receives backup and recall requests from users through a socket. The daemon activates the appropriate MSPs for file migration and recall, maintaining communication with them through unnamed pipes. It also changes the state of inodes as they pass through each phase of the migration and recall process. In addition, dmdaemon maintains a database containing entries for every migrated file on the system. Updates to database entries are logged in a journal file for recovery. See Chapter 4, “The DMF Daemon”, for a detailed description of the DMF daemon.


Caution: If used improperly, commands that make changes to the DMF database can cause data to be lost.

The following administrator commands are related to dmdaemon and the daemon database:

Command 

Description

dmaudit(8) 

Reports discrepancies between file systems and the daemon database. This command is executed automatically if you configure the run_audit.sh task.

dmcheck(8) 

Checks the DMF installation and configuration and reports any problems.

dmdadm(8) 

Performs daemon database administrative functions, such as viewing individual database records.

dmdaemon(8) 

Starts the DMF daemon.

dmdbcheck(8) 

Checks the consistency of a database by validating the location and key values associated with each record and key in the data and key files (also an MSP command). If you configure the run_copy_database.sh task, this command is executed automatically as part of the task. The consistency check is completed before the DMF databases are saved.

dmdbrecover(8) 

Updates the daemon and tape MSP databases with journal entries.

dmdidle(8) 

Causes files not yet copied to tape to be flushed to tape, even if this means forcing only a small amount of data to a volume.

dmdstat(8) 

Indicates to the caller the current status of dmdaemon .

dmdstop(8) 

Causes dmdaemon to shut down.

dmhdelete(8) 

Deletes unused daemon database entries and releases corresponding MSP space. This command is executed automatically if you configure the run_hard_delete.sh task.

dmmigrate(8) 

Migrates regular files that match specified criteria in the specified file systems, leaving them as dual-state. This utility is often used to migrate files before running backups of a file system, hence minimizing the size of the dump image.

dmsnap(8) 

Copies the DMF daemon and the MSP databases to a specified location. If you configure the run_copy_database.sh task, this command is executed automatically as part of the task.

dmversion(8) 

Reports the version of DMF that is currently executing.

Space Management Commands

The following commands are associated with automated space management, which allows DMF to maintain a specified level of free space on a file system through automatic file migration:

Command

Description

dmfsfree(8)

Attempts to bring the free space and migrated space of a file system into compliance with configured values.

dmfsmon(8)

Monitors the free space levels in file systems configured as auto (that is, automated space management is enabled) and lets you maintain a specified level of free space.

dmscanfs(8)

Scans DMF file systems and prints status information to stdout.

See Chapter 3, “Automated Space Management”, for a detailed description of automated space management.

MSP Commands

The DMF tape MSP maintains a database that contains volume (VOL) records and catalog (CAT) records. VOL records contain information about tape volumes, and CAT records contain information about offline copies of migrated files.

The disk and FTP MSPs allow the use of local or remote disk storage for storing migrated data. They use no special commands, utilities, or databases. For more information, see “Disk MSP” in Chapter 6, and “FTP MSP” in Chapter 6.

Two commands manage the CAT and VOL records for the tape MSP:

Command

Description

dmcatadm(8)

Provides maintenance and recovery services for the CAT database.

dmvoladm(8)

Provides maintenance and recovery services for the VOL database, including the selection of volumes for tape merge operations.

Most data transfers to and from tape media are performed by components internal to the MSP. However, there are also two utilities that can read tape MSP volumes directly:

Command

Description

dmatread(8)

Copies data directly from MSP volumes to disk.

dmatsnf(8)

Audits and verifies the format of MSP volumes.

There are also tools that check for MSP database inconsistencies:

Command

Description

dmatvfy(8)

Verifies the MSP database contents against the dmdaemon(8) database. This command is executed automatically if you configure the run_audit.sh task.

dmdbcheck(8)

Checks the consistency of a database by validating the location and key values associated with each record and key in the data and key files.

Commands for Other Utilities

The following utilities are also available:

Command

Description

dmclripc(8)

Frees system interprocess communication (IPC) resources and token files used by dmlockmgr and its clients when abnormal termination prevents orderly exit processing.

dmdate(8)

Performs calculations on dates for administrative support scripts.

dmdump(8)

Creates a text copy of an inactive database file or a text copy of an inactive complete DMF daemon database.

dmdumpj(8)

Creates a text copy of DMF journal transactions.

dmfill(8)

Recalls migrated files to fill a percentage of a file system. This command is mainly used in conjunction with dump and restore commands to return a corrupted file system to a previously known valid state.

dmlockmgr(8)

Invokes the database lock manager. The lock manager is an independent process that communicates with all applications that use the DMF database, mediates record lock requests, and facilitates the automatic transaction recovery mechanism.

dmmove(8)

Moves copies of a migrated file's data to the specified MSPs.

dmmaint(8)

Calls the dmmaint utility, which performs DMF version maintenance and provides interfaces for licensing and initial configuration.

dmov_keyfile(8)

Creates the file of DMF OpenVault keys, ensuring that the contents of the file are semantically correct and have the correct file permissions. This command removes any DMF keys in the file for the OpenVault server system and adds new keys at the front of the file.

dmov_loadtapes(8)

Scans a tape library for volumes not imported into the OpenVault database and allows the user to select a portion of them to be used by an MSP. The selected tapes are imported into the OpenVault database, assigned to the DMF application, and added to the MSP's database.

dmov_makecarts(8)

Makes the tapes in one or more MSP databases accessible through OpenVault by importing into the OpenVault database any tapes unknown to it and by registering all volumes to the DMF application not yet so assigned.

dmselect(8)

Selects migrated files based on given criteria. The output of this command can be used as input to dmmove(8).

dmsort(8)

Sorts files of blocked records.

dmxfsrestore(8)

Calls the xfsrestore(1M) command to restore files dumped to tape volumes that were produced by DMF administrative maintenance scripts.