Chapter 1. Introduction

This manual tells you how to use the dmaudit (8) command to detect and report every known type of discrepancy between your file systems and the DMF daemon database, including the following:

The dmaudit command is intended primarily for interactive use. It uses a series of scrolling menus to display information and to solicit option selections. However, after you complete the initial configuration, dmaudit can also accept the snapshot and report operations from the command line. This allows dmaudit to be run as a background process or in batch mode.

Some of the advantages of using dmaudit are as follows:

If you do want to determine why a discrepancy occurred, there is information available in some of the dmaudit menus to help you narrow down exactly when an error occurred. Examination of daemon logs and journal files may fill in the remaining blanks. Some tips are given in later sections of this manual to help you determine why certain discrepancies occurred.

The Need for dmaudit

During normal system operation, the daemon database and the file systems stay synchronized with each other. Each migrated user file in your file systems has a unique bfid. Each bfid has one or more active daemon database entries. If an entry in the daemon database is not in use, it is soft-deleted. (A database entry is soft-deleted when the MSP copy of the data is no longer current. Data remains on the alternate media until the database entry is deleted.)

However, things can get out of synchronization. Examples of some of the discrepancies that might occur are as follows:

  • Migrated files that have bfids for which no database entries exist

  • Active database entries for which no migrated files exist

  • Multiple user files that have the same bfid

System crashes are a major source of discrepancies because I/O operations in progress at the time of the crash are not guaranteed to complete successfully. For example, a migrating file might receive a new bfid, but the rewrite of its inode to disk might not succeed. Or perhaps the inode update does complete, but the corresponding database entries are not successfully made.

Inconsistencies also arise if users are allowed to modify or remove migrated files during periods when the daemon is not running, because the kernel is then unable to tell the daemon to soft-delete the corresponding database entries. The unused, or orphan, database entries then accumulate in the database, wasting space on the alternate media.

Use of the xfsdump(1m) and xfsrestore(1m) commands can also create inconsistencies.

For example, files that have been removed or modified can be restored to their previous state. If the same migrated file is restored multiple times, there will be more than one inode containing the same bfid.

Sometimes the inconsistencies are harmless, or only result in wasted space on the alternate media. In other cases, the discrepancies can prevent a user from accessing one or more files, or can result in the loss of files. dmaudit allows the administrator to quickly detect and correct such inconsistencies when they occur, possibly before any data loss becomes permanent.

When to Use dmaudit

After dmaudit has been initially configured, most sites use dmaudit in batch mode on a periodic basis, perhaps once a week. The easiest way to do this is through the use of a cron script. Sites may also want to use dmaudit after a known failure such as a major system crash. Instructions on how to generate a report both interactively and through a cron script are provided in Chapter 6, “Detecting Discrepancies”.

When errors are detected, you should interactively examine and correct them.

Running dmaudit on a periodic basis will help you determine why discrepancies have appeared. For example, if your system crashes and a subsequent dmaudit run shows discrepancies that were not previously present, you can be reasonably sure that they occurred because of events at the time of the crash.