Chapter 8. Archive Application

This chapter provides information specific to configuring and operating NetWorker with the Archive Application.  It also includes explanations that compare and contrast the different methods used for protecting data across a network.

The NetWorker archive server provides file archiving and retrieval services to a range of client machines.  It is packaged as an add-on extension to existing NetWorker backup servers, and uses the same license mechanism as NetWorker.

The NetWorker archive client can be any machine on a network that employs archive services provided by the archive server.  Clients may be enabled for backups, for archives, or for both.

Data archiving is the process of taking a snapshot of files or directories as they reside on primary media (usually disk) at a given point in time.  The snapshot image is typically stored on removable media, such as tape or optical disc.  Once the snapshot is safely stored on removable media, related files may be deleted to conserve space on disk.

Navigating the Windows

The following sections explain the features and use of the various Archive windows:

  • Clients window—to enable archive services for a client.

  • Archive Requests window—to schedule the archiving of a client.

  • Archive Request Control window—to monitor scheduled archive requests.

  • Archive Request Details window—to see the progress of a recent archive.

Clients Window

Before archiving can occur, you must configure NetWorker to recognize archive clients.  Choose Client Setup from the Clients menu to open the Clients window (see Figure 8-1).

Figure 8-1. Enable Archive in Clients Window

Figure 8-1 Enable Archive in Clients Window

The Archive services choices enable or disable archives for the currently selected client in the Clients list.  To allow archives for the client, click the Enabled choice for Archive services.  If archive services remain Disabled, the client will not be able to perform an archive.  When you enable archive services for one client, you also enable other clients of the same name on that server.

Find the Archive users list at the bottom of the scrolling section of the Clients window.  To allow users on the client to perform manual archives, enter their username into the Archive users list.  To schedule an archive request of an entire workstation, root (or equivalent) must be on the Archive users list for that client, or [email protected] must be in the Administrator list for the server.

For a complete description of the Clients window, see “Navigating the Clients Window”.

Archive Requests Window

To schedule an archive on a one-time basis, choose Archive Requests from the Customize menu.  The Archive Requests window appears, as shown in Figure 8-2.

Figure 8-2. Archive Requests Window

Figure 8-2 Archive Requests Window

The Archive Requests scrolling list displays archives previously requested on this server.  You can add new archive requests and delete old ones using the Create and Delete buttons.

The scrolling panel displays the following information:

  • Name—displays the archive name.

  • Annotation—a mixed-case comment string that you provide for every archive request (limited to 1024 characters).

  • Status—indicates when to begin the archive.  Start now means archiving begins when you click the Apply button.  Start later means archiving begins as indicated in the Start time field.

  • Start time—gives the next time to begin, based on a 24-hour clock.

  • Client—displays the hostname for the archive client.  To request an archive for the server, enter its name in the Client field.

  • Save set—specifies pathnames of directories or files to archive.

  • Directive—specifies the backup method, usually Unix standard or Unix with compression.

  • Archive pool—specifies the volume pool to which archives should be sorted.  The default volume pool is Archive.

  • Verify—indicates whether or not to automatically check the integrity of data archived on tape.

  • Clone—indicates whether or not to automatically clone the archive volumes for extra security.

  • Archive clone pool—specifies the volume pool for archive clones.  The default archive clone pool is Archive Clone.

  • Grooming—indicates whether or not client files and directories should be removed after archiving is complete.

    Caution: If you use directives that include instructions to skip files, do not enable the Grooming option.  Grooming occurs after a file has been archived.  If a file is skipped, it cannot be groomed and will cause the archive to fail.

  • Archive completion—contains an optional command to execute after archiving is complete, for example /usr/ucb/mail.

Archive Request Control Window

To see the results of an archive request, or to start or disable a scheduled archive, choose Archive Request Control from the Server menu. The Archive Request Control window appears, as shown in Figure 8-3.

Figure 8-3. Archive Request Control Window

Figure 8-3 Archive Request Control Window

The most recently run archive request is initially highlighted in the Archive list.  Usually it is Disabled, since scheduled archives run only once.  To see details about when this archive completed and how successfully it ran, click the Details button.

If you want details about, or control over, a different archive request, select the one you want from the Archive list.

To initiate an archive request immediately, click the Start button.  This has the same effect as selecting a Status of Start now for that archive request in the Archive Requests window.

To reschedule an archive request, click the Schedule button and enter a new starting time.  This has the same effect as selecting Start later and specifying a new Start time in the Archive Requests window.

To disable an archive request that you scheduled here or in the Archive Requests window, click the Disable button.  To halt an archive in progress, click the Stop button.

Archive Request Details

To see the progress of a recent archive (the one currently selected), click the Details button inside the Archive Request Control window.  The Archive Request Details window appears, as shown in Figure 8-4.

Figure 8-4. Archive Request Details Window

Figure 8-4 Archive Request Details Window

This window provides information about the completion of an archive request.  Some of the information in this window, such as annotation, archive clone pool, archive pool, client, clone, name, save set, start time, status, and verify, also appears in the Archive Requests window.

The completion time field displays when the archive finished.  Its duration is the difference between this and start time.  The log field shows messages generated by the archive.  The run status field shows the outcome of the archive request, either completed, failed, or partial.

Tip: For more information about failed archives, see the log file, usually /nsr/logs/daemon.log on the server.

Archive Example

Suppose you must shut down and remove the workstation (and hostname) of someone who has left the company.  It would be wise to archive the system data first, in case its filesystems contain essential files you need to access later.

This section gives an example of how to schedule and run an archive request.

Creating an Archive Client

Only registered archive clients can use the archive facility.  To create an archive client, follow these steps:

  1. Choose Client Setup from the Clients menu.  The Clients window appears, as shown on page 201.

  2. Click the Create button; the Clients window changes.

  3. Enter the hostname of the workstation in the client Name field.

  4. Click Enabled after Archive services to allow archives for this client.

  5. If you want to permit users on the archive client machine to use archive and retrieve, scroll to the bottom of the Clients window and add their user names to the Archive users field.

    (It is unlikely that you would want to allow manual archives and retrieves on a workstation about to be shut down.)

Machine hostname is now a registered archive client.  However, an archive will not take place until you request one.

Tip: If you want to allow archives on the server, make sure that archives are enabled for the server as a client of itself.

Making an Archive Request

Valid archive users may request archives manually using the nwarchive command.  However, a manual archive often takes a long time.  To avoid overloading a busy network with an archive request during the day, schedule it late at night when the network has less traffic.

To make an archive request, follow these steps:

  1. Choose Archive Requests from the Customize menu.

    The Archive Requests window appears, as shown in Figure 8-2.

  2. Click the Create button; the Archive Requests window changes.

  3. Enter the Name you want to assign to this archive request, and a brief Annotation to remind you of the purpose for the archive.

  4. Click Start later for the Status choice to schedule this archive for that night.  Enter the Start time you want, or accept the default starting time of 3:33 a.m.

  5. Enter the archive client machine hostname in the Client field, and the pathname(s) you want to archive in the Save set field.

  6. Specify a custom Archive pool, or accept the default Archive volume pool.

  7. Click Yes for the Verify choice to check that the archived data was saved correctly.

  8. If you want to make a duplicate copy of this archive volume, click Yes for the Clone choice.  Accept the default Archive Clone pool.

  9. To remove files and directories from disk after archiving them to tape, click remove for the Grooming choice.

  10. To be notified when the archive completes, enter a command into the Archive completion field, for example:

    /usr/ucb/mail -s archive_request [email protected] 

  11. Click the Apply button to activate your changes.

You have now requested an archive of a client machine hostname to begin that night at 3:33 a.m.

Checking the Archive Request

The next morning, check the outcome of the archive.  If you set up an Archive completion notice, look for an e-mail message containing a log of the archive request.

To check details of the archive, follow these steps:

  1. Choose Archive Request Control from the Server menu.

    The Archive Request Control window appears, as shown in Figure 8-3.

  2. Click the Details button.

    The Archive Request Details window appears, as shown in Figure 8-4.

  3. If this window shows the archive completed successfully, you can safely reconfigure the ex-employee's machine.

    If the archive failed, you can reschedule it (see the following section.)

You may use the Archive Request Control window to start, schedule, disable, or stop another archive.

Rescheduling the Archive Request

Suppose that the archive of hostname did not complete last night because, for example, an energy-conscious employee turned off the computer.  You decide to reschedule for the next night.

To reschedule the archive, follow these steps:

  1. Choose Archive Request Control from the Server menu.  The Archive Request Control window appears, as shown on page 205.  Make sure that the archive request you want is highlighted.

  2. Click the Schedule button.  The Archive Request Schedule window appears, as shown in Figure 8-5.

    Figure 8-5. Archive Request Schedule Window

    Figure 8-5 Archive Request Schedule Window

  3. Enter a new starting time in the Schedule Archive field, using the 24-hour clock, and click Ok.

The archive request executes again that night, at the time you specified.  If you change your mind and want to discontinue the archive, click the Disable button in the Archive Request Control window.

Clone and Verify

NetWorker contains two preconfigured volume pools for use with archiving:  Archive and Archive Clone.  If you want to create a new volume pool for archives, follow these steps:

  1. Choose Pools from the Media menu.

  2. Click the Create button in the Pools window.

  3. Fill out all the fields according to your needs.

    Tip: Make sure the Pool type is set to Archive, and the Store index entries field is set to No.  These two traits distinguish archive pools from backup pools.

  4. Click the Apply button to activate the new volume pool.

Now when you schedule a new archive request, you may use the new archive volume pool you created.  If you choose to clone archive data, also create a new archive clone pool.  NetWorker will write archive data only to an archive volume and archive clone data only to an archive clone volume.

If you have already made an archive and want to make a clone of it, follow these steps:

  1. Choose Clone from the Save set menu.

  2. Enter criteria for locating save sets in the Save Set Clone window.

    Tip: Click the More button and type Archive in the Pool field.

  3. Click the Query button to see save sets matching your criteria.

  4. Select the save set you wish to clone and click the Clone button.

  5. Click the Start button in the Save Set Status Clone window to activate the save set clone.

To verify data already archived on an archive volume, you have two alternatives:

  • Clone the archive data by choosing Clone from the Save set menu.  During cloning, the original archive data will be verified as the save set is copied from one volume to another.  When you are done, you may reuse the cloned volume.

  • Determine the save set ID of the archived data, for example by searching the NetWorker Archive Retrieve window.  Then run the following command at the system prompt, substituting the save set ID of the archived data for ssid (this is how NetWorker verifies archived data):

    # nsrretrieve -n -S ssid 

Both alternatives verify the integrity of the data, but do not actually compare archived data with data on disk.  Chapters 6 and 7 in this manual provide more information about volume pools and save set cloning.

Archiving Shortcut

Tip: To schedule and run an archive request, follow this general procedure:

  1. Create and enable an archive client using the Clients window.

  2. Using the Archive Requests window, fill in all or most of the fields with your preferences.

    Name and Client are mandatory, as is the Start time if you Start laterSave set defaults to empty, Directive and Archive pool to the default, Verify and Clone to No, and Grooming to none.

  3. Check the archive status in the Archive Request Details window.

  4. To reschedule an archive, bring up the Archive Request Schedule window.  To discontinue a scheduled archive request, click the Disable button.  To start and stop an archive request (for example, to test it), click the Start and Stop buttons.

Understanding the Archive Feature

Archive save sets are similar to backup save sets.  The principal difference is that archive save sets have no expiration date.  Also, archives are always full—there are no levels of differential saves, or incremental saves.

Note: Archives are not recorded in the online file index, so they are not affected by the browse policy.  This feature helps conserve disk space.

Retrieve is similar to recover, except that it works with archive save sets instead of backup save sets.  Since archived files are not recorded in the file index, the user interface for retrieve is based on save sets, rather than on a directory hierarchy.

Users on the NetWorker administrator list have permission to configure archive services.  These users, and users registered on the Archive users list in the Clients window, have permission to use the archiving and retrieval facilities.  Registered users may archive any file for which they have read permission.

Anyone can browse archive save sets—that is, look at information in the media index.  However, you may retrieve only those files that you own, unless you are superuser, in which case you may retrieve any file.

Tip: If you want to overwrite an archive tape, first make sure nobody will ever need the data again.  Then relabel the volume, as you would a backup volume.

Archive Functions

This section describes three archive functions:  archiving, verification, and retrieval.

Data Archiving

Data archiving can be performed by end users or by the system administrator.  Registered clients can perform manual archives that start right away.  System administrators can perform manual archives or schedule an archive to take place anytime during the next 24 hours.  For example, the best time to perform a large archive might be in the middle of the night.

Both users and administrators can request an extra copy of their archive save set, called a clone.  NetWorker Archive employs the volume pools feature to separate backup volumes from archive volumes and archive volumes from archive clone volumes.

After archiving is complete, users and administrators are given the option of deleting archived files and directories or leaving them in place.  This option is called grooming.  Grooming helps conserve disk space after a project is finished.

This chapter describes the Archive Application's graphical user interface.  For details about the command-line interface, see the nsrarchive(1M) reference page.

Data Verification

Since archived files are often deleted from the system, NetWorker provides an extra measure of security to ensure archived data are correct.  NetWorker verifies two ways.

  • media verification—NetWorker checks the archive volume to ensure that it is writable and contains no bad spots.

  • data verification—NetWorker reads data from the archive volume as if doing a retrieve, but does not actually write any archived data back to disk.

If a volume is suspect, or if there are problems with the data on the tape, NetWorker issues a warning and suspends grooming.

If you decide to groom files, we recommend you also select Verify or Clone to avoid deleting improperly archived files.  Figure 8-6 shows the ordering of archive functions.

Figure 8-6. Archive Process Flow Chart

Figure 8-6 Archive Process Flow Chart

Data Retrieval

When you use NetWorker Retrieve, the NetWorker Retrieve window displays archived save sets for the selected server, listed by client name.  You can retrieve a save set only if you have administrator or archive user privileges for that server, are the owner of files in the save set, or are root.

It is possible to search for specific archives and to alter the sort order of archive save sets in the viewing list.  Refer to Chapter 5, “Archiving and Retrieving Files” in the IRIX NetWorker User's Guide for more details on retrieving archived files.

When the user picks an archive save set to retrieve, and the administrator ensures that the relevant archive volume (or a clone of that volume) is mounted on the NetWorker server, retrieval can begin.

Retrieved save sets can be relocated, renamed, or allowed to overwrite existing files of the same name, as with the NetWorker recover feature.

Methods for Protecting Data

Data backup is the process of storing copies of files and directories from local disk onto removable media, usually tapes.  These copies can be recovered in case the original files are lost or damaged.  The system administrator usually schedules daily backups.  Any new files, or files that changed since the last backup, are copied to tape so they can be restored on disk if necessary.

Archiving is normally performed on data associated with specific projects, rather than on an entire system.  Unlike data backup, end users usually archive their files as needed, so a network-wide archiving policy is not required.  Archives, unlike backups, are not associated with a level (full, differential, or incremental).

When users archive project data, they can choose to delete the files from the system disk automatically to conserve space.  In this case, archived files need to be placed on long-lasting archive media.

Hierarchical storage management (HSM) is a data management strategy where data is automatically migrated from one storage medium to another, based on a set of rules.  The rule most often employed is access rate—the longer a file is inactive, the more likely it is to migrate.

Storage hierarchy is usually governed by the cost of storage for each media.  The benefit of HSM is that it provides users with a seemingly infinite storage capacity, at the lowest possible cost.

The principal goals of backup, archiving, and HSM are as follows:

  • The goal of backup is to protect data against accidental loss or damage.  Backups should be reliable and efficient.

  • The goal of archiving is to conserve online storage space.  Storage media must be durable, safe, and reliable.

  • The goal of HSM is to conserve network storage resources.  Migration and recall must be automatic and reliable.