Chapter 10. Configuring and Scheduling Archive Requests

NetWorker includes an optional Archive product for long-term data storage. Whereas the goal of backup is to protect data against accidental loss or damage, the goal of archiving is to maintain a snapshot of data in perpetuity (or for as long as desired), and to conserve online storage space.


Note: If you are not sure if your site includes the NetWorker Archive product, read the release notes and the /var/netls/nodelock file.

This chapter describes how to configure, schedule, and control archive requests using the NetWorker Archive option. This chapter explains

This chapter ends with a shortcut procedure for scheduling and running archive requests.


Note: For information on installing and enabling the NetWorker Archive programs, see “Installing and Enabling NetWorker Software on Servers” and “Installing NetWorker Software on Clients” in Chapter 2. For information on using the NetWorker archive and retrieval programs, see Chapter 5, “Archiving and Retrieving,” in the IRIX NetWorker User's Guide.


Backup, Archiving, and Hierarchical Storage Management

Data backup is the process of storing redundant copies of files and directories residing on a local disk onto removable media, usually tapes. Redundant copies can be recovered in case the original files are lost or damaged. The system administrator usually schedules backups daily. Any new files, or files that changed since the last backup, are copied to tape so they can be restored on disk if necessary.

Archiving is normally performed on data associated with specific projects, rather than on an entire system. Unlike data backup, end users typically initiate data archiving whenever they need, so a network-wide archiving policy is often impractical or unwise. Backups are always associated with a level—full, differential, or incremental—whereas archives are not.

When users archive project data, they sometimes want to delete those files to conserve space. As a consequence, they require that data be placed on long-lasting archive media that are durable, safe, and reliable. These media are usually removable, such as tape or optical disk.

Hierarchical Storage Management (HSM) is a policy-driven data management strategy in which data is automatically migrated from one storage medium to another, based on a set of policies. The policy often employed is access rate: the longer a file is inactive, the more likely it is to migrate. After a file migrates from primary to secondary storage, it can later be staged to tertiary storage based on another set of policies. Storage hierarchy is usually governed by the cost of storage for each media. The benefit of HSM is that it provides users with a seemingly infinite storage capacity, at the lowest possible cost.

The principal goals of backup, archiving, and HSM are as follows:

  • The goal of backup is to protect data against accidental loss or damage. Backups should be reliable and efficient.

  • The goal of archiving is to conserve online storage space. Storage media must be durable, safe, and reliable.

  • The goal of HSM is to decrease the overall cost of storage. Migration and demigration must be automatic and reliable.

Features of the NetWorker Archive Product

The NetWorker Archive option consists of software for clients and a server:

  • The NetWorker Archive server provides file archiving and retrieval services to a range of client machines. It is packaged as an add-on extension to existing NetWorker backup servers, and uses the same license mechanism as NetWorker.

  • The NetWorker Archive client can be any machine on a network that employs archive services provided by the archive server. Clients can be enabled for backups, for archives, or for both.

The three categories of archive functions are

  • data archiving

  • data verification

  • data retrieval

Data Archiving

Data archiving is the process of taking a snapshot of files or directories as they reside on primary media (usually disk) at a given point in time for long-term storage on media called archive volumes. Archive volumes are similar to backup volumes, except that their retention policy has an unlimited expiration date.

Data archiving can be performed by end users or by the system administrator. Users archive data manually when they need to, whereas administrators can schedule an archive to take place anytime during the next 24 hours. For example, the best time to perform a large archiving operation might be in the middle of the night.

Both users and administrators can request an extra copy of their archive save set, called a clone. NetWorker Archive allows you to use the volume pools feature to separate backup from archive volumes, and archive volumes from archive clone volumes. See “Using Volume Pools” in Chapter 7 for information.

After archiving is complete, users and administrators are given the option of deleting archived files and directories, or leaving them in place. This option, called grooming, helps conserve disk space after a project is finished.

This chapter documents the Motif-based GUI interface for the archive and retrieval programs, although a command line-oriented interface is available as well: the commands nsrarchive and nsrretrieve.

Data Verification

Because archived data might be deleted from the system immediately or eventually, NetWorker provides an extra measure of security to make sure that archived data is correct. NetWorker verifies data in two ways:

  • media verification: NetWorker reads the archive volume and ensures it is writable and contains no bad spots

  • data verification: NetWorker reads data from the archive volume as if doing a retrieve, but does not actually write any archived data back to disk.

If the archive volume is suspect, or if there are any problems with the data on tape, NetWorker issues a warning.

Data Retrieval

When you run the NetWorker retrieve program, NetWorker displays the names of the archive save sets on that server, listed by client name. You can retrieve a particular save set only if you have administrator or archive user privileges for that server.

You can search for specific archives or alter the sort order of archive save sets in the viewing list.

When the user picks an archive save set to retrieve, and the administrator ensures that the relevant archive volume (or a clone of that volume) is mounted on the NetWorker server, retrieval can begin.

Retrieved save sets can be relocated, renamed, or allowed to overwrite existing files of the same name, as with the NetWorker recovery feature.

Because archives theoretically last forever, NetWorker does not track them in the online index. Consequently, when a user retrieves archived data, the interface does not show the structure of files and directories. It shows only the archives that a server has made during a selected period, or for a certain user. Thus the best way for a user or administrator to remember archives is to keep a record of the date, or create an annotation that remains meaningful over the years.

Configuring NetWorker for Archiving

If your site has purchased the NetWorker Archive option, you can begin archive administration by invoking the nwadmin command:

# nwadmin& [ 


-s servername ]

Replace servername with the name of the NetWorker server where you wish to perform archive administration, or omit the -s option altogether.

If you have already started the NetWorker Administrator program, double-click on its icon, which is shown in Figure 10-1.

Figure 10-1. NetWorker Administrator Icon


The NetWorker Administrator window appears, as shown in Figure 10-2.

Figure 10-2. NetWorker Administrator Window


At the top level, this window looks like version 4.1.1 of NetWorker Administrator. The differences are as follows:

  • in the Clients setup window, you can enable Archive services and add names to the Archive users list

  • Archive Request Control menu item in the Server menu

  • Archive Request menu item in the Customize menu

Use of these features is explained in the next sections.


Tip: If you have a jukebox, it is a good idea to insert an archive volume into one of the slots.

Before any archiving can occur, you must configure NetWorker to recognize one or more archive clients and set up save sets and volume pools as needed. This section explains

  • configuring archive services for a client

  • using archive save sets

  • creating new volume pools for archiving

Configuring Archive Services for a Client

The arrow in Figure 10-3 points to the Archive services field in the Clients window. This field governs whether archives are enabled for the system selected in the Clients list. To allow archives for that client, click the Enabled button at Archive services. If archive services remain Disabled, that client will not be able to participate in a scheduled archive.

Figure 10-3. Enabling Archive Services for a Client



Note: When you enable archive services for one client, archive services are enabled for all clients of that server.

The Archive users list is at the bottom of the scrolling section of the Clients window. To allow users on the client system to perform manual archives and file grooming, enter their user names into the Archive users list. If no user names appear in the Archive users list, the client machine will be able to participate in scheduled archives, but individual users will not be able to archive files.

Using Archive Save Sets

Archive save sets are similar to backup save sets. The principal difference is that archive save sets have no expiration date. Also, archives are always complete: there are no backup levels or incremental saves.


Note: Archives are not recorded in the online file index, so they are not affected by the browse policy. The lack of associated indexes helps conserve disk space.

nsrretrieve (the retrieve program) is similar to the NetWorker nwrecover program, except that it works with archive save sets instead of backup save sets. Since archived files and directories are not recorded in the online index, the graphical user interface for retrieve is based on save sets, rather than on a directory hierarchy.

Users on the NetWorker administrator list have permission to configure archives. These users, and users registered on the Archive users list in the Clients window, have permission to use the archiving and retrieval facilities.

Anyone can browse archive save sets—that is, look at information in the media database. However, you can retrieve only files that you own, unless you are superuser, in which case you can retrieve anyone's files.


Tip: To overwrite an archive tape, first make sure nobody will ever need the data again. Then relabel the volume as you would a backup volume; see “Labeling Backup Volumes” in Chapter 7.


Creating New Volume Pools for Archiving

NetWorker contains two preconfigured volume pools for use with archiving: Archive and Archive Clone. If you want to create a new volume pool for archives, follow these steps:

  1. Choose “Pools” from the Media menu.

  2. Click the Create button in the Pools window.

  3. Fill out all the fields according to your needs.


    Note: Make sure the Store index entries field is set to No. This feature distinguishes archive pools from backup pools.


  4. Click the Apply button to activate the new volume pool.

When you schedule a new archive request, you can use the new archive volume pool you have created. If you choose to clone archive data (as explained in “Cloning an Archive,” later in this chapter), you should also create a new archive clone pool.

Scheduling and Monitoring Archive Requests

This section explains

  • scheduling an archive request

  • starting or disabling an archive request

  • viewing the progress of the current archive request

  • verifying data on an archive volume

  • cloning an archive

Scheduling an Archive Request

To schedule an archive on a one-time basis, choose “Archive Request” from the Customize menu.

The Archive Request window appears, as shown in Figure 10-4.

Figure 10-4. Archive Request Window, Expanded View


Use the features of the Archive Request window as follows:

Archive Request list  


View a list of archives that someone has previously requested on this server. You can add new archive requests and delete old ones using the Create and Delete buttons.

scrolling panel  


View the name of the selected archive. Use fields in this panel as follows:

  • Annotation: Enter a comment of 1024 characters or less that you provide for this archive request.

  • Status: “Start now” means archiving begins when you click the Apply button; “Start later” means it begins as indicated in the Start time field. After archive runs, neither is selected.

  • Start time: Enter the time to begin, using a 24-hour clock.

  • Client field: Enter the archive client's hostname. For the server to archive itself, the client should be the same as the server.

  • Save set: Enter pathnames of directories or files to archive.

  • Directive: As in the Clients window, select the directive, usually Unix standard or Unix with compression.

  • Archive pool: Enter the volume pool to which archives should be sorted. You should usually set this to Archive.

  • Verify buttons: Indicate whether or not to automatically check the integrity of data archived on tape.

  • Clone buttons: Indicate whether or not to automatically clone the archive volumes for extra security.


    Note: If you enable cloning, you should also set the Archive clone pool, usually to Archive Clone.


  • Grooming buttons: Indicate whether or not client files and directories should be removed after archiving is complete.

  • Archive completion field: If desired, enter a command to execute after archiving is complete.

Starting or Disabling an Archive Request

To start an archive immediately, or disable a scheduled archive, choose “Archive Request Control” from the Server menu. The Archive Request Control window appears, as shown in Figure 10-5.

Figure 10-5. Archive Request Control Window


Use features of this window as follows:

Archive list 

View archive requests; the most recently run archive request is initially highlighted. To see details about when this archive completed and how successfully it ran, click the Details button.

If you want details about, or control over, a different archive request, select the one you want from the Archive list.

Start button 

Click to initiate an archive request immediately. This action has the same effect as selecting a status of Start now for that archive request in the Archive List window.

Schedule button 

Click to reschedule an archive request; enter a new starting time. This action has the same effect as selecting Start later and specifying a new Start time in the Archive List window.

Disable button 

Click to disable an archive request that you scheduled here or in the Archive List window.

Stop button 

Click to halt an archive in progress.

Viewing the Progress of the Current Archive Request

To see the progress of a recent archive (the one currently selected), click the Details button in the Archive Request Control window. The Archive Request Details window appears, as shown in Figure 10-6.

Figure 10-6. Archive Request Details Window



Note: Entries appear in the Archive Request Details window only after an archive request runs.

This window provides information about the completion of an archive request. Many fields in this window, such as annotation, archive clone pool, archive pool, client, clone, name, save set, start time, status, and verify, appear in the Archive Request window.

The features of the Archive Request Control window have the following functions:

completion time field  


Indicates when the archive finished. Its duration is the difference between this and start time.

log field  

Displays a message generated by the archive.

run status field  

Displays the success of the archive request, either completed, failed, or partial.


Note: For more information about an archive, see the log file /usr/lib/nsrarchive.


Verifying Data on an Archive Volume

To verify data already archived on an archive volume after the fact, you have two alternatives:

  • Clone the archive data by choosing “Clone” from the Save set menu. During cloning, the original archive data is verified as the save set is copied from one volume to another. When you are done, you can reuse the cloned volume.

  • Find out the save set ID of the archived data; for example, by searching the NetWorker Archive Retrieve window. Then enter the following command at the system prompt, substituting that save set ID for ssid (this how NetWorker verifies archived data):

    retrieve -n -S ssid 
    

Cloning an Archive

If you have already made an archive, and want to make a clone of it, follow these steps:

  1. Choose “Clone” from the Save set menu.

  2. Enter criteria for locating save sets in the Save Set Clone window.


    Tip: Click the More button and type Archive in the Pool field.


  3. Click the Query button to see save sets matching your criteria.

  4. Select the save set you wish to clone, and click the Clone button.

  5. Click the Start button in the Save Set Status Clone dialog box to activate the save set clone.

For more information on volume pools and save set cloning, see “Using Volume Pools” in Chapter 7 and “Cloning Save Sets” in Chapter 9, respectively.

Archive Example

Suppose you must decommission the workstation (and hostname) of someone who has left the company. It would be wise to archive the system first, in case its filesystems contain some essential files that will not be recognized as such until later.

This section gives an example of how to schedule and run an archive request:

  • creating an archive client

  • making an archive request

  • checking the archive request

  • rescheduling the archive request

Creating an Archive Client

Only registered archive clients can use the archive facility. To create an archive client, follow these steps:

  1. Choose “Client Setup” from the Clients menu. The Clients window appears, as shown in Figure 10-3.

  2. Click the Create button; the Clients window changes.

  3. Enter the hostname of the decommissioned workstation into the client Name field.

  4. Click Enabled after Archive services to allow archives for this client.

  5. To permit employees on other machines to use archive and retrieval services, scroll to the bottom of the Clients window and add their user names to the Archive users field.

    If you do not supply archive users, manual archives and retrieves are disabled on the client machine.

The system hostname is now a registered archive client. However, an archive does not take place until you request one.


Tip: To allow archives on the server, make sure that archives are enabled for the server as a client of itself.


Making an Archive Request

Valid archive users may request archives manually using the nwarchive command. However, a manual archive often takes too long, especially during normal working hours when the network is busy. Therefore, it is efficient to use an archive request. Follow these steps:

  1. Choose “Archive Request” from the Customize menu. The Archive Request window appears, as shown in Figure 10-4.

  2. Click the Create button; the Archive Request window changes.

  3. Type the Name you want to assign to this archive request, and a brief Annotation to remind you what the archive is for.

  4. After Status, click on Start later to schedule this archive for tonight. Type the Start time you want, or accept the default starting time of 3:33 a.m.

  5. Type the archive client machine name into the Client field, and the Save set pathname(s) you want to archive.

  6. To speed the archive, select a Directive of either Unix standard or Unix with compression.

  7. To separate this archive from normal backups, select the Archive pool named Archive, or be sure to use a different tape.

  8. To check that archived data was accurately saved, click Yes in the Verify field.

  9. To make a duplicate copy of this archive volume, click Yes in the Clone field. Accept the default Archive Clone pool.

  10. To remove files and directories from disk after archiving them to tape, click remove in the Grooming field.

  11. To be notified by e-mail when the archive completes, type the following into the Archive completion field:

    /usr/sbin/Mail -s “archive request complete” root
    

    This instruction sends it to root.

  12. Click the Apply button to activate your changes.

You have now requested an archive of the system gallo to begin tonight at 3:33 a.m.

Checking the Archive Request

The next morning, check the outcome of the archive. If you set an Archive completion notice, you might have a mail message containing a log of the archive request.

To check details of the archive, follow these steps:

  1. Choose “Archive Request Control” from the Server menu. The Archive Request Control window appears, as shown in Figure 10-5 on page 268.

  2. Click the Details button. The Archive Request Details window appears, as shown in Figure 10-6.

If the archive completed successfully, you are done, and you can begin dismantling the ex-employee's system. If the archive failed, you can reschedule it, as explained in the next section.

You can use the Archive Request Control window to start, schedule, disable, or stop another archive.

Rescheduling the Archive Request

Suppose that the archive of hostname did not complete last night because the Archive volume had not been mounted. You decide to reschedule for the next night. Follow these steps:

  1. Choose “Archive Request Control” from the Server menu. The Archive Request Control window appears, as shown on page 268. Make sure that the archive request you want is highlighted.

  2. Click the Schedule button. The Archive Request Schedule dialog box appears, as shown in Figure 10-7.

    Figure 10-7. Archive Request Schedule Dialog Box


  3. Type a new starting time into the Schedule Archive field, using the 24-hour clock, and click Ok.

The archive request is executed again that night, at the time you specified. If you change your mind and want to discontinue the archive, click the Disable button.

Shortcut

Use this shortcut to schedule and run an archive request if you are an experienced NetWorker user or after you have reviewed the information in this chapter. Follow this general procedure:

  1. Create and enable an archive client using the Clients window.

  2. Using the Archive Request window, fill in all or most of the fields with your preferences.

    Name and Client are mandatory, as is the Start time if you Start later. Save set defaults to empty, Directive and Archive pool to none, Verify and Clone to No, and Grooming to none.

  3. Check the archive status in the Archive Request Details window.

  4. To reschedule an archive, use the Archive Request Schedule dialog box. To discontinue an scheduled archive request, click the Disable button. To start and stop an archive request (for example, to test it), click the Start and Stop buttons.