NetWorker includes an optional Archive product for long-term data storage. Whereas the goal of backup is to protect data against accidental loss or damage, the goal of archiving is to maintain a snapshot of data in perpetuity (or for as long as desired), and to conserve online storage space.
![]() | Note: If you are not sure if your site includes the NetWorker Archive product, read the release notes and the /var/netls/nodelock file. |
This chapter describes how to configure, schedule, and control archive requests using the NetWorker Archive option. This chapter explains
backup, archiving, and hierarchical storage management
features of the NetWorker Archive product
configuring NetWorker for archiving
scheduling and monitoring archive requests
example
This chapter ends with a shortcut procedure for scheduling and running archive requests.
![]() | Note: For information on installing and enabling the NetWorker Archive programs, see “Installing and Enabling NetWorker Software on Servers” and “Installing NetWorker Software on Clients” in Chapter 2. For information on using the NetWorker archive and retrieval programs, see Chapter 5, “Archiving and Retrieving,” in the IRIX NetWorker User's Guide. |
Data backup is the process of storing redundant copies of files and directories residing on a local disk onto removable media, usually tapes. Redundant copies can be recovered in case the original files are lost or damaged. The system administrator usually schedules backups daily. Any new files, or files that changed since the last backup, are copied to tape so they can be restored on disk if necessary.
Archiving is normally performed on data associated with specific projects, rather than on an entire system. Unlike data backup, end users typically initiate data archiving whenever they need, so a network-wide archiving policy is often impractical or unwise. Backups are always associated with a level—full, differential, or incremental—whereas archives are not.
When users archive project data, they sometimes want to delete those files to conserve space. As a consequence, they require that data be placed on long-lasting archive media that are durable, safe, and reliable. These media are usually removable, such as tape or optical disk.
Hierarchical Storage Management (HSM) is a policy-driven data management strategy in which data is automatically migrated from one storage medium to another, based on a set of policies. The policy often employed is access rate: the longer a file is inactive, the more likely it is to migrate. After a file migrates from primary to secondary storage, it can later be staged to tertiary storage based on another set of policies. Storage hierarchy is usually governed by the cost of storage for each media. The benefit of HSM is that it provides users with a seemingly infinite storage capacity, at the lowest possible cost.
The principal goals of backup, archiving, and HSM are as follows:
The goal of backup is to protect data against accidental loss or damage. Backups should be reliable and efficient.
The goal of archiving is to conserve online storage space. Storage media must be durable, safe, and reliable.
The goal of HSM is to decrease the overall cost of storage. Migration and demigration must be automatic and reliable.
The NetWorker Archive option consists of software for clients and a server:
The NetWorker Archive server provides file archiving and retrieval services to a range of client machines. It is packaged as an add-on extension to existing NetWorker backup servers, and uses the same license mechanism as NetWorker.
The NetWorker Archive client can be any machine on a network that employs archive services provided by the archive server. Clients can be enabled for backups, for archives, or for both.
The three categories of archive functions are
data archiving
data verification
data retrieval
Data archiving is the process of taking a snapshot of files or directories as they reside on primary media (usually disk) at a given point in time for long-term storage on media called archive volumes. Archive volumes are similar to backup volumes, except that their retention policy has an unlimited expiration date.
Data archiving can be performed by end users or by the system administrator. Users archive data manually when they need to, whereas administrators can schedule an archive to take place anytime during the next 24 hours. For example, the best time to perform a large archiving operation might be in the middle of the night.
Both users and administrators can request an extra copy of their archive save set, called a clone. NetWorker Archive allows you to use the volume pools feature to separate backup from archive volumes, and archive volumes from archive clone volumes. See “Using Volume Pools” in Chapter 7 for information.
After archiving is complete, users and administrators are given the option of deleting archived files and directories, or leaving them in place. This option, called grooming, helps conserve disk space after a project is finished.
This chapter documents the Motif-based GUI interface for the archive and retrieval programs, although a command line-oriented interface is available as well: the commands nsrarchive and nsrretrieve.
Because archived data might be deleted from the system immediately or eventually, NetWorker provides an extra measure of security to make sure that archived data is correct. NetWorker verifies data in two ways:
If the archive volume is suspect, or if there are any problems with the data on tape, NetWorker issues a warning.
When you run the NetWorker retrieve program, NetWorker displays the names of the archive save sets on that server, listed by client name. You can retrieve a particular save set only if you have administrator or archive user privileges for that server.
You can search for specific archives or alter the sort order of archive save sets in the viewing list.
When the user picks an archive save set to retrieve, and the administrator ensures that the relevant archive volume (or a clone of that volume) is mounted on the NetWorker server, retrieval can begin.
Retrieved save sets can be relocated, renamed, or allowed to overwrite existing files of the same name, as with the NetWorker recovery feature.
Because archives theoretically last forever, NetWorker does not track them in the online index. Consequently, when a user retrieves archived data, the interface does not show the structure of files and directories. It shows only the archives that a server has made during a selected period, or for a certain user. Thus the best way for a user or administrator to remember archives is to keep a record of the date, or create an annotation that remains meaningful over the years.
If your site has purchased the NetWorker Archive option, you can begin archive administration by invoking the nwadmin command:
# nwadmin& [ -s servername ] |
Replace servername with the name of the NetWorker server where you wish to perform archive administration, or omit the -s option altogether.
If you have already started the NetWorker Administrator program, double-click on its icon, which is shown in Figure 10-1.
The NetWorker Administrator window appears, as shown in Figure 10-2.
At the top level, this window looks like version 4.1.1 of NetWorker Administrator. The differences are as follows:
in the Clients setup window, you can enable Archive services and add names to the Archive users list
Archive Request Control menu item in the Server menu
Archive Request menu item in the Customize menu
Use of these features is explained in the next sections.
![]() | Tip: If you have a jukebox, it is a good idea to insert an archive volume into one of the slots. |
Before any archiving can occur, you must configure NetWorker to recognize one or more archive clients and set up save sets and volume pools as needed. This section explains
configuring archive services for a client
using archive save sets
creating new volume pools for archiving
The arrow in Figure 10-3 points to the Archive services field in the Clients window. This field governs whether archives are enabled for the system selected in the Clients list. To allow archives for that client, click the Enabled button at Archive services. If archive services remain Disabled, that client will not be able to participate in a scheduled archive.
![]() | Note: When you enable archive services for one client, archive services are enabled for all clients of that server. |
The Archive users list is at the bottom of the scrolling section of the Clients window. To allow users on the client system to perform manual archives and file grooming, enter their user names into the Archive users list. If no user names appear in the Archive users list, the client machine will be able to participate in scheduled archives, but individual users will not be able to archive files.
Archive save sets are similar to backup save sets. The principal difference is that archive save sets have no expiration date. Also, archives are always complete: there are no backup levels or incremental saves.
![]() | Note: Archives are not recorded in the online file index, so they are not affected by the browse policy. The lack of associated indexes helps conserve disk space. |
nsrretrieve (the retrieve program) is similar to the NetWorker nwrecover program, except that it works with archive save sets instead of backup save sets. Since archived files and directories are not recorded in the online index, the graphical user interface for retrieve is based on save sets, rather than on a directory hierarchy.
Users on the NetWorker administrator list have permission to configure archives. These users, and users registered on the Archive users list in the Clients window, have permission to use the archiving and retrieval facilities.
Anyone can browse archive save sets—that is, look at information in the media database. However, you can retrieve only files that you own, unless you are superuser, in which case you can retrieve anyone's files.
![]() | Tip: To overwrite an archive tape, first make sure nobody will ever need the data again. Then relabel the volume as you would a backup volume; see “Labeling Backup Volumes” in Chapter 7. |
NetWorker contains two preconfigured volume pools for use with archiving: Archive and Archive Clone. If you want to create a new volume pool for archives, follow these steps:
Choose “Pools” from the Media menu.
Click the Create button in the Pools window.
Fill out all the fields according to your needs.
![]() | Note: Make sure the Store index entries field is set to No. This feature distinguishes archive pools from backup pools. |
Click the Apply button to activate the new volume pool.
When you schedule a new archive request, you can use the new archive volume pool you have created. If you choose to clone archive data (as explained in “Cloning an Archive,” later in this chapter), you should also create a new archive clone pool.
This section explains
scheduling an archive request
starting or disabling an archive request
viewing the progress of the current archive request
verifying data on an archive volume
cloning an archive
To schedule an archive on a one-time basis, choose “Archive Request” from the Customize menu.
The Archive Request window appears, as shown in Figure 10-4.
Use the features of the Archive Request window as follows:
To start an archive immediately, or disable a scheduled archive, choose “Archive Request Control” from the Server menu. The Archive Request Control window appears, as shown in Figure 10-5.
Use features of this window as follows:
Archive list | View archive requests; the most recently run archive request is initially highlighted. To see details about when this archive completed and how successfully it ran, click the Details button. If you want details about, or control over, a different archive request, select the one you want from the Archive list. | |
Start button | Click to initiate an archive request immediately. This action has the same effect as selecting a status of Start now for that archive request in the Archive List window. | |
Schedule button | Click to reschedule an archive request; enter a new starting time. This action has the same effect as selecting Start later and specifying a new Start time in the Archive List window. | |
Disable button | Click to disable an archive request that you scheduled here or in the Archive List window. | |
Stop button | Click to halt an archive in progress. |
To see the progress of a recent archive (the one currently selected), click the Details button in the Archive Request Control window. The Archive Request Details window appears, as shown in Figure 10-6.
![]() | Note: Entries appear in the Archive Request Details window only after an archive request runs. |
This window provides information about the completion of an archive request. Many fields in this window, such as annotation, archive clone pool, archive pool, client, clone, name, save set, start time, status, and verify, appear in the Archive Request window.
The features of the Archive Request Control window have the following functions:
completion time field |
| |
log field | Displays a message generated by the archive. | |
run status field | Displays the success of the archive request, either completed, failed, or partial. |
![]() | Note: For more information about an archive, see the log file /usr/lib/nsrarchive. |
To verify data already archived on an archive volume after the fact, you have two alternatives:
Clone the archive data by choosing “Clone” from the Save set menu. During cloning, the original archive data is verified as the save set is copied from one volume to another. When you are done, you can reuse the cloned volume.
Find out the save set ID of the archived data; for example, by searching the NetWorker Archive Retrieve window. Then enter the following command at the system prompt, substituting that save set ID for ssid (this how NetWorker verifies archived data):
retrieve -n -S ssid |
If you have already made an archive, and want to make a clone of it, follow these steps:
Choose “Clone” from the Save set menu.
Enter criteria for locating save sets in the Save Set Clone window.
![]() | Tip: Click the More button and type Archive in the Pool field. |
Click the Query button to see save sets matching your criteria.
Select the save set you wish to clone, and click the Clone button.
Click the Start button in the Save Set Status Clone dialog box to activate the save set clone.
For more information on volume pools and save set cloning, see “Using Volume Pools” in Chapter 7 and “Cloning Save Sets” in Chapter 9, respectively.
Suppose you must decommission the workstation (and hostname) of someone who has left the company. It would be wise to archive the system first, in case its filesystems contain some essential files that will not be recognized as such until later.
This section gives an example of how to schedule and run an archive request:
creating an archive client
making an archive request
checking the archive request
rescheduling the archive request
Only registered archive clients can use the archive facility. To create an archive client, follow these steps:
Choose “Client Setup” from the Clients menu. The Clients window appears, as shown in Figure 10-3.
Click the Create button; the Clients window changes.
Enter the hostname of the decommissioned workstation into the client Name field.
Click Enabled after Archive services to allow archives for this client.
To permit employees on other machines to use archive and retrieval services, scroll to the bottom of the Clients window and add their user names to the Archive users field.
If you do not supply archive users, manual archives and retrieves are disabled on the client machine.
The system hostname is now a registered archive client. However, an archive does not take place until you request one.
![]() | Tip: To allow archives on the server, make sure that archives are enabled for the server as a client of itself. |
Valid archive users may request archives manually using the nwarchive command. However, a manual archive often takes too long, especially during normal working hours when the network is busy. Therefore, it is efficient to use an archive request. Follow these steps:
Choose “Archive Request” from the Customize menu. The Archive Request window appears, as shown in Figure 10-4.
Click the Create button; the Archive Request window changes.
Type the Name you want to assign to this archive request, and a brief Annotation to remind you what the archive is for.
After Status, click on Start later to schedule this archive for tonight. Type the Start time you want, or accept the default starting time of 3:33 a.m.
Type the archive client machine name into the Client field, and the Save set pathname(s) you want to archive.
To speed the archive, select a Directive of either Unix standard or Unix with compression.
To separate this archive from normal backups, select the Archive pool named Archive, or be sure to use a different tape.
To check that archived data was accurately saved, click Yes in the Verify field.
To make a duplicate copy of this archive volume, click Yes in the Clone field. Accept the default Archive Clone pool.
To remove files and directories from disk after archiving them to tape, click remove in the Grooming field.
To be notified by e-mail when the archive completes, type the following into the Archive completion field:
/usr/sbin/Mail -s “archive request complete” root |
This instruction sends it to root.
Click the Apply button to activate your changes.
You have now requested an archive of the system gallo to begin tonight at 3:33 a.m.
The next morning, check the outcome of the archive. If you set an Archive completion notice, you might have a mail message containing a log of the archive request.
To check details of the archive, follow these steps:
Choose “Archive Request Control” from the Server menu. The Archive Request Control window appears, as shown in Figure 10-5 on page 268.
Click the Details button. The Archive Request Details window appears, as shown in Figure 10-6.
If the archive completed successfully, you are done, and you can begin dismantling the ex-employee's system. If the archive failed, you can reschedule it, as explained in the next section.
You can use the Archive Request Control window to start, schedule, disable, or stop another archive.
Suppose that the archive of hostname did not complete last night because the Archive volume had not been mounted. You decide to reschedule for the next night. Follow these steps:
Choose “Archive Request Control” from the Server menu. The Archive Request Control window appears, as shown on page 268. Make sure that the archive request you want is highlighted.
Click the Schedule button. The Archive Request Schedule dialog box appears, as shown in Figure 10-7.
Type a new starting time into the Schedule Archive field, using the 24-hour clock, and click Ok.
The archive request is executed again that night, at the time you specified. If you change your mind and want to discontinue the archive, click the Disable button.
Use this shortcut to schedule and run an archive request if you are an experienced NetWorker user or after you have reviewed the information in this chapter. Follow this general procedure:
Create and enable an archive client using the Clients window.
Using the Archive Request window, fill in all or most of the fields with your preferences.
Name and Client are mandatory, as is the Start time if you Start later. Save set defaults to empty, Directive and Archive pool to none, Verify and Clone to No, and Grooming to none.
Check the archive status in the Archive Request Details window.
To reschedule an archive, use the Archive Request Schedule dialog box. To discontinue an scheduled archive request, click the Disable button. To start and stop an archive request (for example, to test it), click the Start and Stop buttons.