Appendix B. Managing the NetWorker Environment

This appendix provides examples and suggestions for you to consider while you are thinking about setting up your NetWorker environment.  It also offers background information to help you understand the logic behind the NetWorker backup schedule and index policy features.

For your convenience, NetWorker is shipped with preconfigured backup schedules, index policies, pools, label templates, directives, and notifications.  A brief description of each preconfigured setting is presented in this appendix.

Guidelines for Choosing a Configuration

There are several factors that can help determine the NetWorker server configuration that best suits your backup and recover needs.  The configuration consists of the hardware and software, including tape drives, jukeboxes, client systems, and network connections.

This section provides a few simple rules that you can use to guide your choices.  Information focuses on backup, since backup requires far more server capacity than recovery.  Keep in mind that these are guidelines; actual performances may vary.

The goal in selecting a configuration is to balance the different hardware and software limitations to achieve the overall data handling capabilities you require.  Start by looking at the limits of the major NetWorker configuration components:  tape drives, clients, network connection, jukeboxes, and the NetWorker server itself.

Tape Drives

Tape drives have a fixed maximum data transfer rate that they can handle.  Since NetWorker automatically spans multiple tapes, the total tape capacity is not as important as the data rate.  Refer to your hardware documentation to find out what the data transfer rate is for your drive.

NetWorker cannot back up faster than the data rate of your tape drive, but multiple tape drives can decrease backup time.

Clients

Different clients can generate data at different rates and, even within a single client, different types of files can generate different data rates.  For example, symbolic links require as much processing as large data files, but produce no data.  Consequently, the data rate produced by a backup of a single client can vary quite a bit.  It is a good idea to run several clients simultaneously to help smooth out fluctuations in the data transfer rate for each client.

Network

Ethernet has an upper limit on bandwidth of about 1 megabyte (MB) per second, but in practice, most networks can handle only about 500 KB between a set of clients and a single server.  Token ring has a lower maximum bandwidth (8 Mb per second) but a higher utilization ratio, so data transfer rates are approximately the same.  FDDI is faster; see Table B-1.

Table B-1. Network Speed Comparison

Network

Rate

Ethernet

500 KB/s

Token Ring

500 KB/s

FDDI

5 MB/s


Server

The server must be able to handle the load of network packets, data movement, and tape drives in order to achieve the rates listed above.  Most of the work on the server side is in data movement, context switching, and interrupt handling.  The performance of all of these functions improves as the CPU speed increases.  It takes approximately 20 MIPS to handle 500 KB/s of data, although this tapers off at high CPU speeds because bus bandwidth and other bottlenecks begin to affect the data movement.  Approximately 16 MB of memory is required per 500 KB/s of data rate handled by the server.

Table B-2 shows the relationship of server power to backup throughput.

Table B-2. Server Throughput Comparison

Server CPU/Memory

Rate

20 MIPS/16 MB

500 KB/s

50 MIPS/32 MB

1000 KB/s

100 MIPS/64 MB

2000 KB/s


Jukeboxes

Jukeboxes (autochangers) provide automatic loading and unloading of tapes or optical disks.  This assists the administrator in two different ways.  During nightly backups, NetWorker uses the jukebox to automatically switch to new media as the data is backed up.  During recovers, NetWorker uses the jukebox to load all of the media needed for the recovery without operator intervention.  Refer to the jukebox documentation for data transfer rates and maximum capacity.

To determine the capacity requirements of a jukebox for a scheduled unattended backup, simply pick the jukebox with a capacity large enough to handle the largest possible amount of backup data.  For example, a full backup of 60 GB requires an automatic tape loader(ATL) or Lago® jukebox.

To determine how much disk space the online indexes will require for quick recovers, do a rough calculation of the amount of data backed up in a single schedule period (for example, week, month, or quarter).  Employ the guidelines in Table B-3 to determine how much data is backed up with different levels of backup.

Table B-3. Amount of Backup Data at Different Levels

Level

% of Data Backed Up

Full

100%

Level 1-9

25%

Incremental

10%

For example, a monthly schedule that has 1 full on the first Sunday, a level 5 on other Sundays, and incrementals every other day will look like Table B-4.

Table B-4. Sample Monthly Data Backup

Level

% of Data Backed Up

1 Full

100%

26 Incremental

260%

4 Level 5

100%

Total

460%

This illustrates that over the course of a month, 460% of the total amount of data will be backed up.  For example, a total of 10 GB of client data backed up using this schedule would result in about 46 GB of data on tape per month.

In this example, different jukeboxes would provide the amount of online data indicated in Table B-5.

Table B-5. Capacity of Different Autochangers

Jukebox

Capacity

Months of Data

EXB-10e

50 GB

1

ATL/Lago

270 GB

5

EXB-120CHS

580 GB

12


Guidelines

Using the previous data transfer rate and capacity calculations, you may consider the following guidelines for configuring your network of servers and clients:

  1. Assign 4 simultaneous clients per network, by setting the parallelism to 4 (if the clients are PCs, you may assign more).

  2. Use one Exabyte 8500 or Sony 5200 per network, or two Exabyte 8200s.

  3. Assign a NetWorker server with approximately 20-30 MIPS and 16 MB of memory per network.

Configuration Examples

This section includes two examples of NetWorker configuration to illustrate the reasoning behind selecting the components.

Example 1

Site A has approximately 30 GB of data on two networks of 50 clients and wants to schedule full backups for all of their data in one night (12 hours).  The following equation calculates the required data transfer rate to achieve this goal:

30000 MB ÷ 12 hours = 2500 MB/hour = 694 KB/second

To back up the data with a single NetWorker server, the following configuration is suggested:

  • NetWorker on a 30-40 MIPS CPU system with 24-32 MB of memory

  • two Exabyte 8500 tape drives connected to the NetWorker server

  • an ATL (or Lago) jukebox with two Exabyte 8500 tape drives connected to the NetWorker server

  • two network interfaces

  • one NetWorker Autochanger Software Module for the ATL (or Lago)

To back up the data with two NetWorker servers, the following configuration is suggested:

  • NetWorker on two 20 MIPS CPU systems with 16 MB of memory

  • an EXB-10e jukebox with an Exabyte 8500 tape drive connected to each server

  • one network interface for each server

  • two NetWorker Autochanger Software Modules for the EXB–10e

Example 2

Site B has 50 GB of data on a single network with 80 clients and they want to be able to schedule backups in a single night (12 hours).  The full backups for the clients must be staggered due to the limit of 500 KB/sec data transfer rate per network.  Calculate the backup capacity required to complete the backups in one night:

500 KB/sec × 12 hours = 21.6 GB/night

By staggering their full backups into three nights instead of one, using three different backup schedules, they reduce the load of the nightly backup data to about 20 GB each night, as shown in Table B-6.

Table B-6. Staggered Backup Schedules Reduce Load

Full:

50GB ÷ 3 = 16.7 GB

Incr:

(50 GB – 16.7 GB) × .1 = 3.3 GB

Total:

20 GB ÷ 12 hrs = 462 KB/s

To back up the data with a single NetWorker server, the following configuration is suggested:

  • NetWorker on one 20 MIPS CPU system with 16 MB of memory

  • an EXB-10e jukebox with an Exabyte 8500 tape drive connected to the server

  • one network interface

  • one NetWorker Autochanger Software module for the EXB–10e

Measuring Performance

If you are interested in measuring the performance of your NetWorker environment, you must take into consideration the server system and the client system.

The factors to consider for the server system are the speed of the tape drive, the network speed, and the CPU speed.  Factors to consider for the client system are filesystem traversing, generation of data, data on multiple disks, and CPU speed.

Server Performance

This section provides examples on how to measure the performance of the server.

Tape Drive Speed

Most tapes have step function in data rate.  With most devices, NetWorker uses 32 KB per record.

To measure tape speed, follow these steps:

  1. Create a large file (at least 20 MB) with non-zero data.  For example:

    # cat /vmunix /vmunix /vmunix /vmunix /vmunix > big 
    

  2. Use the dd command to write the large file to tape four times and time the results:

    /bin/time dd if=big of=/dev/nrst8 bs=32k conv=sync 
    /bin/time dd if=big of=/dev/nrst8 bs=32k conv=sync 
    /bin/time dd if=big of=/dev/nrst8 bs=32k conv=sync 
    /bin/time dd if=big of=/dev/nrst8 bs=32k conv=sync 
    

  3. Divide the file's size by the average of the last three real times.  The resulting number gives you a rate for the tape speed.  For example:

    20190 -rw-rw-r-- 1 root 20675420 Jan 7 11:04 big
    95.2 real 13.0 user 11.9 sys
    78.2 real 12.9 user 12.7 sys
    78.0 real 12.8 user 12.5 sys
    76.8 real 13.0 user 12.4 sys
    

    Average: 77.67 (last three);  Rate: 20190 KB ÷ 77.67 sec = 260 KB/sec

Network Speed

NetWorker for UNIX uses TCP and RPC/XDR as network communication protocols.  To measure the network speed, follow these steps:

  1. Create a large file (as in the tape speed measurement example) on a fast client.

  2. Use the rcp command to copy the file from the client to the server and time the result:

    # /bin/time rcp big server:/dev/null 
    

  3. To find the network speed, divide the number of bytes in the file by the real time.  For example:

    20190 -rw-rw-r-- 1 root 20675420 Jan 7 11:04 big
    38.2 real 0.2 user 30.7 sys
    

    Rate: 20190 KB ÷ 38.2 sec = 529 KB/sec

The most important factor affecting network speed is network errors.  To determine the input error rate, the output error rate, and the collision rate, use the netstat -i command.  If the input or output error rate is above 0.5%, or the collision rate is above 5%, network errors are slowing down the network speed.

CPU Speed

The CPU of a server limits the following:

  • the total data throughput to tape

  • the interrupts per second for network data

  • context switches per second between processes

The best measure is the MIPS rating for the server (a larger MIPS or SPEC rating means a faster machine).

Memory

The memory on the server limits the amount of data buffered between the NetWorker save command, agent daemon, and media management daemon.

Client Performance

This section provides examples of measuring the performance of a NetWorker client.

Filesystem Traversing

To measure the filesystem traversing speed, follow these steps:

  1. Time the uasm command with the -bi option.  For example:

    # /bin/time uasm -bi /usr 
    13931 records 2667396 header bytes 350849148 data bytes 124.9 real 10.8 user 34.0 sys
    

  2. Divide the number of records by real time for rate per file.  For example:

    13931 records ÷ 124.9 sec = 111.5 files/sec

Data Generation Rate

To measure the rate at which a client generates data for a backup, follow these steps:

  1. Time the uasm command with the -si option and redirect the output to /dev/null.  For example:

    # /bin/time uasm -si /usr > /dev/null 
    

  2. Divide the number of bytes obtained (filesystem traversing) with the uasm -bi command by the real time generated by the uasm -si command.  For example:

    342626 KB ÷ 1199 = 286 KB/sec

Data on Multiple Disks

NetWorker automatically backs up multiple disks in parallel.

To measure parallel disk speeds, follow these steps:

  1. Use the df or du command to find two directories of approximately the same size located on different disks.

  2. Run the same uasm speed tests for filesystem traversing and data generation rate as for one disk, but run the tests simultaneously on the two directories.

  3. Add the data from each test (files/sec and KB/sec) to obtain a combined rate.

This rate reflects the performance of NetWorker backing up data on multiple disks.

CPU Speed

The CPU of a client limits the following:

  • the total data throughput to the network interface

  • interrupts per second for network data

  • context switches per second between processes

The best measure is the MIPS rating for the client (a larger rating means a faster machine).

NetWorker Backup Schedules

The NetWorker server backs up each client system across your network according to a backup schedule.  Create schedules in the Schedules window and assign individual clients to schedules in the Clients window.  Schedules can be very simple or very sophisticated, depending on the needs of your environment.  All clients can share the same schedule, or each client can have its own unique schedule.

This section discusses some of the considerations to keep in mind while determining which schedule best fits your situation.

Backup Levels

The schedule specifies the backup level at which NetWorker performs backups for a client on each day of a weekly or monthly period.  NetWorker offers eleven different backup levels:

  • Full—backs up all files, regardless of whether or not they have changed since the previous backup.  A full backup is equal to a level 0 backup.

  • Level 1 through level 9—back up files that have been modified since a previous full backup or a backup of a lower numbered level.  For example, a level 3 backs up all the files that have changed since the previous level 2, level 1, or full backup.

  • Incremental—backs up all files that have changed since the previous backup of any level.  An incremental backup is equal to a level 10.

You can also skip a backup on a given day.  You may want to schedule a “skip” backup on weekends or holidays when no one is available to load backup media.  If you have a jukebox, this option becomes less important, since a jukebox automatically loads and unloads backup media.

NetWorker's on-screen calendars present an easy method for setting up backups for each day of the month.  You can designate a schedule and repeat it over a weekly or monthly period.  For example, if you set up a full backup for one Friday, NetWorker automatically sets up a full backup for every Friday.  Or, you can override the regularly scheduled backup level for a specific day.

There is no “correct” way to set up a backup schedule for a particular client or network of clients.  The clients you need to back up probably vary considerably—some have a lot of critical data to back up, others may have a small amount of data that does not change very often.  Consider the situation for each client, weigh the benefits of different backup schedules, and then select the best schedule for each client.

Full Backups Versus Incremental Backups

If your site has a small number of files, you may choose to perform a full backup every day, or perhaps once a week.  This is a simple schedule to set up and execute, and it makes recovering from a disk crash easy—you simply need the last full backup volume.

The situations to consider are listed below:

  • Full backups take more time to complete than incremental backups.

  • If the full backup does not fit on a single piece of media, someone has to monitor the backup and change the media (unless you have a jukebox).

  • Full backups cause the online indexes to grow more rapidly than incremental or level backups.

You may decide to schedule a full backup at the beginning of the period and schedule incremental backups the rest of the period.  This schedule minimizes the amount of time that backups take, the size of the backups, and disk space required for the online indexes.

However, if you need to recover from a disk crash, you may need all the tapes used during the schedule, because the most current version of your files may be scattered across several different tapes.  Although NetWorker asks for each tape that it needs for the recovery by name, loading and unloading them can be time-consuming (unless you have a jukebox, or all the incremental backups fit on one tape).

Using Level Backups

You can use level 1 through level 9 backups to moderate between the two extremes described above.  Level 1 through level 9 backups allow you to set up a schedule for each client that balances your need for small, fast backups that do not take up too much index space and the need to recover quickly and easily from a disk crash.

A level backup serves as a checkpoint in your schedule since it collects into a single backup session all the files that have changed over many days or even weeks.  Without a level backup, these files would be spread across tapes from many different backup sessions.  As a result, a level backup can simplify and speed file recovery.

To illustrate the effect of level 1 to level 9 backups, consider two examples.  In the first example, a full backup takes place on the first day, followed by a level 9, level 8, level 7, and so on down to a level 1 backup over time.

A full backup followed by level 9 to level 1 is illustrated in Figure B-1.

Figure B-1. Full Backup Followed by Levels 9 to 1

Figure B-1 Full Backup Followed by Levels 9 to 1

The advantage of this schedule is that to recover from a disk crash, you only need two tapes:  the one with the full backup, and the one with the last level backup.  The disadvantage is that with each day, there are more changed files to back up, so the backups take longer to complete.

In the second example, the backup schedule also starts out with a full, but the level backups that follow are in reverse order:  starting with a level 1 on the first day following the full, on down to a level 9 backup.  Each day, the backup backs up only the files that have changed on that day.

A full backup followed by level 1 to level 9 is illustrated in Figure B-2.

Figure B-2. Full Backup Followed by Levels 1 to 9

Figure B-2 Full Backup Followed by Levels 1 to 9

The advantage of this schedule is that each day's backup is small and completes quickly.  The disadvantage is that recovering from a disk crash would require the full backup tape and all of the level backup tapes up until the day of the disk crash.

Neither of these backup schedules is practical.  They simply illustrate how level backups work.  The real power of level backups comes into play when you combine multiple levels along with fulls and incrementals.

Typical Monthly Backup Schedule

Sites with even a few gigabytes of files to back up often choose a monthly schedule based on fulls, incremental, and level backups.  The example described in this section performs a full backup on the first day of each month, a level 5 backup on the 10th and 20th of the month, and incremental backups on all other days.

A monthly backup schedule minimizes the size of daily backups while also making it relatively easy to recover after a disk crash.  This schedule offers several advantages.

First, the level 5 backups simplify recovery.  Assume that a disaster strikes on the 24th of the month.  All the files that you need to recover an entire client system are located on tapes from just five backup sessions:

  • the incrementals from the 21st, 22nd, and 23rd

  • the level 5 backup from the 20th

  • the full backup at the beginning of the month

Second, the incremental backups are relatively small and quick to execute, even for large network environments.  Several days of incrementals fit on a single tape.  This further simplifies recovery and also avoids the need to have someone change tapes each day. Illustrated in Figure B-3 is level 5 and incremental backups after a full.

Figure B-3. Full Backup Plus Incrementals and Level 5

Figure B-3 Full Backup Plus Incrementals and Level 5

Backups Take Time

The amount of time you have to complete a backup on any given day also influences the schedule that you decide to use. Because of flextime and around-the-world operations, many networks must be up and running for users from early in the morning until very late in the evening.  While NetWorker is able to back up live filesystems, most administrators want 100% of their network and systems capacity ready for users during work hours.

What number of files can NetWorker back up in, for example, a four-hour backup window?  The answer is “it depends.”

  • Select a backup server with enough CPU power, memory, and bus bandwidth so the backup server is not a bottleneck.

  • Leave NetWorker's parallelism feature turned on.  This feature causes multiple client systems to send their files to the backup server in parallel.  This keeps a stream of files ready for the tape drive, so that it does not start and stop.  Find a parallelism that keeps the tape drive streaming without overloading the CPU.

  • Experiment with compressing files on the client systems to reduce the size of the data sent across the network and written to tape.  Compression may speed your backup as long as the client systems are still able to supply files to the backup server fast enough to keep the tape drive streaming.

  • Take advantage of NetWorker's ability to skip specified files during the backup.  For example, you could choose to skip core files.

  • Add a second backup device to your server.  With NetWorker Server Edition and Network Edition, you can simultaneously back up to more than one device.

If your backup server can drive a single 8 mm tape drive at an average of 400 KB/second (its maximum speed is 500 KB/second and some time is invariably lost loading the tape or rewinding, for example), you will be able to back up a maximum of 5.76 GB in four hours.  If you have more than this amount of data to back up, then full backups will be limited to weekends and holidays when users will not be affected.

Unless you have a jukebox and NetWorker's optional Autochanger Software Module, you also have to schedule backups based on someone being available to load and unload media.  Many administrators find that an incremental backup of their network fits onto a single 4 mm or 8 mm tape, but they must schedule multi-tape full and level backups for specific nights or weekends when an operator is on duty to load additional tapes.

If an operator will not be available over a holiday weekend, you can set an override in the schedule to skip the backup on that day.  You may also want to override the schedule just before a holiday with a full backup, for added peace of mind.

Using Compression During Backup

Software compression involves significant CPU usage.  If you have a fast client, and performance monitors such as gr_osview indicate that CPU bandwidth remains, use compressasm to compress data before it goes over the wire to the server.

To use compressasm, open the Clients window and select Unix with compression directives in the Directive field.  This is the default for IRIX NetWorker.

Software compression reduces network traffic, because compressasm typically achieves 2:1 compression (your mileage may vary).  There is no harm in using compressasm in conjunction with a compressing tape drive, but the tape drive will probably not achieve further compression on the data.

If you are deciding between compressasm and a compressing drive solely to increase the amount of data on a tape, obtain the compressing drive—the hardware on the drive generally compresses faster than NetWorker and places no load on the client CPU for data compression.

Follow these guidelines when compressing data during backup:

  • Use compressasm to minimize network bandwidth if you have the available CPU power.

  • Use compressing drives to get more data on a tape.

  • Any generic compressing algorithm typically achieves 2:1 compression, and tape drives are no exception.  Sometimes you get more, sometimes less.

  • Compressing already compressed data, such as GIF or JPEG images, has no effect (and might even expand the data).

  • Do not use compressasm with a compressing drive and no networked clients.

Staggering the Backup Schedules

Networks with a large number of files can take a very long time, and require a lot of loading and unloading of tapes to complete a full backup.  There may not even be time in a night or an entire weekend to complete a full backup of all the systems across a very large network.  An easy way to handle this problem is to stagger the clients' backup schedules.  Rather than have every client system perform a full backup on Monday and incrementals the rest of the week, for example, you can schedule some clients to perform a full backup on Tuesday and others on Wednesday.

NetWorker goes one step further to smooth the backup load for very large client systems.  With NetWorker you can assign a separate backup schedule to each filesystem.  Each filesystem, in essence, is treated as if it were a completely separate client.

Convenience Versus Security

You may leave the same backup volume mounted in the server's backup device throughout a week or month, and when it becomes full, replace it with a new labeled backup volume.  NetWorker tracks all the backups, no matter what day of the week or month, or what part of the backup schedule cycle is in effect.  The same backup volume may contain full, level [1-9], or incremental backups, and it makes no difference to NetWorker.  For you, the benefits are fewer backup volumes to manage and the ability to recover from a disk crash with a minimum number of backup volumes.

Some sites prefer to segregate the full backups from the level [1–9] and incremental ones.  The full backups protect the network from a catastrophic disk loss, and you want to guarantee their integrity.  There is always a very small risk that if you leave the backup volume with the full backup sitting in the backup device, something could happen to it.

If a backup volume with incremental backups is ruined, users may lose one day of work.  If the backup volume with the full backup is destroyed, users may lose all the work done since the last full backup.  Therefore, some administrators prefer to remove the backup volume used for a full backup, put it in a safe place, and mount another backup volume for the following level [1-9] and incremental backups.  The trade-off is that you may need a few more backup volumes to recover from a disk crash—the one with the last full, and the other volumes that contain the most recent level [1-9] and incremental backups.

NetWorker Browse and Retention Policies

NetWorker maintains online indexes of all the files backed up for each client and an index of the files stored on each piece of media.  NetWorker lets you set policies that automatically control how long the information is retained in these online indexes.  This section explains NetWorker's browse and retention policies and the trade-off between providing faster, easier recovery for your users or conserving disk space.

Browse Policy

One of NetWorker's popular features is the ability to browse many versions of a file that have been backed up over time and to choose which one to recover.  However, each version of a file that NetWorker tracks takes up space in the client online index (about 220 bytes each).  Since disk space is limited, you need to establish a policy of how far back in time you will keep information about backed-up files in the indexes.

The browse policy that you select specifies how long the entries for your files remain in the file indexes.  A browse policy can be any number of days, weeks, months, or years.  NetWorker automatically deletes entries older than the browse policy time and frees up disk space.  The browse policy, like the backup schedule, can be different for each client.

How Browse Policies Work

To recover a complete directory or filesystem, you often need to recover some files from incremental and level backups as well as from a full.  The incremental backup is dependent on the level backups and, in turn, on the full.  NetWorker will not delete the entries from any backups on which other backups depend.  As a result, you may find that entries are deleted later than you expect.

In Figure B-4 the browse policy is set to one week, which happens to equal one complete backup cycle.

Figure B-4. Backups and the Browse Period

Figure B-4 Backups and the Browse Period

NetWorker will not remove the first full backup from the online file index until all the incremental and level-5 backups that depend on it have expired.  As a result, the full backup actually stays in the online index for a period of time equal to the browse policy plus one full backup cycle.

The first full backup will not be removed from the online index in exactly one week, however, because there are incrementals and a level 5 backup, which have not yet expired, that depend on the full.  Each incremental backup will be removed from the online index one week from the time it was completed.  The level 5 backup will be removed one week after the last incremental that depends on it is removed, and then the full backup will be removed at that same time.

The rule to remember is that a full backup actually remains in the online index for a period of time equal to the browse policy plus one complete backup cycle.  A backup cycle is measured from one full backup to the next full backup.  Also note that the browse policy is set for an entire client (or filesystem, if the filesystems are separately scheduled).  Consequently, whatever policy you have for keeping full backups online and browsable in the file index you must also use for all incremental and level backups.  With NetWorker you manage backup cycles (the period from one full backup to the next); you do not independently manage different levels of backups.

Reclaiming Disk Space

NetWorker automatically reclaims disk space that is freed up when entries are deleted from the online file indexes.  However, the space is not returned immediately to your system.  NetWorker takes some time, processing power, and swap space in order to reclaim this space, and having this constantly taking place on your backup server is inefficient.  Instead, NetWorker first reuses this space to store information about new files that are backed up.  When the file index for a client reaches a point where less than 50% of its space is being used by files that have not reached the end of their browse period, then NetWorker automatically invokes a process that returns the space to your system.

You may also reclaim disk space at any time by using the Reclaim space button in the Indexes window.

Recovering Files Removed From the Index

You can recover files whose entries have been removed from the online index because they have passed the Browse policy period as long as the files are still stored on a backup volume.  However, the recover process is not as convenient as when the entries are still in the online index.

If you do not want to rebuild the index, the save sets you need are still in the media index.  Since you know which save set contains the file you want, you can use the save set recover feature to recover the entire save set or selected directories and files.  The save set recover feature is most useful for recovering from full backups and is limited to root and users belonging to the group operator.

If you want to rebuild the file index so that you can browse for the file you lost, here is the basic procedure to follow:

  1. Use the Volume Management window to find the name of the backup volume that contains the save set.

  2. Use the mminfo command to determine the save set ID.  Use this syntax:

    mminfo -v -s server -c client -N saveset volume_name 
    

  3. Rebuild the file index entries for the save set using the scanner -i -s save_set_id# command at the system prompt.  Enter the save set ID number determined above for save_set_id#.  Rebuilding the file index using the scanner command may take some time.

  4. Use the NetWorker Recover window to identify the needed file(s) and initiate the recovery.

Recovery is considerably easier if the file information is still in the NetWorker online index.  That is why you want to set a browse policy long enough to cover most recover requests.

Media Retention Policy

Your need to conserve disk space may lead you to establish a short browse period.  NetWorker's media retention policies complement the browse policy by letting you specify a longer period of time during which files can still be recovered, although with more difficulty.  NetWorker uses the retention policy to automatically recycle backup volumes.

NetWorker maintains a file index for each client system and a much smaller media index that tracks the save sets stored on each backup volume.  When NetWorker removes entries that are older than the specified browse time from a file index, it leaves the corresponding save set information in the media index.  The retention policy controls how long this information is kept and, as a result, how long a backup volume is kept before it can be overwritten with new backups.

As with the backup schedule and browse policy, you set the retention policy for each NetWorker client.  Different clients can have different policies.  The retention period can be any number of days, weeks, months, or years as long as the retention period is equal to or longer than the browse policy.

A NetWorker backup volume can contain save sets for many different clients over many days.  As the retention period is reached for each save set, information about that save set is removed from the media index.  When the retention period for every save set on a backup volume is reached, NetWorker marks the volume “recyclable.”  This volume can then be reused for backups.  At the time that the volume is actually reused, the old files are overwritten and can no longer be recovered.

The NetWorker browse and retention policies combine to give you a hierarchy of recovery capability while keeping the disk space needed for the online indexes to a minimum.  Recovering a file is quick and easy using the NetWorker Recover window until the browse policy time is reached and the file information is removed from the file index.  Then you can use save set recover or the more tedious process described to recover your files until the retention policy time is reached and the backup volume is recycled.

Setting Policies When Using a Jukebox

The NetWorker Autochanger Software Module automates your backup and recover activity.  The capacity of the jukebox, the backup schedule you select, and the browse and retention policies you use determine whether you can walk away from backups for a week, a month, or even longer.

Jukebox Capacity

A jukebox is most useful if it has at least enough capacity to complete one entire backup cycle without intervention.  This allows backups to run while you are out ill, on vacation, or busy with a user emergency, and helps minimize the time that you spend on backup (particularly if the backup server and jukebox are located some distance away).  At the end of the cycle, you can move the used backup volumes offsite and load fresh tapes into the jukebox.

A jukebox with the capacity for one entire backup cycle also speeds file recovery.  If a user accidentally deletes a file, there is at least one version (more if the user has recently edited the file) in the jukebox.  With NetWorker, the user can quickly identify the lost file and initiate the recovery.  The jukebox loads the needed tape and NetWorker completes the recovery without your help.  Depending on the speed of the jukebox and the device used, the file should be recovered very quickly.

You need to design a schedule that fits the capacity of your jukebox.  Start with your ideal schedule and then consider these suggestions to reduce the size of a complete backup cycle:

  • Use more incremental backups and fewer level 1-9 backups.

  • Back up systems with less critical files less often—perhaps only once a week.

  • Use NetWorker's directives to skip files during the backup, for example, core files.

  • Shorten the length of the backup cycle.

Although your jukebox may have enough capacity for only one backup cycle, you can still set the browse and retention policies for a longer period.  If a user tries to recover a file stored on a volume that is not in the jukebox, NetWorker will prompt you to load that volume.  You can use the Location field in the Volumes window to keep track of volumes.  Users can refer to this information when deciding which version of a file to recover and choose the one stored on a tape that is located in the jukebox.

Choosing the Jukebox Capacity

With just enough capacity in the jukebox for a single backup cycle, you must reload tapes at the end of each cycle.  With more capacity you can set the schedule and the browse and recover policies so that the jukebox runs unattended for a long period of time.  The jukebox automatically recycles tapes containing save sets that have passed their browse and retention times to continue backups virtually indefinitely.

Suppose you established a backup schedule for your network of systems that takes one week to complete (for example, you schedule a full backup once a week) and consumes a total of 12 GB of tape during the week.  Assume that you are using a 50 GB Exabyte-10e jukebox.  Each of the following combinations of browse and retention times will allow the jukebox to operate without intervention for an extended period of time:

  • browse policy = 1 week, retention policy = 1 week

  • browse policy = 1 week, retention policy = 2 weeks

  • browse policy = 2 weeks, retention policy = 2 weeks

Each of these sets of policies has its advantages.  With a browse and retention policy of just one week, your online indexes will be kept small.  With a browse and retention policy of two weeks, your indexes will be larger but your users will have more versions to select from when they need to recover a file.  A browse policy of one week and a retention policy of two weeks keeps your indexes small and allows you to recover older files, although with a great deal more effort than if those files were still browsable in the index.

If you set the browse policy to four weeks, 4 × 12 GB = 48 GB will fit in the jukebox.  First, a full backup actually remains in the online index for a period of time equal to the browse policy plus one complete backup cycle.  Thus with a browse policy of four weeks, essentially five weeks of backups would need to fit into the jukebox.

Second, since NetWorker cannot recycle a tape until all the save sets on that tape have expired, there is often some amount of “unavailable” tape in the jukebox.

Suppose that one year later the number of files that you have has grown so that the one-week backup cycle needs 18 GB of tape capacity.  A browse policy of one week and a retention policy of one week still allow the jukebox to run continuously unattended.

If you want to keep files online in the jukebox longer, then you can use the methods listed earlier to reduce the size of the backup cycle.  As an alternative, you can stretch out the backup cycle.  For example, you can perform full backups every other week rather than every week.  This should not greatly increase the size of a backup cycle and gives you more versions of files online in the jukebox.

Choosing a Jukebox

In the ideal situation, you first design the best backup schedule and set of policies for your environment and then determine the jukebox size that you need to purchase.

Assume that you have a network of systems with a total of 25 GB of files to back up and that you have selected a schedule that includes a full backup at the beginning of each month, a level 5 on the 10th and 20th of each month, and incrementals on all other days.  The calculations in Table B-7 illustrate that one complete backup cycle should be about 64 GB in size.

Table B-7. Storage Requirements for One Backup Cycle

Level

Size

Frequency

Total

Full

25 GB

× 1 time /month

= 25 GB

Level 5

2.5 GB[a]

× 2 times/ month

= 5 GB

Incremental

1.25 GB

× 27 times/month

= 34 GB

 

 

Grand Total

= 64 GB

[a] These size percentages are based on Legato experience over the past three years.

To determine the size jukebox that you need, start by estimating the size of a complete backup cycle.  Now assume that you have decided on a browse policy of two months for all the client systems and a retention policy of six months.  These policies let your users quickly recover any file and any version of a file that they had during the past two months, and with some effort, you can recover files that they owned at any time during the past six months.  So you will need 6 months × 64 GB = 384 GB of capacity.

In practice you need a little extra jukebox capacity since there will be a small number of “unavailable” volumes; NetWorker must wait to recycle a tape until after all the save sets on that tape have expired.

Finally, remember to plan for growth in the number of your files. While sites differ in the rate at which their files are growing, a rule of thumb is to purchase a jukebox with about 50% more capacity than your current requirement.

Preconfigured Selections

NetWorker provides preconfigured settings for you to use so you can immediately start backing up your systems.  This section offers an explanation of the different preconfigured settings.

Preconfigured Backup Schedules

For your convenience, NetWorker is shipped with several preconfigured backup schedules.  If these schedules fit your backup requirements, you can use them “out of the box,” or you can create new ones to accommodate your site-specific needs.

This section explains the logic behind each schedule.  After understanding how they work, you may want to use them as examples to set up your own schedules.

The most efficient way to protect the systems from file loss and maintain control over the number of backup volumes is to follow full backups with level [1-9] and incremental backups.

Each time you use the Schedules window to create a new weekly backup schedule, the preconfigured default schedule shown in Figure B-5 appears in the calendar as your starting point.

You are not allowed to change the name of an existing schedule.  For example, if you want to change the schedule “Full Every Friday” to “Full Every Monday,” you must delete the “Full Every Friday” schedule and create a “Full Every Monday” schedule.  You cannot change the existing schedule to complete full backups on Mondays instead of Fridays and then edit its name.

Figure B-5. Schedules Window

Figure B-5 Schedules Window

  • Default—this is the only schedule you may not delete.  It is a weekly schedule and completes a full backup every Sunday, followed by incremental backups all other days of the week.

    This schedule is convenient if you want to premount the backup volume Friday night before you go home for the weekend.  On Monday mornings, check your messages from NetWorker to make sure the backup completed.  If you want to separate the full backups from the incrementals, remove the backup volume with the full backup and mount another one for the incremental backups.

  • Full Every Friday -– this weekly schedule completes a full backup every Friday, followed by incremental backups the other days of the week. 

    This schedule is identical to the Default schedule, except that instead of completing a full backup on Sundays, the full backup takes place on Fridays.  Depending upon how much data changes on the network, the daily incremental backups might all fit onto one backup volume.  In that case, if you had to recover from a disk crash, you would need only two backup volumes: the one with the last full backup, and the one with the incremental backups.

  • Full on 1st Friday of Month—this monthly schedule completes a full backup on the first Friday of the month (not the first calendar day of the month. Incremental backups take place on all the other days.

    The advantage of this schedule is that you complete a full backup only once a month.  If you use this schedule, it would be a good idea to store the backup volume with the full backup in a safe place, and use other backup volumes for the incremental backups.  It would also be a good idea to change backup volumes every few days for the incremental backups.  If you allow all the incremental backups to be stored on one backup volume, and it is destroyed near the end of the month, you may not be able to fully recover from a disk crash.

    Whenever you create a monthly schedule for a full backup on a weekday instead of a calendar day (like Friday, in this example), you must set the overrides in each month.  (Notice the “f*” in the first Friday of each month.)  This is because the first weekday (Monday through Friday) in a month may fall on any calendar day from 1 to 7.


    Note: The Overrides you select for individual days do not carry over from one year to the next.  Preconfigured schedules, however, do maintain the overrides for years into the future.


  • Full on 1st of Month -– this monthly schedule completes a full backup on the first calendar day of the month.  On the other days of the month, an incremental backup takes place.  This schedule has the same advantages and disadvantages as the “Full on 1st Friday of Month” schedule.  This schedule is easier to create because you do not have to set any overrides manually.

  • Quarterly—the quarterly schedule completes a full backup on the first day of the quarter.  A level 5 backup takes place on the first day of the other months in the quarter.  Every seven days, a level 7 backup takes place.  The other days of the month, an incremental backup takes place.

    This schedule is convenient because a full backup takes place only once a quarter.  On the first day of the month, a level 5 backs up everything that has changed since the first day of the quarter.  Every seven days, the level 7 backup protects all the data that has changed since the first day of the month.  The daily changes are protected by incremental backups.

    If you use this schedule, it is a good idea to segregate the backup volumes and store them in a safe place.  Use one volume for the full backup, one for the level 5 backups, one for the level 7 backups, and another one for the incremental backups.  If you have a disk crash or a disaster, you risk losing only a few days' work (the backup(s) on the mounted volume).  If you change the backup volume every day for the incremental backups, you risk losing only one day's work; however, you must use more tapes to recover from a disaster.

    When you create a quarterly schedule like this one, use the Month period to set the level backups, then set each quarterly full backup on the calendar with an override.

    To recover from a disk crash, you would need the backup volume with the full backup, the latest level 5, the latest level 7, and the incremental backups for the week.

Preconfigured Policies

NetWorker is shipped with five preconfigured policies:  Decade, Month, Quarter, Week, and Year.  Use these policies to choose the length of time to retain the entries in both the file index and media index.  Remember, the retention policy you select affects the size of the media index and controls the length of time NetWorker tracks the backup volumes and the data on each volume.

The browse policy affects the size of the file index and the length of time NetWorker retains entries for every file backed up and visible in the Recover window.  You must always choose a retention policy that is greater than or equal to the browse policy.

For example, if you choose Quarter for the retention policy for a client, and Month as the browse policy, the client will be able to browse all the file entries for backed-up files dating back a month.  Each month the oldest entries for the client's files are automatically removed from the server's file index.  However, the backup volumes that contain the files are still tracked by NetWorker in the media index.

The Policies window and the five preconfigured policies are shown in Figure B-6.

Figure B-6. Policies Window (Details)

Figure B-6 Policies Window (Details)

See “Manually Managing the Online Indexes” for an illustration of how browse and retention policies work.

  • Week —this policy maintains the file index entries or the media index entries for one week after the last full backup.  If you use this browse policy, the users will be able to view and mark for recovery only those files that go back in time for a week.  It is a useful browse policy when you have a limited amount of disk space and users do not expect to be able to recover versions of their data older than one week.

    As a retention policy, Week means that your backup volumes will turn over quickly, and NetWorker will recycle through the tapes at a faster rate.  Use this policy if you schedule weekly full backups and need to keep backup data for only one backup cycle plus a week.

  • Month—This browse policy allows users to view and recover versions of files dating back at least a month.  The Recover window displays versions of files backed up for one full month plus a number of weeks.  As a retention policy, NetWorker maintains and tracks the backup volumes for one full backup cycle plus a month.

  • Quarter—Use this policy if you need to keep backed-up data longer than a month.  With this browse policy, the client can view and recover files for at least three months into the past.  The retention policy tracks the backup volumes for at least three months plus one full backup cycle.

  • Year—If you need to keep backed-up data online for several months, use the Year policy.  For example, if your company requires ready access to information going back in time for at least three quarters, this is a good browse and retention policy.  Realize, however, that NetWorker requires more disk space to maintain all the information online.

  • Decade  - This policy retains the entries in the server's indexes for ten years.  It is useful for organizations that are required to retrieve individual files for very long periods.

    Your NetWorker server will require lots of disk space for the online indexes if you choose Decade for your browse policy.  Depending upon how much data you are backing up, ten years of file index entries could take up gigabytes of disk space.

    It would make more sense to use Decade as the retention policy and use Quarter or Year as the browse policy.  NetWorker can then track the backup volumes and the data on each one.  You would always be able to retrieve data from an old backup volume using the save set recover feature if you needed to do so.  NetWorker would still require disk space to maintain the media index, but it would be a much smaller amount of space using the Quarter or Year browse policies.

Preconfigured Pools

NetWorker is shipped with preconfigured pools and matching label templates.  Each preconfigured volume pool has a set of unique preselected choices.  If you do not choose a pool for your backups, they are automatically assigned to the preconfigured Default pool and are labeled using the Default label template.

The preconfigured pools have been included for your convenience and provide a variety of ways for organizing your data.

The preconfigured volume pools have matching label templates.  The Two Sided label template is for labeling optical media and is the only template that does not have a matching volume pool.

You can use the Default, Default Clone, Archive, and Archive Clone pools without making any additional selections in the Pools window.  To use the other preconfigured pools, you must first complete the selections and choose Yes from the Enabled choices.  A pool must be enabled in order for NetWorker to sort data to that pool.

The preconfigured pools are described below:

  • Archive—for archiving client data only.  This pool cannot be modified or deleted.  The preconfigured settings are Enabled—Yes, Label template—Archive, Pool type—Archive, Store Index entries—Yes.  The are no selections for you to make for this pool.

  • Archive Clone—for cloning archive data only.  This pool cannot be modified or deleted.  The preconfigured settings are Enabled—Yes, Label template—Archive Clone, Pool type—Archive Clone, Store Index entries—No.  There are no selections for you to make for this pool.

  • Default—automatically used if you do not choose a pool.  If you decide not to use the pools feature, NetWorker automatically places all of your backup volumes in this pool.  The Default pool cannot be deleted or modified.  The preconfigured settings are Enabled—Yes, Label template—Default, Pool type—Default, Store Index entries—Yes.  There are no selections for you to make for this pool.

  • Default Clone—automatically used if you do not choose a pool for cloned data.  If you decide not to use the pools feature, NetWorker automatically places all of your cloned backup volumes in this pool.  The Default Clone pool cannot be deleted or modified.  The preconfigured settings are Enabled—Yes, Label template—Default Clone, Pool type—Backup Clone, Store Index entries—No.  There are no selections for you to make for this pool.

The Full, NonFull, and Offsite pools are intended for sorting data by levels.

  • Full—use this pool for full backups only.  This pool separates all of your full backups from the incremental and level backups.  Using the Full pool allows you to easily track and separate your full backups from the incremental and level backups.  Typically, you use this pool in conjunction with the NonFull pool.  The preconfigured settings are Enabled—No, Label template—Full, Levels—full, Pool type—Backup, Store Index entries—Yes.

  • NonFull—use for any backups other than full backups.  This pool includes all incremental and level backups.  Use the NonFull pool to easily keep your incremental and level backups separate from the fulls.  Typically you use this pool in conjunction with the Full pool.  The preconfigured settings are Enabled—No, Label template—NonFull, Levels—all level and incremental backups, Pool type—Backup, Store Index entries—Yes.

  • Offsite—for volumes being stored offsite.  The Offsite pool allows you to easily create a set of volumes to be stored offsite.  If your onsite backup volumes are destroyed, you can still recover your valuable data with the volumes you have stored offsite.  If you are also using the Full pool, you must disable it while you are sending data to the Offsite pool to ensure that all of the full backups will go only to the Offsite pool.  The preconfigured settings are Enabled—No, Label template—Offsite, Levels—full, Pool type—Backup, Store Index entries—No.


    Tip: Remember to enable pools you wish to have in effect during scheduled backups by selecting Yes from the Enabled choices.


Preconfigured Label Templates

The preconfigured label templates shipped with NetWorker are Archive, Archive Clone, Default, Default Clone, Full, NonFull, Offsite, and Two Sided.  These are provided so that you can easily start labeling your backup volumes.  There are also preconfigured volume pools with corresponding names (except Two Sided).  The preconfigured volume pools automatically use the preconfigured label template with the same name.

The number range for all of the preconfigured label templates starts at 001 and ends with 999 to allow for expansion of the volume pools.

The Archive label template is used only for clients that need to archive data.  It has three fields each separated with a period.  The first field contains the NetWorker server name, the second field is “archive,” and the third field contains a number.

For example:

server.archive.number
space.archive.001
space.archive.099
atlas.archive.325

The Archive Clone template has three fields separated by periods.  NetWorker uses this template for backup volumes belonging to the Archive Clone pool.  The first field contains the name of the NetWorker server and the letter “c.”  The second field is “archive,” and the third field contains a number.

For example:

moon_c.archive.001

The Default template has two fields separated by a period.  The first field contains the name of the NetWorker server and the second field contains a number.

For example:

server.number
space.675
space.800
atlas.054

The Default Clone template has two fields separated by a period.  NetWorker uses this template for backup volumes belonging to the Default Clone pool.  The first field contains the server name and the letter “c.”  The second field contains a number. 

For example:

moon_c.002

The three preconfigured label templates Full, NonFull, and Offsite use the same labeling conventions.  The name of the label template appears in the first field, and the second field contains a number.

For example:

label name.number
Full.076
NonFull.003
Offsite.120

The Two Sided template is for use with two-sided media such as optical media.  When labeling two-sided media, you need to be able to label both sides of the media.  The first field contains the name of the server, the second a number, and the third either an “a” or “b” to differentiate between the two sides of the media.

server.number.side
phoenix.001.a
phoenix.001.b

Preconfigured Directives

NetWorker is shipped with preconfigured directives.  Each directive covers a set of the most important and most useful backup instructions.

The preconfigured directives are listed below.

  • Unix standard directives—use for most of your backups and when you do not need one of the other specialized directives.  This selection

    • applies the directive “+skip: core” to the root directory (/), thus skipping the backup of all core files

    • contains a swapasm directive to back up the relevant information about all NFS based and local swap files, but not the data in them

    • contains a mailasm directive to ensure that your mail files are backed up, yet not marked as read, and logasm for directories containing log files

  • Unix with compression directives—use when you want to compress your backup data.  Compressing client files saves you media space and network bandwidth but takes more time and CPU cycles on the client.  Overall, the entire network may back up faster if all the clients compress their files, and parallelism is set appropriately.

  • DOS standard directives—use to back up your DOS clients.

  • NetWare standard directives—use to back up your NetWare clients.

  • NT standard directives—use to back up your NT® clients.

  • NT with compression directives—use to compress data on an NT client during backup.

  • Index directives—use to back up the online file index.  This option is usually used only by the savegrp command.

Preconfigured Notifications

NetWorker is shipped with several notifications:  Bootstrap, Cleaning cartridge expired, Cleaning cartridge required, Device cleaned, Device cleaning required, Index size, Log default, Migration attention, Migration completion, Registration, Savegroup completion, Tape mount request 1, Tape mount request 2, and Tape mount request 3.

Most of these notices alert you to important NetWorker events.  For example, if a group of clients did not complete a nightly backup, NetWorker sends you a savegroup completion notice by electronic mail.

Registration

The registration notification sends a message to root notifying you that your NetWorker products are not properly registered.  You will receive the registration notification once a day or each time you start NetWorker.  The notification message includes related information about each of the NetWorker products that are not registered correctly.  A default registration action is shown in Figure B-7.

Figure B-7. Registration Notification

Figure B-7 Registration Notification

Log Default

The log default notification uses a UNIX facility called syslog to log and distribute notification about all NetWorker events.  These events include requests for backup volume mounts, index size notices, and savegroup completion notices.  How this information is distributed depends on how you have configured syslog.  When NetWorker was installed, it created entries for logging and contacting operators.  You can customize these entries.  Refer to the syslogd(1M) reference page for information about configuring the distribution of log information.  A default log action is shown in Figure B-8.

Figure B-8. Log Default Notification

Figure B-8 Log Default Notification

Index Size

NetWorker checks the size of its online indexes and sends a notification when it looks as if the indexes may run out of disk space.  NetWorker automatically sends the electronic mail message to root.  A default index size action is shown in Figure B-9.

Figure B-9. Index Size Notification

Figure B-9 Index Size Notification

The example above notifies you when the index for the client atlas is getting large.  If you want the message to be mailed to someone other than root, you can edit Action and substitute root with a different user login name or mailing list.

Savegroup Completion

When NetWorker finishes backing up a group of clients, it sends a completion message via electronic mail to root.  A savegroup completion action is shown in Figure B-10.

Figure B-10. Savegroup Completion Notification

Figure B-10 Savegroup Completion Notification

Backup Media Request Notices

When NetWorker needs backup media mounted for a backup, or a specific backup volume mounted to fill a recovery request, it displays a media request message in the NetWorker Administrator window.  If no one fills the request, NetWorker sends another request after fifteen minutes.  NetWorker sends a third request after another thirty-seven minutes, if no one fills the request.

The first mount request has a blank Action field, so the request appears only in the Pending display of the NetWorker Administrator window.  The second mount request sends an alert to the logger, and the third request sends electronic mail messages to root.  A tape mount request action is shown in Figure B-11.

Figure B-11. Tape Mount Request Notification

Figure B-11 Tape Mount Request Notification

Summary

There are no rules for configuring NetWorker.  The challenge is to understand how to best take advantage of the power and flexibility that NetWorker offers for your specific environment.  Start using NetWorker with the preconfigured schedules and policies and then undertake small experiments.  As your network of systems grows larger, as there are more and more files to back up, and as users see the advantages of NetWorker's fast file recovery, you will need to continue making adjustments.