Chapter 12. NetWorker Performance

The performance of a server system is affected by the speed of the backup device, the network speed, the amount of main memory, the disk speed, the CPU speed, and the number of CPUs. The factors affecting the performance of a client system are filesystem traversing, generation of data, data on multiple disks, CPU speed, and number of CPUs.

This chapter explains

Guidelines for Choosing a Configuration

Several factors determine which NetWorker server configuration best suits your backup and recover needs. The configuration consists of the hardware and software, which includes tape drives, jukeboxes, client systems, and network connection.

This section provides a few simple rules that you can use to guide your choices, and focuses on backup, since backup requires far more server capacity than recover and occurs so much more often.


Note: Please keep in mind that these are guidelines, and actual performance may vary.

The goal in selecting a configuration is to balance the different hardware and software limitations to achieve the overall data handling capabilities you require. Start by looking at the limits of the major NetWorker configuration components: tape drives, network connection, jukeboxes, clients, and the NetWorker server itself.

Tape Drives

Tape drives have a fixed maximum data transfer rate that they can handle. Since NetWorker automatically spans multiple tapes, the total tape capacity is not as important as the data rate. Table 12-1 shows the data transfer rate for several tape drives.

Table 12-1. Tape Drives and Data Transfer Rates

Drive

Data Transfer Rate

DAT

200 KB/s

EXABYTE 8200

250 KB/s

EXABYTE 8500

500 KB/s

Digital Linear Tape

1.25 MB/s


Clients

Different clients can generate data at different rates and, even within a single client, different types of files can generate different data rates. For example, symbolic links require as much processing as large data files, but produce no data. Consequently, the data rate produced by a backup of a single client can vary quite a bit. The numbers listed below are considered average transfer rates for each client system. However, it is a good idea to run several clients simultaneously to help smooth out fluctuations in each client's data transfer rate. Table 12-2 shows the data transfer rates for several types of clients.

Table 12-2. Client Data Transfer Rates

Client: 1 Backup

Data Transfer Rate

Silicon Graphics

600 KB/s

PC/DOS/SPX

80 KB/s

PC/DOS/TCP

150 KB/s

Sun™ SS2

200 KB/s

IBM® RS/6000

300 KB/s


Network

Ethernet has an upper limit on bandwidth of about 1 MB per second, but in practice, most Ethernet networks are limited to about 800 KB per second between a set of clients and a single server. For higher performance, use FDDI. Table 12-3 shows the data transfer rates for several networks.

Table 12-3. Data Transfer Rates for Networks

Network

Data Transfer Rate

Ethernet

800 KB/s

FDDI

8 MB/s


Server

The server must be able to handle the load of network packets, data movement, and tape drives in order to achieve the rates listed above. Most of the work on the server side is in data movement, context switching, and interrupt handling. The performance of all of these functions improves as the number of CPUs increases.

Jukeboxes

Jukeboxes provide automatic loading and unloading of tapes or optical disks. This capability assists the administrator in two different ways. During nightly backups, NetWorker uses the jukebox to automatically switch to the next tape when a tape fills up. During recovers, NetWorker uses the jukebox to load all of the tapes needed for the recovery without operator intervention. Table 12-4 shows data transfer rates and capacities for several jukeboxes.

Table 12-4. Jukebox Data Transfer Rates and Capacities

Jukebox

Maximum Data Transfer Rate

Capacity

Average Data Transfer Rate

EXB-10i

480 KB/sec

50 GB

480 KB/sec

DLT2700

1.25 MB/sec

70 GB

1.25 MB/sec

To determine the capacity requirements of a jukebox for a scheduled unattended backup, select the jukebox with a capacity large enough to handle the largest possible amount of backup data. For example, a full backup of 60 GB requires a DLT2700 jukebox or two EXB-10i autochangers.

To determine how much disk space you need for the online indexes for quick recovers, first do a rough calculation of the amount of data backed up in a single schedule period (for example, week, month, or quarter). Use the guidelines in Table 12-5 showing how much data is backed up with different levels of backup.

Table 12-5. Percent of Data Backed Up for Each Type of Backup

Level

Percent of Data Backed Up

Full

100%

Level 1-9

25%

Incremental

10%

For example, a monthly schedule that has one full backup on the first Sunday, a level 5 backup on other Sundays, and incrementals every other day looks like Table 12-6.

Table 12-6. Example of the Percent of Data Backed Up

Level

% of data backed up

1 Full

100%

26 Incremental

260%

4 Level 5

100%

Total

460%

This table illustrates that 460% of the total amount of data is backed up over the course of a month. For example, a total of 10 GB of client data backed up using this schedule would result in about 46 GB of data on tape per month.

Now assume that you have decided on a browse policy of two months for all the client systems and a retention policy of six months. These policies let your users quickly recover any file, and any version of a file that they had during the past two months. And with some effort you can recover for them files that they had any time during the past six months. So you need six months times 46 GB, which yields 276 GB of capacity.

In practice, you need a little extra jukebox capacity, since there are a small number of “unavailable” volumes as NetWorker must wait to recycle a tape until after all the save sets on that tape have expired.

Finally, remember to plan for growth in the number of your files. While sites differ in the rate at which their files are growing, a rule of thumb is that you should purchase a jukebox, or set of jukeboxes, with about 50% more capacity than your current requirement.

NetWorker Configuration: Example 1

Site A has approximately 70 GB of data on two networks of 50 clients and wants to schedule full backups for all of their data in one night (12 hours). This equation calculates the required data transfer rate to achieve this goal.

70000 MB / 12 hours = 6700 MB / hour = 1660 KB / second

To back up the data with a single NetWorker server, this configuration is suggested:

  • IRIX NetWorker with the TurboPak option

  • two DLT2700 jukeboxes containing DLT tape drives connected to the NetWorker server, with licenses for each

  • two Ethernet network interfaces

  • at least 1600 MB of free disk space on the server for the client index files

  • 50 Client upgrade license

NetWorker Configuration Example 2

Site B has 50 GB of data on a single network with 80 clients; the administrators want to be able to schedule backups in a single night (eight hours). The full backups for the clients must be staggered due to the limit of 800 KB/sec data transfer rate per Ethernet network. Calculate the backup capacity required to complete the backups in one night:

800 KB / second * 8 hours = 23 GB / night

By using three different backup schedules to stagger their full backups into three nights instead of one, the administrators reduce the load of the nightly backup data from 50 GB per night to about 20 GB per night.

Full: 50 GB / 3 = 16.7 GB
Incr: (50 GB - 16.7 GB) * .1 = 3.3 GB
Total: (16.7 GB + 3.3 GB) / 8 hours = 694 KB / second

To back up the data with a single NetWorker server, this configuration is suggested:

  • IRIX NetWorker with the TurboPak option

  • one EXB-440 jukebox with two EXABYTE 8505 tape drives connected to the server, with license

  • one network interface

  • approximately 1 GB of free disk space on the server for client index files

  • 50 client upgrade licenses

Measuring Server Performance

This section provides examples of how to measure the performance of a server.

Backup Device Speed

Most tapes have a step function in their data rate. NetWorker uses 32 KB per record. To measure tape speed, follow these steps:

  1. Create a large file (at least 20 MB) with non-zero data and list its size. For example:

    	# cat /unix /unix /unix /unix /unix /unix /unix /unix /unix /unix > big
    	# ls -l big
    	-rw-rw-r--    1 root     sys      20675420 Mar  5 16:11 big
    

  2. Use the dd(1M) command to write the large file to tape four times and measure the time results:

    	# time dd if=big of=/dev/rmt/tps1d6nrnsv bs=32k conv=sync
    	95.2 real 13.0 user 11.9 sys
    	# time dd if=big of=/dev/rmt/tps1d6nrnsv bs=32k conv=sync
    	78.2 real 12.9 user 12.7 sys
    	# time dd if=big of=/dev/rmt/tps1d6nrnsv bs=32k conv=sync
    	78.0 real 12.8 user 12.5 sys
    	# time dd if=big of=/dev/rmt/tps1d6nrnsv bs=32k conv=sync
    	76.8 real 13.0 user 12.4 sys
    

  3. Divide the file's size by the average of the last three real times. For example:

    Rate: 20675 KB / 77.66 seconds = 266 KB / second
    

    This number gives you the rate of the tape speed.

Network Speed

NetWorker uses TCP and RPC/XDR as network communication protocols. To measure the network speed, follow these steps:

  1. Create a large file (as in the tape speed measurement example) on a fast client.

    	# cat /unix /unix /unix /unix /unix /unix /unix /unix /unix /unix > big
    	# ls -l big
    	-rw-rw-r--    1 root     sys      20675420 Mar  5 16:11 big
    

  2. Use the rcp(1C) command to copy the file from the client to the server and time the result:

    # time rcp big server:/dev/null
    38.2 real 0.2 user 30.7 sys
    

  3. To find the network speed, divide the number of bytes in the file by the real time. For example:

    Rate: 20675 KB / 38.2 seconds = 541 KB / second
    

The most important factor affecting network speed is network errors. To determine the input error rate, the output error rate, and the collision rate, use the netstat –i command. If the input or output error rate is above 0.5%, or the collision rate is above 5%, network errors are slowing down the network speed.

Server CPU Speed

The speed and the number of the CPU(s) of a server limits the following:

  • total data throughput to tape

  • interrupts per second for network data

  • context switches per second between processes

The best measure is the number of CPUs for the server. More CPUs means a faster system.

Memory

The memory on the server limits the amount of data buffered between the NetWorker save(1M) command, agent daemon, and media management daemon. In general, the more memory the server has, the better performance is.

Measuring Client Performance

This section provides examples of how to measure the performance of a NetWorker client.

Filesystem Traversing

To measure the filesystem traversing speed, follow the steps below:

  1. Time the uasm(1M) command with the –bi option. For example:

    # time /usr/etc/uasm -bi /usr
    33848 records 6961176 header bytes 644814472 data bytes
    
    real 51.86
    user 7.73
    sys  27.95
    7.7u/27.9s (68% of 0:51)  0k+0k+0k  0pf+0sw  2993i+16o
    

  2. Divide the number of records by real time for rate per file. For example:

    33848 records / 51.86 seconds = 652.7 files / second
    

Data Generation Rate

To measure the rate at which a client generates data for a backup, follow the steps below:

  1. Time the uasm command with the –si option and redirect the output to /dev/null. For example:

    # time /usr/etc/uasm -si /usr > /dev/null
    
    real 6:39.49
    user 12.62
    ys  1:27.19
    12.6u/87.2s (24% of 6:39)  0k+0k+0k  0pf+0sw  45302i+27o
    

  2. Divide the number of bytes obtained (filesystem traversing) with the uasm –bi command by the real time generated by the uasm –si command. For example:

    629701 KB / 399.49 = 1576 KB / second
    

Data on Multiple Disks

NetWorker automatically backs up multiple disks in parallel.

To measure parallel disk speeds, follow these steps:

  1. Use the df(1) or du(1M) command to find two directories of approximately the same size on different disks.

  2. Run the same uasm speed tests for filesystem traversing and data generation rate as for one disk, but run the tests simultaneously on the two directories.

  3. Add the data from each test (files/sec and KB/sec) to obtain a combined rate.

This rate reflects the performance of NetWorker backing up data on multiple disks.

Client CPU Speed

The CPU of a client limits the following:

  • the total data throughput to tape

  • the interrupts per second for network data

  • the context switches per second between processes

The best measure is the MIPS® rating for the client. A larger MIPS rating means a faster system.