Appendix C. Troubleshooting

This appendix contains troubleshooting information that addresses common questions concerning operating and configuring NetWorker.

Checking the NetWorker Daemons

If you have trouble starting NetWorker, the daemons may not be running properly.  To check the daemons, enter the following command:

# ps -ef | grep nsr 

The system displays output as shown below, showing these five daemons running.

111 ?  IW    0:10 /usr/etc/nsrexecd -s localhost
116 ?  S   176:15 /usr/etc/nsrd
158 ?  IW    2:48 /usr/etc/nsrmmdbd
159 ?  S    23:45 /usr/etc/nsrindexd
160 ?  IW<  16:07 /usr/etc/nsrmmd -n 1

If you discover that you need to start the NetWorker daemons, enter these commands:

# cd / 
# nsrd 
# nsrexecd 

Displaying NetWorker

If you enter the nwadmin command and the NetWorker Administrator window does not appear, the DISPLAY variable on your system may not be set correctly.

To set the DISPLAY variable correctly, follow these steps.

  1. Enter the following command at the system prompt for C shell or tcsh:

    # setenv DISPLAY hostname:0.0 
    

    For a Korn shell or a Bourne shell, enter these commands:

    # DISPLAY=hostname:0.0 
    # export DISPLAY 
    

    Replace hostname with the name of the machine where the user initially logged in.

  2. Enter one of the following at the system prompt:

    # xhost machineName 
    

    or

    # xhost + 
    

    Replace machineName with the name of the machine where you are currently logged in, or the machine where you will log in.

  3. Restart nwadmin.

Renaming a Client

NetWorker maintains an index for every client it backs up.  If you change the name of the client, the index for that client is no longer associated with the client, and the client will not be able to recover any files backed up under its old name.

To change the name of a NetWorker client, you must first delete the old client name, then add the new client name, and rename the directory that contains the corresponding index.  Follow these steps:

  1. In /etc/hosts or the NIS hosts map, make the old client name an alias of the new name:

    nnn.nn.nn.nn    newClientName oldClientName 
    

  2. Using the Clients window, create a client with the new hostname.  Configure the new client to mimic the configuration of the old client.

  3. Using the Clients window, delete the old client.

  4. As root on the NetWorker server, shut down the NetWorker daemons:

    # /etc/init.d/networker stop 
    

  5. Change to the directory containing the client index directory, by default /nsr/index:

    # cd /nsr/index 
    

  6. Delete the new client index directory (which is empty):

    # rmdir newClientName 
    

  7. Use the mv command to rename the client index directory.  For example:

    # mv oldClientName newClientName 
    

    If you failed to remove the new client index directory, the old client index directory will be copied into the new client directory, as a subdirectory, with its old name.

  8. Restart the NetWorker daemons:

    # /etc/init.d/networker start 
    

    The media management database daemon nsrmmdbd renames all the instances of save sets under the old client name to the new client name.

  9. In /etc/hosts or the NIS hosts map, optionally delete the old client name alias.

As soon as possible, complete a full backup of the renamed client's files.

Recover Access Issues

System administrators can control client recover access by configuring the client.  The Recover access list in the Clients window displays the names of machines that can recover client files. The following users have the ability to recover any files on any client:

  • root

  • operator

  • a member of the operator group

Other users can recover only those files for which they have read permission, relative to the file mode and ownership at the time the file was backed up.

Files recovered by a user other than root, operator, or the operator group are created owned by that user.

Previewing a Backup

Every time you add a new client to NetWorker, it is a good idea to check if NetWorker can successfully back up the files for the new client.  Use the Preview button in the Group Control window to see a “preview” of a group backup without actually backing up any files.  You can also use the savegrp -p command at the system prompt to see a preview.

This command previews a backup of clients assigned to the groupName backup group:

# savegrp -p groupName 

If NetWorker cannot access a client in the backup group, check the following items:

  • Make sure nsrexecd is running on the client machine and that it lists the hostname of the server in the command line.   To make sure that nsrexecd is running, use the UNIX command ps on the client.  See “Installing the Client Software” in the appropriate chapter for your platform for more information on nsrexecd.

  • Using nsrexecd is the best method for backing up clients over the network.  If you choose not to use nsrexecd and clients cannot find the NetWorker binaries, add the path to the NetWorker binaries in the Executable path field of the Clients window for each client.  In other words, if the default PATH setup for root or Remote user does not include the appropriate path to the NetWorker binaries, add them to the client configuration.  Display these attributes by choosing Details from the View menu.

Halting a Network Backup

To stop running a network-wide backup, click the Stop button in the Group Control window, shown in Figure C-1.

Figure C-1. Group Control Stop Button

Figure C-1 Group Control Stop Button

The next network-wide backup will start as scheduled in the Start time field of the Groups window, or you may restart by clicking the Restart button in the Group Control window.

Backup Media Capacity

Occasionally you will find that NetWorker marks backup volumes as “full” when they are not really full.  (The Volumes window and the output from the mminfo -m command display the details of the backup volumes.)

NetWorker marks magnetic tape “full” when it reaches the end of the tape or when there is a bad spot on the tape.  For example, a backup tape that is reported as only “13% used” and marked as “full” has a bad spot on 13% of the length at the beginning of the tape.  This tape can still be used for recoveries, but may not be used for any more backups.

If you see this “bad spot” behavior on many of the backup volumes, it may indicate the device needs to be cleaned or repaired.

Tapes are also marked “full” when they are recovered after being deleted from the media index.

Determining Jukebox Capacity

To find out how much space is available in the jukebox (autochanger), use either the Jukebox Mounting window or the nsrjb command.  The Jukebox Mounting window displays all media in the jukebox and the percent used of each tape, as shown in Figure C-2.

Figure C-2. Jukebox Mounting Window

Figure C-2 Jukebox Mounting Window

If you prefer to use the nsrjb command, follow the steps below:

  1. Switch user to root.

  2. Enter the nsrjb -v command at the system prompt:

    # nsrjb -v 
    

    NetWorker displays information about the backup volumes in the jukebox that looks similar to this:

    Jukebox arc-db:
    slot    volume     used    pool     mode
    1:      moon.010           Default 
    2:      moon.011   full    Default 
    3:      moon.012           Default 
    4:      moon.013   full    Default 
    4 volumes, 2 less than 80% full.
    2305 MB total capacity, 2200 MB remaining (5% full)
    drive 1 (/dev/rmt/0hbn) slot 3: moon.012 
    

Notice the information about the registered volumes, total capacity, and remaining capacity.  This information tells you how much space is still available in the jukebox.

Savegroup Completion Messages

In the Notifications window, you configured NetWorker to mail the event notification about your savegroups.  The Notifications window is preconfigured to mail the savegroup completion messages to root.  This section contains descriptions of error messages that may appear in the savegroup completion mail.  Possible solutions are included.

Binding to Server Errors

NetWorker is designed to follow the client/server model.  In a client/server model, servers provide services to the client through the Remote Procedure Call (RPC).  These services live inside of long-lived UNIX processes, known as daemons.

For clients to find these services, the services must be registered with a registration service.  When daemons start up they register themselves with the registration service.  In UNIX, the portmapper provides the registration service.

NetWorker servers provide a backup and recover service:  they receive data from clients, store the data on backup media, and retrieve it on demand.  If the NetWorker daemons are not running and a service is requested by nwbackup, nwrecover, or mminfo, for example, the following messages may appear in your savegroup completion mail:

Server not available
RPC error, remote program is not registered

These messages indicate the NetWorker daemons nsrd, nsrindexd, nsrmmd, nsrmmdbd might not be running.

To restart the NetWorker daemons, enter nsrd at the system prompt:

# nsrd 

Saving Remote Filesystems

You may receive the following error message in your Savegroup completion notification when backing up a remote filesystem:

All: host hostname cannot request command execution

This means the nsrexecd on the client was not configured to allow the server hostname to back up its files.

You may also see this message:

All: sh: permission denied

This means nsrexecd is not running at all on the clients.

Make sure nsrexecd is running on the client machine and that it lists the server`s hostname in the command line.   To make sure that nsrexecd is running, use the UNIX command ps on the client.  See “Installing the Client Software” in the appropriate chapter for more information on nsrexecd.

Using nsrexecd is the best method for backing up clients over the network.  If you choose not to use nsrexecd, and the clients cannot find the NetWorker binaries, add the location of the NetWorker binaries to the Executable path field in the Client window for each client.  In other words, if the default PATH setup for root or Remote user does not include the appropriate path to the NetWorker binaries, add the path to the client configuration.  Display these hidden attributes by choosing Details from the View menu.

File Changed During Backup

NetWorker backs up the image that is in the filesystem at the time it comes across the file.  NetWorker will notify you that the file was changed during the backup in the Backup Status window and the savegroup completion mail.  You can back up the file manually after it has been closed, or wait until the next incremental backup.

Cannot Print Bootstrap Information

If your bootstraps are not being printed, you may need to enter the printer name as a hidden attribute using the following steps:

  1. Open the Groups window and choose Details from the View menu.

  2. Enter the name of the printer you are using to print the bootstrap in the Printer field.

  3. Click Apply to save your changes.

Copy Violation

If you installed NetWorker on more than one server using the same NetWorker enabler code, you will receive the following messages in your savegroup completion mail:

--- Unsuccessful Save Sets ---
* quattro:/var save: error, copy violation - servers `quattro' and
 `spim' have the same software enabler code, `12345' (13)
* quattro:/var save: cannot start a save for /var with NSR server
 `quattro'
* quattro:index save: error, copy violation - servers `quattro' and
 `spim' have the same software enabler code, `12345'
* quattro:index save: cannot start a save for /usr/nsr/index/quattro
 with NSR server `quattro'
* quattro:index save: cannot start a save for bootstrap with NSR server
 `quattro'
* quattro:index /usr/etc/savegrp: bootstrap save of server's index and
 volume databases failed

To complete a backup, you must kill the NetWorker daemons on both servers, de-install NetWorker from the extra server(s), and restart the NetWorker daemons on one server.

  1. To kill the NetWorker daemons, log in to the NetWorker servers as root and enter the following command on all servers that have NetWorker installed:

    spim# /etc/init.d/networker stop 
    ...
    quattro# /etc/init.d/networker stop 
    

  2. Use the following command to de-install NetWorker on the server(s) that you will not be using as a NetWorker server:

    spim# versions remove networker4 
    

  3. Finally, restart the NetWorker daemons on one server with these commands:

    quattro# nsrd & 
    quattro# nsrexecd & 
    

Maximum Filename Length

NetWorker supports a maximum filename size of 1024 characters.  This is the same as the UNIX svid limitation.

Savegroup Completion Warning Messages

Occasionally the savegroup completion message includes one or more messages.  These messages contain information that help the administrator understand why NetWorker performs certain tasks.

Below is one of the messages you might see:

quattro:/usr no cycles found in media db; doing full save 

In this example, the filesystem, /usr, on the client quattro has no full saves listed in the media database.  Therefore, despite the backup level pre-selected for that client's schedule, NetWorker will perform a full backup.  This feature is important because it allows you to perform disaster recoveries for that client.

This message may also appear if the server and client clocks are not synchronized.  To avoid this, make sure the NetWorker server and client

  • are in the same time zone

  • have their clocks synchronized

The following savegroup message may also appear:

NetWorker_server:index Saving server index because server is not in an active group

If your server belongs to a group that is not enabled, NetWorker will, to avoid a long recovery process, save the server bootstrap information along with this group.  As soon as possible, enable the group to which your NetWorker server belongs.

X11 Errors

The following error message may appear when the nwadmin & command is executed:

Xlib: connection to "client:0.0" refused by server
Xlib: Client is not authorized to connect to Server
X error: Cannot open display on window server: client:0.0 (Server pkg)

This indicates that the client is not authorized to display NetWorker.

To correct this situation do the following at the client machine:

client% xhost NetWorkerServer 

Remotely log in to the NetWorker server and run the following command at the server prompt:

% setenv DISPLAY client:0.0 

For the Korn shell or the Bourne shell, use the following commands:

# DISPLAY=client:0.0 
# export DISPLAY 

Moving Indexes

Because the index databases are holey files, cp creates a file that consumes more disk space than the original file.  To move indexes, execute the following command in the /nsr/index directory:

# uasm -si clientIndexDirectoryName | (cd targetDir; uasm -rv) 

Recovering Files From an Interrupted Backup

You cannot recover files from a backup terminated by killing the NetWorker daemons because the media index was not updated before the daemons exited.  Consequently, NetWorker does not know on which volume the requested file is located.

Determining the NetWorker Server

If you start NetWorker from a remotely mounted directory, you may receive the following message:

Using server serverName as server for clientName.

NetWorker looks for the system that is the fileserver of a remotely mounted directory and uses the NetWorker server assigned to that system as the backup server.  To bypass this message, start NetWorker from a local filesystem.

Using nsrexecd

The nsrexecd daemon runs on NetWorker client machines.  This daemon provides a secure and restrictive way for NetWorker to start automatic backups on clients.  The nsrexecd daemon allows you to restrict access to a select set of NetWorker servers.  When you install NetWorker on a client, chkconfig automatically turns NetWorker on, so nsrexecd will be started each time the client reboots.  Security is increased by the use of a challenge/response scheme to ensure that only the NetWorker server is initiating connections, and not another program.

The file modified for each client type is shown in the table below.  If you ever need to reconfigure nsrexecd, for example, to allow a different NetWorker server to back up the client, edit the appropriate file on the client, make the changes to the nsrexecd startup command (see the nsrexecd(1M) reference page for a description of the command-line configuration options), and restart nsrexecd.

Make sure you enter the nsrexecd command in exactly the same way as it is listed in the boot-time file, complete with all command-line options.  Alternatively, on non-IRIX systems you can use nsr_ize -c -u to deinstall the client software, entering no whenever it asks you questions.  Then, use nsr_ize -c -i and follow the instructions as if you were performing an NFS client install.

Table C-1 shows the location of boot-time files on different operating systems.

Table C-1. Where to Start nsrexecd

Operating System Type

Boot-Time File

AIX

/etc/rc.nsr

HP-UX

/etc/rc or /sbin/init.d/networker

IRIX

with chkconfig(1M)

SCO

/etc/rc2.d/S95networker

Solaris 2.x

/etc/rc2.d/S95networker

SunOS 4.1.x

/etc/rc.local

Ultrix

/etc/rc.local

others

/etc/rc2.d/S95networker