Chapter 1. Understanding ONC3/NFS

This chapter introduces the Silicon Graphics implementation of the Sun Microsystems Open Network Computing Plus (ONC+) distributed services, which was previously referred to as Network File System (NFS). In this guide, NFS refers to the distributed file system in ONC3/NFS.

The information in this chapter is prerequisite to successful ONC3/NFS administration. It defines ONC3/NFS and its relationship to other network software, introduces the ONC3/NFS vocabulary, and identifies the software elements that support ONC3/NFS operation. It also explains special utilities and implementation features of ONC3/NFS. You should be familiar with the information in this chapter before setting up or modifying the ONC3/NFS environment.

The components of ONC3/NFS are described below.

NFS 

The distributed file system in ONC3/NFS. It contains the automounter and lock manager. ONC3/NFS includes the new level of the NFS protocol, NFS3, an optimized version of NFS, designed to be transparent to users. NFS is multithreaded to take advantage of multiprocessor performance.

CacheFS 

The Cache File System (CacheFS) is a new file system type in IRIX 5.3 that provides client-side caching for NFS and other file system types. Using CacheFS on NFS clients with local disk space can significantly increase the number of clients a server can support and reduce the data access time for clients using read-only file systems.

NIS 

The network information service (NIS) is a database of network entity location information that can be used by NFS. Information about NIS is published in a separate volume called the NIS Administration Guide.

This chapter contains these sections:

What Is NFS?

NFS is a network service that allows users to access file hierarchies across a network and treat them as if they were local. File hierarchies can be entire file systems or individual directories. Systems participating in the NFS service can be heterogeneous. They may be manufactured by different vendors, use different operating systems, and be connected to networks with different architectures. These differences are transparent to the NFS application.

NFS is an application layer service that can be used on any network running the Transmission Control Protocol (TCP) or User Datagram Protocol (UDP). It relies on remote procedure calls (RPC) for session layer services and external data representation (XDR) for presentation layer services.

XDR is a library of routines that translate data formats between processes.

Figure 1-1 illustrates the NFS software implementation in the context of the Open Systems Interconnect (OSI) model.

Figure 1-1. NFS Software Implementation


NFS and Diskless Workstations

It is possible to set up a system so that all the required software, including the operating system, is supplied from remote systems by means of the NFS service. Workstations operating in this manner are considered diskless workstations, even though they may be equipped with a local disk.

Instructions for implementing diskless workstations are given in the Diskless Workstation Administration Guide. However, it is important to acquire a working knowledge of NFS before setting up a diskless system.

The Cache File System

A cache is a temporary storage area for data. The Cache File System (CacheFS) enables you to use local disk drives on workstations to store frequently used data from a remote file system or CD-ROM. The data stored on the local disk is the cache.

When a file system is cached, the data is read from the original file system and stored on the local disk. The reduction in network traffic improves performance. If the remote file system is on a storage medium with slower response time than the local disk (such as a CD–ROM), caching provides an additional performance gain.

CacheFS can use all or part of a local disk to store data from one or more remote file systems. A user accessing a file does not need to know whether the file is stored in a cache or is being read from the original file system. The user opens, reads, and writes files as usual.

NFS and the Network Information Service

The Network Information Service (NIS) is a database service that provides location information about network entities to other network servers and applications, such as NFS. NFS and NIS are independent services that may or may not be operating together on a given network. On networks running NIS, NFS may use the NIS databases to locate systems when NIS queries are specified.

Client-Server Fundamentals

In an NFS transaction, the workstation requesting access to remote directories is known as the client. The workstation providing access to its local directories is known as the server. A workstation can function as a client and a server simultaneously. It can allow remote access to its local file systems while accessing remote directories with NFS. The client-server relationship is established by two complementary processes, exporting and mounting.

Exporting

Exporting is the process by which an NFS server provides access to its file resources to remote clients. Individual directories, as well as file systems, can be exported, but exported entities are usually referred to as file systems. Exporting is done either during the server's boot sequence or from a command line as superuser while the server is running.

Once a file system is exported, any authorized client can use it. A list of exported file systems, client authorizations, and other export options are specified in the /etc/exports file (see “/etc/exports and Other Export Files” in Chapter 2 for details). Exported file systems are removed from NFS service by a process known as unexporting.

A server can export any file system or directory that is local. However, it cannot export both a parent and child directory within the same file system; to do so is redundant.

For example, assume that the file system /usr contains the directory /usr/demos. As the child of /usr, /usr/demos is automatically exported with /usr. For this reason, attempting to export both /usr and /usr/demos generates an error message that the parent directory is already exported. If /usr and /usr/demos were separate file systems, this example would be valid.

Mounting

Mounting is the process by which file systems, including NFS file systems, are made available to the IRIX operating system and consequently, the user. When NFS file systems or directories are mounted, they are made available to the client over the network by a series of remote procedure calls that enable the client to access the file system transparently from the server's disk. Mounted NFS directories or file systems are not physically present on the client system, but the mount looks like a local mount and users enter commands as if the file systems were local.

NFS clients can have directories mounted from several servers simultaneously. Mounting can be done as part of the client's boot sequence, automatically, at file system access, with the help of a user-level daemon, or with a superuser command after the client is running. When mounted directories are no longer needed, they can be relinquished in a process known as unmounting.

Like locally mounted file systems, NFS mounted file systems and directories can be specified in the /etc/fstab file (see “/etc/fstab and Other Mount Files” in Chapter 2 for details). Since NFS file systems are located on remote systems, specifications for NFS mounted resources must include the name of the system where they reside.

Mount Points

The access point in the client file system where an NFS directory is attached is known as a mount point. A mount point is specified by a conventional IRIX pathname.

Figure 1-2 illustrates the effect of mounting directories onto mount points on an NFS client.

Figure 1-2. Sample Mounted Directory


The pathname of a file system on a server can be different from its mount point on the client. For example, in Figure 1-2 the file system /usr/demos is mounted in the client's file system at mount point /n/demos. Users on the client gain access to the mounted directory with a conventional cd(1) command to /n/demos, as if the directory were local.

Mount Restrictions

NFS does not permit multihopping, mounting a directory that is itself NFS mounted on the server. For example, if host1 mounts /usr/demos from host2, host3 cannot mount /usr/demos from host1. This would constitute a multihop.

NFS also does not permit loopback mounting, mounting a directory that is local to the client via NFS. For example, the local file system /usr on host1 cannot be NFS mounted to host1, this would constitute a loopback mount.

Automatic Mounting

As an alternative to standard mounting via /etc/fstab or the mount command, NFS provides an automatic mounting feature called the automounter, or automount. The automounter dynamically mounts file systems when they are referenced by any user on the client system, then unmounts them after a specified time interval. Unlike standard mounting, automount(1M), once set up, does not require superuser privileges to mount a remote directory. It also creates the mount points needed to access the mounted resource. NFS servers cannot distinguish between directories mounted by the automounter from those mounted by conventional mount procedures.

Unlike the standard mount process, automount does not read the /etc/fstab file for mount specifications. Instead, it reads alternative files (either local or through NIS) known as maps for mounting information (see “automount Files and Maps” in Chapter 2 for details). It also provides special maps for accessing remote systems and automatically reflecting changes in the /etc/hosts file and any changes to the remote server's /etc/exports file.

Default configuration information for automounting is contained in the file /etc/config/automount.options. This file can be modified to use different options and more sophisticated maps.

automount Restrictions

CacheFS file systems cannot be automatically mounted with automount.

Stateless Protocol

NFS implements a stateless protocol in which the server maintains almost no information on NFS processes. This stateless protocol insulates clients and servers from the effects of failures. If a server fails, the only effect to clients is that NFS data on the server is unavailable to clients. If a client fails, server performance is not affected.

Clients are independently responsible for completing NFS transactions if the server or network fails. By default, when a failure occurs, NFS clients continue attempting to complete the NFS operation until the server or network recovers. To the client, the failure can appear to be slow performance on the part of the server. Client applications continue retransmitting until service is restored and their NFS operations can be completed. If a client fails, no action is needed by the server or its administrator in order for the server to continue operation.

The major advantage of a stateless server is robustness in the face of client, server, or network failures. This robustness is especially important in a complex network of heterogeneous systems, many of which are not under the control of a centralized operations staff, and some of which are systems that are often rebooted without warning.

Input/Output Management

In NFS transactions, data input and output is asynchronous read-ahead and write-behind, unless otherwise specified. As the server receives data, it notifies the client that the data was successfully written. The client responds by freeing the blocks of NFS data successfully transmitted to the server. In reality, however, the server might not write the data to disk before notifying the client, a technique called delayed writes. Writes are done when they are convenient for the server, but at least every 30 seconds.

Although delayed write is the default method of operation for NFS, synchronous writes are also an option (see “/etc/exports Options” in Chapter 2 for more details about NFS options). With synchronous writes, the server writes the data to disk before notifying the client that it has been written. Synchronous writes may slow NFS performance due to the time required for disk access, but increase data integrity in the event of system or network failure.

NFS File Locking Service

To help manage file access conflicts and protect NFS sessions during failures, NFS offers a file and record locking service called the network lock manager. The network lock manager is not an integral part of NFS. It is a separate service NFS makes available to user applications with the facility to use it. To use the locking service, applications must make calls to standard IRIX lock routines (fcntl(2), flock(3B), and lockf(3C)). For NFS files, these calls are received by the network lock manager process (lockd(1M)).

The network lock manager processes must run on both the client and the server to function properly. Communication between the two processes is by means of RPC. Calls for service issued to the client process are handed to the server process, which uses its local IRIX locking utilities to handle the call. If the file is in use, the lock manager issues an advisory to the calling application, but it does not prevent the application from accessing a busy file. The application must determine how to respond to the advisory, using its own facilities.

There are four basic kernel-to-lock manager requests:

KLM_LOCK 

Lock the specified record.

KLM_UNLOCK 


Unlock the specified record.

KLM_TEST 

Test if the specified record is locked.

KLM_CANCEL 


Cancel an outstanding lock request.

Despite the fact that the network lock manager adheres to lockf/fcntl semantics, its operating characteristics are influenced by the nature of the network, particularly during crashes.

Locking and Crash Recovery

As part of the file locking service, the network lock manager assists with crash recovery by maintaining state information on locked files. It uses this information to reconstruct locks in the event of a server or client failure.

When an NFS client goes down, the lock managers on all of its servers are notified by their status monitors, and they simply release their locks, on the assumption that the client will request them again when it wants them. When a server crashes, however, matters are different. When the server comes back up, its lock manager gives the client lock managers a grace period to submit lock reclaim requests. During this period, the lock manager accepts only reclaim requests. The client status monitors notify their respective lock managers when the server recovers. The default grace period is 45 seconds.

After a server crash, a client may not be able to recover a lock that it had on a file on that server, because another process may have beaten the recovering application process to the lock. In this case the SIGLOST signal is sent to the process (the default action for this signal is to kill the application).

The local lock manager does not reply to the kernel lock request until the server lock manager has responded to it. Further, if the lock request is on a server new to the local lock manager, the lock manager registers its interest in that server with the local status monitor and waits for its reply. Thus, if either the status monitor or the server's lock manager is unavailable, the reply to a lock request for remote data is delayed until it becomes available.

Locking and the Network Status Monitor

To handle crash recoveries, the network lock manager relies on information provided by the network status monitor. The network status monitor is a general service that provides information about network systems to network services and applications. The network status monitor notifies the network lock manager when a network system recovers from a failure, and by implication, that the system failed. This notification alerts the network lock manager to retransmit lock recovery information to the server.

To use the network status monitor, the network lock manager registers with the status monitor process (statd(1M)) the names of clients and servers for which it needs information. The network status monitor then tracks the status of those systems and notifies the network lock manager when one of them recovers from a failure.