Chapter 2. ClearCase Data Storage

This chapter describes the on-disk data structures that implement ClearCase VOBs and views. File system objects stored in these structures are termed MVFS objects, because client programs access them through the ClearCase multiversion file system.

Versioned Object Bases (VOBs)

The ClearCase data repository for a network is implemented as a set of versioned object bases (VOBs). Each VOB is implemented as a UNIX directory tree, whose top-level directory is termed the VOB storage directory. The main components of this directory tree are:

  • VOB database—The db subdirectory contains the binary files managed by ClearCase's embedded DBMS. Each VOB has its own database; there is no central database that encompasses all VOBs. The database stores several kinds of data:

    • version-control information: elements, their branch structures, and their versions

    • meta-data associated with the file system objects: version labels, attributes, and so on

    • event records and configuration records, which document ClearCase development activities

    • type objects, which are involved in the implementation of both the version-control structures and the meta-data

Actual file system data (for example, the contents of version 3 of file msg.c) is not stored in the VOB database.

  • VOB storage pools—The c, d, and s subdirectories contain the VOB's storage pools, each of which is a standard UNIX directory. The storage pools hold data container files, which store the VOB's file system data: versions of elements and shared binaries. Depending on the element type, the versions of an element might be stored in separate data container files, or might be combined into a single structured file that contains deltas (version-to-version differences).

  • Identity directory—The .identity subdirectory contains files that establish the VOB's owner, its principal group, and its group list.

The vob manual page provides a detailed description of the contents of a VOB. (The .identity directory is not discussed further in this chapter—see “VOBs and Views: Owner and Groups” for more on this topic.)


Note: Users do not directly access a VOB storage directory. Rather, they access a VOB through its VOB-tag, which specifies a location at which the VOB is activated (mounted) as a file system of type MVFS. Chapter 3 discusses this in detail.


VOB Database

Each VOB has its own database, implemented as a set of files in the db subdirectory of the VOB storage directory. ClearCase server programs are invoked automatically, as needed, to access a VOB's database:

  • A db_server process handles requests from a single ClearCase client program.

  • A vobrpc_server process handles requests from one or more ClearCase view_server processes.

These server processes run on the host where the VOB storage directory physically resides. The db subdirectory must also be on that same host. (As explained below, storage pools can be remote.)


Caution: You cannot simply move the VOB database directory (db) to another host. See Chapter 11, “Occasional VOB Maintenance”, for steps you can take if a VOB database threatens to fill up its disk partition.

For the most part, ClearCase servers and crontab(1) scripts manage VOB databases automatically. See Chapter 10, “Periodic Maintenance of the Data Repository” and Chapter 11 for more on VOB maintenance.

VOB Storage Pools

Each VOB has a set of storage pools, which hold several different kinds of data containers. Each storage pool holds data containers of one kind (Figure 2-1).

Figure 2-1. VOB Database and VOB Storage Pools


Source Storage Pools

Each source storage pool holds all the source data containers for a set of file elements. A source data container holds the contents of one or more of a file element's versions. For example, a single source data container holds all the versions of an element of type text_file. The type manager program for this element type handles the task of reconstructing individual versions from deltas in the data container. Likewise, the type manager updates the data container when a new version is checked in.

Source pools are accessed by checkout and checkin commands, and by development operations (for example, cat(1), lp(1), cc(1)) that read the contents of elements that are not checked-out. In many cases, however, a cleartext pool is accessed instead of the source pool.

Cleartext Storage Pools

Each cleartext storage pool holds all the cleartext data containers for a set of file elements. A cleartext data container holds the contents of one version of an element. These pools are caches that accelerate access to elements for which all versions are stored in a single data container: compressed files and text files.

For example, the first time a version of a text_file element is required, the text_file_delta type manager reconstructs the version from the element's source data container. The version is cached as a cleartext data container—an ordinary text file—located in a cleartext storage pool. On subsequent accesses, ClearCase looks first in the cleartext pool. A “cache hit” eliminates the need to access a source pool, thus reducing the load on that pool; it also eliminates the need for the type manager to reconstruct the requested version.

Cache hits are not guaranteed, since cleartext storage pools are periodically scrubbed. (See Chapter 10, “Periodic Maintenance of the Data Repository”.) A miss simply means that the type manager must be invoked to reconstruct the version again.

Derived Object Storage Pools

Each derived object storage pool holds a collection of derived object data containers. A derived object data container holds the file system data (typically, binary data) of one DO, created during clearmake or clearaudit execution.

DO storage pools contain data containers only for the derived objects that have been shared by two or more views, through ClearCase's wink-in feature. Each directory element is assigned to a particular DO storage pool; the first time a DO that was created within that directory is winked-in to some view, its data container is copied to the corresponding DO storage pool. The data containers for never-shared derived objects reside in view-private storage.

Derived object pools are periodically scrubbed, as described in Chapter 10.

The vob_server Process

Most access to VOB storage pools goes through a ClearCase server program, the vob_server. This process handles data-access requests from clients, forwarded to it by the VOB's db_server and vobrpc_server processes. As with these other servers, a vob_server runs on host where the corresponding VOB storage directory resides. Each VOB on a host has its own dedicated vob_server process.

Default, Local, and Remote Storage Pools

As part of creating a new VOB, the mkvob command creates three subdirectories for storage pools, with a single default storage pool within each one:

c

directory for all cleartext pools

 

c/cdft

default cleartext pool

 

d

directory for all derived object pools

 

d/ddft

default derived object pool

 

s

directory for all source pools

 

s/sdft

default source pool

You can create as many additional storage pools as desired (with mkpool), and can adjust the assignment of elements to these pools (with chpool).

By default, the mkpool command creates new storage pools within the VOB storage directory itself. Such pools are termed local. For example, mkpool -source srcpl2 creates a local pool as subdirectory s/srcpl2 under the VOB storage directory.

You can use mkpool -ln to create a remote storage pool, leaving behind a standard UNIX-level symbolic link that points to the remote location:

# cleartool mkpool -source -ln /net/ccsvr04/ccase_pools/srcpl3 srcpl3

In this example, a storage pool directory is created at the remote location /net/ccsvr04/ccase_pools/srcpl3. Within the VOB storage directory, a symbolic link is created instead of a subdirectory; the text of the link is /net/ccsvr04/ccase_pools/srcpl3.

The remote location can be on another host—even a non-ClearCase host—or in another disk partition on the local host (Figure 2-2). Either way, this capability enables a VOB to circumvent the UNIX limitation that restricts a directory tree to be wholly contained within a single disk partition. It also allows you to use high-capacity and/or high-speed file servers on which ClearCase is not installed.

Figure 2-2. Local and Remote VOBStorage Pools


This is a powerful feature, enabling a single logical entity to be distributed physically. But there are some provisos:

  • The important task of data backup is considerably harder for a distributed VOB than for a VOB wholly contained in a single disk partition. See “Backing Up a VOB with Remote Storage Pools”.

  • You must be careful in devising the pathname of the remote location. This pathname must be valid on all client hosts that will access the VOB. In particular, you cannot use the network region facility to handle network idiosyncrasies, such as hosts with multiple network interfaces.

Elements' Source Pool Assignments

The mkvob command creates a single directory element, the VOB's root directory. Users access this directory at the VOB-tag location (the VOB's mount point). This top-level directory is assigned to the three default storage pools; and by default, all newly-created elements inherit the pool assignments of their parent directories. Thus, all elements in a VOB will use the default storage pools, unless you create new pools and reassign elements to them.

You can use the chpool command to change the source and/or cleartext pool associated with an element. Changing the source pool of a file moves all its data containers; for a directory element, this changes the source pool to which new elements created within it will be assigned. (See also “Creating Additional VOB Storage Pools”, and the mkvob manual page.)

Commands for Working with Storage Pools

The following commands are your basic tools for working with VOB storage pools. Each has its own manual page, which provides complete details on its usage.

cleartool subcommands:

mkpool 

Creates a new storage pool; with -update, adjusts an existing pool's scrubbing parameters.

lspool  

Lists basic information about one or more storage pools. (The describe -pool command lists the same information.)

rnpool 

Renames a storage pool.

rmpool 

Deletes a storage pool.

chpool 

Reassigns elements to a different pool.

utility commands:

scrubber 

Deletes unneeded data containers from derived object and cleartext pools. (See also “Scrubbing VOB Storage Pools”.)

view_scrubber 

With -p option, transfers data containers from view-private storage to a VOB's derived object storage pool. (See also “Scrubbing View-Private Storage”.)

Views

A ClearCase development environment can includes any number of views. A typical view is “private” to a single user, or perhaps to a small group of users tackling a particular task as a team.

Each view implements a virtual workspace, which presents its user(s) with an extended file system that superficially appears to be a standard UNIX file system hierarchy. This workspace combines:

  • Selected versions of elements (actually stored in VOB storage pools)

  • Files that are being modified (checked-out file elements, stored in the view's private storage area)

  • Directories that are being modified (checked-out directory elements, maintained in the VOB database)

  • Derived objects built by users working in this view (stored in the view's private storage area); configuration records that correspond to these derived objects

  • Derived objects originally built in another view, but them winked-in to this view (stored in VOB storage pools)

  • View-private objects: miscellaneous files, directories, and links that appear only in this view (stored in the view's private storage area)

Each view is a UNIX directory tree, whose top-level directory is termed the view storage directory. The main components of this directory tree are:

  • View database—The db subdirectory contains the binary files managed by ClearCase's embedded DBMS. The database tracks the correspondence between VOB objects and view-private objects. For example, a checkout of a file element creates a “checked-out-version” object in the VOB database, and a corresponding data file in the view's data storage area. The view database records the relationship between these two objects.

  • Private storage area—The .s subdirectory is the top-level of a directory tree in which all view-private objects are stored: checked-out versions of file elements, unshared derived objects, text-editor backup files, and so on. Each view-private file, which appears to be located in some directory within some VOB, is actually stored in a data container in the view's private storage area.

  • Identity directory—The .identity subdirectory contains files that establish the view's owner, its principal group, and its group list.

The view manual page provides a detailed description of the contents of a view. (For more on the .identity directory, see “VOBs and Views: Owner and Groups”.)

View Database

Each view has its own database, implemented as a set of files in the db subdirectory of the view storage directory. On-disk overhead for the database is quite small—usually less than 1Mb.

A single ClearCase server program, the view_server, is invoked when the view is activated (for example, with a startview or setview command). This enables ClearCase client programs and standard UNIX programs to use the view—both to access VOB data and to access view-private data. The view_server process runs on the host where the view storage directory resides.


Caution: Never try to move the view database directory (db) to another host. But see “Moving a View (Same Architecture)”.


View's Private Storage Area

A view's private storage area is a subtree in the view storage directory. It provides disk storage for view-private files (including checked-out versions of file elements) and for derived objects actually built in that view. clearmake also caches configuration records of recently-built derived objects in this area, to speed configuration lookup (build avoidance).

Typically, unshared derived objects make the greatest storage demand on a view's private storage area. When a derived object is first created, both its data container file and its configuration record are stored in the view. The first time the derived object is winked-in to another view:

  • The configuration record is moved to the appropriate VOB database, or databases. If the build script creates derived objects in several VOBs, each VOB database gets a copy of the same configuration record.

  • The data container is copied (not moved) to a VOB storage pool. The original data container remains in view storage, to avoid “pulling the rug out from under” user processes that are currently accessing the data container. From time to time, you (or whichever user “owns” the view) may find it worthwhile to eliminate the redundant storage containers from views with the view_scrubber utility. (See Chapter 10.)

A view's private storage area can be located remotely from the view storage directory, and accessed through a standard UNIX-level symbolic link. This resembles the remote VOB storage pool facility, discussed in “Default, Local, and Remote Storage Pools”, but the facility for views is less elaborate:

  • A VOB can have any number of storage pools, any of which can be remote.

  • A view has a single private storage area: the directory tree with view-storage-dir/.s as its root. By default, the mkview command creates a view storage directory with .s as an actual subdirectory; the mkview -ln command creates a view with .s as a symbolic link to another location.

The same restriction as for remote VOB storage pools applies: a remote private storage area must be NFS-accessible at the same pathname from all ClearCase hosts (for example, /net/cccvr02/view_storage/drp).

If a view storage directory threatens to fill up its disk partition, you can move its .s directory to a larger partition. See Chapter 12, “Occasional View Maintenance” for details.