If you plan to use a dual-server configuration so that a secondary server can take over if the primary server fails, you can use IRIS FailSafe version 1.2. Alternatively, you can set up one or more servers for clip mirroring, with clip caches that mirror the clip cache on a designated primary server. This chapter describes the following:
IRIS FailSafe 1.2 is an SGI software product that allows a pair of servers to be used in a redundant configuration so that access to a storage device, or set of storage devices, can be transferred from the primary server to the backup server should the primary server fail.
Storage devices, such as RAID arrays that can store clips and index filesystems, are physically attached to the two servers in the system, but are owned and accessed by one server at a time. The two servers and the storage form an IRIS FailSafe 1.2 cluster.
![]() | Note: Setting up servers and shared storage for IRIS FailSafe 1.2 is performed by qualified SGI System Support Engineers (SSEs) only. |
The servers also share a public IP alias or a name users can use to connect to the servers. If a server fails, the other server takes over the shared storage as well as this IP alias, which users still use to connect to the servers. To the user, a failover looks the same as a server that was unavailable for a number of seconds but returned to service thereafter.
The servers communicate via a private serial connection. When the standby server detects a problem with the active server, IRIS FailSafe 1.2 unmounts the clip filesystem from the active server and mounts it on the standby server. The standby VST server detects this action, adds the clips to its internal tables, and becomes the active server, ready to play or record.
![]() | Note: VST does not automatically begin playing or recording the material that was being played or recorded on the active server when it failed. The application(s) or automation system must restart the operations. |
This section breaks the instructions for installing and configuring IRIS FailSafe 1.2 for use with VST into the following subsections:
Follow these steps:
Have ready the following:
Install the IRIS FailSafe 1.2 software from the distribution CD:
# inst -f /CDROM/dist Inst> keep * Inst> install ha ha_www ha_fsconf Inst> go Inst> quit |
Install the IRIS FailSafe 1.2 subsystem vst_eoe.sw.failsafe from the main VST images on the VST CD, if this subsystem was not selected when VST was originally installed.
# inst -f /CDROM/dist Inst> install vst_eoe.sw.failsafe Inst> go Inst> quit |
Install the latest IRIS FailSafe 1.2 patch from the patches CD:
# inst -f /CDROM/patches/patch3582/dist Inst> install * Inst> go Inst> quit |
Set the system variables on the primary server:
# nvram AutoLoad Yes # nvram scsihostid 0 # chkconfig failsafe off |
Set the system variables on the secondary server; the recommended value for the SCSI host ID is 8.
# nvram AutoLoad Yes # nvram scsihostid 8 # chkconfig failsafe off |
Follow these steps:
From your network administrator, obtain for each server a private hostname, a public hostname, and an IP address. The two private IP addresses and the public IP address should be on the same subnet.
Configure the network interfaces on the primary server by editing the /etc/hosts file to include the following:
a private network between the servers (not broadcast)
hostname and address for each server
a single hostname and address for the entire IRIS FailSafe 1.2 configuration (also called the public IP address)
Example 5-1 shows example entries in the /etc/hosts file.
# This entry must be present or the system will not work. 128.64.0.1 localhost # Primary FailSafe Host 192.70.0.100 fsprimary.engr.sgi.com fsprimary 192.0.3.1 priv-fsprimary.engr.sgi.com priv-fsprimary # Secondary FailSafe Host 192.70.0.101 fssecondary.engr.sgi.com fssecondary 192.0.3.2 priv-fssecondary.engr.sgi.com priv-fssecondary # FailSafe Alias for single system image 192.70.0.109 fssystem.engr.sgi.com fssystem |
Place the same network information in the /etc/hosts file on the secondary server.
Add the following entries to the primary server's /etc/config/netif.options file:
if1name=ef0 if1addr=$HOSTNAME if2name=ef1 if2addr=priv-$HOSTNAME |
Place the same information in the /etc/config/netif.options files on the secondary server.
Turn routing off on the primary server by modifying the /etc/config/routed.options file to contain the following:
-h -q |
Place the same entry in the /etc/config/routed.options file on the secondary server.
Reboot each server to enable the network changes.
Use ping from the secondary server to check the connection. For example:
# ping fsprimary.engr.sgi.com |
Use ping from the primary server to check the connection. For example:
# ping fssecondary.engr.sgi.com |
Open /etc/nsswitch.conf. If necessary, change the resolution order for hosts to use files before either nis or dns. For example:
hosts: files nis |
Set resolution order the same way on the secondary server.
Place the following alias in the /etc/aliases file to handle administrative messaging from IRIS FailSafe 1.2 on the primary server. Use the email address of the person who administers the system. For example:
fsafe_admin:postmaster postmaster:[email protected]_system |
Use the newaliases command the enable the changes:
# newaliases |
Place the same alias in the /etc/aliases file on the secondary server and use newaliases to enable the changes.
Follow these steps:
Edit the /etc/inittab file on the primary server to reserve the private IRIS FailSafe 1.2 serial connection between the servers. Make certain that the t2 entry is as follows:
t2:23:off:/sbin/getty -N ttyd2 co_9600 # port 2 |
Use the init q command to enable the serial port changes:
# init q |
Change the /etc/inittab file t2 entry on the secondary server to match that of the primary server and use init q to enable the changes.
Follow these steps:
Create the XLV logical volumes and XFS filesystems to be shared by following the instructions in IRIX Admin: Disks and Filesystems .
Remember that each XLV logical volume must be owned by the node that is the same as the primary node for running VST. To simplify the management of the node names (or owners) of volumes on shared disks, follow these recommendations:
Work with the volumes on the shared storage from the primary node.
If you did not use the primary to create the volumes, you can change the node name to the primary node using xlv_mgr.
Copy /usr/vtr/failsafe/ha.conf to /var/ha/ha.conf onto the primary and secondary servers. These files are included in the vst_eoe.sw.failsafe subsystem.
Copy /usr/vtr/failsafe/chkvtr to /var/ha/actions/chkvtr onto the primary and secondary servers. These files are included in the vst_eoe.sw.failsafe subsystem.
If necessary, change the default mount point for the shared file systems in /var/ha/ha.conf on the primary and secondary servers.
By default, the shared filesystem is mounted at /usr/vtr/clips. If you want to install a separate shared filesystem that stores the indices for the clips, do the following:
Add a new filesystem block like the one in the sample file for filesystem fcraid1 in /var/ha/ha.conf.
Add a new volume block, as shown for volume fcraid1.
See the IRIS FailSafe Administrator's Guide for more details.
Follow these steps on the primary and secondary servers:
Change the fsprimary.engr.sgi.com entry (default hostname) in /var/ha/ha.conf to the name of your primary server.
Change the fssecondary.engr.sgi.com entry in /var/ha/ha.conf to the name of your secondary server.
Change the default IP address, 198.29.66.18, in /var/ha/ha.conf to point to the IP address of your primary server.
Change the default IP address 198.29.66.90 in /var/ha/ha.conf to the IP address of your secondary server.
Change the default netmask values and the broadcast-addr variable to the correct values for your network on the primary and secondary servers.
The broadcast-addr variable should point to the broadcast address of the IP subnet on which the servers are present. Both servers must be on the same IP subnet.
Follow these steps:
Perform the following steps as a superuser on the primary and secondary servers:
# chown root.sys /var/ha/ha.conf # chmod 500 /var/ha/ha.conf # /usr/etc/ha_cfgchksum |
Make sure the output of ha_cfgchksum is the same on both servers.
Make sure no errors are reported when you enter the following on the primary and secondary servers:
# /usr/etc/ha_cfgverify |
Make sure the shared filesystems are not shown on the primary and secondary servers in the file /etc/fstab, or when you enter df.
Make sure the mount points are the same on both servers for the shared filesystems.
Execute the following command first on the primary server and then on the secondary server:
chkconfig failsafe on /etc/init.d/failsafe start |
You can use the VST clipmirror subsystem to implement a fully redundant cluster of two or more VST servers, each with its own private clip storage.
One server is designated the primary server and handles normal clip playback and recording, as well as file transfers to and from other non-VST servers or workstations. The VST clipmirror subsystem on each redundant (clip-mirroring) server synchronizes the contents of the redundant server's clip cache with the contents of the primary server's clip cache. That is, the clip caches on the redundant servers are maintained as mirrors of the primary server's clip cache.
This section consists of the following subsections:
You set the name of the primary server that a redundant server is configured to mirror in the file /usr/vtr/config/system-defaults/clipmirror by specifying the value of the control vtr.clipmirror.primary_server.hostname. For example:
vtr.clipmirror.primary_server.hostname newprimary |
The application can dynamically control the clip-mirroring feature by setting this control with the MVCP SSET command. For example:
SSET clipmirror vtr.clipmirror.primary_server.hostname newprimary |
During the initial synchronization phase that occurs when VST starts on a redundant server, the clipmirror subsystem on the redundant server opens an MVCP connection to the primary server. Using this connection, the redundant server retrieves the list of clips (including modification times) from the primary server. The redundant server compares the primary server's clips with the clips in the local clip cache, checking for clips that are missing, out of date, or extraneous.
If the file on a redundant server is newer than the file on the primary server, the clip is assumed to be up to date.
If the clip file on a redundant server is of the same size as the corresponding file on the primary server and also has a later modification time, then the clip is assumed to be up to date.
If a clip exists on the primary server only, the clip is copied via FTP to the redundant server.
If a file exists on the redundant server only, a message is added to the VST log file (default /var/adm/vtr/logs/vtrlog), but no other action is taken. The clip can be removed from the redundant server manually or by an application.
When a clip is recorded on or transferred via a network to the primary server, it is automatically copied to the clip-mirroring server. The speed of the copy operation depends on the bit rate of the clip, the network media connecting the servers, and whether other clips are being copied or are waiting to be copied to the redundant server.
The maximum number of FTP transfers spawned is set by the control vtr.clipmirror.max_threads. The value of this control should be based on the bandwidth of the network connection; the default is 20. The use of this control and other clip mirror controls is explained in “Setting Up Primary and Redundant Servers”; clip mirror controls are summarized in “System Controls for Clip Mirroring” in Appendix A.
Once the clip caches on the primary and redundant servers are synchronized, the redundant server monitors the primary server for changes to the contents of its clip cache.
Clips added to the primary server's clip cache are copied via FTP to the redundant server. A clip is copied to the redundant server again if the clip media data is modified on the primary server.
If a clip is removed, renamed, or has its protection changed, the redundant server performs the same operation on its copy of the clip.
If the primary server becomes unavailable, each redundant server tries to reconnect to the primary server at an interval determined by the vtr.clipmirror.reconnect_interval control.
In a dual-server environment, if the primary server becomes unavailable for an extended period of time, you can reconfigure the redundant server to be the primary server. When the original primary server is available again, you can reconfigure it as the new redundant server. All new clips on the new primary server are then automatically copied to the new redundant server.
![]() | Note: The VST clip-mirroring feature does not work correctly with clip formats that require index files (such as vframe and stream). |
To set up primary and redundant servers, follow these steps:
To make sure that the filesystem holding the clips has the same real-time extent on the primary and on each redundant server, as root enter
# xfs_growfs -n /usr/vtr/clips |
Output should resemble the following, which has been slightly reformatted:
meta-data=/usr/vtr/clips/ isize=256 agcount=8, agsize=8192 blks data = bsize=16384 blocks=65536, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 1 bsize=16384 log =internal bsize=16384 blocks=1000 realtime =external extsz=2097152 blocks=4445586, rtextents=34731 |
In the last line, extsz is the extent size of the filesystem.
Using the versions command, make sure that the following subsystems are installed on the redundant servers:
vst_eoe.sw.clipmirror
vst_eoe.sw.tools
vst_eoe.sw.fsmon
vst_eoe.sw.ftpd
On the primary server, make sure the filesystems vst_eoe.sw.fsmon and vst_eoe.sw.ftpd are installed.
On each redundant server (not the primary server), open the configuration file /usr/vtr/config/system-defaults/clipmirror. This file contains clip-mirroring controls on separate lines, commented out with pound signs (for example, #vtr.clipmirror.primary_server.hostname ""). Delete the pound sign and change the default primary server hostname (the null string) to the hostname of the primary server. For example, for a primary server named vst-1:
vtr.clipmirror.primary_server.hostname vst-1 |
If you do not want to set the name of the primary server as the default, the application can set the value of this and all other clip mirror controls with the MVCP SSET command; for example:
SSET clipmirror vtr.clipmirror.primary_server.hostname vst-2 |
If desired, include a line in the configuration file /usr/vtr/config/system-defaults/clipmirror on each redundant server to determine how often (in seconds) the redundant server tries to reconnect to a failed primary server. The minimum interval is 1 second; the default is 30. This example sets the interval to 60 seconds:
vtr.clipmirror.reconnect_interval 60 |
If desired, set the value for the control vtr.clipmirror.max_threads on the redundant server(s); this control determines the maximum number of concurrent clip transfers from the primary server to the redundant server. The default limit is 20; the range is 1 to 100. For example:
vtr.clipmirror.max_threads 2 |
On the primary and redundant servers, create a new user account named vtrsync:
/usr/sysadm/privbin/addUserAccount -l vtrsync -u idnumber -g 0 -H /usr/vtr/clips -S /bin/csh -P |
The user ID (the variable idnumber in the example above) must be unique in the system; it can differ for different servers. To check user IDs already in the system, open /etc/passwd and view the third field. In the following example, the user ID is 994.
tutor::994:997:Tutorial User:/usr/tutor:/bin/csh |
For the new user account vtrsync (on each server), set the password to vtrsync:
# passwd vtrsync New Password: Re-enter new password: |
At each prompt shown above, enter vtrsync.
To stop the automatic backup, set the name of the primary server to an empty string via an MVCP connection to the redundant server(s):
SSET clipmirror vtr.clipmirror.primary_server.hostname "" |
![]() | Note: Clips are always transferred in full between the primary and the redundant servers, regardless of any in and out points that have been set. |
If you are redesignating a redundant server as the primary server, follow these steps:
Turn off clip mirroring in the redundant server by unsetting the name of the primary server with the following command in an MVCP connection:
SSET clipmirror vtr.clipmirror.primary_server.hostname "" |
On the server that is to be the new primary server, open the configuration file /usr/vtr/config/system-defaults/clipmirror and make sure that the name of the primary is unset:
vtr.clipmirror.primary_server.hostname "" |
Unset the primary if necessary.
On each redundant server, set the value of vtr.clipmirror.primary_server.hostname to the name of the new primary server.
An application can use the MVCP SSET command to change the values of the clip-mirroring controls while VST is running. For example, if vst-1 is the current primary server and vst-2 is the current redundant server, the following MVCP command swaps their roles:
On vst-2, use
SSET clipmirror vtr.clipmirror.primary_server.hostname "" |
On vst-1, use
SSET clipmirror vtr.clipmirror.primary_server.hostname vst-2 |
The new redundant server, vst-1, starts the initial synchronization phase as described in “Using Clip-Mirroring Servers”.
By default, spawned FTPs transfer clips between the primary interfaces of the primary and redundant servers. To enable these transfers to utilize another interface that might be higher bandwidth, use the controls vtr.clipmirror.local_server.hostname and vtr.clipmirror.primary_server.hostname in conjunction. FTP uses the server names specified in these controls.
For example, if the redundant server has a 100-Base-T Ethernet interface, vst1-enet, and a fibre channel connection, vst1-fc, you might wish to transfer clips on the higher bandwidth fibre channel connection. To do so, set vtr.clipmirror.local_server.hostname to vst12-fc. Similarly, if you want to use an interface on the primary server called p-vst-fc, set the control vtr.clipmirror.local_server.hostname to that name.