Appendix A. CXFS Software Architecture

This appendix discusses the following for administration nodes:

Also see the following:.

Kernel Threads

Table A-1, discusses kernel threads. CXFS shares with XFS the Linux xfsbufd and xfsdatad kernel threads to push buffered writes to disk.


Note: The thread names begin with a * (such as [*mtcp_notify]).


Table A-1. Kernel Threads

Kernel Thread

Software Product

Description

cmsd

sgi-cxfs-server-kmp

Manages CXFS kernel membership and CXFS kernel heartbeating.

Recovery

sgi-cxfs-server-kmp

Manages recovery protocol for node.

corpse leader

sgi-cxfs-server-kmp

Coordinates recovery between nodes.

dcshake

sgi-cxfs-server-kmp

Purges idle CXFS vnodes on the CXFS client.

cxfsd

sgi-cxfs-server-kmp

Manages sending extent and size updates from the client to the server. This daemon (which runs on the CXFS client) takes modified inodes on the client and ships back any size and unwritten extent changes to the server.

mtcp_recv

sgi-cxfs-server-kmp

Reads messages (one per open message channel).

mtcp_notify

sgi-cxfs-server-kmp

Accepts new connections.

mtcp_discovery

sgi-cxfs-server-kmp

Monitors and discovers other nodes.

mtcp_xmit

sgi-cxfs-server-kmp

Supplies CXFS kernel heartbeat.

The fs2d, clconfd, and crsd daemons run at real-time priority. However, the mount and umount commands and scripts executed by clconfd are run at normal, time-shared priority.

Communication Paths

The following figures show communication paths in CXFS.


Note: The following figures do not represent the cmond cluster manager daemon. The purpose of this daemon is to keep the other daemons running.


Figure A-1. Communication within One Server-Capable Administration Node

Communication within One Server-Capable Administration
Node

Figure A-2. Daemon Communication within One Server-Capable Administration Node

Daemon Communication within One Server-Capable Administration Node

Figure A-3. Communication between Nodes in the Pool

Communication between Nodes in the Pool

Figure A-4. Communication for a Server-Capable Administration Node Not in a Cluster

Communication for a Server-Capable Administration
Node Not in a Cluster

One of the server-capable administration nodes running the fs2d daemon is chosen to periodically multicasts its IP address and the generation number of the cluster database to each of the client-only nodes. Each time the database is changed, a new generation number is formed and multicast. The following figure describes the communication among nodes, using a Solaris client-only node as an example.

Figure A-5. Communication Among Nodes

Communication Among Nodes

Flow of Metadata for Reads and Writes

The following figures show examples of metadata flow.


Note: A token protects a file. There can be multiple read tokens for a file at any given time, but only one write token.


Figure A-6. Metadata Flow on a Write

Metadata Flow on a Write

Figure A-7. Metadata Flow on a Read on Client B Following a Write on Client A

Metadata Flow on a Read on Client B Following
a Write on Client A

Figure A-8. Metadata Flow on a Read on Client B Following a Read on Client A

Metadata Flow on a Read on Client B Following a Read on Client
A