Appendix A. CXFS Software Architecture

This appendix discusses the following for administration nodes:

Also see the following:

Kernel Threads

Table A-1, discusses kernel threads. CXFS shares with XFS the Linux xfsbufd and xfsdatad kernel threads to push buffered writes to disk.


Note: In the ps command output, the thread names begin with a * character, such as [*mtcp_notify] ).


Table A-1. Kernel Threads

Kernel Thread

Description

cmsd

Manages CXFS kernel membership and CXFS kernel heartbeating.

Recovery

Manages recovery protocol for node.

corpse leader

Coordinates recovery between nodes.

dcshake

Purges idle CXFS vnodes on the CXFS client.

cxfsd

Manages sending extent and size updates from the client to the server. This daemon (which runs on the CXFS client) takes modified inodes on the client and ships back any size and unwritten extent changes to the server.

mtcp_recv

Reads messages (one per open message channel).

mtcp_notify

Accepts new connections.

mtcp_discovery

Monitors and discovers other nodes.

mtcp_xmit

Supplies CXFS kernel heartbeat.

The fs2d, clconfd, and crsd daemons run at real-time priority. However, the mount and umount commands and scripts executed by clconfd are run at normal, time-shared priority.

Communication Paths

The following figures show communication paths in CXFS.


Note: The following figures do not represent the cmond cluster manager daemon. The purpose of this daemon is to keep the other daemons running.


Figure A-1. Communication Within One Server-Capable Administration Node

Communication Within One Server-Capable Administration
Node

Figure A-2. Daemon Communication Within One Server-Capable Administration Node

Daemon Communication Within One Server-Capable Administration Node

Figure A-3. Communication Among Nodes in the Pool

Communication Among Nodes in the Pool

Figure A-4. Communication for a Server-Capable Administration Node Not in a Cluster

Communication for a Server-Capable Administration
Node Not in a Cluster

One of the server-capable administration nodes running the fs2d daemon is chosen to periodically multicasts its IP address and the generation number of the cluster database to each of the client-only nodes. Each time the database is changed, a new generation number is formed and multicast. Figure A-5describes the communication among nodes, showing just a single client-only node as an example.

Figure A-5. Communication Among Nodes

Communication Among Nodes

Flow of Metadata for Reads and Writes

The following figures show examples of metadata flow.


Note: A token protects a file. There can be multiple read tokens for a file at any given time, but only one write token.


Figure A-6. Metadata Flow on a Write

Metadata Flow on a Write

Figure A-7. Metadata Flow on a Read on Client B Following a Write on Client A

Metadata Flow on a Read on Client B Following
a Write on Client A

Figure A-8. Metadata Flow on a Read on Client B Following a Read on Client A

Metadata Flow on a Read on Client B Following a Read on Client
A