MDSes and their states

CephFS requires an additional component to coordinate client access and metadata; this component is called the Metadata Server, or MDS for short. Although the MDS is used to serve metadata requests to and from the client, the actual data read and written still goes directly via the OSDs. This approach minimizes the impact of the MDS on the filesystem's performance for more bulk data transfers, although smaller I/O-intensive operations can start to be limited by the MDS performance. The MDS currently runs as a single-threaded process and so it is recommended that the MDS is run on hardware with the highest-clocked CPU as possible.

The MDS has a local cache for storing hot portions of the CephFS metadata to reduce the amount of I/O going to the metadata pool; this cache is stored in local memory for performance and can be controlled by adjusting the MDS cache memory-limit configuration option, which defaults to 1 GB.

CephFS utilizes a journal stored in RADOS mainly for consistency reasons. The journal stores the stream of metadata updates from clients and then flushes them into the CephFS metadata store. If an MDS is terminated, the MDS that takes over the active role can then replay these metadata events stored in the journal. This process of replaying the journal is an essential part of the MDS becoming active and therefore will block until the process is completed. The process can be sped up by having a standby-replay MDS that is constantly replaying the journal that is ready to take over the primary active role in a much shorter amount of time. If you have multiple active MDSes, whereas a pure standby MDS can be a standby for any active MDS, standby-replay MDSes have to be assigned to a specific MDS rank.

As well as the active and replaying states, an MDS can also be in several other states; the ones you are likely to see in the ceph status are listed for reference for when operating a Ceph cluster with a CephFS filesystem. The states are split into two parts: the part on the left side of the colon shows whether the MDS is up or down. The part on the right side of the colon represents the current operational state:

Although there are other states an MDS can be in, it is likely that during normal operations they will not be seen and so have not been included here. Please consult the official Ceph documentation for more details on all available states.