Components of the GPFS
The following components comprise GPFS:
- A kernel enhancement
- The GPFS daemon
- Daemons for RSCT
- Module for portability
GPFS is a kernel module extension (mmfs)
The kernel extension interfaces to the simulated file system (VFS) for file system access.
Applications make file system calls to the operating system, then route them to the GPFS file system kernel extension. GPFS appears to applications as just another file system in this manner. The kernel augmentation will either satisfy these requests using existing system resources or send a message to the daemon to finish the request.
The GPFS daemon (mmfsd)
The GPFS daemon manages all GPFS I/O and buffers, including read-ahead for sequential reads and write-behind for all writes that are not specified as synchronous. Token management protects all I/O and ensures the systems’ data consistency.
The GPFS daemon is a multi-threaded process, with some threads dedicated to specific tasks. This ensures that services that require immediate attention are not hampered because other threads are preoccupied with routine tasks.
The daemon also conveys with the other node cases to coordinate configuration changes, recovery, as well as parallel updates of the same database systems.
The daemon performs the following functions:
- Disk space is allocated to new and recently extended files.
- Directory management includes creating new directories, inserting and removing entries from existing directories, and searching for directories that require I/O.
- Locks are assigned to protect the integrity of data and metadata. Locks on data that can be accessed from multiple nodes necessitate interaction with the token management function.
- Daemon threads initiate disk I/O.
- The daemon also manages security and quotas in collaboration with the File System Manager.
Daemons for RSCT
GPFS makes use of two RSCT daemons to provide topology and group services. They are the daemons hagsd, and hatsd.
The hagsd daemon is associated with the Group Service subsystem. The Group Services subsystem provides distributed coordination, messaging, and synchronization to other subsystems.
The hatsd daemon represents the Topology Service subsystem. The Topology Services subsystem provides network adapter status, node connectivity information, and dependable messaging service to other subsystems.
The daemons are added during the rsct.basic package installation.
Overview Of GPFS Architecture
Nodes are classified into three types: file system, storage, and manager. Any node can carry out one of the functions listed above.
The node of the file system: manages administrative tasks
There is one manager node for each file system. Global lock manager, local lock manager, allocation manager, and so on are examples of manager nodes.
Storage nodes implement shared file access, work with the manager node during recovery, and allow file data and metadata to be striped across multiple storage nodes.
Other auxiliary nodes are as follows:
Metanode: For centralized file metadata management, a node is dynamically selected as a metanode. The token server facilitates metanode selection.
Token Server: A token server keeps track of all tokens distributed to cluster nodes. It uses a token granting algorithm to minimize the expenses of token management.
Special management responsibilities
GPFS performs the same functions on all nodes in general. It handles application requests on the node that contains the application, ensuring that the data is as close to the application as possible.
GPFS file system disc storage and file structure usage
A file system (or stripe group) is a collection of discs that hold file data, file metadata, and supporting entities like quota files and recovery logs.
Memory and GPFS
GPFS uses three types of memory: kernel heap memory, daemon segment memory, and shared memory accessed by both the daemon and the kernel.
Network communication as well as GPFS
You can specify different networks inside the GPFS cluster for GPFS daemon communication and GPFS command usage.
GPFS application but also user interaction
A GPFS file system can be accessed in four ways.
Disk discovery with NSD
When the GPFS daemon starts on a node, it reads a disc descriptor written on each disc operated by GPFS to discover the discs defined as NSDs. This allows the NSDs to be found regardless of the disk’s current operating framework device name.
Processing of failure recovery
GPFS failure recovery is handled automatically. As a result, while not required, some familiarity with its internal functions is useful when observed failures.
Data files for cluster configuration
The configuration and file system information stored by GPFS commands is stored in one or more files known as GPFS cluster configuration data files. These files are not intended to be manually modified.
Backup data for GPFS
During command execution, the GPFS mmbackup command creates several files. Some files are temporary and are deleted at the end of the backup process, and other files remain in the root directory of the fileset or file system and should not be deleted.
Configuration repository clustered
The Clustered Configuration Repository (CCR) is used by GPFS and many other IBM Spectrum Scale components such as the GUI, the CES services, and the monitoring service to store or return requested files and values across the group.