Concept Guide
BeeGFS File System
5 Dell EMC Ready Solutions for HPC BeeGFS High Performance Storage | ID 460
2 BeeGFS File System
BeeGFS
3
is an open source parallel cluster file system. The software can be downloaded from
www.beegfs.io. The file system software also includes enterprise features such as High Availability, Quota
enforcement, and Access Control Lists. BeeGFS is a parallel file system which distributes user data across
multiple storage nodes. There is parallelism of access as it maps data across many servers and drives and
provides a global namespace, a directory tree, that all nodes can see. It is easy to deploy BeeGFS and
integrate it with existing systems. The BeeGFS server components are user space daemons. The client is a
native kernel module that does not require any patches to the kernel itself. All BeeGFS components can be
installed and updated without even rebooting the machine. So, clients and servers can be added to existing
systems without any downtime. BeeGFS is a highly scalable file system. By increasing the number of servers
and drives, the performance and capacity can be increased to the required level from small clusters up to
enterprise-class systems with thousands of nodes. As BeeGFS is software defined storage that decouples the
storage software from its hardware, it offers flexibility and choice while making hardware purchasing
decisions. In BeeGFS, the roles and hardware are not tightly integrated. The BeeGFS clients and servers can
even run on the same machine. BeeGFS supports a wide range of Linux distributions, RHEL, CentOS, SUSE,
and so on. BeeGFS is designed to be independent of the local file system used. The local storage can be
formatted with any of the standard Linux file systems—xfs or ext4.
The BeeGFS architecture consists of four main services:
• Management Service
• Metadata Service
• Storage Service
• Client Service
Each BeeGFS file system or namespace has only one management service. The management service is the
first service which must be set up because when we configure all other services, they must register with the
management service. The Metadata Service is a scale-out service, which means that there can be many
metadata services in a BeeGFS file system. However, each metadata service has exactly one metadata
target to store metadata. On the metadata target, BeeGFS creates one metadata file per user created file.
BeeGFS metadata is distributed on a per-directory basis. The metadata service provides the data striping
information to the clients and is not involved in the data access between file open/close. The Storage Service
stores the user data. A BeeGFS file system can be made of multiple storage servers where each storage
service can manage multiple storage targets. On those targets the striped user data is stored in chunk files for
parallel access from the client. Except for the client service which is a kernel module, the management,
metadata and storage services are user space processes. Figure 1 illustrates the general architecture of the
BeeGFS file system.