Release Notes

Dell PowerEdge R730xd Performance and Sizing Guide for Red Hat Ceph Storage - A Dell Red Hat Technical White Paper 13
Pools: A Ceph storage cluster stores data objects in logical dynamic partitions called pools. Pools can be
created for particular data types, such as for block devices, object gateways, or simply to separate user
groups. The Ceph pool configuration dictates the number of object replicas and the number of placement
groups (PGs) in the pool. Ceph storage pools can be either replicated or erasure-coded, as appropriate for
the application and cost model. Also, pools can “take root” at any position in the CRUSH hierarchy (see
below), allowing placement on groups of servers with differing performance characteristics—allowing
storage to be optimized for different workloads.
Placement groups: Ceph maps objects to placement groups (PGs). PGs are shards or fragments of a
logical object pool that are composed of a group of Ceph OSD daemons that are in a peering relationship.
Placement groups provide a way to creating replication or erasure coding groups of coarser granularity
than on a per-object basis. A larger number of placement groups (for example, 200/OSD or more) leads to
better balancing.
CRUSH ruleset: CRUSH is an algorithm that provides controlled, scalable, and decentralized placement of
replicated or erasure-coded data within Ceph and determines how to store and retrieve data by
computing data storage locations. CRUSH empowers Ceph clients to communicate with OSDs directly,
rather than through a centralized server or broker. By determining a method of storing and retrieving data
by algorithm, Ceph avoids a single point of failure, a performance bottleneck, and a physical limit to
scalability.
Ceph monitors (MONs): Before Ceph clients can read or write data, they must contact a Ceph MON to
obtain the current cluster map. A Ceph storage cluster can operate with a single monitor, but this
introduces a single point of failure. For added reliability and fault tolerance, Ceph supports an odd number
of monitors in a quorum (typically three or five for small to mid-sized clusters). Consensus among various
monitor instances ensures consistent knowledge about the state of the cluster.
Ceph OSD daemons: In a Ceph cluster, Ceph OSD daemons store data and handle data replication,
recovery, backfilling, and rebalancing. They also provide some cluster state information to Ceph monitors
by checking other Ceph OSD daemons with a heartbeat mechanism. A Ceph storage cluster configured to
keep three replicas of every object requires a minimum of three Ceph OSD daemons, two of which need
to be operational to successfully process write requests. Ceph OSD daemons roughly correspond to a file
system on a hard disk drive.
2.2 Selecting a Storage Access Method
Choosing a storage access method is an important design consideration. As discussed, all data in Ceph is
stored in pools—regardless of data type. The data itself is stored in the form of objects by using the
Reliable Autonomic Distributed Object Store (RADOS) layer which:
Avoids a single point of failure
Provides data consistency and reliability
Enables data replication and migration
Offers automatic fault-detection and recovery