Open Source Object Storage for Unstructured Data: Ceph on HP ProLiant SL4540 Gen8 Servers
Table Of Contents
- Executive summary
- Introduction
- Overview
- Solution components
- Workload testing
- Configuration guidance
- Bill of materials
- Summary
- Appendix A: Sample Reference Ceph Configuration File
- Appendix B: Sample Reference Pool Configuration
- Appendix C: Syntactical Conventions for command samples
- Appendix D: Server Preparation
- Appendix E: Cluster Installation
- Naming Conventions
- Ceph Deploy Setup
- Ceph Node Setup
- Create a Cluster
- Add Object Gateways
- Apache/FastCGI W/100-Continue
- Configure Apache/FastCGI
- Enable SSL
- Install Ceph Object Gateway
- Add gateway configuration to Ceph
- Redeploy Ceph Configuration
- Create Data Directory
- Create Gateway Configuration
- Enable the Configuration
- Add Ceph Object Gateway Script
- Generate Keyring and Key for the Gateway
- Restart Services and Start the Gateway
- Create a Gateway User
- Appendix F: Newer Ceph Features
- Appendix G: Helpful Commands
- Appendix H: Workload Tool Detail
- Glossary
- For more information
Reference Architecture| Ceph on HP ProLiant SL4540 Gen8 Servers
Introduction
This reference architecture describes a Ceph cluster deployed on HP hardware. It details why and how to build a Ceph cluster
with HP hardware to solve unstructured, cloud and backup/archival storage problems. The key reasons why the reader
should care about this are:
• Object storage is a better solution for unstructured data than traditional storage alone
• The right solution needs the right platform—‘white box’ hardware doesn’t meet enterprise needs at scale
Object storage is architected for the characteristics and use of Big Data to remove scaling limitations. As implemented by
Ceph, object storage is an SDS layer that federates traditional file and block storage on industry-standard Linux servers. This
provides a way to scale out massively for Big Data needs at lower costs than SAN/NAS business-critical storage targets.
HP hardware is the right platform for a large-scale object storage cluster because it provides better TCO for operating and
maintaining the hardware than ‘white box’ servers. HP provides:
• Platform management tools that scale across data centers
• Server components and form factors that are optimized for enterprise use cases
• Hardware platforms where component parts have been qualified together
• A proven support infrastructure
Clusters built with ‘white box’ servers work for business at small scales, but as they grow, the complexity and cost make
them less compelling than enterprise-focused hardware. With ‘white box’ solutions, IT has to standardize and integrate
platforms and supported components themselves. Support escalation becomes more complicated. Without standardized
toolsets to manage the hardware at scale, IT must chart their own way with platform management and automation. Power
consumption and space inefficiencies of generic platform design also limit scale and increase cost over time.
The result is IT staff working harder and the business spending more to support the quantity and complexity of a ‘white box’
hardware infrastructure. The lowest upfront cost does not deliver the lowest total cost or easiest solution to maintain.
Reference architecture guidance
It’s important to set expectations on what this reference architecture is attempting to accomplish, and what that means to
the reader. This paper does provide a picture of how to implement a Ceph cluster on HP hardware, and why it’s compelling
in a business and technical sense.
It does not show how to build an entire application solution stack using Ceph.
The distinction is important because the ‘whole solution’ picture that’s present for classic server applications using
traditional block and file storage is only beginning to appear for object storage. Placing an object storage cluster in the data
center is an important step, but an object storage interface requires integration effort to use. It’s a new storage interface.
There are no standardized benchmarks, use cases or typical object data patterns for object storage currently. Nor is there
clear guidance around recommending applications that can connect to object storage. This means additional work to figure
out the pieces and players for connecting a Ceph cluster to enterprise applications.
This paper also doesn’t provide an exhaustive picture of cluster configuration options.
Scaling storage on industry standard servers is different from standardizing on classic NAS and SAN targets within a single
datacenter. Based on business requirements, a variety of servers and infrastructure options can be selected for cluster
architecture across multiple sites. It all adds up to more variables than this paper can reasonably cover while staying
focused.
Sample reference configuration summary
This Ceph cluster is shown at a high level to give the reader context; it’s based around storage on the HP ProLiant SL4540
Server, which is purpose-built for Big Data. The single rack sample cluster contains:
• Five 2x25 HP ProLiant SL4540 Gen8 Server chassis, with 3TB drives and SSD Journals.
• Three HP ProLiant DL360p Gen8 Server chassis
• Ubuntu 12.04.03 LTS. Ubuntu is the OS best supported by Ceph software today, and the long-term support release is
most appropriate for an enterprise environment.
• Ceph running the Dumpling (v0.67) release, which is the most current and stable Ceph LTS release at the time of this
testing
• 10GbE Networking running on HP 5900AF switches, carrying object data traffic
4










