Leaflet
© 2016 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public information. Page 11
Cisco HyperFlex Systems
March 2016
• VMware API for Array Integration (VAAI): This storage offload API allows
vSphere to request advanced file system operations such as snapshots and
cloning. The controller implements these operations through manipulation of
metadata rather than actual data copying, providing rapid response, and thus
rapid deployment of new application environments.
Data Distribution
The Cisco HyperFlex HX Data Platform controller handles all read and write
requests for volumes that the hypervisor accesses and thus intermediates all I/O
from the virtual machines. Recognizing the importance of data distribution, the
Cisco HyperFlex HX Data Platform is designed to exploit low network latencies and
parallelism, in contrast to other approaches that emphasize data affinity.
With data distribution, the data platform stripes data evenly across all nodes, with
the number of data replicas determined by the policies you set. This approach
avoids both network and storage hot spots and makes I/O performance the same
regardless of virtual machine location. This feature gives you more flexibility in
workload placement and contrasts with other architectures in which a locality
approach does not fully utilize all available netowrking and I/O resources
• Data write operations: For write operations, data is written to the local SSD
cache and the replicas are written to the remote SSD drives in parallel before the
write is acknowledged.
• Data read operations: For read operations, data that happens to be local will
usually be read directly off the local SSD drive. If the data is not local, the data
is retrieved from an SSD drive on the remote node. This allows the platform to
leverage all SSD drives for reads eliminating bottlenecks and delivering superior
performance..
In addition, when moving a virtual machine to a new location such as through
VMware Dynamic Resource Scheduling (DRS), the data platform does not require
data movement and there is no impact or cost to moving virtual machines.
Data Operations
The data platform implements a log-structured file system that uses a caching layer
in SSD drives to accelerate read requests and write responses, and it implements a
capacity layer with HDDs.
Incoming data is striped across the number of nodes that you define to meet your
data availability requirements The log-structured file system assembles blocks to
be written to a configurable cache until the buffer is full or workload conditions
dictate that it be destaged to a spinning disk. When existing data is (logically)
overwritten, the log-structured approach simply appends a new block and updates
the metadata. When data is destaged, the write operation consists of a single seek
operation with a large amount of data written. This approach improves performance
significantly compared to the traditional read-modify-write model, which is
characterized by numerous seek operations and small amounts of data written at a
time.
When data is destaged to disk in each node, the data is deduplicated and
compressed. This process occurs after the write operation is acknowledged, so