Data management paper

How BigMemory Max Works
Traditional servers typically rely on the
venerable concept of hierarchal storage.
Operating systems and applications, which
need to run as fast as possible, load from
slower disk-based storage into a server’s
high-speed RAM. On the other hand, bulk
data—such as documents, media les,
or raw data captured from transactional
systems—is stored on slower, often less
expensive hard-disk-based storage, and
only loaded into RAM when required by
an application. Loading data from disk
into RAM is time-consuming, and can
signicantly slow analysis tasks.
Now, servers can utilize hundreds of
gigabytes, or even terabytes, of cost-
effective RAM to store data. Operating
systems and applications often only
require gigabytes or tens of gigabytes
of memory, which can leave large
amounts of RAM unused. BigMemory Max
congures unused memory as storage
space that applications can use to store,
retrieve, and analyze bulk data.
By combining multiple servers into arrays,
BigMemory Max can create large pools
of high-speed memory that applications
can use to store, retrieve, and analyze
extremely large datasets. In a single-
server environment, the BigMemory
Max client manages data in the server’s
local RAM. In a distributed environment,
the BigMemory Max client manages
data within the server’s RAM and also
maintains a network connection between
the members of the BigMemory Max
server array, moving data as needed
among the member servers’ RAM.
BigMemory Max also protects data and
can deliver 99.999% uptime by mirroring
the data across multiple servers. RAM is
volatile and does not persist data when
power is removed. If a server outage
occurs, any data in that server’s RAM is
lost. BigMemory Max pairs two servers in a
mirrored conguration where each server
keeps a complete copy of the data. If either
server experiences a power failure or
some other fault, the data is not lost.
How Applications Access BigMemory Max
Applications can access BigMemory Max
storage using several methods. A common
method is the industry-standard Java*
Ehcache API that provides common methods
for storing and retrieving data, in addition to
advanced search and analysis capabilities.
Other common APIs, such as those found
in C# and C++ libraries, can also access and
manage data in BigMemory Max.
BigMemory Max provides an option
for storing data as Java objects, which
simplies application programming and
reduces the amount of data transformation
overhead that relational databases require.
Software engineers can use common
protocols such as MOM, HTTP, REST, and
SOAP to access BigMemory Max.
Figure 1. Terracotta BigMemory Max* creates a data store from unused RAM, which is
accessible by applications in microseconds
Terracotta BigMemory Max
*
RAM Utilization
Figure 1: Terracotta BigMemory Max* creates a data store from unused
RAM, which is accessible by applications in microseconds
Application
Application
Application
Unused
Memory
6+ TB
BigMemory
Max
6+ TB
Application
Application
2 GB 2 GB
Terracotta BigMemory Max
*
RAM Utilization
Figure 1: Terracotta BigMemory Max* creates a data store from unused
RAM, which is accessible by applications in microseconds
Application
Application
Application
Unused
Memory
6+ TB
BigMemory
Max
6+ TB
Application
Application
2 GB 2 GB
Terracotta and Intel: Powering the
New Generation of Data Analysis
Built on the Intel 22 nanometer
architecture, the Intel Xeon processor
E7 v2 family provides key scalability,
performance, and reliability enhancements
that increase the performance and
exibility of BigMemory Max. These
features can help enterprises reduce the
complexity and total cost of ownership of
their analytical engines while enjoying the
benets of higher performance and uptime.
Scalability
Servers equipped with Intel Xeon
processor E7 v2 family provide higher
memory capacities than previous
generation Intel Xeon processors.
3
Terracotta and Intel: Breaking Down Barriers to In-memory Big Data ManagementTerracotta and Intel: Breaking Down Barriers to In-memory Big Data Management