Data management paper

ability to behave predictably under load at
any scale. Minor performance uctuations
on individual servers can become
magnied as multiple servers are pooled
into larger server arrays.
In recent tests, Intel and Terracotta
engineers created an environment to
determine the performance and latency
predictability of servers equipped with
the Intel Xeon processor E7 v2 family and
BigMemory Max. The benchmark measured
performance with large, in-memory data
sets by measuring the total number of
transactions per second and transaction
latency, which is the round trip time in
milliseconds to complete each transaction.
The test conguration consisted of a
single server running BigMemory Max
4, and was equipped with four of the
Intel Xeon processor E7-4890 v2 and
6 terabytes of RAM. The tests were
repeated against multiple BigMemory Max
data store sizes of 2 TB, 4 TB, and 5.5
terabytes.
6
Seven client machines were
used to produce the workload, and were
each equipped with two of the Intel Xeon
processor E5-2697 v2. The server and
client machines communicated over a
1 gigabit Ethernet network.
The client machines used the Ehcache API
to read from and write to the BigMemory
Max data store on the server. A client
machine bulk loaded data into the data
store up to its maximum capacity of 5.5
terabytes. Once the data load completed,
each of the seven clients executed up to
128 threads concurrently for a maximum
of 896 threads, with each thread
representing a transaction between the
Figure 3: Terracotta BigMemory Max* provides predictable throughput as the number of
concurrent client threads increases
7
Throughput
(Transactions Per Second)
Concurrent Client Threads
56 112 224 448 89628147
Terracotta BigMemory Max
*
Throughput
120,000
100,000
80,000
60,000
40,000
20,000
0
Figure 3: Terracotta BigMemory Max* provides predictable throughput as the
number of concurrent users increases
Bulk Load TPS
Read/Write TPS
Figure 4. Terracotta BigMemory Max* provides predicable latency and throughput as data
volume increases
7
Terracotta BigMemory Max
*
Performance Scaling
Figure 4: Terracotta BigMemory Max provides predicable latency and throughput
as data volume increases
Throughput
(Transactions Per Second)
Latency
(Milliseconds)
5.5 TB4 TB2 TB
60,000
50,000
40,000
30,000
20,000
0
25.00
20.00
15.00
10.00
5.00
0.00
TPS
Latency
(1 Billion Elements) (2 Billion Elements) (3 Billion Elements)
client and server. To simulate real-world
conditions, the tests consisted of a ratio of
90 percent reads from the data store to
10 percent writes to the data store. Each
transaction carried an average payload of 2
kilobytes stored as a binary array.
The results in Figure 3 demonstrate how
BigMemory Max provides predictable
throughput under load as the number
of concurrent threads increases. As
the number of threads increased, the
transactions per second (TPS) increased in
a predictable manner. At 224 concurrent
threads, the server produced 46,506
TPS, and increased to 63,492 TPS at 896
concurrent threads. The CPU utilization
peaked at only 15 percent with seven
clients running a total of 896 concurrent
threads.
Figure 4 illustrates the scalability of
BigMemory Max. The server memory
capacity and BigMemory Max data store
was increased incrementally to 2 TB, 4 TB,
and 5.5 TB. The performance tests were
run with each memory conguration. The
volume of data used in each test increased
in relation to the size of the BigMemory
5
Terracotta and Intel: Breaking Down Barriers to In-memory Big Data Management