Paper

ManualsBrandsIntel ManualsOtherIntel Xeon Processor E7430

Big Data Integration for

Massive Scalability

Many businesses already need to store

and analyze terabytes and in some cases

petabytes of data. It is possible to build a

real-time, in-memory business platform

with petabyte scalability (see the sidebar,

In-Memory Computing at Petabyte Scale).

However, this approach is neither finan-

cially practical nor necessary for most

businesses. Apache Hadoop* offers a more

cost-effective solution for integrating

very large data volumes with in-memory

computing environments.

Hadoop provides massively scalable

storage and analytics on a distributed

architecture based on affordable servers

and commodity disk drives. It offers a cost-

effective solution for ingesting, preparing,

and storing warm data for inclusion in the

real-time analytics environment. Petabytes

of data, including all data types, can be

stored at a cost-per-terabyte that is not

only much lower than an in-memory data-

base, but also much lower than a tradi-

tional, disk-based storage system.

Intel and SAP offer an integrated solution

today based on SAP HANA and the Intel®

Distribution for Apache Hadoop (IDH)

software. Business users and data analysts

see the data in Hadoop as an extension

of the SAP HANA data set, and queries

are automatically federated across both

platforms. IDH also provides enterprise-

class tools and capabilities for management

and data security. The combined solution

supports real-time analytics acting on

petabytes of data.

Other database vendors are following suit.

Hadoop is becoming the defacto standard

for storing and analyzing massive, unstruc-

tured data sets. There may come a time

when it is economical to store all data

in main memory. Until then, integrating

Hadoop and other massively scalable solu-

tions with in-memory computing platforms

will be key to optimizing capability versus

cost across all enterprise data and all busi-

ness requirements.

Where to Start

In time, all business computing will be done

in memory. Today, businesses have to move

forward intelligently, by balancing cost, risk

and value. Companies with high-value use

cases that cannot be solved using tradi-

tional tools should consider implementing

in-memory computing sooner rather than

later. For others, it may make more sense

to wait. As vendors continue to integrate

in-memory capability into their core prod-

ucts, costs will go down and integration

will become simpler.

Regardless of your current needs and goals,

now is the time to:

• Evaluate the potential of in-memory

computing in your specific business

and industry. Work with business units

to identify potential, high-value use cases.

Consider what you could do better, faster,

or different if you could analyze large

data sets, including fresh operational

data, almost instantly.

• Explore current in-memory solutions

and track progress as new solutions

emerge. Given the proven business value

and the maturity of the enabling tech-

nologies, in-memory computing can be

expected to advance rapidly.

• Quantify the potential business value

and consider the cost and risk of imple-

mentation. There is no doubt that a time

will come when the benefits of in-memory

computing exceed the cost and risk of

implementation for your business. The

potential benefits are huge. Be prepared

so you can make the right move at the

right time.

Conclusion

In-memory computing represents a para-

digm shift in the way businesses manage

and use data. The unprecedented speed

and scale of in-memory databases allows

companies to host transactional and ana-

lytical applications on the same database.

Operational data is available for analysis

as soon as it is generated, and complex

queries can be completed in seconds rather

than hours or days. Infrastructure and

operational requirements are also greatly

reduced, which can lead to dramatic savings

in total cost of ownership.

In-memory solutions are available today

from dozens of vendors, and all major data-

base vendors, including SAP, IBM, Oracle,

and Microsoft, offer or will soon offer

in-memory options. The Intel Xeon processor

E7 v2 family powers a new generation of

servers that are specifi cally optimized for

in-memory computing, delivering up to

2x higher performance than previous-

generation servers

and providing up

to 3x higher memory scalability. They

are ideal for the data-intensive, mission-

critical demands of in-memory computing.

In-Memory Computing

at Petabyte Scale

In May 2012, Intel and SAP launched

the SAP HANA* petascale cloud, a

100 TB in-memory system consisting

of 100 servers based on the Intel®

Xeon® processor E7 family. They have

since expanded this cloud infrastruc-

ture to include more than 250 servers,

8,000 threads, 4,000 cores and 250

TB of RAM, all capable of running a

single instance of SAP HANA.

Used for customer proof-of-concept

projects and as a laboratory for

Intel and SAP research teams, this

petascale cloud environment has

clearly demonstrated that in-memory

computing can scale to deliver real-

time performance while acting on

massive data volumes.

For more information, see the Intel

and SAP solution brief, "Scaling

Real-Time Analytics across the

Enterprise— and into the Cloud"

Changing the Way Businesses Compute…and Compete