Paper

ManualsBrandsIntel ManualsOtherIntel Xeon Processor 7130N

The “Big Data” Challenge

The term big data refers to the growing

flood of data in today’s hyperconnected

world, and to the enormous challenges

and opportunities this data presents. In a

very real sense, however, data has been

too big for decades. Since the first Intel

processor was launched more than forty

years ago, the data access speeds of bulk

storage systems have not kept pace with

processor capability.

Processor speed and functionality are

delivered in silicon, and have benefitted

from the rapid transition toward smaller

and faster silicon process technologies

predicted by Moore’s Law.

Bulk storage

has long been based on mechanical tech-

nologies, from punch cards to magnetic tape

to the spinning magnetic disks of today’s

hard disk drives (HDDs). Although the cost

per gigabyte (GB) of storage has declined

rapidly over the years, storage performance

has been limited by the mechanical nature

of the data access process.

Faster, silicon-based data storage tech

nologies have existed for many years, but

have been too costly for bulk storage.

Instead, such technologies have been

used for the main memory of computing

systems and for the even faster cache

subsystems that reside directly on the

processor die. Although these high-speed

memory subsystems ameliorate the data

access problem to some degree, their

limited capacity has been a performance

bottleneck. Getting the right data out of

bulk storage and into the right processor

registers at the right time has been a

tough challenge for decades.

Database vendors have done much over

the years to work around and mask this

performance gap, but the resulting cost and

complexity have been significant. As illus

trated in Figure 1, traditional HDD-depen-

dent information infrastructure requires:

• Separate databases for transactional

and analytical applications, along with

separate infrastructure. In each case,

hardware and software must be optimized

to achieve acceptable performance.

• Multiple data marts to address special-

ized business intelligence needs without

overloading data warehouses.

• Constant tuning and optimization

of databases and storage systems

to deliver acceptable performance,

especially for analytical workloads.

Despite all the cost and effort, custom-

ers still experience long delays between

the time that data is generated and the

time it is available for analysis. Data must

be extracted from transactional systems,

transformed into required formats, and

loaded into the analytics environment.

In many cases, the data models in the

warehouse or data mart must then be

re-optimized for performance. Even with

all this preparatory work, complex queries

can still take many hours to complete.

Harnessing Big Data with

In-Memory Computing

In-memory computing changes the

computing paradigm fundamentally (see

Figure 2 on the next page). All relevant

data is kept in the main memory of the

computing system, rather than in a

separate bulk storage system.

• Data can be accessed orders of magni-

tude faster, so fast that transactional

and analytical applications can run simul-

taneously on the same database running

on the same infrastructure. The need

for separate data ware houses and data

marts is eliminated, along with the

associated costs.

• Data is available for analysis as

soon as it is generated. Even if it is

generated on a separate system, it

can be replicated almost instantly

into an in-memory database.

Data Warehouse

ERP

TRANSACTIONAL

INFRASTRUCTURE

Data Marts

Extraction,

Transformation,

Loading (ETL)

BATCH

REPORTS

ANALYTICS

INFRASTRUCTURE

OLAP, Reporting,

Data Mining

CHALLENGES OF TRADITIONAL BUSINESS INTELLIGENCE

• Slow business response

• High infrastructure and operational costs

CRM

OLTP

Long time to insight (hours to days)

Figure 1. Traditional, HDD-dependent information infrastructure requires separate transactional and analytical infrastructure, resulting in high costs

and long delays between data generation and insight.

Changing the Way Businesses Compute…and Compete