1.1

Table Of Contents

• In an online auction application, you may need to stream incoming auction bids to hundreds of clients with

very low latency. To do so, you must manage selected "hot" auctions on a single partition so that they receive

sufﬁcient processing power. As the processing demand increases, add more partitions and route the application

logic that matches bids to clients to the data store itself.

• In a ﬁnancial trading engine that constantly matches bid prices to asking prices for thousands of securities,

partition data using ID of the security. When market data for a security changes, all of the related reference

data is co-located with the matching algorithm to ensure low-latency execution.

Life beyond Distributed Transactions provides additional background information about entity groups and

distributed systems.

Creating Partitioned Tables on page 62 describes other ways to partition a table in SQLFire.

Use Entity Groups Instead of "Sharding"

Sharding is a common strategy in traditional databases where database rows are horizontally partitioned into

multiple "database shards."

There are a number of differences between application-assisted or system-assisted sharding and SQLFire

partition-aware database design:

• Rebalancing: SQLFire rebalances the data automatically by making sure that related rows are migrated together

and without any integrity loss. This enables you to add capacity as needed.

• Distributed transactions: SQLFire transaction boundaries are not limited to a single partition. Atomicity and

isolation guarantees are provided across the entire distributed system.

• Parallel scatter-gather: Queries that cannot be pruned to a single partition are automatically executed in

parallel on data stores. Joins can be performed between tables with the restriction that the joined rows are in

fact collocated.

• Subqueries on remote partitions: Even when a query is pruned to a single partition, the query can execute

subqueries that operate on data that is stored on remote partitions.

Replicate Code Tables

The "star" schema is the most common relational design pattern. In a star schema, large "fact" tables have foreign

key relationships to one or more "code" or "dimension" tables. With SQLFire, you should denormalize code

tables into fact tables when possible. When this is not possible, determine whether code tables can be replicated

in the SQLFire cluster.

The main goal is to partition the entities in fact tables, but to replicate the rows in slow-changing code tables on

each member where the partitions are managed. In this way, a join between the fact table and any number of its

code tables can be executed concurrently on each partition, without requiring network hops to other members.

Replicating Tables on page 69 provides more information.

Dealing with Many-to-Many Relationships

Where tables have many-to-many relationships, you have a choice of strategies for handling queries that need

to join non-colocated data.

Note: Joins are permitted only on data that is colocated. Query execution can be distributed and executed

in parallel, but the joined rows in each partition member have to be restricted to other rows in the same

partition.

For tables have many-to-many relationships, choose one of the following strategies to handle queries that need

to join non-colocated data:

vFabric SQLFire User's Guide50

Managing Your Data in vFabric SQLFire