1.1

Table Of Contents
In an online auction application, you may need to stream incoming auction bids to hundreds of clients with
very low latency. To do so, you must manage selected "hot" auctions on a single partition so that they receive
sufcient processing power. As the processing demand increases, add more partitions and route the application
logic that matches bids to clients to the data store itself.
In a nancial trading engine that constantly matches bid prices to asking prices for thousands of securities,
partition data using ID of the security. When market data for a security changes, all of the related reference
data is co-located with the matching algorithm to ensure low-latency execution.
Life beyond Distributed Transactions provides additional background information about entity groups and
distributed systems.
Creating Partitioned Tables on page 62 describes other ways to partition a table in SQLFire.
Use Entity Groups Instead of "Sharding"
Sharding is a common strategy in traditional databases where database rows are horizontally partitioned into
multiple "database shards."
There are a number of differences between application-assisted or system-assisted sharding and SQLFire
partition-aware database design:
Rebalancing: SQLFire rebalances the data automatically by making sure that related rows are migrated together
and without any integrity loss. This enables you to add capacity as needed.
Distributed transactions: SQLFire transaction boundaries are not limited to a single partition. Atomicity and
isolation guarantees are provided across the entire distributed system.
Parallel scatter-gather: Queries that cannot be pruned to a single partition are automatically executed in
parallel on data stores. Joins can be performed between tables with the restriction that the joined rows are in
fact collocated.
Subqueries on remote partitions: Even when a query is pruned to a single partition, the query can execute
subqueries that operate on data that is stored on remote partitions.
Replicate Code Tables
The "star" schema is the most common relational design pattern. In a star schema, large "fact" tables have foreign
key relationships to one or more "code" or "dimension" tables. With SQLFire, you should denormalize code
tables into fact tables when possible. When this is not possible, determine whether code tables can be replicated
in the SQLFire cluster.
The main goal is to partition the entities in fact tables, but to replicate the rows in slow-changing code tables on
each member where the partitions are managed. In this way, a join between the fact table and any number of its
code tables can be executed concurrently on each partition, without requiring network hops to other members.
Replicating Tables on page 69 provides more information.
Dealing with Many-to-Many Relationships
Where tables have many-to-many relationships, you have a choice of strategies for handling queries that need
to join non-colocated data.
Note: Joins are permitted only on data that is colocated. Query execution can be distributed and executed
in parallel, but the joined rows in each partition member have to be restricted to other rows in the same
partition.
For tables have many-to-many relationships, choose one of the following strategies to handle queries that need
to join non-colocated data:
vFabric SQLFire User's Guide50
Managing Your Data in vFabric SQLFire