Specifications
118 CHAPTER 6 Scalable Data Warehousing
CREATE DATABASE
The CREATE DATABASE statement has a set of options for supporting distributed and repli-
cated tables. You determine how much space you need in total for the database for replicated
tables, distributed tables, and logs. Parallel Data Warehouse manages the database according
to your specications.
Here is an example of the statement you use in Parallel Data Warehouse to create a new
database:
CREATE DATABASE DW
WITH (
AUTOGROW = ON,
REPLICATED_SIZE = 50,
DISTRIBUTED_SIZE = 10000,
LOG_SIZE = 25
);
This statement uses the following options:
■
AUTOGROW This option species whether to enable or disable the automatic
growth feature. This feature allows Parallel Data Warehouse to manage the growth of
data and log les as needed over time.
■
REPLICATED_SIZE This species the total space in gigabytes allocated to replicated
tables (and associated data) on each compute node. Parallel Data Warehouse stores
replicated tables in a SQL Server legroup on each compute node.
■
DISTRIBUTED_SIZE This species the total space in gigabytes allocated to distrib-
uted tables on the appliance. Parallel Data Warehouse divides the space among all dis-
tributions on the compute nodes and stores each distribution in a separate SQL Server
legroup. In the SN architecture of Parallel Data Warehouse, each distribution has its
own set of disks for storage. This set of disks is congured as a logical unit number
(LUN).
■
LOG_SIZE This option species the total space in gigabytes allocated to the transac-
tion log on the appliance. You should plan for the log le size to be large enough to
accommodate the largest data load that you expect. The automatic growth feature
adjusts the log size as needed if you underestimate the required log le size.
CREATE TABLE
The CREATE TABLE statement syntax varies slightly from its syntax in standard Transact-SQL.
For Parallel Data Warehouse, the statement includes options for specifying whether the table
uses a replicated or a distributed strategy and whether to store the table with a clustered in-
dex or with a heap. You can also use this syntax to create partitions by specifying the partition
boundary values.