Specifications
114 CHAPTER 6 Scalable Data Warehousing
The Landing Zone Node
The Landing Zone is a high-capacity data storage node in the control rack that contains tera-
bytes of disk space for temporary storage of user data before loading it into the appliance.
Using your ETL processes to move data to the Landing Zone, you can either copy data to the
Landing Zone and then load it into the appliance, or you can load data directly without rst
storing it on the Landing Zone. With either approach, the Landing Zone uses the appliance’s
high-speed fabric to copy that data in parallel into the data rack. To perform parallel data
loading, you can use SQL Server Integration Services or a command-line tool.
The Backup Node
Another node in the control rack is the Backup node that, as the name implies, is dedicated
to the backup process, which it can perform at very high speed. The backup node uses SQL
Server’s native database-level backup and restore functionality and coordinates the backup
across nodes. You can create full backups or differential backups of user databases, or
backups of the system database that contains information about user accounts, passwords,
and permissions. The initial backup takes the longest time because it contains all data in a
database, but subsequent differential backups run much faster because they contain only
the changes in the data that were made since the last full backup. Furthermore, the backup
process runs in parallel across nodes to help performance.
TIP To restore the backup, the destination appliance must have at least as many of com-
pute nodes as the appliance where the backup was created.
The Management Node
The nal node in the control rack is the management node, which operates as the hub for
software deployment, servicing, and system health and performance monitoring. This node
also runs a Windows domain controller to manage authentication within the appliance. It
performs functions related to the management of hardware and software in the appliance
and is not visible to users. Like the control node, the management node is a two-node active/
passive cluster.
NOTE Parallel Data Warehouse does not use the domain controller on the management
node for user authentication.
The Compute Node
Each compute node is the host for a single SQL Server instance and runs the DMS to commu-
nicate with and transfer data to other appliance nodes. Each compute node stores a subset of
each user database. Before parallel query processing begins, Parallel Data Warehouse copies