Specifications
112 CHAPTER 6 Scalable Data Warehousing
The Control Rack
The control rack is a separate rack that houses the servers, storage, and networking com-
ponents for the nodes that provide control, management, or interface functions. It contains
several types of nodes that Parallel Data Warehouse uses to process user queries, to load
and back up data, and to manage the appliance. Some of the nodes serve as intermediaries
between the corporate network and the private network that connects the nodes in both the
control rack and data rack. You never interact directly with the data rack; you submit a data
load or a query to the control rack, which then coordinates the processes between nodes to
complete your request.
Most Parallel Data Warehouse activity involves coordination with the control node. To sup-
port high availability, the control node is a two-node active/passive cluster. If the active node
fails for any reason, the passive node takes over. The redundancy between the two nodes
ensures the appliance can recover quickly from a failure.
Parallel Data Warehouse uses multiple networking technologies. The control rack servers
connect to the corporate network by using the corporate Ethernet. The compute node serv-
ers connect to their dedicated database storage by using a Fibre Channel network. A high-
speed InniBand network internally connects all the servers in the appliance to one another.
Because InniBand is much faster than a Gigabit Ethernet network, it is better suited for the
Parallel Data Warehouse nodes, which must transfer high volumes of data and be as fast as
possible. For high availability, the switching fabric of each network includes redundancy.
The Control Node
The control node is in the control rack and manages client authentication; accepts client con-
nections to Parallel Data Warehouse; manages the query execution process, which it distrib-
utes across the compute nodes; and serves as the central point for all hardware monitoring.
To support high availability, the control node is a two-node active/passive cluster in which the
passive node instantly takes over if the active node fails for any reason. The control node also
contains a SQL Server instance.
To support the distributed architecture of Parallel Data Warehouse, the control node con-
tains the MPP Engine, the Data Movement Service (DMS), and Windows Internet Information
Services (IIS), as shown in Figure 6-2. The MPP Engine coordinates parallel query processing,
storage of appliance-wide metadata and conguration data, and authentication and authori-
zation for the appliance and databases. The DMS, which runs on most appliance nodes, is the
communication interface for copying data between appliance nodes. IIS hosts a Web applica-
tion, called the Admin Console, that you access by using Windows Internet Explorer and use
to manage and monitor the appliance status and query performance.
You can connect to the Parallel Data Warehouse control node by using a variety of client
access tools. Parallel Data Warehouse integrates with SQL Server 2008 R2 Business Intelligence