Datasheet
Chapter 1: SSIS Solution Architecture
22
Development and Test Servers
For mission - critical ETL processes, you must have a test environment that mirrors your production
hardware. It will be costly, but consider the cost if you deploy a change that you couldn ’ t test and your
server goes down.
If your budget is restricted, and your ETL process is not mission - critical to your business, then your test
environment can be a scaled - down version of your production servers or a virtual server. One option to
save on cost is to use the same server for both development and testing, and then you may be able to use
equivalent hardware. Use different database instances and packages for your testing.
One aspect of setting up test servers that is critical is that the number and naming of your volumes
(drive letters) must match between all your environments. In other words, if you have
G: , H: , and I: as
logical drives on your production server, ensure that your development and test machines have the same
logical drives, and the folder structures and databases are the same as well. You don ’ t necessarily need to
have the same number of physical drives, but you should partition what you do have into the same
logical volumes. Doing so can alleviate a lot of deployment pain.
ETL Collocation
Sharing the source or destination server with your ETL process (called collocation ) is an option if your
source or destination is not impacted by your ETL process. What you must watch out for is the database
engine taking all the available RAM on the server, and starving your SSIS process of memory, which can
be very debilitating to the ETL.
You can configure the SQL Server memory through the
sp_configure TSQL statement. You can begin
your SSIS processes by limiting the RAM that SQL uses, then run your ETL, and at the end of the ETL
process, reset the memory usage on the SQL Server.
SSIS collocation is best for smaller to medium solutions where the ETL only runs nightly or weekly. The
next section clarifies the execution location of your environment.
Storage Hardware
As mentioned in the “ Problem ” section earlier in this chapter, the storage hardware is important because
you are performing bulk operations. The more throughputs you can generate with your disk subsystem,
the better.
How do you estimate hardware needs? Doing so is very difficult at the start of a solution, but if you can
get a gut sense of the record volume and growth, then you can probably do it. A good DBA will be able
to help estimate the table sizes by taking the estimated row width, multiplying that number by the
expected rows, and then adding some overhead for indexing and growth. (You must consider a lot of
factors such as, the SQL Server data page width and free space.)
c01.indd 22c01.indd 22 9/24/09 11:26:38 AM9/24/09 11:26:38 AM