MATLAB® Distributed Computing Server™ 5 Installation Guide
How to Contact MathWorks Web Newsgroup www.mathworks.com/contact_TS.html Technical Support www.mathworks.com comp.soft-sys.matlab suggest@mathworks.com bugs@mathworks.com doc@mathworks.com service@mathworks.com info@mathworks.com Product enhancement suggestions Bug reports Documentation error reports Order status, license renewals, passcodes Sales, pricing, and general information 508-647-7000 (Phone) 508-647-7001 (Fax) The MathWorks, Inc.
Contents Product Installation 1 Cluster Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 Installing Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On the Cluster Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On the Client Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 1-3 1-3 ..........................
Configuring Parallel Computing Products for HPC Server 3 Configure Cluster for Microsoft Windows HPC Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 Configure Client Computer for HPC Server 2008 . . . . . 3-3 Validate Installation Using Microsoft Windows HPC Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Step 1: Define a User Configuration . . . . . . . . . . . . . . . . . . Step 2: Validate the Configuration . .
Configuring Parallel Computing Products for a Generic Scheduler 5 Interfacing with Generic Schedulers . . . . . . . . . . . . . . . . Support Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Submission Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2 5-2 5-2 Configure Generic Scheduler on Windows Cluster . . . . Without Delegation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using Passwordless Delegation . . . . . . . . . . . . . .
vi Contents
1 Product Installation • “Cluster Description” on page 1-2 • “Installing Products” on page 1-3 • “Configuring Your Cluster” on page 1-4 For help, contact the MathWorks install support team at 508-647-7000 or http://www.mathworks.
1 Product Installation Cluster Description To set up a cluster, you first install MATLAB® Distributed Computing Server™ (MDCS) on a node called the head node. You can also install the license manager on the head node. After performing this installation, you can then optionally install MDCS on the individual cluster nodes, called worker nodes. You do not need to install the license manager on worker nodes. This figure shows the installations that you perform on your MDCS cluster nodes.
Installing Products Installing Products On the Cluster Nodes Install the MathWorks products on your cluster as a network installation according to the instructions found at http://www.mathworks.com/help/base/install/ These instructions include steps for installing, licensing, and activating your installation. You can install in a central location, or individually on each cluster node. Note MathWorks highly recommends installing all MathWorks products on the cluster.
1 Product Installation Configuring Your Cluster When the cluster and client installations are complete, you can proceed to configure the products for the job scheduler of your choice.
2 Configuring Parallel Computing Products for a Job Manager • “Configure Cluster to Use a Job Manager” on page 2-2 • “Configure Windows Firewalls on Client” on page 2-22 • “Validate Installation with Job Manager” on page 2-23 For help, contact the MathWorks install support team at 508-647-7000 or http://www.mathworks.
2 Configuring Parallel Computing Products for a Job Manager Configure Cluster to Use a Job Manager The mdce service must be running on all machines being used for job managers or workers. This service manages the job manager and worker processes. One of the major tasks of the mdce service is to recover job manager and worker sessions after a system crash, so that jobs and tasks are not lost as a result of such accidents.
Configure Cluster to Use a Job Manager Configure Windows Firewalls If you are using Windows® firewalls on your cluster nodes, 1 Log in as a user with administrator privileges. 2 Execute the following in a DOS command window. matlabroot\toolbox\distcomp\bin\addMatlabToWindowsFirewall.bat This command adds MATLAB as an allowed program. If you are using other firewalls, you must configure them to make similar accommodation.
2 Configuring Parallel Computing Products for a Job Manager Step 2: Stop mdce Services of Old Installation If you have an older version of the distributed computing products running on your cluster nodes, you should stop the mdce services before starting the services for the new installation. • “Stop mdce on Windows” on page 2-4 • “Stop mdce on UNIX” on page 2-5 Stop mdce on Windows If this is your first installation of the distributed computing products, proceed to Step 2.
Configure Cluster to Use a Job Manager 5 Repeat the instructions of this step on all worker nodes. Stop mdce on UNIX 1 Log in as root. (If you cannot log in as root, you must alter the following parameters in the matlabroot/toolbox/distcomp/bin/mdce_def.sh file to point to a folder for which you have write privileges: CHECKPOINTBASE, LOGBASE, PIDBASE, and LOCKBASE if applicable.
2 Configuring Parallel Computing Products for a Job Manager Identify Hosts and Start the mdce Service. 1 To open Admin Center, navigate to the folder: matlabroot\toolbox\distcomp\bin ( on Windows) matlabroot/toolbox/distcomp/bin ( on UNIX) Then execute the file: admincenter.bat (on Windows) admincenter (on UNIX) If there are no past sessions of Admin Center saved for you, the GUI opens with a blank listing, superimposed by a welcome dialog box, which provides information on how to get started.
Configure Cluster to Use a Job Manager 2 Click Add or Find. The Add or Find Hosts dialog box opens. 3 Select Enter Hostnames, then list your hosts in the text box. You can use short host names, fully qualified domain names, or individual IP addresses. The following figure shows an example using hosts node1, node2, node3, and node4. In your case, use your own host names. Keep the check to start mdce service. 4 Click OK to open the Start mdce service dialog box.
2 Configuring Parallel Computing Products for a Job Manager It might take a moment for Admin Center to communicate with all the nodes, start the services, and acquire the status of all of them. When Admin Center completes the update, the listing should look something like the following figure. 2-8 For help, contact the MathWorks install support team at 508-647-7000 or http://www.mathworks.
Configure Cluster to Use a Job Manager 5 At this point, you should test the connectivity between the nodes. This assures that your cluster can perform the necessary communications for running other MCDS processes. In the Hosts module, click Test Connectivity. 6 When the Connectivity Testing dialog box opens, it shows the results of the last test, if there are any. Click Run to run the tests and generate new data. For help, contact the MathWorks install support team at 508-647-7000 or http://www.
2 Configuring Parallel Computing Products for a Job Manager If any of the connectivity tests fail, contact the MathWorks install support team. 7 If your tests pass, click Close to return to the Admin Center dialog box. Start the Job Manager. 1 To start a job manager, click Start in the Job Manager module. (This is one of several ways to open the New Job Manager dialog box.) In the New Job Manager dialog box, specify a name and host for your job manager.
Configure Cluster to Use a Job Manager 2 Click OK to start the job manager and return to the Admin Center dialog box. Start the Workers. 1 To start workers, click Start in the Workers module. (This is one of several ways to open the Start Workers dialog box.) a In the Start Workers dialog box, specify the number of workers to start on each host. The number is up to you, but you cannot exceed the total number of licenses you have.
2 Configuring Parallel Computing Products for a Job Manager d Click OK to start the workers and return to the Admin Center dialog box. It might take a moment for Admin Center to initialize all the workers and acquire the status of all of them. When all the workers are started, Admin Center looks something like the following figure. If your workers are all idle and connected, your cluster is ready for use. 2-12 For help, contact the MathWorks install support team at 508-647-7000 or http://www.mathworks.
Configure Cluster to Use a Job Manager If you encounter any problems or failures, contact the MathWorks install support team. For more information about Admin Center functionality, such as stopping processes or saving sessions, see the “Admin Center” chapter in the MATLAB Distributed Computing Server System Administrator's Guide. Using the Command-Line Interface (Windows) Start the mdce Service. You must install the mdce service on all nodes (head node and worker nodes). Begin on the head node.
2 Configuring Parallel Computing Products for a Job Manager cmd If you are using a version of Windows other than Windows XP, you must run the command window with administrator privileges. To do this, click Start > Programs > Accessories; right-click Command Window, and select Run as Administrator. This option is available only if you are running User Account Control (UAC). 3 Navigate to the folder with the control scripts.
Configure Cluster to Use a Job Manager Start the Job Manager. To start the job manager, enter the following commands in a DOS command window. You do not have to be at the machine on which the job manager will run, as long as you have access to the MDCS installation. 1 Navigate to the folder with the startup scripts. cd matlabroot\toolbox\distcomp\bin 2 Start the job manager, using any unique text you want for the name . Enter this text on a single line.
2 Configuring Parallel Computing Products for a Job Manager 2 Start the workers on each node, using the text for that identifies the name of the job manager you want this worker registered with. Enter this text on a single line.
Configure Cluster to Use a Job Manager where hostA,hostB,hostC refers to a list of your host names. Note that there are no spaces between host names, only a comma. If you need to indicate protocol, platform (such as in a mixed environment), or other information, see the help for remotemdce by typing ./remotemdce -help Start the Job Manager. To start the job manager, enter the following commands.
2 Configuring Parallel Computing Products for a Job Manager For each computer used as a worker, enter the following commands. You do not have to be at the machines where the MATLAB workers will run, as long as you have access to the MDCS installation. 1 Go to the folder with the startup scripts. cd matlabroot/toolbox/distcomp/bin 2 Start the workers on each node, using the text for that identifies the name of the job manager you want this worker registered with.
Configure Cluster to Use a Job Manager Debian Platform On each cluster node, register the mdce service as a known service and configure it to start automatically at system boot time by following these steps: 1 Create the following link, if it does not already exist: ln -s matlabroot/toolbox/distcomp/bin/mdce /etc/mdce 2 Create the following link to the boot script file: ln -s matlabroot/toolbox/distcomp/bin/mdce /etc/init.d/mdce 3 Set the boot script file permissions: chmod 555 /etc/init.
2 Configuring Parallel Computing Products for a Job Manager 4 Look in /etc/inittab for the default run level. Create a link in the rc folder associated with that run level. For example, if the run level is 5, execute these commands: cd /etc/init.d/rc5.d; ln -s ..
Configure Cluster to Use a Job Manager sudo ./mdce stop 2 Create the following link if it does not already exist: sudo ln -s matlabroot/toolbox/distcomp/bin/mdce /usr/sbin/mdce 3 Copy the launchd .plist file for mdce to /Library/LaunchDaemons: sudo cp ./util/com.mathworks.mdce.plist /Library/LaunchDaemons 4 Start mdce and observe that it starts inside launchd: sudo ./mdce start The command output should read: Starting the MATLAB Distributed Computing Server using launchctl.
2 Configuring Parallel Computing Products for a Job Manager Configure Windows Firewalls on Client If you are using Windows firewalls on your client node, 1 Log in as a user with administrative privileges. 2 Execute the following in a DOS command window. matlabroot\toolbox\distcomp\bin\addMatlabToWindowsFirewall.bat This command adds MATLAB as an allowed program. If you are using other firewalls, you must configure them to make similar accommodation.
Validate Installation with Job Manager Validate Installation with Job Manager This procedure verifies that your parallel computing products are installed and configured correctly. Step 1: Verify the Network Connection To verify the network connection from the client computer to the job manager computer, follow these instructions. Note In these instructions, matlabroot refers to the folder where MATLAB is installed on the client computer. Do not confuse this with the MDCS cluster computers.
2 Configuring Parallel Computing Products for a Job Manager 3 In the Job Manager Configuration Properties dialog box, provide text for the following fields: a Set the Configuration name field to JobManagerTest. b Set the Description field to For testing installation with job manager. c Set the Job manager hostname field to the name of the host on which your job manager is running. Depending on your network, this might be only a host name, or it might have to be a fully qualified domain name.
Validate Installation with Job Manager e Click the Jobs tab. f For the Maximum number of workers, enter the number of workers for which you want to test your configuration. g For the Minimum number of workers, enter a value of 1. 4 Click OK to save your configuration. For help, contact the MathWorks install support team at 508-647-7000 or http://www.mathworks.
2 Configuring Parallel Computing Products for a Job Manager Step 3: Validate the Configuration In this step you verify your user configuration, and thereby your installation. 1 If it is not already open, start the Configurations Manager from the MATLAB desktop by selecting Parallel > Manage Configurations. 2 Select your configuration in the dialog box listing. 3 Click Start Validation. The validation results appear in the dialog box.
Validate Installation with Job Manager select File > Export, and save your file in a convenient location. Then later, when running the Configurations Manager from a MATLAB client session, other users can import your configuration by selecting File > Import. For help, contact the MathWorks install support team at 508-647-7000 or http://www.mathworks.
2 2-28 Configuring Parallel Computing Products for a Job Manager For help, contact the MathWorks install support team at 508-647-7000 or http://www.mathworks.
3 Configuring Parallel Computing Products for HPC Server • “Configure Cluster for Microsoft Windows HPC Server” on page 3-2 • “Configure Client Computer for HPC Server 2008” on page 3-3 • “Validate Installation Using Microsoft Windows HPC Server” on page 3-4 For help, contact the MathWorks install support team at 508-647-7000 or http://www.mathworks.
3 Configuring Parallel Computing Products for HPC Server Configure Cluster for Microsoft Windows HPC Server Follow these instruction to configure your MDCS installation to work with Windows HPC Server or Compute Cluster Server (CCS). In the following instructions, matlabroot refers to the MATLAB installation location. Note If using HPC Server 2008 in a network share installation, the network share location must be in the “Intranet” zone.
Configure Client Computer for HPC Server 2008 Configure Client Computer for HPC Server 2008 This configuring applies to all versions of HPC Server 2008, including HPC Server 2008 R2. Note If using HPC Server 2008 in a network share installation, the network share location must be in the “Intranet” zone. You might need to adjust the Internet Options for your cluster nodes and add the network share location to the list of Intranet sites.
3 Configuring Parallel Computing Products for HPC Server Validate Installation Using Microsoft Windows HPC Server This procedure verifies that your parallel computing products are installed and configured correctly for using Microsoft® Windows HPC Server or Compute Cluster Server (CCS). Step 1: Define a User Configuration In this step you define a user configuration to use in subsequent steps. 1 Start the Configurations Manager from the MATLAB desktop by selecting Parallel > Manage Configurations.
Validate Installation Using Microsoft® Windows® HPC Server g Click the Jobs tab. h For the Maximum number of workers, enter the number of workers for which you want to test your configuration. i For the Minimum number of workers, enter a value of 1. 4 Click OK to save your configuration. Step 2: Validate the Configuration In this step you verify your user configuration, and thereby your installation.
3 Configuring Parallel Computing Products for HPC Server 2 Select your configuration in the dialog box listing. 3 Click Start Validation. The validation results appear in the dialog box. The following figure shows a configuration that passed all validation tests. Note If your validation does not pass, contact the MathWorks install support team. If your validation passed, you now have a valid configuration to use in other parallel applications.
4 Configuring Parallel Computing Products for Supported Third-Party Schedulers (PBS Pro, Platform LSF, TORQUE) • “Configure Platform LSF Scheduler on Windows Cluster” on page 4-2 • “Configure Windows Firewalls on Client” on page 4-5 • “Validate Installation Using an LSF, PBS Pro, or TORQUE Scheduler” on page 4-6 Note You must use the generic scheduler interface for any of the following: • Any third-party schedule not listed above (e.g., Sun Grid Engine, GridMP, etc.
4 Configuring Parallel Computing Products for Supported Third-Party Schedulers (PBS Pro, Platform LSF, TORQUE) Configure Platform LSF Scheduler on Windows Cluster If your cluster is already set up to use mpiexec and smpd, you can use Parallel Computing Toolbox™ software with your existing configuration if you are using a compatible MPI implementation library (as defined in matlabroot\toolbox\distcomp\mpi\mpiLibConf.m).
Configure Platform LSF® Scheduler on Windows Cluster matlabroot\bin\win32\smpd -install or matlabroot\bin\win64\smpd -install This command installs the service and starts it. As long as the service remains installed, it will start each time the node boots. 3 If this is a worker machine and you did not run the installer on it to install MDCS software (for example, if you're running MDCS software from a shared installation), execute the following command in a DOS command window. matlabroot\bin\matlab.
4 Configuring Parallel Computing Products for Supported Third-Party Schedulers (PBS Pro, Platform LSF, TORQUE) Using Passwordless Delegation 1 Log in as a user with administrator privileges. 2 Start smpd by typing in a DOS command window one of the following, as appropriate: matlabroot\bin\win32\smpd -register_spn or matlabroot\bin\win64\smpd -register_spn This command installs the service and starts it. As long as the service remains installed, it will start each time the node boots.
Configure Windows® Firewalls on Client Configure Windows Firewalls on Client If you are using Windows firewalls on your cluster nodes, 1 Log in as a user with administrative privileges. 2 Execute the following in a DOS command window. matlabroot\toolbox\distcomp\bin\addMatlabToWindowsFirewall.bat This command adds MATLAB as an allowed program. If you are using other firewalls, you must configure them to make similar accommodation.
4 Configuring Parallel Computing Products for Supported Third-Party Schedulers (PBS Pro, Platform LSF, TORQUE) Validate Installation Using an LSF, PBS Pro, or TORQUE Scheduler This procedure verifies that your parallel computing products are installed and configured correctly. Step 1: Define a User Configuration In this step you define a user configuration to use in subsequent steps. 1 Start the Configurations Manager from the MATLAB desktop by selecting Parallel > Manage Configurations.
Validate Installation Using an LSF®, PBS Pro®, or TORQUE Scheduler h Click the Jobs tab. i For the Maximum number of workers, enter the number of workers for which you want to test your configuration. j For the Minimum number of workers, enter a value of 1. 4 Click OK to save your configuration. Step 2: Validate the Configuration In this step you verify your user configuration, and thereby your installation.
4 Configuring Parallel Computing Products for Supported Third-Party Schedulers (PBS Pro, Platform LSF, TORQUE) 2 Select your configuration in the dialog box listing. 3 Click Start Validation. The validation results appear in the dialog box. The following figure shows a configuration that passed all validation tests. Note If your validation does not pass, contact the MathWorks install support team. If your validation passed, you now have a valid configuration you can use in other parallel applications.
5 Configuring Parallel Computing Products for a Generic Scheduler Note You must use the generic scheduler interface for any of the following: • Any third-party schedule not listed in previous chapters (e.g., Sun Grid Engine, GridMP, etc.) • PBS other than PBS Pro • A nonshared file system when the client cannot directly submit to the scheduler (e.g., TORQUE on Windows) This chapter includes the following sections.
5 Configuring Parallel Computing Products for a Generic Scheduler Interfacing with Generic Schedulers In this section...
Interfacing with Generic Schedulers submit directly to the cluster (for example, if the scheduler's client utilities are not installed). • Nonshared — When there is not a shared file system between client and cluster machines. Before using the support scripts, decide which submission mode describes your particular network setup. For help, contact the MathWorks install support team at 508-647-7000 or http://www.mathworks.
5 Configuring Parallel Computing Products for a Generic Scheduler Configure Generic Scheduler on Windows Cluster If your cluster is already set up to use mpiexec and smpd, you can use Parallel Computing Toolbox™ software with your existing configuration if you are using a compatible MPI implementation library (as defined in matlabroot\toolbox\distcomp\mpi\mpiLibConf.m).
Configure Generic Scheduler on Windows Cluster matlabroot\bin\win32\smpd -install or matlabroot\bin\win64\smpd -install This command installs the service and starts it. As long as the service remains installed, it will start each time the node boots. 3 If this is a worker machine and you did not run the installer on it to install MDCS software (for example, if you're running MDCS software from a shared installation), execute the following command in a DOS command window. matlabroot\bin\matlab.
5 Configuring Parallel Computing Products for a Generic Scheduler Using Passwordless Delegation 1 Log in as a user with administrator privileges. 2 Start smpd by typing in a DOS command window one of the following, as appropriate: matlabroot\bin\win32\smpd -register_spn or matlabroot\bin\win64\smpd -register_spn This command installs the service and starts it. As long as the service remains installed, it will start each time the node boots.
Configure Sun™ Grid Engine on Linux® Cluster Configure Sun Grid Engine on Linux Cluster To run parallel jobs with MATLAB Distributed Computing Server and Sun™ Grid Engine (SGE), you need to establish a “matlab” parallel environment for SGE. The “matlab” parallel environment described in these instructions is based on the “MPI” example shipped with SGE. To use this parallel environment, you must use the matlabpe.
5 Configuring Parallel Computing Products for a Generic Scheduler qconf -mq all.q This will bring up a text editor for you to make changes: search for the line pe_list, and add matlab. 5 Ensure you can submit a trivial job to the PE: $ echo "hostname" | qsub -pe matlab 1 6 Use qstat to check that the job runs correctly, and check that the output file contains the name of the host that ran the job. The default filename for the output file is ~/STDIN.o###, where ### is the SGE job number.
Configure Windows® Firewalls on Client Configure Windows Firewalls on Client If you are using Windows firewalls on your cluster nodes, 1 Log in as a user with administrative privileges. 2 Execute the following in a DOS command window. matlabroot\toolbox\distcomp\bin\addMatlabToWindowsFirewall.bat This command adds MATLAB as an allowed program. If you are using other firewalls, you must configure them to make similar accommodation.
5 Configuring Parallel Computing Products for a Generic Scheduler Validate Installation Using a Generic Scheduler Testing the installation of the parallel computing products with a generic scheduler requires familiarity with your network configuration, with your scheduler interface, and with the generic scheduler interface of Parallel Computing Toolbox software. Note The remainder of this chapter illustrates only the case of using LSF in a nonshared file system.
Validate Installation Using a Generic Scheduler Users Desktop Cluster MATLAB worker MATLAB client Run command (ssh) r/w Local drive (Local data location, e.g., C:\Temp\jobdata) Submit job (qsub/bsub) Login node Scheduler cluster-hostname Copy (sFTP) MATLAB worker MATLAB worker r/w Shared drive (Cluster data location, e.g.
5 Configuring Parallel Computing Products for a Generic Scheduler Step 1: Set Up Windows Client Host On the Client Host 1 You need the necessary scripts on the path of the MATLAB client. You can do this by copying them to a folder already on the path.
Validate Installation Using a Generic Scheduler f Set Function called when submitting parallel jobs with the following text: {@parallelSubmitFcn, 'cluster-host-name', '/network/share/jobdata'} where cluster-host-name is the name of the cluster host (identified in Step 2) from which the job will be submitted to the scheduler; and, /network/share/jobdata is the location on the cluster where the scheduler can access job data. This must be accessible from all cluster nodes.
5 Configuring Parallel Computing Products for a Generic Scheduler 5 Click OK to save your configuration. Step 3: Validate Configuration In this step you verify your user configuration, and thereby your installation.
Validate Installation Using a Generic Scheduler 1 If it is not already open, start the Configurations Manager from the MATLAB desktop by selecting Parallel > Manage Configurations. 2 Select your configuration in the dialog box listing. 3 Click Start Validation. The validation results appear in the dialog box. The following figure shows a configuration that passed all validation tests. Note If your validation fails any stage, contact the MathWorks install support team.