Administrator Guide

RAPIDS Multi Node Set Up

30 RAPIDS Scaling on Dell EMC PowerEdge Servers

H RAPIDS Multi Node Set Up

1. Run as Docker container on each node

On each node, go inside the RAPIDS docker image and start the multi-node configuration as

described in the next steps. Below is the command example to go within the docker:

docker run --runtime=nvidia

--rm -it --net=host

-p 8888:8888

-p 8787:8787

-p 8786:8786

-v /home/rapids/notebooks-contrib/:/rapids/notebooks/contrib/

-v /home/rapids/data/:/home/dell/rapids/data/

nvcr.io/nvidia/rapidsai/rapidsai:0.10-cuda10.1-runtime-ubuntu18.04

2. Launch the dask-scheduler on the primary compute node

$ dask-scheduler --port=8888 --bokeh-port 8786

output:

distributed.scheduler - INFO - Receive client connection: Client-9ad22140-

83bd-11e9-823c-246e96b3e316

distributed.core - INFO - Starting established connection

3. Launch dask-cuda-worker on the primary compute node

This step will start workers at the same Primary machine as the scheduler was started

$ dask-cuda-worker tcp://<ip_primary_node>:8888

output:..... messages with successful connection

4. Launch dask-cuda-worker on the secondary compute node

This step will start additional workers on the secondary compute node

$ dask-cuda-worker tcp://<ip_primary_node>:8888

output:.. messages with successful connection

5. Start Jupyter and run the notebook (client python API) on the primary compute node

In this case, the NYC-Taxi notebook is the Client Python API which will be attached to the scheduler

running on the primary compute node, so it can be run using all compute node GPUs in distributed

mode. To do so, we need to modify the notebook, starting the client and providing the primary node

IP and port designated to be listened as below:

client = Client('tcp://<ip_primary_node>:8888') #connect to cluster

output:

Client

Scheduler: tcp://<ip_primary_node>:8888

Dashboard: http://<ip_primary_node>:8786/status

Cluster

Workers: 8 # total workers in distributed mode

Cores: 8

Memory: 67.47 GB