User`s guide

Transfer Data to or from a Cloud Cluster
1-21
Transfer Data to or from a Cloud Cluster
In this section...
“Transfer Data from Amazon S3 Account” on page 1-21
“Transfer Data with Job Methods and Properties” on page 1-21
“Download SSH Key Identity File” on page 1-22
“Transfer Data with Standard Utilities” on page 1-22
“Transfer Data with the remotecopy Utility” on page 1-24
“Retrieve Data from Persisted Storage Without Starting a Cluster” on page 1-25
Transfer Data from Amazon S3 Account
When creating your cluster, the advanced options provide access to your Amazon S3
account files. Click Add Files to specify which files you want to make available to your
cluster nodes. (This option is not available after you have created a cluster.) When the
cluster starts up, before the mdce process starts, the specified S3 files are copied into the
folder /shared/imported on the cluster’s shared file system. If any of the files have the
extension .gz, .gzip, .tar, or .zip, they are automatically expanded.
Note Transfering a large amount of data from your Amazon S3 account can cause the
cluster to time out during its startup. If your data size exceeds approximately 5 GB, start
your cluster without the S3 data transfer, then upload the necessary data to the cluster
/shared/persisted folder from a local drive as described in either “Transfer Data with
Standard Utilities” on page 1-22 or “Transfer Data with the remotecopy Utility” on
page 1-24.
Transfer Data with Job Methods and Properties
To transfer data to the cloud cluster, you can use the AttachedFiles or JobData
property, in the same way you use these for other clusters. For example:
1
Place all required executable and data files in the same folder.
2
Specify that folder in the AttachedFiles property of the job.
When you submit your job, the files are transferred to the cloud and made available
to the workers running on the cloud cluster.