User`s guide
Run mapreduce on a Parallel Pool
6-59
readall(meanDelay)
Key Value
__________________ ________
'MeanArrivalDelay' [7.1201]
Then, run the calculation on the current parallel pool. Note that the output text indicates
a parallel mapreduce.
meanDelay = mapreduce(ds,@meanArrivalDelayMapper,@meanArrivalDelayReducer,inPool);
Parallel mapreduce execution on the parallel pool:
********************************
* MAPREDUCE PROGRESS *
********************************
Map 0% Reduce 0%
Map 100% Reduce 50%
Map 100% Reduce 100%
readall(meanDelay)
Key Value
__________________ ________
'MeanArrivalDelay' [7.1201]
With this relatively small data set, a performance improvement with the parallel pool is
not likely. This example is to show the mechanism for running mapreduce on a parallel
pool. As the data set grows, or the map and reduce functions themselves become more
computationally intensive, you might expect to see improved performance with the
parallel pool, compared to running mapreduce in the MATLAB client session.
Note When running parallel mapreduce on a cluster, the order of the key-value pairs in
the output is different compared to running mapreduce in MATLAB. If your application
depends on the arrangement of data in the output, you must sort the data according to
your own requirements.
See Also
Functions
datastore | mapreduce | mapreducer