User`s guide
6 Programming Overview
6-62
outputFolder = '/home/user/logs/hadooplog';
Note The specified outputFolder must not already exist. The mapreduce output from
a Hadoop cluster cannot overwrite an existing folder.
Create a MapReducer object to specify that mapreduce should use your Hadoop cluster. .
mr = mapreducer(cluster);
Create and preview the datastore. The data set is available in matlabroot/toolbox/
matlab/demos.
ds = datastore('airlinesmall.csv','TreatAsMissing','NA',...
'SelectedVariableNames','ArrDelay','ReadSize',1000);
preview(ds)
ArrDelay
________
8
8
21
13
4
59
3
11
Call mapreduce to execute on the Hadoop cluster specified by mr. The map and reduce
functions are available in matlabroot/toolbox/matlab/demos.
meanDelay = mapreduce(ds,@meanArrivalDelayMapper,@meanArrivalDelayReducer,mr,...
'OutputFolder',outputFolder)
Parallel mapreduce execution on the Hadoop cluster:
********************************
* MAPREDUCE PROGRESS *
********************************
Map 0% Reduce 0%
Map 66% Reduce 0%
Map 100% Reduce 66%
Map 100% Reduce 100%