Getting output from parallel jobs that would regularly go to a diary

14 views (last 30 days)
Using the parallel computing toolbox, is it possible to get each individual job spawned by createTask to write output to a diary, so that one can get feedback on one's jobs progress? For example, in the following silly example, I'd like to write the output from fsolve's displays to separate files as they are generated, so that I can see where things are going bad. It appears that error messages are generated only at the end of a job, which is not at all helpful if your job is schedule to run for a week. Here's an example that illustrates the first point, but not the second (because the job runs quickly).
parallel.defaultClusterProfile('local');
c = parcluster();
j = createJob(c);
numCores = 4;
pauseTime = 2;
for ii=1:numCores
param = {1/rand,pauseTime*(ii-1)};
t = createTask(j,@dumbProblem,3,param);
end;
disp(['Submitting ' num2str(numCores) ' copies of dumbProblem']);
submit(j);
disp(['job submitted at ' datestr(now) ', now waiting']);
wait(j,'finished');
disp(['job finishing at ' datestr(now) ]);
OutputArgs = fetchOutputs(j);
errmsgs = get(j.Tasks, {'ErrorMessage'});
nonempty = ~cellfun(@isempty, errmsgs);
celldisp(errmsgs(nonempty));
disp(OutputArgs)
keyboard;
function [soln,fval,exitflag] = dumbProblem(param,pauseTime);
pause(pauseTime);
[soln,fval,exitflag,output] = fsolve(@(x) sin(param*x),[-4,
4],optimoptions('fsolve','Display','iter'))
sprintf('soln');
disp(output);
OutputArgs gives me the solns if they are generated, but if the engine fails to solve the problem, I don't see any way of getting any feedback while the program is still running.
Thanks! leo

Answers (1)

Thomas Ibbotson
Thomas Ibbotson on 21 Oct 2014
Remove the line with:
wait(j, 'finished')
Then this allows you to get the tasks' diary output while the job is still running with:
j.Tasks.Diary
  1 Comment
Leo Simon
Leo Simon on 21 Oct 2014
Thanks very much for the response, Thomas. I modified as suggested, and converted dumbProblem into an endless loop, as follows
parallel.defaultClusterProfile('local');
c = parcluster();
j = createJob(c);
numCores = 4;
pauseTime = 2;
for ii=1:numCores
param = {1/rand,pauseTime*(ii-1)};
t = createTask(j,@dumbProblem,3,param);
end;
disp(['Submitting ' num2str(numCores) ' copies of dumbProblem']);
submit(j);
disp(['job submitted at ' datestr(now) ', now waiting']);
%wait(j,'finished');
disp(['job finishing at ' datestr(now) ]);
keyboard
and dumbProblem is now
function [soln,fval,exitflag] = dumbProblem(param);
alpha = param{1};
pauseTime = param{2};
pause(pauseTime);
while 1 > 0
[soln,fval,exitflag,output] = fsolve(@(x) sin(alpha*x),[-4, 4],optimoptions('fsolve','Display','iter'))
sprintf('soln');
disp(output);
end
However, when I proceed as suggested, j.Tasks.Diary returns what looks like an empty cell array:
>> main_program
Submitting 4 copies of dumbProblem
job submitted at 21-Oct-2014 06:36:49, now waiting
job finishing at 21-Oct-2014 06:36:49
K>> j.Tasks.Diary
ans =
''
ans =
''
ans =
''
ans =
''
Can you advise further please?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!