MATLAB DIstributed Computing Server

5 views (last 30 days)
Eamorr
Eamorr on 11 Jun 2014
Edited: Edric Ellis on 12 Jun 2014
Hi,
I have a PBS TORQUE/MAUI cluster set up.
Here's the MATLAB I'm trying to run:
%Initialise cluster:
cluster = parallel.cluster.Generic()
set(cluster,'ClusterMatlabRoot','/gaia/software/src/matlab-2014a/build/')
set(cluster,'JobStorageLocation','/tmp/generic')
set(cluster,'NumWorkers',64)
set(cluster, 'OperatingSystem', 'unix');
set(cluster, 'IndependentSubmitFcn', {@independentSubmitFcn, 'gaia.ucd.ie', '/gaia/scratch'});
set(cluster, 'CommunicatingSubmitFcn', {@communicatingSubmitFcn, 'gaia.ucd.ie', '/gaia/scratch'});
set(cluster, 'GetJobStateFcn', @getJobStateFcn);
set(cluster, 'DeleteJobFcn', @deleteJobFcn);
set(cluster,'HasSharedFilesystem',false)
%create a job:
job = createJob(cluster)
%do some work:
createTask(job, @rand, 1, {3,3});
createTask(job, @rand, 1, {3,3});
createTask(job, @rand, 1, {3,3});
createTask(job, @rand, 1, {3,3});
createTask(job, @rand, 1, {3,3});
%submit job:
submit(job)
%Get results:
results = fetchOutputs(job); %This fails!
Here is the error message:
Error using parallel.Job/fetchOutputs (line 841) Outputs can only be fetched if the job is in State 'finished'.
In MATLAB, the job is stuck permanently in a "pending" state:
job
job =
Job
Properties:
ID: 46
Type: independent
Username: ehynes
State: queued
SubmitTime: Wed Jun 11 18:34:33 IST 2014
StartTime:
Running Duration: 0 days 0h 0m 0s
AutoAttachFiles: true
Auto Attached Files: List files
AttachedFiles: {}
AdditionalPaths: {}
Associated Tasks:
Number Pending: 5
Number Running: 0
Number Finished: 0
Task ID of Errors: []
When I do `qstat` on the cluster's head node command line, I see the job submitted and it goes from state "Q to R to E to C" (queued, running, exiting, completed).
  1 Comment
Edric Ellis
Edric Ellis on 12 Jun 2014
Edited: Edric Ellis on 12 Jun 2014
Which integration scripts are you using? Jobs getting stuck in state 'pending' usually means that the workers on the cluster didn't have access to the JobStorageLocation. You might also want to check the JobStorageLocation to see if any log-files made it back there - they might have more information.

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!