Data Limit for Parallelisation and detailed questions about parallel scheduling

2 views (last 30 days)
Dear all,
I've run into a strange problem for a parallelisation task for which I have not found the solution yet. I use the spmd-environment and codistribute a large array within this environment. The problem is, that I don't know beforehand, how large the solution array will be, so I'm trying of over-estimate the solution array. I also create the solution array within the spmd-environment in order to use the results in a composite-style after spmd is finished. Now the problem that I don't understand: I can pass the data to the workers and codistribute the input-array and create the solution array within each worker. No problems there. But as soon as I introduce the code that calculates the solution for the solution array, Matlab throws an "error in distcompserialize, Error during serialisation". I find this strange because the amount of data passed into the worker does not change. I thought for some time, that the 2GB-limit might also apply for data created within the worker, but that wouldn't explain why I can pass the data to the workers and create the solution array.
Code that works (spmd-block only):
spmd
disp('Starting Lab...');
d_dist = codistributed(EoraCoDistCalc, codistributor1d(1));
d_local = getLocalPart(d_dist);
disp(['local size is ' num2str(size(d_local))]);
transferG = zeros(100000000,5); % This array should holds the results
line = 1;
% Calculation part is commented out.
%for i=1:size(d_local,1) %Loop over S1 elements
%
% % store Eora row and col locally to avoid multiple look ups of those
% % coordinates in the pre and post conc matrices
%
% EoraRow = d_local(i,2); EoraCol = d_local(i,3);
%
% % find destination coordinates in CREEA for current elements in Eora
%
% CREEArowsubs = find(PreConc(EoraRow,:)); % find those values that are non-zero in that row (i.e. the ones that the element corresponds to)
% CREEAcolsubs = find(PostConc(EoraCol,:));
%
% CREEArowvals = PreConc(EoraRow,CREEArowsubs);
% CREEAcolvals = PostConc(EoraCol,CREEAcolsubs);
%
% Nrow = length(CREEArowsubs); Ncol = length(CREEAcolsubs);
%
% NValues = Nrow*Ncol;
%
% transferG(line:line+NValues-1,:) = [repmat(d_local(i,1),NValues,1) repmat(d_local(i,4),NValues,1) repmat(CREEArowsubs',Ncol,1) reshape(repmat(CREEAcolsubs, Nrow, 1),NValues,1) reshape(CREEArowvals'*CREEAcolvals,NValues,1)];
% line = line+NValues;
%
% end
%
% transferG = transferG(1:line,:);
end
Code that does not work:
spmd
disp('Starting Lab...');
d_dist = codistributed(EoraCoDistCalc, codistributor1d(1));
d_local = getLocalPart(d_dist);
disp(['local size is ' num2str(size(d_local))]);
transferG = zeros(100000000,5);
line = 1;
for i=1:size(d_local,1) %Loop over S1 elements
% store Eora row and col locally to avoid multiple look ups of those
% coordinates in the pre and post conc matrices
EoraRow = d_local(i,2); EoraCol = d_local(i,3);
% find destination coordinates in CREEA for current elements in Eora
CREEArowsubs = find(PreConc(EoraRow,:)); % find those values that are non-zero in that row (i.e. the ones that the element corresponds to)
CREEAcolsubs = find(PostConc(EoraCol,:));
CREEArowvals = PreConc(EoraRow,CREEArowsubs);
CREEAcolvals = PostConc(EoraCol,CREEAcolsubs);
Nrow = length(CREEArowsubs); Ncol = length(CREEAcolsubs);
NValues = Nrow*Ncol;
transferG(line:line+NValues-1,:) = [repmat(d_local(i,1),NValues,1) repmat(d_local(i,4),NValues,1) repmat(CREEArowsubs',Ncol,1) reshape(repmat(CREEAcolsubs, Nrow, 1),NValues,1) reshape(CREEArowvals'*CREEAcolvals,NValues,1)];
line = line+NValues;
end
transferG = transferG(1:line,:);
end
I have a asked a few people that have worked with parallel Matab-environments before, and they said they don't know what's going on. My feeling is there are details to the memory usage in parallel code sections that I am probably not aware of. It would be great if anybody could guide me in the right direction.
Thank you,
Arne
  1 Comment
Edric Ellis
Edric Ellis on 5 Feb 2014
It would be very helpful if you could reduce your problem to a simple, correct, self-contained example so that we can run it and see exactly what the problem is.
Also, what version of MATLAB/PCT are you using? What OS are you using?
Note that the client/worker transfer limit was increased beyond 2GB in R2013a.

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!