Parfor and data copy to workers

5 views (last 30 days)
Giovanni De Luca
Giovanni De Luca on 11 Dec 2013
Edited: Matt J on 11 Dec 2013
Hello,
I read this brief explanation about when MATLAB does a copy of data when working with the PCT in
but I would need a more clear answer, if possible. Assuming I have 2 available parallel workers, and I want to perform a time-expensive function fnc on a large sparse matrix A with dimension [10^5x10^5] and B with dimension [10^5x10^2], having a vector vec with scalar values and dimensione [1x1000], and the function return a positive scalar scal:
function scal=fnc(A,B,elem)
X=(elem*A)\B; % most expensive routine
% do something on X
scal=...
end
Then,
parfor i=1:size(vec,2) % 1000 cycles
scal(1,i)=fnc(A,B,vec(1,i));
end
s_max=max(scal(1,:));
My question is: how many times the data-copy is done, having 2 available workers and the loop index=1000? The point is that, if I split vec in 2 parts (as the number of available workers), i.e.
vec_new=[vec(1,1:500);vec(1,501:1000)]; % [2x500] matrix
and I slightly modify the function fnc, creating a new function fnc_new:
function scal=fnc_new(A,B,vec_elem)
scal=0;
for i=1:size(vec_elem,2)
X=(elem*A)\B;
% do something on X
scal_temp=...
end
scal=max(scal,scal_temp);
end
and
parfor i=1:size(vec_new,1) % 2 cycles
scal(1,i)=fnc_new(A,vec_new(i,:));
end
s_max=max(scal(1,:));
the two approach ( fnc plus first parfor, and fnc_new plus second parfor ) provide the same final results s_max , but I experienced a further speedup on the second one, I wonder if it's a data-copy issue. I hope I was clear. Thank you in advance.
  1 Comment
Matt J
Matt J on 11 Dec 2013
Edited: Matt J on 11 Dec 2013
It looks quite sub-optimal to be doing
X=(elem*A\B)
repeatedly in a loop for the same A and B. You should really be pre-computing X=A\B once. Then inside fnc() do
function scal=fnc(X,elem)
X=elem*X; % most expensive routine
% do something on X
scal=...
end
Once you do this, the rest of your question might be irrelevant, since you noted that X=(elem*A)\B is your bottleneck anyway. I can't see why the second version would be faster, though, not from the detail that your code provides.
Other than that, you appear to have a typo where you call
scal(1,i)=fnc_new(A,vec_new(i,:));
with only two arguments. Shouldn't it be
scal(1,i)=fnc_new(A,B,vec_new(i,:));

Sign in to comment.

Answers (0)

Categories

Find more on Parallel for-Loops (parfor) in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!