Can't get speed up !

3 views (last 30 days)
john
john on 1 Dec 2012
Hi,
I've just learned about Matlab Parallel Computing Toolbox. I'm studying about it. In the beginning and for taking motivation i tried some simple codes to get speed up. But all of my parallel results were worse than serial ones. I know about overhead of data communication between cores. So i wrote a code that have least data communication but likewise before the parallel execution time was longer than serial. I execute parallel code on two workers. My CPU is Intel Core 2 Duo 2.26 Ghz. CPU usage is 100% while running parallel code and 50% while running serial code.
I also tried a code that i found in net. The writer had claimed speed of for the code is 1.92 using 2 workers. But i got 0.96 !
I'm so disturbed!
  2 Comments
Walter Roberson
Walter Roberson on 1 Dec 2012
If you have hyperthreading enabled, turn it off.
john
john on 2 Dec 2012
I checked the BIOS and didn't find CPU settings and hyperthreading. I think the CPU (Core 2 Duo) doesn't have hyperthreading feature.
Any other suggestion? Please help me

Sign in to comment.

Answers (2)

john
john on 2 Dec 2012
This is my serial code :
clear
A = zeros(2,4000000);
tic
for j = 1:2
for k = 1:4000000
A(j,k) = sin(j + k);
end
end
toc
And the parallel one :
clear
A = zeros(2,4000000);
tic
parfor j = 1:2
for k = 1:4000000
A(j,k) = sin(j + k);
end
end
toc
Very simple! The parfor has just two iterations and i expect that each of the iterations is executed by one core and get speed up about 2. But run time of the first is 4 seconds and the second is 14!
  2 Comments
Walter Roberson
Walter Roberson on 2 Dec 2012
Try reversing the order of the subscripts, producing a 4 million by 2 output, so that there would not be any cache-line contention. Also, try vectorizing, e.g.,
A(j, :) = sin(j + (1:4000000));
with no "for k" loop.
Jan
Jan on 29 Dec 2012
Slightly fast: sin(j + 1:j + 4000000)

Sign in to comment.


john
john on 2 Dec 2012
Edited: john on 2 Dec 2012
Thanks! I tried this code
A(j, :) = sin(j + (1:4000000));
and got speed up about 1 ! This wasn't disappointing like before samples. Then i tried weighting each iteration
A(j, :) = sin(j + (1:4000000)) .* sin(j - (1:4000000)) ...
.* cos(j - (1:4000000)) .* cos(j + (1:4000000));
and speed up was about 1.3!
Can you please explain what effect the code that you mentioned has? (A(j , :) = ...) why was my first code bad?
Regards
  6 Comments
Walter Roberson
Walter Roberson on 29 Dec 2012
Those machines haven't been sold for a number of years. And MATLAB has not been supported on them for a fair number of releases.
Bradley Stiritz
Bradley Stiritz on 22 Jan 2013
Regarding Walter's 12/2/2012 comment, I requested documentation reference from Mathworks support & received this reply:
------------------------------------------------
"While working with MATLAB in general and also with PCT, row-wise access in same column will be faster than column-wise access in same row. This is because, in MATLAB, matrix elements belonging to the same column are located in consecutive locations of memory, while elements belonging to the same row of the matrix are located in non-consecutive locations of memory.
The following link from the Mathworks website highlights the above mentioned fact and also provides additional information on "Speeding up MATLAB Applications"
Also, please see the following MATLAB Documentation link for additional details on profiling and improving parallel code:
Unfortunately, we cannot recommend any books. However, it would be highly recommended to go through the webinars mentioned in the link below:
------------------------------------------------
Hope this helps..

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!