Can LU decomposition use more than 12 cores in a Desktop?

2 views (last 30 days)
Hi,
I have to use LU decomposition for a very large matrix. LU makes use of multicores when they are present automatically. Will it use more than 12 if they are available?
Thanks

Answers (2)

Shashank Prasanna
Shashank Prasanna on 28 Aug 2013
LU takes advantage of built-in Multithreading and that extends beyond 12 cores.
The 12 core limit you are probably referring to is the number of independent workers that Parallel Computing Toolbox allows you to launch on your local machine.
  8 Comments
Shashank Prasanna
Shashank Prasanna on 28 Aug 2013
Edited: Shashank Prasanna on 28 Aug 2013
Salvador, you have to take into account that there will be data transfer overhead if you want to scale computation by parallilizing. This overhead is far more significant when it comes to GPUs.
What is the end goal? Is this for academic purposes or for a real-world application? If this is indeed for an actual application there are, as Walter mentioned already, several other factors that needs to be considered including your infrastructure. If you can elaborate on your use-case we will be able to provide you with what tools are available to best attach the problem
Walter Roberson
Walter Roberson on 28 Aug 2013
Because the transfer times are longer when it comes to GPUs, you have to give the GPUs more work to make it worthwhile compared to the alternatives. As you increase the number of GPUs, there would come a time when you would not be able to load all of them before twice the length of time each GPU would take to execute the task -- the point at which it would have been more efficient to instead give a task twice as long to half as many GPUs (and so avoid about half of the loading and about half of the unloading.)
Besides, you cannot attach more than one GPU per worker process (but GPU can sometimes be shared between workers.) With heat budgets and the like, you are pressing it to put more than 2 GPU in a single standard case.
Perhaps your question was about switching to a single GPU instead of parallel cores ?
Earlier I wrote about simultaneous transfers for distributing instructions and workers. What I omitted to mention was that some cluster systems designed for high performance can send the same data to multiple core-groups simultaneously, without increased time. Provided the same information is going to each. The only such systems I can name at the moment are the blade servers built by SGI; I don't think Parallel Processing Toolkit is designed to take advantage of those facilities (but you might be able to make message passing calls using the standard message passing libraries to get it to work.) There are some kinds of problems for which that kind of facility is critical for high performance, but the majority of parallel tasks can instead be implemented in terms of tasks that can be run independently.

Sign in to comment.


Salvador Sanchez
Salvador Sanchez on 29 Aug 2013
Thanks for the answers. The final goal is solving an inverse problem in medical imaging. What would be the best solution for solving these two types of problems: Matrix multiplication and linear equation solving (by LU) assuming a maximum budget of 20.000$? The matrices are very large (10 Gb) (not sparse)

Categories

Find more on Parallel Computing Fundamentals in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!