How many MATLAB Distributed Computing Engine worker sessions should I run on my cluster?

1 view (last 30 days)
I would like to know the optimal amount of workers and job managers that I should run on my cluster. I also would like to know how many Distributed Computing Toolbox client sessions can be run with this configuration.

Accepted Answer

MathWorks Support Team
MathWorks Support Team on 27 Jun 2009
You can run as many MDCE worker sessions on a machine as you want to, but the question is whether you gain any enhancement in performance by running more workers. This issue depends on many factors, such as
- How fast is the processor speed and how much memory does each machine have?
- Are the tasks computationally intensive, or do they involve a lot of idle time?
- Is there a lot of data transferred across the network for each task?
- How does the execution time of the tasks compare to the time to transfer data?
Without this specific information, you cannot derive a clear formula about the number of worker sessions to run on your machines. But a general guideline for forming a basis for comparison is to run one MDCE worker session per processor or core. So on single-processor/single-core machines, start with one worker session; on dual-processor/dual-core machines, run two workers; etc.
Starting with MATLAB 7.4 (R2007a), it is possible to configure MATLAB to use multiple threads to perform computations. In MATLAB Distributed Computing Engine, this behavior is disabled. As a result, you should still use the general rule of one worker per processor/core.
How many schedulers you should run and how many workers should be registered per scheduler depends on how many workers you need to perform the jobs of your application. You need to register at least the minimum number of workers needed for your job with a single job manager or other third-party scheduler. Given that, you can configure your workers and schedulers in whatever ratio works best for your cluster. Generally, you minimize the number of idle workers by minimizing the number of schedulers. If all your jobs are submitted to the same scheduler, that scheduler can make best use of all available workers. It does not matter how many client sessions connect to a job manager; it makes no difference to the job manager if it receives many jobs from each of a few client sessions, or a few jobs from each of many client sessions. The same applies to schedulers other than the job manager.

More Answers (0)

Categories

Find more on MATLAB Parallel Server in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!