Highly variable execution time of mex file

1 view (last 30 days)
Jason
Jason on 19 Dec 2013
Commented: Jason on 20 Dec 2013
Hello,
I am running an EM type minimization where the E step is essentially a Kalman filter+smoother with a few minor additional details. I have a matlab version and a mex C version. The C version uses BLAS/LAPACK for the multiplications and divisions. I am running both at each EM step to compare timing and accuracy etc. but I have noticed that the mex timing is highly variable, ranging from 0.5 to 1.0 seconds for a pass while the Matlab version is more stable at 1.53 to 1.75. Below are some example tic-tocs. Each pass is doing exactly the same thing (same memory management, same lapack function calls etc.). The question is, why is the mex version so much more variable in its timing? Is this known behavior for some reason? Its not the end of the world, I'm just curious as it probably means I have more optimization to do.
For the record, the mex file is using persistent variables a few mxCalloc's and mxFree's, and the usual math suspects (dsymm, dgemv, dgemm, dposv, dsyr, dger, and memcpy) and the gcc compiler.
Thanks in advance.
Jason
(First is matlab then mex then the next matlab etc.):
Elapsed time is 1.559690 seconds.
Elapsed time is 0.768989 seconds.
Elapsed time is 1.607263 seconds.
Elapsed time is 0.799670 seconds.
Elapsed time is 1.745036 seconds.
Elapsed time is 0.871407 seconds.
Elapsed time is 1.603699 seconds.
Elapsed time is 0.842408 seconds.
Elapsed time is 1.542582 seconds.
Elapsed time is 0.827200 seconds.
Elapsed time is 1.570363 seconds.
Elapsed time is 0.516239 seconds.
Elapsed time is 1.566661 seconds.
Elapsed time is 1.048197 seconds.
Elapsed time is 1.720891 seconds.
Elapsed time is 0.852531 seconds.
Elapsed time is 1.577776 seconds.
Elapsed time is 0.763253 seconds.
Elapsed time is 1.557377 seconds.
Elapsed time is 0.675226 seconds.
Elapsed time is 1.558302 seconds.
Elapsed time is 0.607389 seconds.
Elapsed time is 1.554754 seconds.
Elapsed time is 0.849879 seconds.
Elapsed time is 1.559411 seconds.
Elapsed time is 0.863618 seconds.
Elapsed time is 1.588187 seconds.
Elapsed time is 0.824819 seconds.
Elapsed time is 1.616166 seconds.
Elapsed time is 0.829500 seconds.
Elapsed time is 1.617957 seconds.
Elapsed time is 0.909654 seconds.
Elapsed time is 1.595710 seconds.
Elapsed time is 0.523104 seconds.
Elapsed time is 1.532362 seconds.
Elapsed time is 0.819856 seconds.

Answers (2)

James Tursa
James Tursa on 20 Dec 2013
I have no real explanation for this behavior (underlying cache system?), but I have noticed it on many occasions for some of my mex routines as well (32-bit Windows XP). My experience is just like above, seeming to cycle between two different values, which is why I have gotten into the habit of always timing functions with a multiple of 2 iterations and then averaging.
  1 Comment
Jason
Jason on 20 Dec 2013
Thak you for the response. I guess I wasn't clear. The alternation from fast to slow is matlab only versus mex. The matlab times are all in the range of 1.6s +/-. The mex versions are all .5 - 1.05. My concern was that the mex execution times (0.5s-1.05s) were quite variable themselves thow always less than the matlab-only version. With only minimal extraneous processes running (RedHat EL 6.4) I would not expect C timings to vary that much. After playing with it I found a naming conflict. I had a static pointer to a struct defined at the top of the file. The memory of this struc was mxCalloc'd and made persistent. In one function I erroneously passed this static pointer to another function and used the same variable name in that second function. Since it was static at the top of the file there was no need to pass it. With block scope though it should have been fine and not really caused a problem.
Once I removed that pointer from the argument list and just used the static pointer with program scope, the timing dramatically changed:
These are mex only timings after that one change:
Elapsed time is 0.228381 seconds.
Elapsed time is 0.228564 seconds.
Elapsed time is 0.225762 seconds.
Elapsed time is 0.227973 seconds.
Elapsed time is 0.227439 seconds.
Elapsed time is 0.228576 seconds.
Elapsed time is 0.227462 seconds.
Elapsed time is 0.228526 seconds.
Elapsed time is 0.227594 seconds.
Elapsed time is 0.228506 seconds.
Passing the pointer was not necessary, but is perfectly legal in C (as far as I know, I'm not a programmer) but the matlab memory management doesn't seem to be dealing with correctly.
I'm satisfied that the problem is "fixed" but still confused as to why there was a problem in the first place.
Thank you again.
Jason

Sign in to comment.


Jan
Jan on 20 Dec 2013
There seems to be a pattern in the runtimes: Slow and fast runs are alternating.
I've seen this behavior, when an array of doubles is stored either matching in the 128 bit cache-line-size or not. All of my tests seems to imply, that mxMalloc reserves memory aligned with the cacheline size, but I did not find this reliably documented.
mxCalloc bahve differently depending on the availability of free zeroed memory, which is controlled by the operating system. You could try to allocate the memory persistently to reduce this effect.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!