GPU computation freezes randomly on Windows 10

3 views (last 30 days)
I'm experencing a strange problem using GPU computation on a Windows 10 machine.
The function which causes the problem is a simple random walk called with arrayfun() for computation on the gpu. So nothing fancy there. Since it is only adding up the position with a random step for a certain amount of timesteps it cannot get stuck in theory.
The exact same code runs perfectly fine on Windows 7 and Windows 8.1 on the same machine using a GTX 1070 using the TdrLevel 0 registry entry. I tried several different driver versions on Windows 10 but after some random time the computation freezes. The GPU load remains at 100% but the Powerconsumption goes down from 45% to 25% and remains there forever. There is also no monitor connected to this GPU.
Sometimes I can trigger this freeze by opening the Taskmanager or GPU-Z so it seams that if something tries to get information from the GPU it freezes.
How can I debug the reason for this freeze when using arrayfun? Because when it freezes I cannot use CTRL+c to stop the computation in Matlab. I have to kill the matlab task. There is also no error in the Command Window.
Many thanks in advance, Dominik
  8 Comments
Joss Knight
Joss Knight on 5 Nov 2017
In answer to your original question, you can't debug an arrayfun kernel in MATLAB, because it's not MATLAB code that's executing but a GPU kernel compiled from that code. But you can try attaching a CUDA debugger or analysing behaviour in one of the CUDA tools, like the Visual Profiler. The profiler can tell you quite a lot about running kernels.
Dominik Ludwig
Dominik Ludwig on 8 Nov 2017
Using Visual Profiler did indeed help. By looking at the timeline I was able to narrow done the problem to the end of the computation shortly before or while the "gather()" command. Also by looking at the resource monitor in windows I noticed an increase of ram errors. So I decided to set my ram frequency to 2133MHz. Up to now there are no freezes over several days and different workloads. What leaves me a bit puzzled is the fact that it worked and works fine with the other setting using Window 7, 8.1 and Linux.
I have to conclude that Windows 10 is a mystery :)

Sign in to comment.

Answers (0)

Categories

Find more on Parallel Computing Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!