What is the best way to get a kD tree rangesearch with gpuArray?
11 views (last 30 days)
Show older comments
So my question is rather broad, but I am hoping someone can offer me some guidance here.
Essentially, I need an equivalent version of rangesearch() that is gpuArray compatible. I use this function in particle tracking code for reactive transport, so the the input data is a large number of points (1e4-1e6) of 3 or less dimensions. It is my understanding that in the non-GPU context, rangesearch() uses a kD tree algorithm to first organize the data so that it can subsequently be searched in an efficient manner.
I have been using the gpuArray features in the transport portion of my code and have gotten significant speedups. However, in order to simulate reactions, I need a fixed-radius nearest neighbor list, which, in a CPU context, I used rangesearch() to generate. But as rangesearch() is not gpuArray compatible, the process of gathering to the CPU to complete the rangesearch and then sending back to a gpuArray seems to be causing significant slowdowns.
So, I suppose my first question is if there is some efficient way of doing this with the current available options.
I have also written my own version of rangesearch() in MATLAB. The script that builds the kD tree is essentially the same as MATLAB's KDTreeSearcher.m, and the fixed-radius search script is my own version, done in the manner described by Friedman, et al., referenced in the KDTreeSearcher.m script. Obviously, my version runs quite a bit slower than MATLAB's for a large number of points, given that mine is not multi-threaded or compiled in a faster language like C or Fortran. I was hoping that with some tweaking I could get my code working with gpuArray features and hence much faster.
However, after doing this, my code is orders of magnitude slower than the same CPU version, and I am not quite sure why. I can post my code here if it would be helpful, but it is around 500 lines, so I will avoid that if someone can offer an answer on a theoretical level.
I am also curious about one last thing. I have written some very fast (actually beats rangesearch() on 8 cores!!) Fortran/OpenMP code that does the fixed-radius search with kD tree, described above. However, it is my understanding that the OS X version of MATLAB does not support OpenMP. As well, as I understand it, a Fortran mex file cannot access gpuArrays, and this can only be done if the mex file is C code. Am I correct in both of these understandings?
So, my overall question is what would be the best option to achieve what I am trying to do?
Thank you!
0 Comments
Answers (0)
See Also
Categories
Find more on Fortran with MATLAB in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!