I am considering using GPU processing for moving around all the bits of my data, which is acquired in real time via PCI-express cards. Right now, I would like the communities help in order for me to evaluate if a new GPU would really help me out, or if this calculation is just not in the nature of the processing on the GPU.
Attached is code below to send various sizes of data to the GPU and call the reshape command on it, and compare it to the speed of your CPU. I can choose to send any buffer chunk sizes I need, for optimization of GPU transfer/calculation ratio (which is why it does this in various sizes). I am looking for people with near equivalent CPUs as my system, but much better GPUs.
The code just clears the command window, and dumps the results there. So a copy and paste all would be sufficient.
My system: Core i7-2700K CPU @ 3.4GHZ (4 cores, 8 threads) GPU: Nvidia 560 2GB 256-bit GDDR5, 336 CUDA Cores, 850MHz core clock. Ideally, I am hoping someone has a system same with many more CUDA Cores and a similar CPU to send me the result of. Using Matlab 2012a
clc % CPU_Time =0; % GPU_Time =0; for a = 0:4:128 b=uint16(1:(512*4096*2*a)); for repeat = 1:10 %Average it 10 times.
tStart = tic; RawData = reshape(b, [2,4,512,512,2,a]); CPU_Time = toc(tStart); tStartB = tic; RawDataGPU = gpuArray(reshape(b, [2,4,512,512,2,a])); GPU_Time = toc(tStartB); end %end repeat loop
a GPU_CPU_Ratio = ( (GPU_Time/10) / (CPU_Time/10)) end %end main loop selecting data size
No products are associated with this question.
RESHAPE does not perform any operations on the data. You can check this by measuring the times for reshaping a 10x10 and a 1000x1000 matrix to e.g. row-vectors on the CPU: Both require the same time, because in the first case Matlab writes a UINT64[1, 100] to the header of the variable, and in the 2nd case UINT64[1, 1e6]. In consequence it does not matter if this is done on the CPU or GPU, because writing 16 bytes does not bother at all.
I'm not sure, if the header of the variable is copied to the GPU at all, or if only the data are sent.
In opposite to RESHAPE, the commands TRANSPOSE or PERMUTE do touch the data.
BTW: uint16(1:(512*4096*2*a)) creates a DOUBLE vector at first and converts it to a UINT16 afterwards. More efficient: uint16(1):uint16(512*4096*2*a).
[EDITED, minor changes, Jan]