polyfit along a given dimension

16 views (last 30 days)
The mean function allows the user to specify a dimension along which to calculate the mean. Can I do something similar with polyfit? I have a big 3D data set. It's essentially a stack of gridded values that have evolved through time, and I'd like to make a map of the linear trend without using a loop.

Accepted Answer

Chad Greene
Chad Greene on 25 Apr 2014
Matt J's solution got me started on the right path. As it turns out, the bsxfun command was unnecessary. I've posted my solution on the file exchange for anyone else who wishes to use it. Thanks to all for your help.
  3 Comments
Chad Greene
Chad Greene on 28 Apr 2014
Matt, that's brilliant, thank you! I've updated the code to make it useful for any ND case. Thanks again for all your suggestions and thoughtful feedback.
Shakir Hussain
Shakir Hussain on 28 Mar 2019
Thank you for this awesome code.

Sign in to comment.

More Answers (3)

Matt J
Matt J on 3 Apr 2014
Edited: Matt J on 3 Apr 2014
Assuming you only want the fit and don't want any of the error analysis output that polyfit normally gives, you can do things like
[m,n,p]=size(data3D);
data2D=reshape(data3D,m,[]);
V=bsxfun(@power,(0:m-1).'/m, [1,0] );
fit =reshape( V*(V\data2D), m,n,p); %linear fit along dim=1
And, of course, you can permute() the data3D array to achieve the same along other dims.
  4 Comments
Chad Greene
Chad Greene on 22 Apr 2014
Hi Matt,
Thank for your helping me with this problem. I suspect this may be the right way to go, but I'm running into a reshape error at the slopes line. No matter how I permute, I get Product of known dimensions not divisible into total number of elements.
I've read the bsxfun documentation, but it's a rather thin document that doesn't adequately convey what this function does or how it does it. Can you explain bsxfun in the context of this problem? Below is some artificial data and the only method I know for solving this problem:
t = (1:10)'; % represents time
A = rand(3,4,length(t)); % some fake data
A(1,1,:) = 1*t + 5; % trend (slope) is one unit per unit time at location 1,1
A(3,1,:) = 7; % no trend at location 3,1
A(3,4,:) = t/2 - 1; % trend is one half unit per unit time at 3,4
A(1,4,:) = -3*t + 3; % trend is negative at location 1,4
% Here's how the field of data evolve through time:
for n = 1:length(t)
imagesc(squeeze(A(:,:,n)))
colorbar
caxis([-30 15])
pause(.5)
end
% Here's an inefficient way to create a map of trends:
trendmap = NaN(3,4);
for rown = 1:3
for coln = 1:4
P = polyfit(t,squeeze(A(rown,coln,:)),1);
trendmap(rown,coln) = P(1);
end
end
figure
imagesc(trendmap)
colorbar
Chad Greene
Chad Greene on 22 Apr 2014
Aha, this seems to do it:
A = permute(A,[3 1 2]);
[m,n,p]=size(A);
data2D=reshape(A,m,[]);
V=bsxfun(@power,(0:(m)-1)', [1,0] ); <- edited to not divide by m
coefficients =(V\data2D);
slopes = reshape( coefficients(1,:), n,p)
The slopes matrix now matches the trendmap matrix obtained using the nested loops above. This is much faster, but I see that it's come at a cost: it assumes data frames are taken at evenly-spaced times. Is there a way to change this to allow for a day or two of missing data?

Sign in to comment.


Image Analyst
Image Analyst on 3 Apr 2014
You can make a visualization that is a movie where go through slice by slice. I'm not sure what your trend is. Is it the mean value of each slice? Then maybe you can take the mean of each slice with mean2() and fit it to a quadratice or whatever with polyfit() and then plot it as a 1D curve with plot(). I'm not really sure what the input to polyfit would be. What are you regressing? The slice means? Can you explain more?
  3 Comments
Image Analyst
Image Analyst on 19 Apr 2014
What is windData? Is that the 3D array? So you have 180 rows by 360 columns by 12600 frames? OK, fine. But I still don't know what linear trend is. I'm just going to have to repeat my questions. Trend of what? Is it the trend of something as a function of frame number? If so, of what? The mean? The variance? If so, get the mean and variance of each frame into an array and then run polyfit on that.
Chad Greene
Chad Greene on 22 Apr 2014
Hi, I'm sorry for being unclear. My data set is gridded by latitude and longitude (m and n), and the data evolve through time. I would like to make a map of how the wind has changed at each location over time. A map of the linear least squares trend calculated for each grid box would highlight were winds have become stronger or weaker in recent decades. For each latitude and longitude all I need is the linear least squares trend, or the slope given by polyfit. Here is an inefficient way to get the trend map:
t = (1:10)'; % represents time
A = rand(3,4,length(t)); % some fake data
A(1,1,:) = 1*t + 5; % trend (slope) is one unit per unit time at location 1,1
A(3,1,:) = 7; % no trend at location 3,1
A(3,4,:) = t/2 - 1; % trend is one half unit per unit time at 3,4
A(1,4,:) = -3*t + 3; % trend is negative at location 1,4
% Here's how the field of data evolve through time:
for n = 1:length(t)
imagesc(squeeze(A(:,:,n)))
colorbar
caxis([-30 15])
pause(.5)
end
% Here's my inefficient way to create a map of trends:
trendmap = NaN(3,4); % preallocation
for rown = 1:3
for coln = 1:4
P = polyfit(t,squeeze(A(rown,coln,:)),1);
trendmap(rown,coln) = P(1);
end
end
figure
imagesc(trendmap)
colorbar
Is there a way to take out those nested loops?

Sign in to comment.


Paul Shepherd
Paul Shepherd on 18 Apr 2014
Hi Chad! I tried to send you an e-mail to make contact as well. Hopefully you got my contact info.
So, if you want a poor-mans linear fit, you might take the diff() along the appropriate dimension, and then consider the mean and variance of the diff. Would that be useful?
I don't know how often you are sampling, but it seems like the linear fit along the time axis (12600 samples) might not be the most interesting thing to look at. What if you down-sampled this to look at the average value over a longer sample period? It's been a while since I looked at decimation filters, but this might let you get the time data down quite a bit. Unfortunately, the filter() function still assumes 1-D data, so you might need to code up an algorithm to scale things up to work with the MxN timeslice as a single "data point".
Fianlly, have you looked at any parallel processing options to maybe make the brute-force method less problematic?
Hope this helps, Paul Shepherd
  2 Comments
Paul Shepherd
Paul Shepherd on 18 Apr 2014
I think you can significantly improve the processing time of a full polyfit by reordering your indices. A little experiment:
x = [1, 2; 3, 4];
y = [5, 6; 7, 8];
z = zeros( 2, 2, 2);
z(:,:,1) = x;
z(:,:,2) = y;
x(1:end)
z(1:end)
Doing this, you can see that Matlab stores things column-wise, even for an N>2 D matrix. Try re-ordering your data so that the matrix is not MxNxP, where you want to do the fit across P, but MxPxN, where the outer for loop counts N, and the inner for loop counts M. That should allow the optimal memory utilization in your program, and might lead to a significant speed up in your algorithm.
John D'Errico
John D'Errico on 22 Apr 2014
Edited: John D'Errico on 22 Apr 2014
Paul - The mean of diff is a very poor mans solution, when there are better answers available.
For example, consider this simple case:
y = [0:5, -3:1];
>> polyfit(1:length(y),y,1)
ans =
-0.22727 2.2727
>> mean(diff(y))
ans =
0.1
See that the mean of the diffs gives a slope where the sign is not even the same as that given by polyfit.
Sorry, but I'd avoid this idea. Again, there are trivial solutions to make the polyfit no more difficult than a matrix multiply.

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!