mldivide versus least squares: X\(eye(m)) versus ( (X'X)\eye(m))*X'

Dear all,
I am fitting a polynomial to data. I construct a polynomial basis X. I use some algorithm to update Y. I could use the mldivide to obtain coefficients theta or use . But i don't know which one is more robust/accurate for my applicaiton. The system is normally overdetermined, but it might be exactly determined.
To obtain the coefficients of the polynomials I would normally do:
theta = X\Y;
However, since I have to do this repeatedly and X does not change I want to use:
%METHOD 1:
X_inv = X\eye(m);
%In each iteration:
theta = X_inv*Y;
where m is size(X,1).This should save computation time of the mldivide.
Now my questions is, for the Minimization of Squared Errors sometimes people also use . In that case I should define:
%METHOD 2:
X_regr = ( (X'*X)\eye(m) )*X';
%In each iteration:
theta = X_regr*Y;
Should one method be preferred to the other (when overdetermined or exactly determined)? Or is that another method that is even better?

 Accepted Answer

However, since I have to do this repeatedly and X does not change
If that's the case you should organize the different Y into the columns of a single matrix and do
X\[Y1,Y2,Y3,...,Yn]
Doing (X'*X)\ is not as numerically well-conditioned as X\, because the operation X'*X basically squares the condition number of X. Nevertheless, if X has many rows and few columns, X'*X\ will often run faster, and sometimes people will give priority to speed, especially if the cond(X) is known to be good.

3 Comments

Thanks Matt!! In my case X will in general have many more rows, but this step is not the most critical for speed so i'll just stick with X_inv= X\eye(m). It is an iterative process so your suggestion to stack Y in colums does not work in my case.
But I am still curious why you would prefer to stacks Y's in columns. Is that more accurate than to use X_inv= X\eye(m) repeatedly? (It would be slower, right?)
It depends in part on how many Y_i you have. If n is less then m, then the computation of X\eye(m) alone requires more inversions than X\[Y1,Y2,Y3,...,Yn].
N=2000;
Y=rand(N,N/2);
X=rand(N,N);
tic;
X_inv=(X\eye(N));
X_inv*Y;
toc
Elapsed time is 0.596263 seconds.
tic
X\Y;
toc
Elapsed time is 0.175316 seconds.
Wow, this helps a lot! I didn't expect this result, because I didn't completely understand. But it makes sense now, and it speeds up my code quite a bit!
Again, many thanks!

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!