Why is there a difference in the output of PLSREGRESS when using the 'cv' option with an integer 'k' and using a cvpartition object of type kfold of the same integer 'k' in Statistics Toolbox 8.1 (R2012b)?

3 views (last 30 days)
I am computing a partial least-squares regression using PLSREGRESS function in two different ways:
1. Directly adding the option 'cv' with a particular integer k =10:
k = 10;
[XL, YL, XS, YS, BETA, PCTVAR, MSE] = plsregress(X, Y, ncomp, 'cv', k);
2. Creating a cvpartition of type k-fold with the same integer 10:
k = 10;
C = cvpartition(length(X), 'kfold', k);
[XL, YL, XS, YS, BETA, PCTVAR, MSE] = plsregress(X, Y, ncomp, 'cv', C);
However, the results I obtain in MSE are different. Why is that?

Accepted Answer

MathWorks Support Team
MathWorks Support Team on 14 Jan 2013
Using 'cv' option with an integer 'k' and using a cvpartition object of type kfold of 'k' are conceptually the same. Both commands compute the mean-squared error, MSE, using k-fold cross-validation.
However, the cross-validated MSE values from two runs of k-fold cross-validation may not be exactly the same, as the data will probably be partitioned into k folds in different ways. Contrarily, passing a cvpartition object of type k-fold of 'k' into PLSREGRESS will reproduce the same MSE values every time.
In the same way, using 'cv' option with 'resubstitution' and using a cvpartition object created with 'resubstitution' will get the same results. In both cases, PLSREGRESS uses X and Y both to fit the model and to estimate the mean-squared errors, without cross-validation.

More Answers (0)

Products


Release

R2012b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!