How are important variables identified in the Partial Least Squares Regression function PLSREGRESS?

Question

MathWorks Support Team on 1 Feb 2019

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/443243-how-are-important-variables-identified-in-the-partial-least-squares-regression-function-plsregress

Commented: Pat Williamson on 15 May 2023

Accepted Answer: MathWorks Support Team

I am using the PLSREGRESS function in one of my applications to identify important variables in my data sets.

For another program I need to know how important variables are identified in this function?

Sign in to answer this question.

Answer 1

MathWorks Support Team on 24 Feb 2022

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/443243-how-are-important-variables-identified-in-the-partial-least-squares-regression-function-plsregress#answer_359546

Edited: MathWorks Support Team on 3 Mar 2022

Open in MATLAB Online

Within your Partial Least Squares (PLS) Regression calculation, the PLS projection finds those components that maximize the covariance between X and Y. For NCOMP components, it first finds the covariance between X and Y. Then, it finds a decomposition of the covariance, and then uses the resulting matrices for projection of X and Y.

Let the singular value decomposition of the covariance result in

[U,S,V] = svd(cov)

where U is the matrix of left singular vectors, and V is the matrix of right singular vectors. The following pseudo code is performed within PLSREGRESS in an iterative fashion:

for NCOMP components
    X is projected onto the column space of the vector corresponding to the largest singular value in U
    Y is projected onto the column space of the vector corresponding to the largest singular value in V
    select the NCOMP components from X and Y that maximize the covariance

There are some additional steps for orthogonalization and centering, but the main algorithm is the SIMPLS algorithm, as mentioned in the reference section of the PLSREGRESS documentation:

Please note that the implementation of the “simpls” function can be found inside of PLSREGRESS.m.

As for your other program, you might be looking for the calculation of the "Variable Importance in Projection" (VIP) scores, which estimate the importance of each variable. They can be easily obtained from the outputs of PLSREGRESS as this example illustrates:

% Load data on near infrared (NIR) spectral intensities of 60 samples of gasoline at 401 wavelengths, and their octane ratings.
load spectra
X = NIR;
Y = octane;
% Perform PLS regression with ten components.
NCOMP = 10;
[XL,YL,XS,YS,beta,pctvar,mse,stats] = plsregress(X,Y,NCOMP);
% Calculate normalized PLS weights
W0 = bsxfun(@rdivide,stats.W,sqrt(sum(stats.W.^2,1)));
% Calculate the product of summed squares of XS and YL
sumSq = sum(XS.^2,1).*sum(YL.^2,1);
% Calculate VIP scores for NCOMP components
vipScores = sqrt(size(XL,1) * sum(bsxfun(@times,sumSq,W0.^2),2) ./ sum(sumSq,2));
 

1 Comment
Show -1 older commentsHide -1 older comments

Pat Williamson on 15 May 2023

Hi @Reza Adhitama,

If you are still experiencing this issue, please consider submitting a Technical Support case. We will be happy to help you out. You can do so at the following location:

https://www.mathworks.com/support/contact_us.html

Sign in to comment.

How are important variables identified in the Partial Least Squares Regression function PLSREGRESS?

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How are important variables identified in the Partial Least Squares Regression function PLSREGRESS?

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments