How to use polyfitn functions, I am quite unclear with usage ?

The Polyfitn toolbox is very well programmed and is great, the demo file uses an example dataset with 2 continuously varying independent variables, however my data is arranged something similar to the following fashion, my question how should the input provided to the toolbox and can a dataset of this kind be regressed using polyfitn?
output input1 input2 input2
0 0.1 3 10
2 0.2 3 10
3 0.3 1 10
4 0.4 1 10
0 0.1 3 20
5 0.2 3 20
6 0.5 1 20
8 0.6 1 20
The output varies nonlinearly with input1, but output is also dependent on input 3 and input 4 as well. How can we achieve polynomial regression for such a dataset. Help of any kind shall be greatly appreciated. Thanking you all for the support.

 Accepted Answer

You provide the input pretty much the same way as for two variables. Assuming these vectors are column vectors, then:
mdl = polyfitn([input1,input2,input3],output,1);
If your data is in an array, then this will suffice:
mdl = polyfitn(data(:,2:4),data(:,1),1);
High order models will just get you in trouble, so don't go too high. I explicitly specified a linear model here for a good reason.
In fact, for the example data that you have shown, anything past a linear model in the second and third parameters will cause failure of the model, since quadratic terms in those parameters will be impossible to estimate. You could have a quadratic term in the first parameter though. You would need to explicitly state the terms to be used in the model then.
This should suffice, as the highest order set of terms that will not cause an error to result for the above set of data:
mdl = polyfitn([input1,input2,input3],output, ...
{'constant','x1','x1^2','x1^3','x2','x3'});
mdl
mdl =
ModelTerms: [6x3 double]
Coefficients: [-10.536 79.957 -186.3 157.69 0.7055 0.14157]
ParameterVar: [18.256 1398.4 18428 18532 1.2745 0.0097654]
ParameterStd: [4.2727 37.395 135.75 136.13 1.1289 0.09882]
DoF: 2
p: [0.13254 0.16594 0.30359 0.36635 0.59581 0.28834]
R2: 0.95897
AdjustedR2: 0.85641
RMSE: 0.53589
VarNames: {'x1' 'x2' 'x3'}
The model produced is...
vpa(polyn2sym(mdl),5)
ans =
157.69*x1^3 - 186.3*x1^2 + 79.957*x1 + 0.7055*x2 + 0.14157*x3 - 10.536
Even here, note the large coefficients on some of the terms. That immediately suggests this model is a poor one. Of course, I know that your data was just made up. mdl.p also indicates that none of the terms in this model are terribly well estimated. Terms that are significant in the model should have mdl.p very near zero for those coefficients.
For example:
mdl = polyfitn(rand(1000,1),rand(1000,1),4)
mdl =
ModelTerms: [5x1 double]
Coefficients: [0.95506 -2.2363 1.6668 -0.45408 0.55207]
ParameterVar: [3.7645 15.461 6.9479 0.42831 0.0022732]
ParameterStd: [1.9402 3.9321 2.6359 0.65446 0.047678]
DoF: 995
p: [0.62266 0.56966 0.5273 0.48796 3.5389e-29]
R2: 0.0021445
AdjustedR2: -0.001867
RMSE: 0.2884
VarNames: {'X1'}
Here the correct model for the above process is to use a constant only! All terms except for the constant term are statistically indistinct from zero, whereas the constant term was quite so.

4 Comments

Thank you very much for the reply and sharing your insight John, but I am quite skeptical still, as in how to view the coefficients and use it for further evaluation and prediction later on for new inputs. Yes the data I gave earlier was random data, my dataset is 8000x4
So if i have understood it correctly, I must try with this configuration right
mdl = polyfitn([input1,input2,input3],output, ...
{'constant','x1','x1^2','x1^3','x2','x3'})
where I must vary the degree of input1 from (1:3) and hold on the rest of the inputs to degree 1 respectively.
NO!!!!!! You clearly do not understand correctly. READ MY ANSWER MORE CAREFULLY.
The model that I used there was specific for the data that you gave as an example! I thought I made that quite clear.
output input1 input2 input2
0 0.1 3 10
2 0.2 3 10
3 0.3 1 10
4 0.4 1 10
0 0.1 3 20
5 0.2 3 20
6 0.5 1 20
8 0.6 1 20
This data as you have it cannot support a higher order model than that. (I suppose I might have added an interaction term and still survived.)
You can choose any model that your heart desires. Of course, the result might be useless dreck if you use a model that is not supported by the data, but I don't see your data posted. Choosing a good model is a task that people can spend years learning to do well.
So I have no idea wherein your skepticism lies. A polynomial model can never be better than the model that you choose, and the quality of the data that you have available.
Even then of course, polynomial models have serious limits. The point is, just because a Taylor series might be valid for some function, does not mean that you can estimate hundreds of terms for a Taylor series from your data. For example, to accurately predict the values for an exponential function, thus exp(x) for large x, you may need hundreds of terms. But you cannot estimate hundreds of terms from any polynomial fitting scheme. Nor can you rationally expect to evaluate a polynomial with hundreds of terms in it in double precision arithmetic.
Again, polynomials have limits. That is not the fault of the tool you use to estimate them.
Yes I understood that, for the given data you have tried to fit. Infact my actual data is similar to earlier only that its on large scale for 8000x4, where column 1 is output and columns 2 to 3 are my inputs.
I have actually attached the data now along with this comment, so probably i may be giving you right information. Thanks again for the help John. I shall keep in mind the points you mentioned.
hi,
after applying this on a data of 5 independent variables and one dependent variable, how can we predict a value of dependent variable for a new input value?
suppose,
p = polyfitn ([x1,x2,x3,x4,x5] , y ,3) %where 3 is degree of polynomial.
now i want to check the either this model predicts the right value or not,
how can i do this?
I tried it to do by using polyvaln (p, [a,b,c,d,e]) , but its totally wrong prediction.

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!