How to use polyfitn functions, I am quite unclear with usage ?

Question

0 votes

The Polyfitn toolbox is very well programmed and is great, the demo file uses an example dataset with 2 continuously varying independent variables, however my data is arranged something similar to the following fashion, my question how should the input provided to the toolbox and can a dataset of this kind be regressed using polyfitn?

output input1 input2 input2
     0.1    3       10
     0.2    3       10
     0.3    1       10
     0.4    1       10
     0.1    3       20
     0.2    3       20
     0.5    1       20
     0.6    1       20

The output varies nonlinearly with input1, but output is also dependent on input 3 and input 4 as well. How can we achieve polynomial regression for such a dataset. Help of any kind shall be greatly appreciated. Thanking you all for the support.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

John D'Errico on 27 Apr 2016

Edited: John D'Errico on 27 Apr 2016

Open in MATLAB Online

3 votes

You provide the input pretty much the same way as for two variables. Assuming these vectors are column vectors, then:

mdl = polyfitn([input1,input2,input3],output,1);

If your data is in an array, then this will suffice:

mdl = polyfitn(data(:,2:4),data(:,1),1);

High order models will just get you in trouble, so don't go too high. I explicitly specified a linear model here for a good reason.

In fact, for the example data that you have shown, anything past a linear model in the second and third parameters will cause failure of the model, since quadratic terms in those parameters will be impossible to estimate. You could have a quadratic term in the first parameter though. You would need to explicitly state the terms to be used in the model then.

This should suffice, as the highest order set of terms that will not cause an error to result for the above set of data:

mdl = polyfitn([input1,input2,input3],output, ...
    {'constant','x1','x1^2','x1^3','x2','x3'});
mdl
mdl = 
    ModelTerms: [6x3 double]
  Coefficients: [-10.536 79.957 -186.3 157.69 0.7055 0.14157]
  ParameterVar: [18.256 1398.4 18428 18532 1.2745 0.0097654]
  ParameterStd: [4.2727 37.395 135.75 136.13 1.1289 0.09882]
           DoF: 2
             p: [0.13254 0.16594 0.30359 0.36635 0.59581 0.28834]
            R2: 0.95897
    AdjustedR2: 0.85641
          RMSE: 0.53589
      VarNames: {'x1'  'x2'  'x3'}

The model produced is...

vpa(polyn2sym(mdl),5)
ans =
157.69*x1^3 - 186.3*x1^2 + 79.957*x1 + 0.7055*x2 + 0.14157*x3 - 10.536

Even here, note the large coefficients on some of the terms. That immediately suggests this model is a poor one. Of course, I know that your data was just made up. mdl.p also indicates that none of the terms in this model are terribly well estimated. Terms that are significant in the model should have mdl.p very near zero for those coefficients.

For example:

mdl = polyfitn(rand(1000,1),rand(1000,1),4)
mdl = 
    ModelTerms: [5x1 double]
  Coefficients: [0.95506 -2.2363 1.6668 -0.45408 0.55207]
  ParameterVar: [3.7645 15.461 6.9479 0.42831 0.0022732]
  ParameterStd: [1.9402 3.9321 2.6359 0.65446 0.047678]
           DoF: 995
             p: [0.62266 0.56966 0.5273 0.48796 3.5389e-29]
            R2: 0.0021445
    AdjustedR2: -0.001867
          RMSE: 0.2884
      VarNames: {'X1'}

Here the correct model for the above process is to use a constant only! All terms except for the constant term are statistically indistinct from zero, whereas the constant term was quite so.

4 Comments
Show 2 older comments Hide 2 older comments

John D'Errico on 28 Apr 2016

Edited: John D'Errico on 28 Apr 2016

Open in MATLAB Online

NO!!!!!! You clearly do not understand correctly. READ MY ANSWER MORE CAREFULLY.

The model that I used there was specific for the data that you gave as an example! I thought I made that quite clear.

output input1 input2 input2
     0.1    3       10
     0.2    3       10
     0.3    1       10
     0.4    1       10
     0.1    3       20
     0.2    3       20
     0.5    1       20
     0.6    1       20

This data as you have it cannot support a higher order model than that. (I suppose I might have added an interaction term and still survived.)

You can choose any model that your heart desires. Of course, the result might be useless dreck if you use a model that is not supported by the data, but I don't see your data posted. Choosing a good model is a task that people can spend years learning to do well.

So I have no idea wherein your skepticism lies. A polynomial model can never be better than the model that you choose, and the quality of the data that you have available.

Even then of course, polynomial models have serious limits. The point is, just because a Taylor series might be valid for some function, does not mean that you can estimate hundreds of terms for a Taylor series from your data. For example, to accurately predict the values for an exponential function, thus exp(x) for large x, you may need hundreds of terms. But you cannot estimate hundreds of terms from any polynomial fitting scheme. Nor can you rationally expect to evaluate a polynomial with hundreds of terms in it in double precision arithmetic.

Again, polynomials have limits. That is not the fault of the tool you use to estimate them.

Srikar on 28 Apr 2016

dataset.mat

Yes I understood that, for the given data you have tried to fit. Infact my actual data is similar to earlier only that its on large scale for 8000x4, where column 1 is output and columns 2 to 3 are my inputs.

I have actually attached the data now along with this comment, so probably i may be giving you right information. Thanks again for the help John. I shall keep in mind the points you mentioned.

gullnaz shahzadi on 10 Apr 2019

hi,

after applying this on a data of 5 independent variables and one dependent variable, how can we predict a value of dependent variable for a new input value?

suppose,

p = polyfitn ([x1,x2,x3,x4,x5] , y ,3) %where 3 is degree of polynomial.

now i want to check the either this model predicts the right value or not,

how can i do this?

I tried it to do by using polyvaln (p, [a,b,c,d,e]) , but its totally wrong prediction.

Sign in to comment.

How to use polyfitn functions, I am quite unclear with usage ?

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

4 Comments
Show 2 older comments Hide 2 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

How to use polyfitn functions, I am quite unclear with usage ?

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

4 Comments Show 2 older comments Hide 2 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

4 Comments
Show 2 older comments Hide 2 older comments