How the CI of GLM parameters are calculated?

6 views (last 30 days)
TingTing
TingTing on 8 Aug 2014
Answered: Tom Lane on 12 Aug 2014
I need to have the distributions of the coefficients estimated by glm model. I tried to find them and tested if the ones I found are the right ones by calculating the 95% CI from them and comparing those to those given by “predict” command. And the result turned out disappointing.
For example, if I use glm with gamma as distribution for response variable. The model is called mdl and the dataset is called dataset. After fitting the model, I wrote:
shape=1/mdl.Dispersion; [newf newc] = predict(mdl,dataset); scale=newf*mdl.Dispersion;
I tried to see if this is correct by calculating the 95% CI bounds for the first data point.
>> gamcdf(newc(1,2),shape,scale(1))
ans =
0.6329
>> gamcdf(newc(1,1),shape,scale(1))
ans =
0.4629
I think the different between them should be 0.95, but here 0.6329-0.4629 is obviously not.
So my understanding is wrong?
I also tried with normal distribution
>> mu=mdl.Fitted; >> variance=mdl.Dispersion; >> [newf newc] = predict(mdl,dataset); >> sigma=sqrt(variance);
>> normcdf(newc(1,2),newf(1),sigma)
ans =
0.5910
>> normcdf(newc(1,1),newf(1),sigma)
ans =
0.4152
Shouldn't the difference between the last two be 0.95? Is my understanding wrong or...?

Answers (1)

Tom Lane
Tom Lane on 12 Aug 2014
I'm not exactly following what you are trying to do. It looks like you are using the estimated standard deviation of the observed response values as a measure of uncertainty in the predicted values. That's not exactly right. Imagine you had huge amounts of data. There would still be residual standard deviation, but the predicted values would be known much more accurately than that.
Here's an illustration of this using a regular linear model:
>> load fisheriris
>> x = meas(:,1:3);
>> y = meas(:,4);
>> lm = fitlm(x,y,'linear');
>> [newf,newc] = predict(lm,meas(1,1:3))
newf =
0.2163
newc =
0.1613 0.2712
Now here's how I can calculate those values myself:
>> newx = [1;meas(1,1:3)'];
>> C = lm.CoefficientCovariance;
>> b = lm.Coefficients.Estimate;
>> b'*newx
ans =
0.2163
>> newf + [1 -1] * tinv(.025,lm.DFE) * sqrt(newx'*C*newx)
ans =
0.1613 0.2712

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!