Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Confidence interval calculation (for multcompare)

Subject: Confidence interval calculation (for multcompare)

From: Russell

Date: 18 Apr, 2013 20:27:18

Message: 1 of 6

In statistics, usually confidence intervals are calculated by the standard deviation over the sample size. However, in the multiple comparison, regardless of what the standard deviation is for each of the individual groups, the confidence intervals are always the same. I give an example code down below:

 A = [1 5 3;1.1 500 4; 1.08, 10000, 3;0.95, 274, 6; 0.99, 2457, 5; 1.05, 54, 5; 1.02, 326, 4]
[p6,anovatab6,stat6] = anova1(A);
multcompare(stat6,'alpha',0.05);

The confidence intervals displayed in this multcompare figure should be as a function of the standard deviation in each of the groups, but it appears that they follow the standard deviation of the whole.

In other words the confidence intervals are not a reflection of these values for each of the groups

std(A)

but seem to be calculated from the entire population...

My question is, how and why are the confidence intervals in the multcompare function in matlab calculated?

Russell

Subject: Confidence interval calculation (for multcompare)

From: Tyler

Date: 19 Apr, 2013 01:07:10

Message: 2 of 6

I'd just like to bump this question for visibility - it's an incredibly frustrating problem not to have any documentation on how these error bars are calculated. It means I absolutely could not publish these figures or even show them to my PI, seeing as I can't explain how they're generated.

The question of how they are generated is all over the web, but never answered. The error bars are not standard error, they aren't the standard deviation or variance, and they don't seem to come out of the confidence intervals from Tukey-Kramer. Any ideas anybody?


"Russell " <rsh935@arl.psu.edu> wrote in message <kkpkv6$ehk$1@newscl01ah.mathworks.com>...
> In statistics, usually confidence intervals are calculated by the standard deviation over the sample size. However, in the multiple comparison, regardless of what the standard deviation is for each of the individual groups, the confidence intervals are always the same. I give an example code down below:
>
> A = [1 5 3;1.1 500 4; 1.08, 10000, 3;0.95, 274, 6; 0.99, 2457, 5; 1.05, 54, 5; 1.02, 326, 4]
> [p6,anovatab6,stat6] = anova1(A);
> multcompare(stat6,'alpha',0.05);
>
> The confidence intervals displayed in this multcompare figure should be as a function of the standard deviation in each of the groups, but it appears that they follow the standard deviation of the whole.
>
> In other words the confidence intervals are not a reflection of these values for each of the groups
>
> std(A)
>
> but seem to be calculated from the entire population...
>
> My question is, how and why are the confidence intervals in the multcompare function in matlab calculated?
>
> Russell

Subject: Confidence interval calculation (for multcompare)

From: Tyler

Date: 19 Apr, 2013 01:09:10

Message: 3 of 6

Hoping to merge this thread:

http://www.mathworks.com/matlabcentral/newsreader/view_thread/255742

"Russell " <rsh935@arl.psu.edu> wrote in message <kkpkv6$ehk$1@newscl01ah.mathworks.com>...
> In statistics, usually confidence intervals are calculated by the standard deviation over the sample size. However, in the multiple comparison, regardless of what the standard deviation is for each of the individual groups, the confidence intervals are always the same. I give an example code down below:
>
> A = [1 5 3;1.1 500 4; 1.08, 10000, 3;0.95, 274, 6; 0.99, 2457, 5; 1.05, 54, 5; 1.02, 326, 4]
> [p6,anovatab6,stat6] = anova1(A);
> multcompare(stat6,'alpha',0.05);
>
> The confidence intervals displayed in this multcompare figure should be as a function of the standard deviation in each of the groups, but it appears that they follow the standard deviation of the whole.
>
> In other words the confidence intervals are not a reflection of these values for each of the groups
>
> std(A)
>
> but seem to be calculated from the entire population...
>
> My question is, how and why are the confidence intervals in the multcompare function in matlab calculated?
>
> Russell

Subject: Confidence interval calculation (for multcompare)

From: Tom Lane

Date: 19 Apr, 2013 13:59:28

Message: 4 of 6

> In statistics, usually confidence intervals are calculated by the standard
> deviation over the sample size. However, in the multiple comparison,
> regardless of what the standard deviation is for each of the individual
> groups, the confidence intervals are always the same.
...
> The confidence intervals displayed in this multcompare figure should be as
> a function of the standard deviation in each of the groups, but it appears
> that they follow the standard deviation of the whole.
> In other words the confidence intervals are not a reflection of these
> values for each of the groups

Russell, the calculations are based on the anova1 fit. This anova model
supposes that the variance is the same across groups, and tests whether the
means may differ. The variance is estimated by pooling the variances of the
different groups.

A reference for the plotted intervals is given inside the file. I'll try to
make sure it's featured more prominently in the future.

Reference: Y. Hochberg and A.C. Tamhane, "Multiple Comparison Procedures,"
Wiley, New York, 1987.
  See 3.32, p. 98.

-- Tom

Subject: Confidence interval calculation (for multcompare)

From: Chen Xing

Date: 25 Jun, 2013 17:52:12

Message: 5 of 6

Looked up Hockberg (1987) as recommended by Tom Lane, and posted scanned images of the relevant pages here (the Adobe Reader plugin needs to be enabled in your browser):
http://www.scintillatingxing.com/think/matlab_mcp_references.html

I gather that the 'error bars' are not used to describe the variance associated with each data point (based on taking the mean across observations); rather, they represent an 'equal width interval' that is applied across the entire 'family' of comparisons and which is used solely for statistical analysis.

The interval size is common to all the observed data points and is used as a criterion when evaluating pairwise differences during the ANOVA- if confidence intervals for two observed data points fail to overlap during a particular pairwise comparison, then one concludes that a significant difference is present. This value, which remains fixed across comparisons, is what you see in the plots generated by the 'multcompare' function.

The actual calculation of interval width, as performed in Matlab, is based on a modified ('improved') version of the Tukey-Kramer procedure (the TK-procedure is popularly implemented due to its simplicity and 'nearly accurate' control of the FWE [familywise error rate]). I haven't burrowed deep enough to fully understand the math behind the equations, but the qualitative difference is that the error bars in published images typically describe the variance associated with individual data points, whereas the error bars plotted in 'multcompare' are calculated for the sake of statistical analysis and are based on pooling of data across treatments and groups.

Subject: Confidence interval calculation (for multcompare)

From: Jeff

Date: 26 Jun, 2013 00:19:01

Message: 6 of 6

On Wednesday, June 26, 2013 5:52:12 AM UTC+12, Chen Xing wrote:

> if confidence intervals for two observed data points fail to overlap during a particular pairwise comparison, then one concludes that a significant difference is present.

That is true, but--just to avoid misunderstanding--the reverse is not. Specifically, there may be a statistically significant difference (in the usual p<.05 or p<.01 sense) even if the confidence intervals do overlap.

A good reference on this point is: Wolfe, R. & Hanley, J. If we're so different, why do we keep overlapping? When 1 plus 1 doesn't make 2. Canadian Medical Association Journal, 2002, 166, 65-66

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us