How are the notches in a boxplot calculated when using the ANOVA1 command?

12 views (last 30 days)
The Matlab help for the boxplot command says "Two medians are significantly different at the 5% significance level if their intervals do not overlap. Interval endpoints are the extremes of the notches or the centers of the triangular markers."
How are these notches calculated?

Answers (1)

the cyclist
the cyclist on 30 Mar 2011
If you type "edit boxplot", you can find the code that calculates the location of the notches:
% Compute notches around the median, based on the quantiles.
nhi = p50 + 1.57*(p75-p25)/sqrt(length(x));
nlo = p50 - 1.57*(p75-p25)/sqrt(length(x));
[I removed some infinity-checking stuff for simplicity.]
"p50" is the median, and "p25" and "p75" are the location of the 25th and 75th percentiles (the edges of the box itself). "x" is the vector of data. Those are all calculated earlier in the file.
The calculation is estimating the confidence interval of the median, based on the location of the quantiles. I don't know all the theory off the top of my head, but I assume that that is a formula for the standard error of the median. Also guessing that there is a normality assumption hidden in there, but I'm not really sure.
  1 Comment
Jonas
Jonas on 20 Sep 2020
The stated formula seams to be based on McGill, R., Tukey, J. W. and Larsen, W. A. (1978) Variations of box plots. The American Statistician 32, 12–16. They are based on asymptotic normality of the median and roughly equal sample sizes for the two medians being compared, and are said to be rather insensitive to the underlying distributions of the samples. The idea appears to be to give roughly a 95% confidence interval for the difference in two medians.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!