sampsizepwr for signrank and ranksum non-parametric hypotesis test

Hi, because my data came from a non-normal distribution, I'm using "signrank" and "ranksum" as hypotesis tests to know if both vector samples X and Y have different medians. I want to know the power of the both tests when I decide to reject H0 and accept H1 if p < 0.05, and I found that "sampsizepwr" do this; but I don't know if this function can be applied to non-parametric distributions.
I saw that both "signrank" and "ranksum" use a z-test to calculate the approximated method, but this is achieved by using the rank of the data and not the data itself, so I do not know if I can use the 'z' TESTTYPE in "sampsizepwr" when data came from non-normal distrubutions.
Could any help me, please ??? Thanks, Esteban

 Accepted Answer

You question literally sent me back to the books (Hollander et al., Nonparametric Statistical Tests, Wiley 2014, ISBN 978-0-470-38737-5 to be precise), since I’ve never done power calculations for nonparametric tests. For both the Wilcoxon Signed Rank Test (Eqn. 3.18, P.52) and the Wilcoxon Rank Sum Test (Eqn. 4.25, P.128), ‘the power can be approximated by the standard normal density’, so I infer that the 'z' testtype would be appropriate. This is a single-phrase condensation of a much longer and more detailed discussion, including the relevant equations, so I encourage you to find the book and read the entire section for both tests. I may have missed something that is important to your study.

6 Comments

Thanks !!! I'll looking for this book
My pleasure!
It’s relatively recent, but the best book on nonparametric statistics I’ve seen. If your university library doesn’t already have it, suggest it to them. Consider it for your own library as well, since it’s also relatively inexpensive, especially considering its clear and encyclopaedic coverage. (I have no financial interest in promoting it. I just like it.)
Thanks, I'll recommend it.
So, if I do the next
[p, h] = signrank(X, Y)
power = sampsizepwr('z', [mean(X) std(X)], mean(Y), [], length(X));
do you think these sentences are the right way to find the power on reject H0 if p < 0.05, don't you ?
best regards, Esteban
I would consult Nonparametric Statistical Tests before doing the power calculations. That the Z-statistic is appropriate, to me means that the statistic in each test is normally distributed. The calculation of that statistic may be different than the calculation in sampsizepwr. There may also be other considerations, since the book specifically mentions a left-sided test. (I’ve looked at the MATLAB documentation but not the code for the function. I don’t know if it is appropriate for nonparametric tests.) You may have to write your own anonymous function to do that, but the code seems to me to be fairly straightforward.
Thanks. I'm going to study sampsizepwr and signrank codes in order to check them with definition of power calculation. I'm going to consult the book you recomend me.
Best regards and I wish you a happy new year
My pleasure!
I have no idea what you are doing or how you want to design your experiment, so I cannot help beyond referring you to the best reference I know of.
I commend you for carefully considering the design of your study and the statistical methods you want to apply to it. Too often here on MATLAB Answers we get Questions the essence of which is the ‘I just collected my data, now how do I analyse it’ variety.
I genuinely wish you good fortune in your research. If you need help later, especially with respect to the MATLAB code you write to analyse your data, we are here to help.
I wish you a Happy New Year as well!

Sign in to comment.

More Answers (1)

Hi, I've looked deeper on this point and I found as follows:
suppose you have two vectors, X and Y, with means mX, mY and standard deviation sX and sY. It doesn't matters the distribution of both variables I run, next code give de same results for minimum sample size.
sampsizepwr('z', [mX sX], my, 0.8, [])
the same result as
[~, m0, s0] = zscore(X);
[~, m1, ~] = zscore(Y);
sampsizepwr('z', [m0 s0], m1, 0.8, [])
and also, for paired samples
d = X - Y;
[~, m0, s0] = zscore(X);
[~, md, sd] = zscore(d);
sampsizepwr('z', [0 s0], md, 0.8, [])
Again, it doesn't matters if X and Y were generated by randn , rand , exprnd , etc. All have the same means and standard deviations.
Because nothing is told about the distribution in sampsizepwr , I could suppose the function was developed for normal distribution only, but the sampsizepwr do not "sees" the distribution, only "sees" mean and standard deviation to calculate both size and power, so I cannot affirm if sampsizepwr is or not for normal distribution.
So, I consider, for the safety of my data, to take both size and power as a "reference values" and not the "real values" when my distribution is not normal. As much as my data distribution approximates to a normal distribution, a more realistic values of size and power I'll get.
Finally, I looked inside to " Applied non-parametric statistical methods ", Sprent & Smeeton, ISBN: 978-1-4398-9401-9. This book treats the size and power issues in 5.3 and 6.6, for both Wilcoxnon signed rank and Mann-Whitney U tests respectively. In both chapters there are two considerations: first, to know the non-normal distribution and obtain values for p0:H0 and p1:H1. Second, to use the "Noether (1987a)" formulae to get an asymptotic approximation to a normal distribution of size and power.
Anyway, as deeper as look into the non-parametric size an power theory, it becomes more difficult to find a simple solution.
best regards, Esteban

1 Comment

Hi Esteban.
I certainly agree that no simple solution exists. The non-parametric power calculations are different from the parametric power calculations, even though the statistics themselves are normally distributed. (The only consideration preventing me from copying and posting the several pages of Hollander was copyright infringement.) I suggest you look to it when you have the opportunity, since it has a discussion of the derivation of the power calculation as well as the result. I’ve only used nonparametric statistics a few times in my career, so I don’t have the experience with them that I would like.
I am going to vote your Question up, and suggest that others do the same. This is an important topic that needs to be incorporated into the Statistics Toolbox. Perhaps you can suggest (Enhancements and Bug Reports) that TMW expand the Statistics Toolbox to include them. The contact page is http://www.mathworks.com/support/contact_us/index.html.
Thank you. I learned from your Question.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!