Choose Your Own Data-Analytic Adventure


Okay, we'll calculate sample size. Thus we need:
  • Alpha Level (usually .05)
  • Statistical Power (or 1 - Beta Level)
  • Effect Size (i.e., the population correlation)

Alpha level is, WHEN the null hypothesis is true, in other words, WHEN the population effect size is zero, the probability that we will reject the null . Since it's bad to reject the null WHEN it is true, we want a low alpha level. We'll go with the customarily low alpha level of .05.


Statistical power is, WHEN the specified (non-zero) effect size is true, the probability that we will reject the null hypothesis. Since, it's good to reject the null WHEN it is false, we want a lot of statistical power. We'll go with the customary minimally high statistical power of .80.


What do you mean "specified (non-zero) effect size"?

That is the tricky question that we must tackle now. To calculate sample size, we've got to specify the effect size in the population.

But, if we knew the effect size in the population, why would we even conduct this research? That's exactly what we are trying to find out. Isn't it?

There lies the rub. We must guess, and then we use that guess (in conjunction with our alpha level and statistical power) to calculate sample size.

Guess?

Yes. Our guess should be fairly conservative, because if we guess too big of a population effect, then we'll collect too small of sample size.

How do I guess the effect size? Am I supposed to guess the difference in MYPAS scores of the intervention and control groups?

Actually, that's one way to do it. You can take that average difference and divide by the standard deviation (of either group, assuming homoscedasticity), and you'll get Cohen's d. You could use Cohen's d as your effect size in the sample size calculation. This measure of effect size works for regression of a continuous variable on a dichotomous variable (i.e., two-sample t-test). You can play around with the idea at this terrific site.

You seemed surprised. Did you have somethign else in mind?


Cohen's d is a very specialized measure of effect size. I generally prefer to use the Pearson product-moment correlation, r. But, whatever works for you.

Whether we use the difference in means (divided by the standard deviation) or the correlation... I... um... I don't know. How do I even begin? Is one easier? How do I guess for either?

It's daunting. You have to use substantive knowledge and/or prior research. Think of the average control patient; how will she score on the MYPAS. Think of the average intervention patient; how will she score on the MYPAS?

Now that I'm thinking about it. There was a recent study on the effectiveness of family-centered preparation in reducing preoperative anxiety, and they found a difference of about 5 points on the MYPAS between the control group and the intervention group. I think that the full slate of child life services would be even more effective.

Great. There's your difference in means. Do you have standard deviation?

They don't note the standard deviation in that study, but I have the standard deviation from the MYPAS validation study: sd = 8.

It would be ideal to have the standard deviation for your exact population, or at least the sample in which they found the 5-point difference, but we'll take whatever information we can get. If we divide the difference by the standard deviation (5/8) we get a Cohen's d of .625.

Jacob Cohen gives guidelines for effect sizes:

  • "small" d = .2
  • "medium" d = .5
  • "large" d = .8
These guidelines are truly garbage, but they sound great in research proposals. Never use these guidelines in your deep data analytic work, but feel free to reference them in your write-up. As data analysts, we should sometimes throw our audience a bone... as long as we ourselves don't mistake the bone for meat! So, you can write, "We determined sample size based on a medium to large effect size (d = .625), which made sense in light of the researchers' literature review and professional experience (Cohen, 1988)."

Okay. So, now have everything we need to calculate sample size, right?

  • Alpha Level = .05
  • Statistical Power = .80
  • Effect Size = .625

All we need to do now is fire up R and enter three lines of code:
# great site: http://www.statmethods.net/stats/power.html
# install the add-on package for calculating statistical power
install.packages('pwr')
# load the add-on package
library(pwr)
# calculate your sample size!
pwr.t.test(n = , d =.625, sig.level =.05, power =.80, type ="two.sample")

Your sample size (n) is 41.2 in each group.

I suggest that you collect more 41.2 observations in each group. Definitely don't try to collect exactly 41.2 observations, because you'll never find that .2 child! Collecting more observations does a couple of things. First, it protects you in case you were over-optimistic about your population effect size (d = .625). Second, if your population effect size was dead-on accurate, you will increase your statistical power by collecting more observations, and that's never a bad thing.

You can play around here to see how sensitive your results are to different effect sizes:
http://www.cs.uiowa.edu/~rlenth/Power/


You decide to:

Let's call it "good" and submit this data analytic strategy.
or
Let's go at it from another angle and calculate the statistical power of your study.
or
Let's go at it from another angle and calculate the effect size for which your study is geared.