Quiz  Logo
  • Background

  • Recall that Poloneicki, Sismanidis, Bland, and Jones (2004) found 8 of the 10 most recent heart transplant operations at St. George's Hospital in London resulted in death within 30 days. This was convincing evidence that the probability of death at St. George's was larger than the national bench mark of 0.15.

    Research Question: From this sample, estimate the probability of death within 30 days of a heart transplantation at St. George's Hospital

    Goals: In this investigation, you will

    • Evaluate an alternative confidence interval methods for the probability of "success" in a binomial random process
  • Estimating the process probability

  •  

    Using the 8 out of 10 results from the research study, use R or JMP to calculate the Clopper-Pearson binomial confidence interval:

    JMP: Use the Journal File, select Confidence Interval for One Proportion, select Summary Stats. Press OK.  Then select the Binomial radio button, enter 8 for the Number of Successes and 10 for the Sample Size

    R: Use iscambinomtest(8, 10, conf.level = .95) or binom.test(8, 10, conf.level = 0.95)

    Find another group with the other software - do your results match?

  • Now use R or JMP to calculate the one-sample z-interval.

    JMP: Change the radio button to Normal Approximation.

    R: iscamonepropztest(8, 10, conf.level = .95)

    Ask a group using the other software - do your results match?  Can you explain what JMP is doing?

     

  • Because the sample size is so small, we don't trust the normal approximation.  But the Binomial interval might still be wider than we need it to be.  So let's try another method.  Focus less on where this method comes from and more on how to evaluate whether it is a valid method and, if so, is it a better method.

     

  • The "Plus Four" approach

  • The Clopper-Pearson (exact binomial) method will always have a coverage rate of at least 95%, but tends to be “conservative,” meaning the coverage rate is often higher than 95% so the intervals are longer than they need to be. The method is also fairly complicated (as opposed to: “go two standard deviations on each side”). An alternative method is often called the Plus Four procedure:

    Definition: Plus Four 95% confidence interval for π

    • Determine the number of successes in the sample (X) and the sample size (n)
    • Increase the number of successes by two, and the sample size by 4, and calculate a new sample proportion  p ̃= (X+2)/(n + 4)
    • Use the z-interval procedure as above, using p ̃ and the augumented sample size (n + 4)

    In short, with 95% confidence, the idea is to pretend you have 2 more successes and 2 more failures than you really did. 

    To investigate the reliability or "coverage" of the Plus Four procedure:

    • Use the Simulating Confidence Intervals applet (see below).
    • Set the values of probability π to 0.15 and Sample size to 10.
    • Use the Method pull-down menu to change select the Plus Four (95%) method
    • Generate at least 1,000 intervals.

     

  • (f) In the applet, change the number of intervals to 100. Press Reset. Use the pull-down menu to select the Blaker method (the Exact Binomial using the tail probabilities), and wait a bit. (If it's just 100 at a time, it's not too bad to hit Sample a few more times.) Report the percentage of intervals that capture the parameter AND the average width of these 100 intervals.
    Running Total:
    Average width:

  • Using the Plus Four Method

  • Key Idea: The Plus Four method is more likely to be a 95% confidence interval method than the one-sample z-interval method. It is never a bad idea to use the Plus Four method.

    “Fudging our data” may seem like cheating, but there is actually some nice theory behind this method. The plus four adjustment is not that different in principle from the continuity correction we suggested with p-values. Many statisticians now recommend using this procedure over the one proportion z-interval procedure because it achieves the stated confidence level for any sample size and over the Clopper-Pearson binomial interval procedure because the intervals tend to be shorter (narrower). In fact, the more general case (something other than 95%) is becoming the default in many software packages. (The Blaker method in others.)

    To calculate a Plus Four confidence interval for our St. George's data, do one of the following:

    • Calculate by hand, first finding 𝑝̃ and z*
    • Use the Theory-Based Inference applet assuming 8+2 successes and 10+4 as the sample size
    • Use JMP by telling JMP there were 4 more operations consisting of two more deaths than in the actual sample.
    • Use R by telling R there were 4 more operations consisting of two more deaths than in the actual sample.

    IMPORTANT: You would NOT use the Simulating Confidence Intervals applet to answer this question.

  • Browse Files
    Drag and drop files here
    Choose a file
    Cancelof
  • Summary

  • As you can tell, there are several ways to obtain a confidence interval for a process probability. When the sample size is large, they will yield very similar results for the endpoints and the coverage rate. When the sample size is small, the Adjusted Wald method is preferred. You can use the
    Clopper-Pearson confidence interval method when the sample size is small, but the Binomial intervals tend to be wider and don’t have the estimate ± margin-of-error simplicity, and is much more computational efficient than the Blaker method.

    Whatever procedure is used to determine the confidence interval, you interpret a (valid) interval the same way – as the interval of plausible values for the parameter. For example, we are 95% confident that the underlying probability of death within 30 days of a heart transplant operation at St. George’s Hospital is between 0.16 and 0.24; where by “95% confident,” we mean that if we were to use this procedure to construct intervals from thousands of representative samples, roughly 95% of those intervals will succeed in capturing the actual (but unknown) value of the parameter of interest.

    Keep in Mind: The main things you should focus on in your study of confidence intervals are:
    • What parameter is the interval estimating (in context)?
    • What are the effects of sample size, confidence level, and the sample statistic on the width and midpoint of the interval?
    • What do we mean by “confidence”?
    • What are the effects (if any!) of sample size, confidence level, and the actual parameter value on the coverage rate of the method?
    • Why might one confidence interval method be preferred over another?

    Also note how to use technology to perform these calculations. You should NOT use the Simulating Confidence Intervals applet to construct a confidence interval for a particular sample of data.

  • Should be Empty: