• Background

  • Recall that Poloneicki, Sismanidis, Bland, and Jones (2004) found 8 of the 10 most recent heart transplant operations at St. George's Hospital in London resulted in death within 30 days. This was convincing evidence that the probability of death at St. George's was larger than the national bench mark of 0.15.

    Research Question: What is the estimated probability of death at St. George's Hospital?

    Goals: In this lab, you will

    • Explore alternative confidence interval methods for the probability of "success" in a binomial random process
    • Explore how to interpret the phrase "95% confidence"
  • Estimating the process probability

  • What about a one-sample z-interval?

  • Recall the formula for the standard error of the sample proportion.

    square root of p-hat times 1 minus p-hat divided by n

  • Recall the formula for a one-sample z-confidence interval (aka the Wald interval).

  • What does it mean to be "95% confident"?

  • Key Idea: The confidence level of an interval procedure is supposed to indicate the long-run percentage of intervals that capture the actual parameter value, if repeated random samples were taken from the process. As such, the confidence level presents a measure of the reliability of the method. A confidence interval procedure is considered valid when the achieved long-run coverage rate matches (or comes close to) the stated (aka nominal) confidence level.

    Statistical confidence is a statement about the method, not a probability statement about an individual interval.

    To consider whether the one sample z-interval is valid for this study, we can

    • pretend we know the process probability,
    • generate many random samples from that process,
    • calculate confidence intervals from the simulated samples, and
    • see what percentage of these confidence intervals capture the actual value of the process probability that we specified.

    In the Simulating Confidence Intervals applet below.

    • Suppose the actual process probability of death is 0.15.
    • Set the sample size to n = 10.
    • Press Sample.
    • Click on the interval to see the endpoint(s), midpoint, and width (verify).
  • Because the z-interval procedure has the sample size conditions of the Central Limit Theorem, there are scenarios where we should not use this z-procedure to determine the confidence interval. In these cases these cases, one option is to return to calculating a confidence interval based on the Binomial distribution.

    Use the Method pull-down menu to select Exact Binomial and continue to press Sample until you generate at least 1,000 confidence intervals from a process with  π = 0.15 and n = 10.

    change Method to Exact Binomial

  • The "Plus Four" approach

  • The Clopper-Pearson (exact binomial) method will have a coverage rate of at least 95%, but tends to be “conservative,”
    meaning the coverage rate is higher than 95% so the intervals are longer than they need to be. The
    method is also fairly complicated (as opposed to: “go two standard deviations on each side”). An
    alternative method that has been receiving much attention of late is often called the Plus Four procedure:

    Definition: Plus Four 95% confidence interval for π

    • Determine the number of successes in the sample (X) and the sample size (n)
    • Increase the number of successes by two, and the sample size by 4, and calculate a new sample proportion  p ̃= (X+2)/(n + 4)
    • Use the z-interval procedure as above, using the augumented sample size (n + 4)

    In short, with 95% confidence, the idea is to pretend you have 2 more successes and 2 more failures than
    you really did. 

    To investigate the reliability or "coverage" of the Plus Four procedure:

    • Return to the applet and again set the values of 0.15 and 10.
    • Use the pull-down menu to change select the Plus Four (95%) method
    • Generate at least 1,000 intervals.

     

  • Using the Plus Four Method

  • Key Idea: The Plus Four method is more likely to be a 95% confidence interval method than the one-sample
    z-interval method. It is never a bad idea to use the Plus Four method.

    “Fudging our data” may seem like cheating, but there is actually some nice theory behind this method.
    The plus four adjustment is not that different in principle from the continuity correction we suggested
    with p-values. Many statisticians now recommend using this procedure over the one proportion z-
    interval procedure because it achieves the stated confidence level for any sample size and over the
    Clopper-Pearson binomial interval procedure because the intervals tend to be shorter (narrower). In fact,
    the more general case (something other than 95%) is becoming the default in many software packages.

    To calculate a Plus Four confidence interval for our St. George's data, you can

    • Calculate the by hand, first finding 𝑝̃ and z*
    • Use the Theory-Based Inference applet assuming 8+2 successes and 10+4 as the sample size
    • Use JMP by telling JMP there were 4 more operations consisting of two more deaths than in the actual sample.

    Note: You wouldn't use the Simulating Confidence Intervals applet to answer this question.

  • Summary

  • As you can tell, there are several ways to obtain a confidence interval for a process probability. When the sample size is large, they will yield very similar results for the endpoints and the coverage rate. When the sample size is small, the Adjusted Wald method is preferred. You can use the
    Clopper-Pearson confidence interval method when the sample size is small, but the Binomial intervals tend to be wider and don’t have the estimate ± margin-of-error simplicity). 

    Whatever procedure is used to determine the confidence interval, you interpret a (valid) interval the same way – as the interval of plausible values for the parameter. For example, we are 95% confident that the underlying probability of death within 30 days of a heart transplant operation at St. George’s Hospital is between 0.16 and 0.24; where by “95% confident,” we mean that if we were to use this procedure to construct intervals from thousands of representative samples, roughly 95% of those intervals will succeed in capturing the actual (but unknown) value of the parameter of interest.

    Keep in Mind: The main things you should focus on in your study of confidence intervals are:
    • What parameter is the interval estimating (in context)?
    • What are the effects of sample size, confidence level, and the sample statistic on the width and
    midpoint of the interval?
    • What do we mean by “confidence”?
    • What are the effects (if any!) of sample size, confidence level, and the actual parameter value on
    the coverage rate of the method?
    • Why might one confidence interval method be preferred over another?

    Also note how to use technology to perform these calculations. You should NOT use the Simulating Confidence Intervals applet to construct a confidence interval for a particular sample of data.

  • Should be Empty: