Suppose you use data from a random sample to build a 95% CI around the sample’s mean. Next, suppose you put that sample back into the population. Finally, suppose you get ready to extract a 2nd random sample from the same population, with plans to use the new data to compute just the 2nd sample’s mean. How confident can you be that your 2nd sample’s mean will lie somewhere between the end points of the 1st sample’s 95% CI?
Did you say or think: “95% confident”?
If you did, you’re a bit more confident than you actually should be!
If your 1st sample’s mean were to match perfectly the mean of the population, you could be 95% confident that the 2nd sample’s mean would turn out to be “inside” the 1st sample’s 95% CI. That’s because the end points of your CI would coincide with the 2 points in a sampling distribution of the mean that serve to bookend the middle 95% of that distribution’s means. Select a 2nd sample, and its mean would have a 95% chance of landing between those bookends.
Your 1st sample, however, is not likely to have a mean that matches up perfectly with μ. This will cause the 1st sample’s 95% CI to be “off-center” in the sampling distribution of means. More than half of the CI will be located on the high (or low) side of that distribution’s midpoint, the true population mean. By having the CI’s end points not coincide with the points that bookend the middle 95% of sampling distribution of means, the 95% CI captures less than 95% of those means.
To prove to yourself that a 95% CI based on one sample’s data does not predict, with 95% accuracy, what a 2nd sample’s mean will be like, answer these 2 little questions: (1) How much of a normal distribution lies between the z-score points of +1.96 and –1.96? (2) How much of a normal distribution lies between any other pair of z-scores that are that same distance (3.92) apart from each other?