Tag Archives: Confidence Intervals



Is one’s health helped or hurt by being in a hot sauna? Researchers from Finland conducted a study to find out, with results recently published in a top-flight medical journal.

In the study, data came from 2,315 men who used saunas. And what were the study’s findings? Those who had saunas more often each week, and those who stayed longer in the sauna, had, over time, fewer fatal heart attacks.

Why did this study get published in a prestigious scientific journal? One of the reasons was this: the researchers used complex statistical procedures to examine the relationship between sauna use and heart attacks while controlling for things such as age, BP, tobacco use, SES, physical activity, etc.

Here is a list of the intermediate and advanced statistical procedures used by the researchers: (1) a chi square test, (2) analysis of variance, (3) 95% confidence intervals, (4) a multivariate Cox model with several covariates (age, BMI, systolic blood pressure, cholesterol levels, smoking, alcohol consumption, previous myocardial infarction, diabetes, cardiorespiratory fitness, resting heart rate, physical activity, and socioeconomic status), (5) sensitivity analyses, (6) survival ratios using the Kaplan-Meier method, (7) hazard ratios and cumulative hazard curves, (8) plots of Schoenfeld residuals to check the proportional hazards assumption, and (9) Martingale residuals to check the linearity assumption.

The research report was published on February 23, 2015, in the Journal of the American Medical Association: Internal Medicine. It had this title: “Association Between Sauna Bathing and Fatal Cardiovascular and All-Cause Mortality Events.” The author/researchers were Tanjaniina Laukkanen, Hassan Khan, Francesco Zaccardi, and Jari A. Laukkanen.



Leave a comment

Filed under Applications


Suppose you use data from a random sample to build a 95% CI around the sample’s mean. Next, suppose you put that sample back into the population. Finally, suppose you get ready to extract a 2nd random sample from the same population, with plans to use the new data to compute just the 2nd sample’s mean. How confident can you be that your 2nd sample’s mean will lie somewhere between the end points of the 1st sample’s 95% CI?

Did you say or think: “95% confident”?

If you did, you’re a bit more confident than you actually should be!

If your 1st sample’s mean were to match perfectly the mean of the population, you could be 95% confident that the 2nd sample’s mean would turn out to be “inside” the 1st sample’s 95% CI. That’s because the end points of your CI would coincide with the 2 points in a sampling distribution of the mean that serve to bookend the middle 95% of that distribution’s means. Select a 2nd sample, and its mean would have a 95% chance of landing between those bookends.

Your 1st sample, however, is not likely to have a mean that matches up perfectly with μ. This will cause the 1st sample’s 95% CI to be “off-center” in the sampling distribution of means. More than half of the CI will be located on the high (or low) side of that distribution’s midpoint, the true population mean. By having the CI’s end points not coincide with the points that bookend the middle 95% of sampling distribution of means, the 95% CI captures less than 95% of those means.

To prove to yourself that a 95% CI based on one sample’s data does not predict, with 95% accuracy, what a 2nd sample’s mean will be like, answer these 2 little questions: (1) How much of a normal distribution lies between the z-score points of +1.96 and –1.96? (2) How much of a normal distribution lies between any other pair of z-scores that are that same distance (3.92) apart from each other?

Leave a comment

Filed under Misconceptions