Is one’s health helped or hurt by being in a hot sauna? Researchers from Finland conducted a study to find out, with results recently published in a top-flight medical journal.

In the study, data came from 2,315 men who used saunas. And what were the study’s findings? Those who had saunas more often each week, and those who stayed longer in the sauna, had, over time, fewer fatal heart attacks.

Why did this study get published in a prestigious scientific journal? One of the reasons was this: the researchers used complex statistical procedures to examine the relationship between sauna use and heart attacks while controlling for things such as age, BP, tobacco use, SES, physical activity, etc.

Here is a list of the intermediate and advanced statistical procedures used by the researchers: (1) a chi square test, (2) analysis of variance, (3) 95% confidence intervals, (4) a multivariate Cox model with several covariates (age, BMI, systolic blood pressure, cholesterol levels, smoking, alcohol consumption, previous myocardial infarction, diabetes, cardiorespiratory fitness, resting heart rate, physical activity, and socioeconomic status), (5) sensitivity analyses, (6) survival ratios using the Kaplan-Meier method, (7) hazard ratios and cumulative hazard curves, (8) plots of Schoenfeld residuals to check the proportional hazards assumption, and (9) Martingale residuals to check the linearity assumption.

The research report was published on February 23, 2015, in the Journal of the American Medical Association: Internal Medicine. It had this title: “Association Between Sauna Bathing and Fatal Cardiovascular and All-Cause Mortality Events.” The author/researchers were Tanjaniina Laukkanen, Hassan Khan, Francesco Zaccardi, and Jari A. Laukkanen.



Leave a comment

Filed under Applications


Newspaper Reports (Misleading)

Recently, a team of 3 researchers conducted an investigation to see if children’s post-surgical pain could be reduced by “audio therapy.”

In the study, the participating children were randomly assigned to two main groups. Those in the treatment group listened—after surgery—to a self-selected audio book or musical playlist. Those in the control group did not listen to anything. Each child in each group was asked to rate his/her post-surgical pain level on two occasions: prior to and immediately after the time the treatment group received the audio therapy.

A newspaper summary of the results stated:

Patients listening to music and audiobooks reported feeling less pain after receiving the therapy, according to the study.”

This sentence makes it seem that audio therapy worked for everyone who got it. Not so.

Data in the published technical research report show that most children in the treatment group said their pain was lower after receiving the audio therapy. However, some of the children in that group had more pain after audio therapy than before. Also, some children in the control group had more pain reduction than some of the children who received audio therapy.

The quoted newspaper sentence should have begun with 2 important words: “On average.” By adding those two words, the misleading sentence from the newspaper report becomes a true and factual statement about the study’s results.

(The published research report carried this title: “The effect of audio therapy to treat postoperative pain in children undergoing major surgery: a randomized controlled trial.” It was published in a peer-reviewed journal called Pediatric Surgery.)

Leave a comment

Filed under Mini-Lessons


The extrapolation in the accompanying cartoon (from Randall Monroe’s website, is rediculous. No one would ever do that kind of silly data-based projection into the future. In many areas of our daily lives, however, people make unjustified predictions based on existing, accurate data. Consider these 2 examples, one dealing with the stock market and the other concerning survey research.

How do people tend to invest their money in the stock market? From controlled experiments as well as from observational studies, the findings are the same. When the stock market has been doing well, most people are “bullish” and want to invest more. In contrast, when there’s been a recent drop in stock values, the typical investor gets “bearish” and wants to sell. The term extrapolation bias has been coined to describe cases like these wherein people think that the future will be a continuation of the past. You, too, possess this bias if you make short-term predictions that fail to consider (1) the variability of data points used to form a “trend line” and (2) the possibility that a trend line can change its direction and, for example, begin to angle down even though it has been angling up.

If you receive a mailed or online survey, do you fill it out and send it in? If you do, you help to increase the survey’s response rate: the percentage of contacted people who complete and return the survey. In a recent research study (, the response rate was only 8.41%. Despite receiving completed surveys from just 241 of the 2,865 people initially contacted, the researcher extrapolated the study’s findings to all of the individuals to whom the survey was sent. This is an unjustified thing to do because “nonrespondents” may well be different from those from whom data are collected. Most likely, we all are guilty of this kind of extrapolation-beyond-the-data. We hear opinions expressed by trusted friends, relatives, co-workers, neighbors, bloggers, or TV analysts, and then we presume that others have the same thoughts. ’Tis a risky thing to do!

Leave a comment

Filed under Jokes & Humor, Mini-Lessons


Many People Get Fooled

The Motley Fool provides advice on money management and investing. However, its recommendations can and should be used by people in other fields. For example, the following 20-word tip, from the “Fool’s School,” should be memorized by everyone who encounters statistically-based claims or findings in politics, medicine, psychology, education, and all other arenas of our lives:

“Never blindly accept what you read. Think critically about not just words, but numbers. They’re not always what they seem.”

Here are 5 examples illustrating how numbers in statistics often do NOT mean what they seem to indicate:

Example A

If the 14 players on a basketball team have a median height of 6 feet 6 inches, it might seem that 7 of those athletes must be shorter than 6’6” whereas 7 must be taller than that. Wrong!

Example B

If the data on 2 variables produce a correlation of +.50, it might seem that the strength of the measured relationship is exactly midway between being ultra weak and ultra strong. Not so!

Example C

If a carefully conducted scientific survey indicates that Candidate X currently has the support of 57% of likely voters with a margin of error of plus or minus 3 percentage points, it might seem that a duplicate survey conducted on the same day in the same way would show Candidate X’s support to be somewhere between 54% and 60%. Bad thought!

Example D

If a null hypothesis is tested and the data analysis indicates that p = .02, it might seem that there’s only a 2% chance that the null hypothesis is true. Nope!

Example E

If, in a multiple regression study, the correlation between a particular independent variable and the dependent variable is r = 0.00, it might seem that this independent variable is totally useless as a predictor. Not necessarily!

The Motley Fool’s admonition, shown above in italics, contains 20 words. If you can’t commit to memory the entirety of this important warning, here’s a condensed version of it:

“Numbers. They’re not always what they seem.”

Leave a comment

Filed under Mini-Lessons, Misconceptions




Year after year, the annual statistical convention is held in the same city. This metropolis has 26 pubs, each named by an alphabet letter: A, B, C, … , Y, Z. The statisticians who are nice, kind, & considerate people go to a wide variety of these “drinking holes.” The mean statisticians, however, patronize just one of them. Which one?


(This little effort at statistical humor comes from S. Huck)

*   *   *   *   *   *   *   *   *

Beyond the Joke:

In statistics, the concept or numerical value of the arithmetic mean can be symbolized in various ways.

Many people use the letter M (often capitalized and italicized) to represent the arithmetic mean. A second way to do this is with the lower-case Greek letter, mu. (In several textbooks, M is used to represent the sample mean whereas mu designates the population mean.)

A third way to symbolize the arithmetic mean is with the letter X accompanied by a short, horizontal line positioned directly above the X. This line is referred to as a “bar,” and the entire symbol is read as “X-bar.”

Symbols for mean

In written statistical discussions, a bar can be positioned above letters (or symbols) other than the letter X. When this occurs, the bar indicates that the arithmetic mean has been (or should be) computed for the various numerical values of the variable represented by whatever letter or symbol has the bar above it. For example, if you see the lower-case letter r with a bar above it, you should refer to it as “r-bar” and guess that it represents the mean of a group of correlation coefficients.  

Leave a comment

Filed under Jokes & Humor, Mini-Lessons


Normal Curve (Product Adoption)

When a new and good product hits the market, how fast or slow are you to buy it? Some people get it immediately. Others wait for varying lengths of time before making their purchase decision.

According to one Internet website (, consumers can be classified into 5 categories based on how quickly they acquire new items. A picture of the famous bell-shaped curve, like the one shown here, indicated the descriptive labels and sizes of the 5 groups.

By considering the percentage of people in each of the 5 groups (as well as the position of the short, dark “notches” on the bell curve’s baseline), you should be able to discern that the statistical concepts of mean and standard deviation were used to “define” each group. For example, a person would be classified as an Early Adopter if he/she tends to purchase new products with a speed that’s between 1 and 2 SDs faster than average.

It is interesting to note that there are 3 sections on the left side of this bell curve but only 2 on the right. The pink area begins 1 SD from the mean and extends all the way to the right. Thus, the percentage of Laggards is equal to the combined percentages of Innovators and Early Adopters. Some people, if creating this picture anew, might split the pink area into 2 parts (thus forming a total of 6 sections rather than 5), with the percentage of Laggards equal to the percentage of Innovators.

To see the original discussion of what was called the “Product Diffusion Curve,” go to

Leave a comment

Filed under Applications, Mini-Lessons


Mean Median Mode (Oldest)

According to one source, the word “mode” was first used (by Karl Pearson) in 1895. The concept of the “median” is a bit older; it was used for the first time by the Frenchman Antoine Cournot in 1843. The notion of the “arithmetic mean” is even older.

QUESTION: When do you think the concept of the “arithmetic mean” was born?

ANSWER: A long, long, LONG time ago. The Pythagoreans studied it in the 5th century B.C.

Leave a comment

Filed under History of Statistical Terms