Is one’s health helped or hurt by being in a hot sauna? Researchers from Finland conducted a study to find out, with results recently published in a top-flight medical journal.

In the study, data came from 2,315 men who used saunas. And what were the study’s findings? Those who had saunas more often each week, and those who stayed longer in the sauna, had, over time, fewer fatal heart attacks.

Why did this study get published in a prestigious scientific journal? One of the reasons was this: the researchers used complex statistical procedures to examine the relationship between sauna use and heart attacks while controlling for things such as age, BP, tobacco use, SES, physical activity, etc.

Here is a list of the intermediate and advanced statistical procedures used by the researchers: (1) a chi square test, (2) analysis of variance, (3) 95% confidence intervals, (4) a multivariate Cox model with several covariates (age, BMI, systolic blood pressure, cholesterol levels, smoking, alcohol consumption, previous myocardial infarction, diabetes, cardiorespiratory fitness, resting heart rate, physical activity, and socioeconomic status), (5) sensitivity analyses, (6) survival ratios using the Kaplan-Meier method, (7) hazard ratios and cumulative hazard curves, (8) plots of Schoenfeld residuals to check the proportional hazards assumption, and (9) Martingale residuals to check the linearity assumption.

The research report was published on February 23, 2015, in the *Journal of the American Medical Association: Internal Medicine. *It had this title:* “Association Between Sauna Bathing and Fatal Cardiovascular and All-Cause Mortality Events.” *The author/researchers were Tanjaniina Laukkanen, Hassan Khan, Francesco Zaccardi, and Jari A. Laukkanen.

]]>

Recently, a team of 3 researchers conducted an investigation to see if children’s post-surgical pain could be reduced by “audio therapy.”

In the study, the participating children were randomly assigned to two main groups. Those in the treatment group listened—after surgery—to a self-selected audio book or musical playlist. Those in the control group did not listen to anything. Each child in each group was asked to rate his/her post-surgical pain level on two occasions: prior to and immediately after the time the treatment group received the audio therapy.

A newspaper summary of the results stated:

“*Patients listening to music and audiobooks reported feeling less pain after receiving the therapy, according to the study*.”

This sentence makes it seem that audio therapy worked for everyone who got it. Not so.

Data in the published technical research report show that most children in the treatment group said their pain was lower after receiving the audio therapy. However, some of the children in that group had more pain after audio therapy than before. Also, some children in the control group had more pain reduction than some of the children who received audio therapy.

The quoted newspaper sentence should have begun with 2 important words: “On average.” By adding those two words, the misleading sentence from the newspaper report becomes a true and factual statement about the study’s results.

(The published research report carried this title: “The effect of audio therapy to treat postoperative pain in children undergoing major surgery: a randomized controlled trial.” It was published in a peer-reviewed journal called *Pediatric Surgery.*)

The extrapolation in the accompanying cartoon (from Randall Monroe’s website, xkcd.com) is rediculous. No one would ever do that kind of silly data-based projection into the future. In many areas of our daily lives, however, people make unjustified predictions based on existing, accurate data. Consider these 2 examples, one dealing with the stock market and the other concerning survey research.

How do people tend to invest their money in the stock market? From controlled experiments as well as from observational studies, the findings are the same. When the stock market has been doing well, most people are “bullish” and want to invest more. In contrast, when there’s been a recent drop in stock values, the typical investor gets “bearish” and wants to sell. The term *extrapolation bias* has been coined to describe cases like these wherein people think that the future will be a continuation of the past. You, too, possess this bias if you make short-term predictions that fail to consider (1) the variability of data points used to form a “trend line” and (2) the possibility that a trend line can change its direction and, for example, begin to angle down even though it has been angling up.

If you receive a mailed or online survey, do you fill it out and send it in? If you do, you help to increase the survey’s *response rate*: the percentage of contacted people who complete and return the survey. In a recent research study (http://gradworks.umi.com/33/91/3391754.html), the response rate was only 8.41%. Despite receiving completed surveys from just 241 of the 2,865 people initially contacted, the researcher extrapolated the study’s findings to all of the individuals to whom the survey was sent. This is an unjustified thing to do because “nonrespondents” may well be different from those from whom data are collected. Most likely, we all are guilty of this kind of extrapolation-beyond-the-data. We hear opinions expressed by trusted friends, relatives, co-workers, neighbors, bloggers, or TV analysts, and then we presume that others have the same thoughts. ’Tis a risky thing to do!

The Motley Fool provides advice on money management and investing. However, its recommendations can and should be used by people in other fields. For example, the following 20-word tip, from the “Fool’s School,” should be memorized by everyone who encounters statistically-based claims or findings in politics, medicine, psychology, education, and all other arenas of our lives:

*“Never blindly accept what you read. Think critically about not just words, but numbers. They’re not always what they seem.”*

Here are 5 examples illustrating how numbers in statistics often do **NOT** mean what they seem to indicate:

**Example A**

If the 14 players on a basketball team have a median height of 6 feet 6 inches, it might seem that 7 of those athletes must be shorter than 6’6” whereas 7 must be taller than that. Wrong!

**Example B**

If the data on 2 variables produce a correlation of +.50, it might seem that the strength of the measured relationship is exactly midway between being ultra weak and ultra strong. Not so!

**Example C**

If a carefully conducted scientific survey indicates that Candidate X currently has the support of 57% of likely voters with a margin of error of plus or minus 3 percentage points, it might seem that a duplicate survey conducted on the same day in the same way would show Candidate X’s support to be somewhere between 54% and 60%. Bad thought!

**Example D**

If a null hypothesis is tested and the data analysis indicates that *p* = .02, it might seem that there’s only a 2% chance that the null hypothesis is true. Nope!

**Example E**

If, in a multiple regression study, the correlation between a particular independent variable and the dependent variable is *r* = 0.00, it might seem that this independent variable is totally useless as a predictor. Not necessarily!

The Motley Fool’s admonition, shown above in italics, contains 20 words. If you can’t commit to memory the entirety of this important warning, here’s a condensed version of it:

*“Numbers.* *They’re not always what they seem.”*

BACKGROUND & QUESTION:

Year after year, the annual statistical convention is held in the same city. This metropolis has 26 pubs, each named by an alphabet letter: A, B, C, … , Y, Z. The statisticians who are nice, kind, & considerate people go to a wide variety of these “drinking holes.” The mean statisticians, however, patronize just one of them. Which one?

ANSWER: The X-Bar

(This little effort at statistical humor comes from S. Huck)

* * * * * * * * *

**Beyond the Joke**:

*In statistics, the concept or numerical value of the arithmetic mean can be symbolized in various ways.*

*Many people use the letter M (often capitalized and italicized) to represent the arithmetic mean. A second way to do this is with the lower-case Greek letter, mu. (In several textbooks, M is used to represent the sample mean whereas mu designates the population mean.)*

*A third way to symbolize the arithmetic mean is with the letter X accompanied by a short, horizontal line positioned directly above the X. This line is referred to as a “bar,” and the entire symbol is read as “X-bar.”*

*In written statistical discussions, a bar can be positioned above letters (or symbols) other than the letter X. When this occurs, the bar indicates that the arithmetic mean has been (or should be) computed for the various numerical values of the variable represented by whatever letter or symbol has the bar above it. For example, if you see the lower-case letter r with a bar above it, you should refer to it as “r-bar” and guess that it represents the mean of a group of correlation coefficients. *

When a new and good product hits the market, how fast or slow are you to buy it? Some people get it immediately. Others wait for varying lengths of time before making their purchase decision.

According to one Internet website (www.quickmba.com), consumers can be classified into 5 categories based on how quickly they acquire new items. A picture of the famous bell-shaped curve, like the one shown here, indicated the descriptive labels and sizes of the 5 groups.

By considering the percentage of people in each of the 5 groups (as well as the position of the short, dark “notches” on the bell curve’s baseline), you should be able to discern that the statistical concepts of mean and standard deviation were used to “define” each group. For example, a person would be classified as an Early Adopter if he/she tends to purchase new products with a speed that’s between 1 and 2 SDs faster than average.

It is interesting to note that there are 3 sections on the left side of this bell curve but only 2 on the right. The pink area begins 1 SD from the mean and extends all the way to the right. Thus, the percentage of Laggards is equal to the combined percentages of Innovators and Early Adopters. Some people, if creating this picture anew, might split the pink area into 2 parts (thus forming a total of 6 sections rather than 5), with the percentage of Laggards equal to the percentage of Innovators.

To see the original discussion of what was called the “Product Diffusion Curve,” go to http://www.quickmba.com/marketing/product/diffusion/

]]>According to one source, the word “mode” was first used (by Karl Pearson) in 1895. The concept of the “median” is a bit older; it was used for the first time by the Frenchman Antoine Cournot in 1843. The notion of the “arithmetic mean” is even older.

QUESTION: When do you think the concept of the “arithmetic mean” was born?

ANSWER: A long, long, LONG time ago. The Pythagoreans studied it in the 5th century B.C.

]]>College and university buildings are usually named after the individual(s) who provide all or most of the money needed to design and build them. On rare occasions, however, a building is named in honor of a professor. That’s the case with the Lindquist Center at the University of Iowa. It is named after E. F. Lindquist, a teacher and scholar who made major contributions to the fields of statistics and testing.

In one of the books Lindquist authored, he provided some sage advice to those who analyze data with statistical tools ** and** to those who read or hear the research-based claims made by those who have analyzed data statistically. Here is what Lindquist said:

“Sound statistical judgment involves a keen appreciation of the inherent LIMITATIONS of statistical techniques and of the original data to which they are applied. In the derivation of these techniques, assumptions are frequently made which cannot be satisfied completely in practical applications. The failure to satisfy these conditions necessitates many qualifications in the interpretations of the results obtained.”

In the middle sentence of this passage, notice that Lindquist points out that important assumptions (concerning data and analytic tools) frequently are ** not** satisfied in studies conducted out in the “real world.” As a consequence of these assumptions being violated, Lindquist then asserts, research findings need to be qualified. Being aware of the LIMITATIONS of statistics, he argues, is necessary for sound statistical judgment.

Unfortunately, many applied researchers who publish research reports based on the statistical analysis of numerical data pay little or no attention to the limitations of their data and of the statistical tools they use. Theoretically, the review process used by good journals is supposed to prevent the publication of articles lacking the “sound statistical judgment” called for by Lindquist. In practice, however, not-so-good articles sometimes slip through the review process.

When reading or listening to the summary of a statistically-based research investigation, be vigilant and try to discern whether or not the researcher(s) who conducted the investigation used what Lindquist referred to as “sound statistical judgment.” If so, be more inclined to be influenced by the study’s finding(s). If not, resist the temptation to believe all you read or hear simply because it’s a summary of research.

]]>Imagine that each of N=6 men has a hat. Also imagine that these hats are identical except that each man’s name is written inside his hat. Finally, imagine that the 6 hats are taken up and then later, because they look alike, randomly returned to the men.

As the 6 hats are returned to the 6 men, there’s a chance that no man will receive his own hat. The chance of this happening is a tad greater than 1 in 3. To be more precise, the probability (to 3 decimal places) of all 6 hats going to the *wrong* individuals is .368.

Now, let’s add a new wrinkle to this imaginary situation. Suppose the number of men (each with a hat) is greater than 6. What if there are 7 men? Or 8? Or more? As N increases, what happens to the probability that no hat will be returned to its proper owner? Some people guess that this probability goes up as N increases. Others guess that this probability goes down.

Both thoughts are wrong.

That’s because the likelihood of no correct “match” is virtually the same for any N > 5, whether N = 6 or N = 600 or N = 600,000!

The actual probability (p) of having no hat returned to its proper owner is given by this formula:

p = 1/(2!) – 1/(3!) + 1/(4!) – 1/(5!) + . . .

where there are N-1 terms on the right side of the equation. With the symbol “!” standing for “factorial,” we could rewrite the above formula as

p = 1/2 – 1/6 + 1/24 – 1/120 + . . .

As either of the above formulas shows, additional terms on the right side of the equation have a smaller and smaller impact on the value of p. Moreover, the drop-off of this impact is sharp, not gradual. This fact is made clear by the following chart showing the value of p, to 6 decimal places, for the case where N = 2, 3, 4, … , 10.

N = 2 p = .500000

N = 3 p = .333333

N = 4 p = .375000

N = 5 p = .366666

N = 6 p = .368054

N = 7 p = .367857

N = 8 p = .367882

N = 9 p = .367879

N = 10 p = .367879

It should be noted that this puzzle question is sometimes referred to as “Montmort’s Problem.” Montmort was a Frenchman who studied the probability behind a game called “Treize.” (*Treize* is the French word for 13.) In its original form, the puzzle question dealt with a jar containing identical balls numbered 1, 2, 3, … , 13. If balls are randomly pulled out of the jar, one at a time, the puzzle question was stated like this: “What’s the probability that the 1st ball taken from the jar will *not* be the ball numbered 1, that the 2nd ball will *not* be the ball numbered 2, and so on, with the end result being that no number on any ball matches the order in which the ball is removed from the jar?”

]]>

(The following effort at statistical humor comes from S. Huck)

BACKGROUND AND BUBBA’S QUESTION:

True to the weather forecast, the college campus was being blasted by a heavy snowfall. Inside a small dining hall, students were eating, studying, talking, & texting. Suddenly, Bubba darted outside where he scooped up some of the white stuff, packed it together in his hands, and then quickly returned inside. Getting everyone’s attention, Bubba held up the cold, white sphere he had just made and said: “Hey, I learned about this in my stats course. Guess what it is?”

BUBBA’S ANSWER: “A snowball sample!”

*** * * * * * * * * ***

*Beyond the Joke**: Things worth knowing about snowball samples:*

*1. Definition**: A snowball sample is formed during the time period when people are being recruited to serve as a study’s research participants. Through face-to-face contact or indirect methods (such as posted notices), the researcher successfully solicits certain individuals to voluntarily enter the study. Next, those initial volunteers are asked to recruit additional participants. This process—of existing volunteers recruiting new volunteers—continues until the desired sample size has been achieved. *

*2. Idea Behind the Name**: Imagine a snowball rolling down a steep, snow-covered hill. At first, the snowball is small. But it gets larger and larger as it heads toward to the bottom of the hill. In a similar fashion, a snowball sample grows in size as volunteer participants successfully recruit additional participants.*

*3. When Used**: Snowball samples are used mainly in studies wherein (1) the researcher doesn’t know who the potential participants are or how to contact them, or (2) potential volunteers are more likely to agree to be in a study if they are recruited by a “peer” rather than by an unknown researcher.*

*4. Example**: In a research report entitled “The ‘Staying Safe’ Intervention: Training People Who Inject Drugs in Strategies to Avoid Injection-Related HCV and HIV Infections” (from the journal: AIDS EDUCATION AND PREVENTION), the researchers stated that “Snowball sampling of participants began with eight participants directly recruited from two sources…. These eight participants then recruited 60 eligible peers.”*

*5. Quality**: Because of the way snowball samples are formed, it is difficult to generalize information about them to larger populations. (Such generalizations are much easier to make with stratified random samples and other kinds of samples classified as “probability samples.”) Thus, snowball samples are most useful in studies wherein (1) the goal is to generate rather than confirm hypotheses or (2) the participants, collectively, are considered to be the target group of interest. *