Tag Archives: Mean

A DRINKING HOLE

BACKGROUND & QUESTION:

Year after year, the annual statistical convention is held in the same city. This metropolis has 26 pubs, each named by an alphabet letter: A, B, C, … , Y, Z. The statisticians who are nice, kind, & considerate people go to a wide variety of these “drinking holes.” The mean statisticians, however, patronize just one of them. Which one?

(This little effort at statistical humor comes from S. Huck)

*   *   *   *   *   *   *   *   *

Beyond the Joke:

In statistics, the concept or numerical value of the arithmetic mean can be symbolized in various ways.

Many people use the letter M (often capitalized and italicized) to represent the arithmetic mean. A second way to do this is with the lower-case Greek letter, mu. (In several textbooks, M is used to represent the sample mean whereas mu designates the population mean.)

A third way to symbolize the arithmetic mean is with the letter X accompanied by a short, horizontal line positioned directly above the X. This line is referred to as a “bar,” and the entire symbol is read as “X-bar.”

In written statistical discussions, a bar can be positioned above letters (or symbols) other than the letter X. When this occurs, the bar indicates that the arithmetic mean has been (or should be) computed for the various numerical values of the variable represented by whatever letter or symbol has the bar above it. For example, if you see the lower-case letter r with a bar above it, you should refer to it as “r-bar” and guess that it represents the mean of a group of correlation coefficients.

Filed under Jokes & Humor, Mini-Lessons

HOW FAST DO YOU BUY THINGS?

When a new and good product hits the market, how fast or slow are you to buy it? Some people get it immediately. Others wait for varying lengths of time before making their purchase decision.

According to one Internet website (www.quickmba.com), consumers can be classified into 5 categories based on how quickly they acquire new items. A picture of the famous bell-shaped curve, like the one shown here, indicated the descriptive labels and sizes of the 5 groups.

By considering the percentage of people in each of the 5 groups (as well as the position of the short, dark “notches” on the bell curve’s baseline), you should be able to discern that the statistical concepts of mean and standard deviation were used to “define” each group. For example, a person would be classified as an Early Adopter if he/she tends to purchase new products with a speed that’s between 1 and 2 SDs faster than average.

It is interesting to note that there are 3 sections on the left side of this bell curve but only 2 on the right. The pink area begins 1 SD from the mean and extends all the way to the right. Thus, the percentage of Laggards is equal to the combined percentages of Innovators and Early Adopters. Some people, if creating this picture anew, might split the pink area into 2 parts (thus forming a total of 6 sections rather than 5), with the percentage of Laggards equal to the percentage of Innovators.

To see the original discussion of what was called the “Product Diffusion Curve,” go to http://www.quickmba.com/marketing/product/diffusion/

Filed under Applications, Mini-Lessons

HOW OLD ARE THE MEDIAN, MODE, & ARITHMETIC MEAN?

According to one source, the word “mode” was first used (by Karl Pearson) in 1895. The concept of the “median” is a bit older; it was used for the first time by the Frenchman Antoine Cournot in 1843. The notion of the “arithmetic mean” is even older.

QUESTION: When do you think the concept of the “arithmetic mean” was born?

ANSWER: A long, long, LONG time ago. The Pythagoreans studied it in the 5th century B.C.

Filed under History of Statistical Terms

DO YOU KNOW SOMEONE WITH CANCER

Dr. Stephen Jay Gould, a world-famous scientist who taught at Harvard, was 40 when diagnosed with cancer. He discovered that people with his kind of cancer live for a median of 8 months. Gould’s down-to-earth essay, “The Median Isn’t the Message,” deals with his “survival expectancy.” It’s considered by some to be “the wisest, most humane thing ever written about cancer and statistics.”

In his thoughtful commentary, Gould offered some important advice to those who hear (for themselves or a loved one) grim diagnoses based on median survival rates. As Gould so correctly pointed out,

What does “median mortality of eight months” signify in our vernacular? I suspect that most people, without training in statistics, would read such a statement as “I will probably be dead in eight months”––the very conclusion that must be avoided, since it isn’t so….

Dr. Gould’s essay contains important food-for-thought not just for those concerned about cancer or other life-ending diseases, but also for those who produce or receive statistically-based research claims in any disciple. In a nutshell, his admonition says: Don’t focus so heavily on means and medians that the important underlying variability is totally overlooked. As Gould put it,

We still carry the historical baggage of a Platonic heritage that seeks sharp essences and definite boundaries. … This Platonic heritage, with its emphasis in clear distinctions and separated immutable entities, leads us to view statistical measures of central tendency wrongly, indeed opposite to the appropriate interpretation in our actual world of variation, shadings, and continua. In short, we view means and medians as the hard “realities,” and the variation that permits their calculation as a set of transient and imperfect measurements of this hidden essence.

If the median is the reality and variation around the median just a device for its calculation, the “I will probably be dead in eight months” may pass as a reasonable interpretation. … But all evolutionary biologists know that variation itself is nature’s only irreducible essence. Variation is the hard reality, not a set of imperfect measures for a central tendency. Means and medians are the abstractions.

If you’d like to hear Gould’s essay read while you see a series of photos of him at work and play, click this link: http://www.youtube.com/watch?v=cH6XuiOBbkc

If you’d prefer to read Gould’s essay yourself, or print it, go here: http://cancerguide.org/median_not_msg.html

Filed under Applications, Mini-Lessons, Quotes

HOW CONFIDENT CAN YOU BE IN A CONFIDENCE INTERVAL?

Suppose you use data from a random sample to build a 95% CI around the sample’s mean. Next, suppose you put that sample back into the population. Finally, suppose you get ready to extract a 2nd random sample from the same population, with plans to use the new data to compute just the 2nd sample’s mean. How confident can you be that your 2nd sample’s mean will lie somewhere between the end points of the 1st sample’s 95% CI?

Did you say or think: “95% confident”?

If you did, you’re a bit more confident than you actually should be!

If your 1st sample’s mean were to match perfectly the mean of the population, you could be 95% confident that the 2nd sample’s mean would turn out to be “inside” the 1st sample’s 95% CI. That’s because the end points of your CI would coincide with the 2 points in a sampling distribution of the mean that serve to bookend the middle 95% of that distribution’s means. Select a 2nd sample, and its mean would have a 95% chance of landing between those bookends.

Your 1st sample, however, is not likely to have a mean that matches up perfectly with μ. This will cause the 1st sample’s 95% CI to be “off-center” in the sampling distribution of means. More than half of the CI will be located on the high (or low) side of that distribution’s midpoint, the true population mean. By having the CI’s end points not coincide with the points that bookend the middle 95% of sampling distribution of means, the 95% CI captures less than 95% of those means.

To prove to yourself that a 95% CI based on one sample’s data does not predict, with 95% accuracy, what a 2nd sample’s mean will be like, answer these 2 little questions: (1) How much of a normal distribution lies between the z-score points of +1.96 and –1.96? (2) How much of a normal distribution lies between any other pair of z-scores that are that same distance (3.92) apart from each other?

Filed under Misconceptions

STATISTICIANS ARE UNUSUAL YET QUITE AVERAGE

On the one hand, statisticians are a bit weird. On the other hand, they are altogether average. Here’s the proof:

1. They often break the law and drive their cars on the MEDIAN.
2. At dinner, they invariably want more desserts than anyone else, and they always want them ala-MODE.
3. If you sum up their deviations, they lose their cool and get incredibly MEAN.

(This little effort at statistical humor comes from S. Huck)