How to Lie With Statistics

Ellipse

The Living Force
FOTCM Member
How to Lie With Statistics - Darrell Huff, Irving Geis - 1993

In the lies series books, this one is interesting as is explain tactics commonly used to manipulate statistics and so how to recognize those patterns.

"There is terror in numbers," writes Darrell Huff in How to Lie with Statistics. And nowhere does this terror translate to blind acceptance of authority more than in the slippery world of averages, correlations, graphs, and trends. Huff sought to break through "the daze that follows the collision of statistics with the human mind" with this slim volume, first published in 1954. The book remains relevant as a wake-up call for people unaccustomed to examining the endless flow of numbers pouring from Wall Street, Madison Avenue, and everywhere else someone has an axe to grind, a point to prove, or a product to sell. "The secret language of statistics, so appealing in a fact-minded culture, is employed to sensationalize, inflate, confuse, and oversimplify," warns Huff. Although many of the examples used in the book are charmingly dated, the cautions are timeless. Statistics are rife with opportunities for misuse, from "gee-whiz graphs" that add nonexistent drama to trends, to "results" detached from their method and meaning, to statistics' ultimate bugaboo - faulty cause-and-effect reasoning. Huff's tone is tolerant and amused, but no-nonsense. Like a lecturing father, he expects you to learn something useful from the book, and start applying it every day. Never be a sucker again, he cries! Even if you can't find a source of demonstrable bias, allow yourself some degree of skepticism about the results as long as there is a possibility of bias somewhere. There always is. Read How to Lie with Statistics. Whether you encounter statistics at work, at school, or in advertising, you'll remember its simple lessons. Don't be terrorized by numbers, Huff implores. "The fact is that, despite its mathematical base, statistics is as much an art as it is a science. - Therese Littleton"

msante

The Living Force
FOTCM Member
You reminded me of what I read in the book by DR WILLIAM T. WHITBY (SMOKING IS GOOD FOR YOU) several years ago when he criticized the anti-smoking campaign (among other things) for being mounted on skewed statistics and full of biases:

The anti-smokers' case rests solely on statistics. Intelligent people have come to look on statistics with suspicion, something by -which "you can prove anything". The old saying is, "Lies, damned lies and statistics". At first sight many people are impressed by an imposing slab of graphs, but soon discover how useless they are in proving anything. Statisticians themselves are the first to admit this. Statistics in themselves are useful information if collected without bias, but as the great statistician Professor Yule once said, "You can't prove anything by them".
[...]
You can have great fun with graphs . I show some graphs that could be made .

In (a) we see that an increase in the use of electric shavers is closely associated with an increase in lung cancer, but does anyone believe it means anything?

In (b) the graph shows an association between an increase in smoking and an increase in illegitimate births . Is there any significance?

In (c) we see the same thing for imports of Japanese cars and lung cancer . Should we stop Japanese cars because of this?

This shows how ridiculous it is to say that a graph proves causation. Statistics can only be evidence, never proof in themselves.

(pages 45-48)

mkrnhr

SuperModerator
Moderator
FOTCM Member
from the graphs above, one can clearly see also that lung cancer is correlated with the calendar year. Therefore, as a fact-based policy, if we use the 1760 calendar in perpetuity, the problem is solved forevever!
More seriously, statistics is something that should be taught at an early age. It only requires to know fractions and sums (and later some calculus though) and some common-sense intuition. They teach children how to solve quadratic equations, which is very unlikely to find real applications in real life, while the simpler domain of statistics is ignored. Maybe it is because of its usefulness in real life that it it ignored