Willful Ignorance [Book Review]

I just finished reading Willful Ignorance: The Mismeasure of Uncertainty by Herbert Weisberg. I gave this book five stars (out of five) on Goodreads.

According to Weisberg, the text can be

“regarded as two books in one. On one hand it is a history of a big idea: how we have come to think about uncertainty. On the other, it is a prescription for change, especially with regard to how we perform research in the biomedical and social sciences” (p. xi).

Willful ignorance is the idea that to deal with uncertainty, statisticians simplify the situation by filtering out or ignoring much of what we know…we willfully ignore some information in order to quantify the amount of uncertainty.

The book gives a cogent history and evolution of the ideas and history of probability, tackling head-on the questions: what is probability, how did we come to our current understanding of probability, and how did mathematical probability come to represent uncertainty and ambiguity.

Although Weisberg presents a nice historical perspective, the book is equally philosophical. In some ways it is a more leisurely read of the material found in Hacking, and in many ways more compelling.

I learned a great deal from this book. In many places I found myself re-reading sections and spiraling back to previously read sections to read them with some new understanding. I may even try to assign parts of it to the undergraduates I am teaching this summer.

This book would make a wonderful beach read for anyone interested in randomness, or uncertainty, or any academic hipster.

Here’s Looking At You!

What do we fear more?  Losing data privacy to our government, or to corporate entities?  On the one hand, we (still) have oversight over our government.  On the other hand, the government is (still) more powerful than most corporate entities, and so perhaps better situated to frighten.

In these times of Snowden and the NSA, the L.A. Times ran an interesting story about just what tracking various internet companies perform.  And it’s alarming. (“They’re watching your every move.”, July 10, 2013). Interestingly, the story does not seem to appear on their website as of this posting.)  Like the government, most of these companies claim that (a) their ‘snooping’ is algorithmic; no human sees the data and (b) their data are anonymized.  And yet…

To my knowledge, businesses aren’t required to adhere to, or even acknowledge, any standards or practices for dealing with private data.  Thus, a human could snoop on particular data.  We are left to ponder what that human will do with the information.  In the best case scenario, the human would be fired, as, according to the L.A. Times, Google did when it fired an engineer for snooping on emails of some teenage girls.

But the data are anonymous, you say?  Well, there’s anonymous and then there’s anonymous.  As LaTanya Sweeney taught us in the 90’s, knowing a person’s zipcode, gender, and date of birth is sufficient to uniquely identify 85% of Americans.  And the L.A. Times reports a similar study where just four hours of anonymized tracking data was sufficient to identify 95% of all individuals examined.  So while your name might not be recorded, by merging enough data files, they will know it is you.

This article fits in really nicely with a fascinating, revelatory book I’m currently midway through:  Jaron Lanier‘s Who Owns The Future? A basic theme of this book is that internet technology devalues products and goods (files) and values  services (software).  One process through which this happens is that we humans accept the marvelous free stuff that the internet provides (free google searches, free amazon shipping, easily pirated music files) in exchange for allowing companies to snoop. The companies turn our aggregated data into dollars by selling to advertisers.

A side affect of this, Lanier explains, is that there is a loss of social freedom.  At some point, a service such as Facebook gets to be so large that failing to join means that you are losing out on possibly rich social interactions.  (Yes, I know there are those who walk among us who refuse to join Facebook.  But these people are probably not reading this blog, particularly since our tracking ‘bots tell us that most of our readers come from Facebook referrals.  Oops.  Was I allowed to reveal that?)  So perhaps you shouldn’t complain about being snooped on since you signed away your privacy rights. (You did read the entire user agreement, right?  Raise your hand if you did.  Thought so.)  On the other hand, if you don’t sign, you become a social pariah.  (Well, an exaggeration.  For now.)

Recently, I installed Ghostery, which tracks the automated snoopers that follow me during my browsing.  Not only “tracks”, but also blocks.  Go ahead and try it.  It’s surprising how many different sources are following your every on-line move.

I have mixed feelings about blocking this data flow. The data-snooping industry is big business, and is responsible, in part, for the boom of stats majors and, more importantly, the boom in stats employment.  And so indirectly, data-snooping is paying for my income.  Lanier has an interesting solution:  individuals should be paid for their data, particular when it leads to value.  This means the era of ‘free’ is over–we might end up paying for searches and for reading wikipedia.  But he makes a persuasive case that the benefits exceed the costs.  (Well, I’m only half-way through the book.  But so far, the case is persuasive.)

An Accidental Statistician

I just finished reading An Accidental Statistician: The Life and Memories of George E. P. Box. The book reads like he is recounting his memories (it is aptly named) rather than as a biography. I enjoyed the stories and vignettes of his work and his intersections with other statisticians. The book also included pictures of many famous statisticians (George’s friends and family—Fisher was his father-in-law for a bit) in social situations. My favorite was the picture of Dr. Frank Wilcoxon on his motorcycle (see below).

Wilcoxon

There were some very interesting and funny anecdotes. For example, when George recounted a trip to Israel, he was told to get to the airport very early because of the intense security measures. After standing in a non-moving line for several hours, he apparently quipped that he had never before physically seen a stationary process.

My favorite sections of the book were the stories he told of writing Statistics for Experimenters, his book—along with William (Bill) Hunter and Stu Hunter—on experimental design. He wrote about how the book evolved from mimeographed notes for a course he had taught to the published version. It took several years for them to finish the writing of the book, only to be met with horrible reviews. (Note: This makes me feel slightly better about the year it took to write our book.)

In a chapter written about Bill Hunter (who was one of George’s graduate students at the University of Wisconsin), George relates that Bill started his PhD in 1960. After he finished (in 1963!)  he was hired almost immediately by Wisconsin as an assistant professor. Three years later he was made associate professor, and in 1969 (eight years after he started his PhD) he was made full professor. Unbelievable!

Box, Hunter, and Hunter

Thursday Next

From Jasper Fforde’s latest Thursday Next novel (The Woman Who Died Alot):

The Office for Ultimate Risk is one of the many departments within the Ministry of National Statistics. Although it was originally an “experimental” department, the statisticians at Ultimate Risk proved their worth by predicting the entire results of three football World Cups in succession, a finding that led to the discontinuation of football as a game and the results being calculated instead.

Introducing Statistics: A Graphic Guide

Source: introducingbooks.com

Source: introducingbooks.com

Over the winter break I was travelling in the UK and I came across this little book called “Introducing Statistics: A Graphic Guide” by Ellen Magnello and Borin Van Loon at the gift shop in the Tate Modern museum in London. The book is published in 2009, and Significance magazine already reviewed it here, so I won’t repeat their comments. I hadn’t heard about the book before, so I picked it up, along with a copy of Introducing Post-Modernism (they were 2 for £10, I had to get two, obviously).

I think the book would be more appropriately named “an illustrated guide”, since the images are mostly illustrations of statisticians with speech bubbles instead of graphics that help visualize the concepts being discussed. The most unexpected are the images of the author herself. The first time I came across one of those I was thinking “who is this lady in the pant-suit standing next to Karl Pearson?”. Needless to say, the illustrations sometimes distract from the text, but they’re fun and nicely drawn.

The book does a very good job of describing the differences between vital statistics and mathematical statistics, and what the terms “statistic” and “variability” mean. Therefore, while the audience of the book is not clear, it could be a perfect gift for parents of statisticians who still don’t quite understand what their offspring do. Or really anyone who is interested in statistics, but has no real formal experience with it.

While the book tells the early history of statistics well, the introduction of statistical concepts follow a strange order. It is useful for gaining familiarity with some terminology and simple statistical distributions and tests, but it would be quite difficult to acquire a thorough understanding of these concepts from the book’s introduction. However, I’m guessing this is not the intent of the book, anyway.

The book is part of a series called Introducing Books, which contain about 80 graphical guides from Introducing Aesthetics to Marxism to Wittgenstein. The museum shop where I got the book carried only about 10 of these titles, and I was happy to see that Introducing Statistics was one of them.