With fewer than two weeks left till the US presidential elections, motivating class discussion with data related to the candidates, elections, or politics in general is quite easy. So for yesterday’s lab we used data released by The Federal Election Commission on contributions made to 2012 presidential campaigns. I came across the data last week, via a post on The Guardian Datablog. The post has a nice interactive feature for analyzing data from all contributions.

Continue reading

Citizen Scientists in Space...

The L.A. Times had an interesting article about how a pair of ‘citizen scientists’ discovered a planet with four suns. I would say that a more accurate term for the pair would be ‘citizen data miners’, because essentially the astronomy community crowd sources data mining by providing reams of data for anyone to examine. It seemed timely for me, following a seminar at the UCLA Center for Applied Statistics by Kiri Wagstaff on automated procedures for discovering interesting features in large data sets.

Continue reading

I was creating a dataset this last week in which I had to partition the observed responses to show how the ANOVA model partitions the variability. I had the observed _Y _(in this case prices for 113 bottles of wine), and a categorical predictor X (the region of France that each bottle of wine came from). I was going to add three columns to this data, the first showing the marginal mean, the second showing the effect, and the third showing the residual.

Continue reading

Nate Silver's New Book

I’ve been reading and greatly enjoying Nate Silver’s book, The Signal and the Noise: Why So Many Predictions Fail—and Some Don’t. I’d recommend the book based on the introduction and first chapter alone. (And, no, that’s not because that’s all I’ve read so far. It’s because they’re that good.) If you’re the sort who skips introductions, I strongly suggest you become a new sort and read this one. It’s a wonderful essay about the dangers of too much information, and the need to make sense of it.

Continue reading

Yesterday (October 14, 2012), Felix Baumgartner made history by becoming the first person to break the speed of sound during a free fall. He also set some other records (e.g., longest free fall, etc.) during the Red Bull Stratos Mission–which was broadcast live on the internet. Kind of cool, but imagine the conversation that took place daydreaming this one… **Red Bull Creative Person: **What if we got some idiot to float up into the stratosphere in a space capsule and then had him step out of it and free fall four minutes breaking the sound barrier?

Continue reading

TheCurrent Population Survey (CPS) is a statistical survey conducted by the United States Census Bureau for the Bureau of Labor Statistics. The data collected is used to provide a monthly report on employment in the United States. Although the CPS data are available, to this point it has really only been easy to deal with for SPSS, Stata, or SAS users. A new blog is also making it easy for R users to obtain and analyses these data.

Continue reading

Author's picture

Citizen Statistician

Learning to swim in the data deluge