DataFest 2013

DataFest is growing larger and larger. This year, we hosted an event at Duke (Mine organized this) with teams from NCSU and UNC, and at UCLA (Rob organized) with teams from Pomona College, Cal State Long Beach, University of Southern California, and UC Riverside. We are very grateful to Vaclav Petricek at eHarmony for providing us with the data, which consisted of roughly one million “user-candidate” pairs, and a couple of hundred variables including “words friends would use to describe you”, ideal characteristics in a partner, the importance of those characteristics, and the all-important ‘did she email him’ and ‘did he email her’ variables.

Continue reading

Daniel Kaplan and Libby Shoop have developed a one-credit class called Data Computation Fundamentals, which was offered this semester at Macalester College. This course is part of a larger research and teaching effort funded by Howard Hughes Medical Institute (HHMI) to help students understand the fundamentals and structures of data, especially big data. [Read more about the project in Macalester Magazine.] The course introduces students to R and covers topics such as merging data sources, data formatting and cleaning, clustering and text mining.

Continue reading

I’m often on the hunt for datasets that will not only work well with the material we’re covering in class, but will (hopefully) pique students' interest. One sure choice is to use data collected from the students, as it is easy to engage them with data about themselves. However I think it is also important to open their eyes to the vast amount of data collected and made available to the public.

Continue reading

participatory sensing

The Mobilize project, which I recently joined, centers a high school data-science curriculum around participatory sensing data. What is participatory sensing, you ask? I’ve recently been trying to answer this question, with mixed success. As the name suggests, PS data has to do with data collected from sensors, and so it has a streaming aspect to it. I like to think of it as observations on a living object. Like all living objects, whatever this thing is that’s being observed, it changes, sometimes slowly, sometimes rapidly.

Continue reading

Open Access Textbooks

In an effort to reduce costs for students, the College of Education and Human Development at the University of Minnesota has created this catalog of open textbooks. Open textbooks are complete textbooks released under a Creative Commons, or similar, license. Instructors can customize open textbooks to fit their course needs by remixing, editing, and adding their own content. Students can access free digital versions or purchase low-cost print copies of open textbooks.

Continue reading

NCTM Essential Understandings

NCTM has finally published books on statistics in its EU series. This is a rather traditional approach to statistics, given the context of this blog. But, since I’m a co-author (along with Roxy Peck and Stephen Miller), why not point you to it? http://www.nctm.org/catalog/product.aspx?ID=13804 And while the book is not computational in theme, it does address a central issue of this blog: universal statistical knowledge. A grades 6-9 version is due out any moment.

Continue reading

iNZight

We spend too much time musing about the Data Deluge, I fear, at the expense of talking about another component that has made citizen-statisticianship possible: accessible statistical software. “Accessible” in (at least) two senses: affordable and ready-to-use. This summer, Chris Wild demonstrated his group’s software iNZight at the Census@ School workshop in San Diego. iNZight is produced out of the University of Auckland, and is intended for kids to use along with the Census@Schools data.

Continue reading

Author's picture

Citizen Statistician

Learning to swim in the data deluge