Paint and Patch

IMG_0591

The other day I was painting the trim on our house and it got me reminiscing. The year was 2005. The conference was JSM. The location was Minneapolis. I had just finished my third year of graduate school and was slotted to present in a Topic Contributed session at my first JSM. The topic was Implementing the GAISE Guidelines in College Statistics Courses. My presentation was entitled, Using GAISE to Create a Better Introductory Statistics Course.

We had just finished doing a complete course revision for our undergraduate course based on the work we had been doing with our NSF-funded Adapting and Implementing Innovative Material in Statistics (AIMS) project. We had rewritten the entire curriculum, including all of our assessments and course activities.

The discussant for the session was Robin Lock. In his remarks about the presentations, Lock compared the re-structuring of a statistics course to the remodeling of a house. He described how some teachers restructure their courses according to a plan doing a complete teardown and rebuild. He brought the entire room to laughter as he described most teachers’ attempts, however, as “paint and patch,” fixing a few things that didn’t work quite so well, but mostly just sprucing things up.

The metaphor works. I have been thinking about this for the last eight years. Sometimes paint-and-patch is exactly what is needed. It is pretty easy and not very time consuming. On the other hand, if the structure underneath is rotten, no amount of paint-and-patch is going to work. There are times when it is better to tear down and rebuild.

As another academic year approaches, many of us are considering the changes to be made in courses we will soon be teaching. Is it time for a rebuild? Or will just a little touch-up do the trick?

Free Book—Statistical Thinking: A Simulation Approach to Modeling Uncertainty

CATALST-Textbook-Cover-v2

Catalyst Press has just released the second edition of the book Statistical Thinking: A Simulation Approach to Modeling Uncertainty. The material in the book is based on work related to the NSF-funded CATALST Project (DUE-0814433). It makes exclusive use of simulation to carry out inferential analyses. The material also builds on best practices and materials developed in statistics education, research and theory from cognitive science, as well as materials and methods that are successfully achieving parallel goals in other disciplines (e.g., mathematics and engineering education).

The materials in the book help students:

  • Build a foundation for statistical thinking through immersion in real world problems and data
  • Develop an appreciation for the use of data as evidence
  • Use simulation to address questions involving statistical inference including randomization tests and bootstrap intervals
  • Model and simulate data using TinkerPlots™ software

Why a cook on a statistics book? It is symbolic of a metaphor introduced by Alan Schoenfeld (1998) that posits many introductory (statistics) classes teach students how to follow “recipes”, but not how to really “cook.” That is, even if students leave a class able to perform routine procedures and tests, they do not have the big picture of the statistical process that will allow them to solve unfamiliar problems and to articulate and apply their understanding. Someone who knows how to cook knows the essential things to look for and focus on, and how to make adjustments on the fly. The materials in this book were intended to help teach students to “cook” (i.e., do statistics and think statistically).

The book is licensed under Creative Commons and is freely available on gitHub. If physical copies of the book are preferred, those are available for $45 at CreateSpace (or Amazon) in full color. All royalties from the book are donated to the Educational Psychology department at the University of Minnesota.

Lifehacker and Statistical Misconceptions

images

The website Lifehacker recently had an article about some common statistical misconceptions. I thought they did a great job explaining things like the base-rate fallacy and Simpson’s Paradox for a lay audience. I also really liked the extrapolation cartoon they picked. [Read the whole article here.]

 

 

What is Rigor?

Two years ago, my department created a new two-course, doctoral-level sequence primarily aimed at our quantitative methods students. This sequence, aside from our students, also attracts students from other departments (primarily in the social sciences) that plan to pursue more advanced methodological coursework (e.g., Hierarchical Linear Modeling).

One of the primary characteristics that differentiates this new sequence of courses from the other doctoral sequence of methodology courses that we teach is that it is “more rigorous”. This adjective, rigorous, bothers me. It bothers me because I don’t know what it means.

How do I know if a class is rigorous? When I ask my colleagues, the response is more often than not akin to Supreme Court Justice Potter Stewart’s “definition” of pornography (see Jacobellis v. Ohio)…I may not be able to define what a ‘rigorous course’ is, but you’ll know it when you take one.

It seems that students, in my experience, associate rigor with the amount (and maybe complexity) of mathematics that appear in the course. Rigor also seems to be directly associated with the amount of homework and difficulty-level of the assessments.

I think that I relate rigor to the degree to which a student is pushed intellectually. Because of this, I have a hard time associating rigor with a particular course. In my mind, rigorousness is an interaction between the content, the assessment and the student. The exact same course taught in different semesters (or different sections within a semester) has, in my mind, had differing levels of rigor, not because the content (nor assessment) has changed, but because the student make-up has been different.

The experience in the classroom, as much as we try to standardize it in the curriculum, is very different from one class to the next. A single question or curiosity might change the tenor of a class (to the good or the bad). And, try as I might to recreate the thoughtful questions or digressions of learning in future iterations of the course, the academic result often never matches that of the original.

So maybe having students that are all interested in statistics in a single course lead to a more nuanced curiosity and thereby rigor. But, on the other hand, there is much to be said about courses in which there are students with a variety of backgrounds and academic interests. I think rigor can exist in both types of courses. Or, maybe I am completely wrong and rigor is something more tangible. Is there such a thing as a rigorous course?

Dance Your Ph.D.

You can win $1000 for turning your Ph.D. thesis into an interpretive dance. More importantly, you will also receive a call-out from Science and get to perform your dance at TEDX in Belgium. This contest is not only open to more recent Ph.D.s, but anyone who got a Ph.D. (in the sciences) and also to students working on a Ph.D.

Gonzolabs has tips and examples over on their website. So put on your dancing shoes, grad your Ph.D. advisor and do-si-do. Now, if I can only get the Jabbawockeez and figure out what a mixed-effects model looks like as a dance…

TISE Network

Ever since we wrote an article in which we analyzed the articles which were been published in the Statistics Education Research Journal (Zieffler et al., 2011), I have been thinking about the relationships within the network of literature published on statistics education. What are the  pivotal articles? Which are foundational? How inter-connected are the articles?

This spring I started documenting those relationships by putting together a social network of articles published in Technology Innovations in Statistics Education and the articles they referenced. I just finished that work and used Gephi to produce a couple network plots.

ModularityThe first network graph (shown above) examines the community structure of the network by decomposing the network into sub-networks, or communities. I have made the nodes for the actual TISE articles larger for visual ease of interpretation. The node labels are the first author’s last name and year of publication. Currently (and not unsurprisingly), the subnetworks generally consist of the actual article published in TISE and the literature that was referenced therein. There are some commonalities between articles as well. For example, the two articles by McDaniel were identified as a single community. It will be interesting to see how these communities change as I add more literature into the network.

InDegree

The second network graph has the size of the node and node label sized by in-degree. In this case, in-degree is a measure of how often a particular article was referenced. The most cited literature in TISE is:

At some point, it would be nice to do this by author as well.

An Accidental Statistician

I just finished reading An Accidental Statistician: The Life and Memories of George E. P. Box. The book reads like he is recounting his memories (it is aptly named) rather than as a biography. I enjoyed the stories and vignettes of his work and his intersections with other statisticians. The book also included pictures of many famous statisticians (George’s friends and family—Fisher was his father-in-law for a bit) in social situations. My favorite was the picture of Dr. Frank Wilcoxon on his motorcycle (see below).

Wilcoxon

There were some very interesting and funny anecdotes. For example, when George recounted a trip to Israel, he was told to get to the airport very early because of the intense security measures. After standing in a non-moving line for several hours, he apparently quipped that he had never before physically seen a stationary process.

My favorite sections of the book were the stories he told of writing Statistics for Experimenters, his book—along with William (Bill) Hunter and Stu Hunter—on experimental design. He wrote about how the book evolved from mimeographed notes for a course he had taught to the published version. It took several years for them to finish the writing of the book, only to be met with horrible reviews. (Note: This makes me feel slightly better about the year it took to write our book.)

In a chapter written about Bill Hunter (who was one of George’s graduate students at the University of Wisconsin), George relates that Bill started his PhD in 1960. After he finished (in 1963!)  he was hired almost immediately by Wisconsin as an assistant professor. Three years later he was made associate professor, and in 1969 (eight years after he started his PhD) he was made full professor. Unbelievable!

Box, Hunter, and Hunter

Research Hack: Paper and Reference Management

Research Hacks are a series of blog posts about some of the tools, applications, and computer programs that I use in my workflow. Some of these I began using when I was a graduate student, and others I have picked up more recently. This is the second post in the series (see the first post Feedreaders and Aggregators.)

Electronically managing the absurdly large volume of articles, reports, book chapters and other writings that academics procure is a huge way to save time and increase production. My initial way to manage these files (often PDFs) was to include them in a folder that corresponded to a particular project or paper. Because I could never find the article again (Spotlight was a long way from working well at this point), I often had multiple copies of the same paper residing on my computer. This also meant that I had multiple annotations across these papers.

When one summer I realized that I had 11 copies of a paper on covariational reasoning (the topic of my dissertation) on my computer I laughed at the absurdity of this system and vowed to fix it. This is when I found Papers.

Papers (now in its second version—Papers2) is a management system for a person’s “research library” (as they refer to it). It is sort of like iTunes for PDF files. You have a “library” of files (only one place on your computer) and these are displayed in the Papers application (just like iTunes). You can then have “playlists” in which you put these files, but without creating multiple copies! For example, you could have “playlists” containing the references for each paper you are currently writing.

Screenshot of the Papers2 application

The search feature is great. If you are an organization nut like myself, you can also input all sorts of meta-data (publication type, tag words, photos, links to supplementary material, etc.). Papers can also output references for BibTeX or Endnote and has integration with Scrivener and Word. There are limited annotation tools within Papers (although more in v2.1) at this point, but rumor has it that is a big part of the future. There are also several workarounds using Dropbox, Skim, etc. Lastly there are iPhone and iPad apps for Papers that I think are beautiful. Reading articles on the iPad is one of the coolest things ever.

Papers on the iPad

Unfortunately Papers is not free (but there is a substantial discount for students). Also, as far as I know, it is only available for Mac users. There are several other management and reference systems available as well. Two of those are Zotero and Mendeley.

Each system has features that are really cool and some that aren’t as well developed. Why did I choose Papers? At the time it was pretty much there only one choice at the time that existed in a state that actually worked. (I seem to recall Mendeley was just released as a beta version.) Would I make the same choice now? I am not sure, but I think so. My second choice would be Zotero. (I am a little concerned about what will happen to Mendeley now that it has been purchased by Elsevier.)

No matter what choice you make, let me make several suggestions.

  1. Begin using it immediately.
  2. Begin entering meta-data for every paper you have right away. Don’t be chincy here. Yes, I know it is time-consuming, but that just gets worse as you accumulate more and more articles. Some of this can be automated depending on the recency of the paper, etc.
  3. Learn how to use it to input references into a paper.
  4. Figure out a workflow for paper annotation (taking notes, highlighting, etc.)

Summer is a wonderful time to learn a new software program or computing language. Happy computing!

Research Hack: Feedreaders and Aggregators

I have been thinking for several years that I should put together a series of blog posts about some of the tools, applications, and computer programs that I use in my workflow. Some of these I began using when I was a graduate student, and others I have picked up more recently.

I wanted to initially do this to share these tools with our graduate students at the University of Minnesota. It seems crazy if you say it to a graduate student, but a person in academia never has any more time than she does as a graduate student; thus making it a perfect time to learn new skills and develop excellent habits. Sharing these ideas on this blog is even better. Perhaps others can weigh in and offer alternatives or (frankly) better ideas than what I posit here.

According to Wikipedia, a “life hack is any productivity trick, shortcut, skill, or novelty method to increase productivity and efficiency.” The tools I will write about have made my work and research life easier and more productive and thus I have dubbed them “research hacks”. I hope they do the same for you.

The first type of application that I wanted to write about is a feedreader. This may seem an odd choice for the initial research hack, especially given my love of R and RStudio, but I think it is apropos. Reading research and staying abreast of current work is the lifeline of academics and researchers. Feedreaders make this easier.

In a nutshell, feedreaders scan websites for new information and then present that new information in digestible chunks, often aggregating the feeds from many websites into one application. Imagine a single “journal” that showed you all of the abstracts from the journals that you read! Not only that, but it also could show you “abstracts” of any blog entries for the blogs that you read. And “abstracts” of the major news stories from the newspapers you read.

There are several options available for a feedreader depending on the device/computer system that you use. While I don’t endorse one over any other, I will tell you what I use. (If I don’t there will be questions, and it may give you a place to start if this kind of tool is new to you.) On my Mac, I use the program Vienna. On my iPad, which is where I do most of my browsing and blog reading, I use Flipboard.

Screen Shot 2013-04-22 at 8.06.55 AM

A screenshot of Vienna running on my Mac.

Once you have a feedreader that you like, you can enter in new subscriptions. These are just website or blog URLs. If the website has a feed, the reader will detect it and add the website to your subscriptions. Then, anytime you open the reader, it will alert you as to whether the website or blog has a new post, which you can then read.

Which websites or blogs should you subscribe to? This is a matter of personal taste and also, for researchers, coverage. Onlinemathdegrees.com published a list of 100 statistics sites that might be a good place to start. Below, I list a few subscriptions that I have:

I also subscribe to blogs written by students and friends, blogs that I find interesting (e.g., The Long and Short of it All: A Dachshund Dog News Magazine), blogs that I find funny or creative (McSweeney’s), aggregators about things I like (e.g., books, design, gardening), and pretty much anything else I want to keep tabs on. I wish more journals employed feeds so that I could keep up with them that way.