Warning: Mac OS 10.9 Mavericks and R Don’t Play Nicely

For some reason I was compelled to update my Mac’s OS and R on the same day. (I know…) It didn’t go well on several accounts and I mostly blame Apple. Here are the details.

  • I updated R to version 3.0.2 “Frisbee Sailing”
  • I updated my OS to 10.9 “Mavericks”

When I went to use R things were going fine until I mistyped a command. Rather than giving some sort of syntax error, R responded with,

> *** caught segfault *** 
> address 0x7c0, cause 'memory not mapped' 
> 
> Possible actions: 
> 1: abort (with core dump, if enabled) 
> 2: normal R exit 
> 3: exit R without saving workspace 
> 4: exit R saving workspace 
> Selection:

Unlike most of my experiences with computing, this I was able to replicate many times. After a day of panic and no luck on Google, I was finally able to find a post on one of the Google Groups from Simon Urbanek responding to someone with a similar problem. He points out that there are a couple of solutions, one of which is to wait until Apple gets things stabilized. (This is an issue since if you have ever tried to go back to a previous OS on a Mac, you will know that this might take several days of pain and swearing.)

The second solution he suggests is to install the nightly build or rebuild the GUI. To install the nightly build visit the R  for Mac OS X Developer’s page. Or, in Terminal issue the following commands,

svn co https://svn.r-project.org/R-packages/trunk/Mac-GUI 
cd Mac-GUI 
xcodebuild -configuration Debug 
open build/Debug/R.app

I tried both and this worked fine…until I needed to load a package. Then I was given an error that the package couldn’t be found. Now I realize that you can download the packages you need from source and compile them yourself, but I was trying to figure out how to deal with students who were in a similar situation. (This is not an option for most social science students.)

The best solution it turned out is to use RStudio, which my students pretty much all use anyway. (My problem is that I am a Sublime Text 2 user.) This allowed the newest version of R to run on the new Mac OS. But, as is pointed out on the RStudio blog,

As a result of a problem between Mavericks and the user interface toolkit underlying RStudio (Qt) the RStudio IDE is very slow in painting and user interactions  when running under Mavericks.

I re-downloaded the latest stable release of the R GUI about an hour ago, and so far it seems to be working fine with Mavericks (no abort message yet), so this whole post may be moot.

Paint and Patch

IMG_0591

The other day I was painting the trim on our house and it got me reminiscing. The year was 2005. The conference was JSM. The location was Minneapolis. I had just finished my third year of graduate school and was slotted to present in a Topic Contributed session at my first JSM. The topic was Implementing the GAISE Guidelines in College Statistics Courses. My presentation was entitled, Using GAISE to Create a Better Introductory Statistics Course.

We had just finished doing a complete course revision for our undergraduate course based on the work we had been doing with our NSF-funded Adapting and Implementing Innovative Material in Statistics (AIMS) project. We had rewritten the entire curriculum, including all of our assessments and course activities.

The discussant for the session was Robin Lock. In his remarks about the presentations, Lock compared the re-structuring of a statistics course to the remodeling of a house. He described how some teachers restructure their courses according to a plan doing a complete teardown and rebuild. He brought the entire room to laughter as he described most teachers’ attempts, however, as “paint and patch,” fixing a few things that didn’t work quite so well, but mostly just sprucing things up.

The metaphor works. I have been thinking about this for the last eight years. Sometimes paint-and-patch is exactly what is needed. It is pretty easy and not very time consuming. On the other hand, if the structure underneath is rotten, no amount of paint-and-patch is going to work. There are times when it is better to tear down and rebuild.

As another academic year approaches, many of us are considering the changes to be made in courses we will soon be teaching. Is it time for a rebuild? Or will just a little touch-up do the trick?

Free Book—Statistical Thinking: A Simulation Approach to Modeling Uncertainty

CATALST-Textbook-Cover-v2

Catalyst Press has just released the second edition of the book Statistical Thinking: A Simulation Approach to Modeling Uncertainty. The material in the book is based on work related to the NSF-funded CATALST Project (DUE-0814433). It makes exclusive use of simulation to carry out inferential analyses. The material also builds on best practices and materials developed in statistics education, research and theory from cognitive science, as well as materials and methods that are successfully achieving parallel goals in other disciplines (e.g., mathematics and engineering education).

The materials in the book help students:

  • Build a foundation for statistical thinking through immersion in real world problems and data
  • Develop an appreciation for the use of data as evidence
  • Use simulation to address questions involving statistical inference including randomization tests and bootstrap intervals
  • Model and simulate data using TinkerPlots™ software

Why a cook on a statistics book? It is symbolic of a metaphor introduced by Alan Schoenfeld (1998) that posits many introductory (statistics) classes teach students how to follow “recipes”, but not how to really “cook.” That is, even if students leave a class able to perform routine procedures and tests, they do not have the big picture of the statistical process that will allow them to solve unfamiliar problems and to articulate and apply their understanding. Someone who knows how to cook knows the essential things to look for and focus on, and how to make adjustments on the fly. The materials in this book were intended to help teach students to “cook” (i.e., do statistics and think statistically).

The book is licensed under Creative Commons and is freely available on gitHub. If physical copies of the book are preferred, those are available for $45 at CreateSpace (or Amazon) in full color. All royalties from the book are donated to the Educational Psychology department at the University of Minnesota.

Lifehacker and Statistical Misconceptions

images

The website Lifehacker recently had an article about some common statistical misconceptions. I thought they did a great job explaining things like the base-rate fallacy and Simpson’s Paradox for a lay audience. I also really liked the extrapolation cartoon they picked. [Read the whole article here.]

 

 

What is Rigor?

Two years ago, my department created a new two-course, doctoral-level sequence primarily aimed at our quantitative methods students. This sequence, aside from our students, also attracts students from other departments (primarily in the social sciences) that plan to pursue more advanced methodological coursework (e.g., Hierarchical Linear Modeling).

One of the primary characteristics that differentiates this new sequence of courses from the other doctoral sequence of methodology courses that we teach is that it is “more rigorous”. This adjective, rigorous, bothers me. It bothers me because I don’t know what it means.

How do I know if a class is rigorous? When I ask my colleagues, the response is more often than not akin to Supreme Court Justice Potter Stewart’s “definition” of pornography (see Jacobellis v. Ohio)…I may not be able to define what a ‘rigorous course’ is, but you’ll know it when you take one.

It seems that students, in my experience, associate rigor with the amount (and maybe complexity) of mathematics that appear in the course. Rigor also seems to be directly associated with the amount of homework and difficulty-level of the assessments.

I think that I relate rigor to the degree to which a student is pushed intellectually. Because of this, I have a hard time associating rigor with a particular course. In my mind, rigorousness is an interaction between the content, the assessment and the student. The exact same course taught in different semesters (or different sections within a semester) has, in my mind, had differing levels of rigor, not because the content (nor assessment) has changed, but because the student make-up has been different.

The experience in the classroom, as much as we try to standardize it in the curriculum, is very different from one class to the next. A single question or curiosity might change the tenor of a class (to the good or the bad). And, try as I might to recreate the thoughtful questions or digressions of learning in future iterations of the course, the academic result often never matches that of the original.

So maybe having students that are all interested in statistics in a single course lead to a more nuanced curiosity and thereby rigor. But, on the other hand, there is much to be said about courses in which there are students with a variety of backgrounds and academic interests. I think rigor can exist in both types of courses. Or, maybe I am completely wrong and rigor is something more tangible. Is there such a thing as a rigorous course?

Dance Your Ph.D.

You can win $1000 for turning your Ph.D. thesis into an interpretive dance. More importantly, you will also receive a call-out from Science and get to perform your dance at TEDX in Belgium. This contest is not only open to more recent Ph.D.s, but anyone who got a Ph.D. (in the sciences) and also to students working on a Ph.D.

Gonzolabs has tips and examples over on their website. So put on your dancing shoes, grad your Ph.D. advisor and do-si-do. Now, if I can only get the Jabbawockeez and figure out what a mixed-effects model looks like as a dance…

TISE Network

Ever since we wrote an article in which we analyzed the articles which were been published in the Statistics Education Research Journal (Zieffler et al., 2011), I have been thinking about the relationships within the network of literature published on statistics education. What are the  pivotal articles? Which are foundational? How inter-connected are the articles?

This spring I started documenting those relationships by putting together a social network of articles published in Technology Innovations in Statistics Education and the articles they referenced. I just finished that work and used Gephi to produce a couple network plots.

ModularityThe first network graph (shown above) examines the community structure of the network by decomposing the network into sub-networks, or communities. I have made the nodes for the actual TISE articles larger for visual ease of interpretation. The node labels are the first author’s last name and year of publication. Currently (and not unsurprisingly), the subnetworks generally consist of the actual article published in TISE and the literature that was referenced therein. There are some commonalities between articles as well. For example, the two articles by McDaniel were identified as a single community. It will be interesting to see how these communities change as I add more literature into the network.

InDegree

The second network graph has the size of the node and node label sized by in-degree. In this case, in-degree is a measure of how often a particular article was referenced. The most cited literature in TISE is:

At some point, it would be nice to do this by author as well.

An Accidental Statistician

I just finished reading An Accidental Statistician: The Life and Memories of George E. P. Box. The book reads like he is recounting his memories (it is aptly named) rather than as a biography. I enjoyed the stories and vignettes of his work and his intersections with other statisticians. The book also included pictures of many famous statisticians (George’s friends and family—Fisher was his father-in-law for a bit) in social situations. My favorite was the picture of Dr. Frank Wilcoxon on his motorcycle (see below).

Wilcoxon

There were some very interesting and funny anecdotes. For example, when George recounted a trip to Israel, he was told to get to the airport very early because of the intense security measures. After standing in a non-moving line for several hours, he apparently quipped that he had never before physically seen a stationary process.

My favorite sections of the book were the stories he told of writing Statistics for Experimenters, his book—along with William (Bill) Hunter and Stu Hunter—on experimental design. He wrote about how the book evolved from mimeographed notes for a course he had taught to the published version. It took several years for them to finish the writing of the book, only to be met with horrible reviews. (Note: This makes me feel slightly better about the year it took to write our book.)

In a chapter written about Bill Hunter (who was one of George’s graduate students at the University of Wisconsin), George relates that Bill started his PhD in 1960. After he finished (in 1963!)  he was hired almost immediately by Wisconsin as an assistant professor. Three years later he was made associate professor, and in 1969 (eight years after he started his PhD) he was made full professor. Unbelievable!

Box, Hunter, and Hunter