My JSM 2016 itinerary


JSM 2016 is almost here. I just spent an hour going through the (very) lengthy program. I think that was time well spent, though some might argue I should have been working on my talk instead…

Here is what my itinerary looks like as of today. If you know of a session that you think I might be interested in that I missed, please let me know! And if you go to any one of these sessions and not see me there, it means I got distracted by something else (or something close by).

Sunday, July 31

Unfortunately it looks like I’ll be in meetings all Sunday, but if there is an opportunity to sneak out I would love to see the following sessions:

4PM – 5:50pm

  • Making the Most of R Tools
    • Thinking with Data Using R and RStudio: Powerful Idioms for Analysts — Nicholas Jon Horton, Amherst College ; Randall Pruim, Calvin College ; Daniel Kaplan, Macalester College
    • Transform Your Workflow and Deliverables with Shiny and R Markdown — Garrett Grolemund, RStudio
    • Discussant: Hadley Wickham, Rice University
  • Media and Statistics
    • Causal Inferences from Observational Studies: Fracking, Earthquakes, and Oklahoma — Howard Wainer, NBME
    • It’s Not What We Say, It’s Not What They Hear, It’s What They Say They Heard — Barry Nussbaum, EPA
    • Bad Statistics, Bad Reporting, Bad Impact on Patients: The Story of the PACE Trial — Julie Rehmeyer, Discover Magazine
    • Can Statisticians Enlist the Media to Successfully Change Policy? — Donald A. Berry, MD Anderson Cancer Center
    • Discussant: Jessica Utts, University of California at Irvine

I’ll also be attending the ASA Awards Celebration (6:30 – 7:30pm) this evening.

Monday, August 1

On Monday there are a couple ASA DataFest related meetings. If you organized a DataFest in 2016, or would like to organize one in 2017 (especially if you will be doing so for the first time), please join us. Both meetings will be held at Hilton Chicago Hotel, Room H-PDR3.

  • 10:30am – 2016 ASA DataFest Debrief Meeting
  • 1pm – 2017ASA DataFest Planning Meeting

8:30AM – 10:20AM

  • Applied Data Visualization in Industry and Journalism
    • Linked Brushing in R — Hadley Wickham, Rice University
    • Creating Data Visualization Tools at Facebook — Andreas Gros, Facebook
    • Cocktail Party Horror Stories About Data Vis for Clients — Lynn Cherny, Ghostweather R&D
    • Visualizing the News at FiveThirtyEight — Andrei Scheinkman,
    • Teaching Data Visualization to 100k Data Scientists: Lessons from Evidence-Based Data Analysis — Jeffrey Leek, Johns Hopkins Bloomberg School of Public Health

If I could be in two places at once, I’d also love to see:

2PM – 3:50pm

I am planning on splitting my time between

4:45pm – 6:15pm

ASA President’s Invited Address – Science and News: A Marriage of Convenience — Joe Palca, NPR


I’ll be splitting my time between the Statistical Computing and Graphics Mixer (6 – 8pm) and the Duke StatSci Dinner.

Tuesday, August 2

8:30AM – 10:20am

  • Introductory Overview Lecture: Data Science
    • On Mining Big Data and Social Network Analysis — Philip S. Yu, University of Illinois at Chicago
    • On Computational Thinking and Inferential Thinking — Michael I. Jordan, University of California at Berkeley

10:30AM – 12:20pm

I’m organizing and chairing the following invited session. I think we have a fantastic line up. Hoping to see many of you in the audience!

  • Doing More with Data in and Outside the Undergraduate Classroom
    • Computational Thinking and Statistical Thinking: Foundations of Data Science — Ani Adhikari, University of California at Berkeley ; Michael I. Jordan, University of California at Berkeley
    • Learning Communities: An Emerging Platform for Research in Statistics — Mark Daniel Ward, Purdue University
    • The ASA DataFest: Learning by Doing — Robert Gould, University of California at Los Angeles
    • Statistical Computing as an Introduction to Data Science — Colin Rundel, Duke University

If I could be in two places at once, I’d also love to see:

2PM – 3:50pm

  • Interactive Visualizations and Web Applications for Analytics
    • Radiant: A Platform-Independent Browser-Based Interface for Business Analytics in R — Vincent Nijs, Rady School of Management
    • Rbokeh: An R Interface to the Bokeh Plotting Library — Ryan Hafen, Hafen Consulting
    • Composable Linked Interactive Visualizations in R with Htmlwidgets and Shiny — Joseph Cheng, RStudio
    • Papayar: A Better Interactive Neuroimage Plotter in R — John Muschelli, The Johns Hopkins University
    • Interactive and Dynamic Web-Based Graphics for Data Analysis — Carson Sievert, Iowa State University
    • HTML Widgets: Interactive Visualizations from R Made Easy! — Yihui Xie, RStudio ; Ramnath Vaidyanathan, Alteryx

If I could be in two places at once, I’d also love to see:


I’ll be splitting my time between the UCLA Statistics/Biostatistics Mixer (5-7pm), Google Cruise, and maybe a peek at the Dance Party.

Sad to be missing the ASA President’s Address – Appreciating Statistics.

Wednesday, August 3

8:30AM – 10:20am

I’m speaking at the following session co-organized by Ben Baumer and myself. If you’re interested in reproducible data analysis, don’t miss it!

  • Reproducibility in Statistics and Data Science
    • Reproducibility for All and Our Love/Hate Relationship with Spreadsheets — Jennifer Bryan, University of British Columbia
    • Steps Toward Reproducible Research — Karl W. Broman, University of Wisconsin – Madison
    • Enough with Trickle-Down Reproducibility: Scientists, Open This Gate! Scientists, Tear Down This Wall! — Karthik Ram, University of California at Berkeley
    • Integrating Reproducibility into the Undergraduate Statistics Curriculum — Mine Cetinkaya-Rundel, Duke University
    • Discussant: Yihui Xie, RStudio

If I could be in two places at once, I’d also love to see:

10:30AM – 12:20pm

  • The 2016 Statistical Computing and Graphics Award Honors William S. Cleveland
    • Bill Cleveland: Il Maestro of Statistical Graphics — Nicholas Fisher, University of Sydney
    • Modern Crowd-Sourcing Validates Cleveland’s 1984 Hierarchy of Graphical Elements — Dianne Cook, Monash University
    • Some Reflections on Dynamic Graphics for Data Exploration — Luke-Jon Tierney, University of Iowa
    • Carpe Datum! Bill Cleveland’s Contributions to Data Science and Big Data Analysis — Steve Scott, Google Analytics
    • Scaling Up Statistical Models to Hadoop Using Tessera — Jim Harner, West Virginia University

If I could be in two places at once, I’d also love to see:

2PM – 3:50pm

If I could be in two places at once, I’d also see:

4:45PM – 6:15pm


I’m planning on attending the Section on Statistical Education Meeting / Mixer (6-7:30pm).

Thursday, August 4

8:30AM – 10:20am

I think I have to attend a meeting at this time, but if I get a chance I’d love to see:

  • Big Data and Data Science Education
    • Teaching Students to Work with Big Data Through Visualizations — Shonda Kuiper, Grinnell College
    • A Data Visualization Course for Undergraduate Data Science Students — Silas Bergen, Winona State University
    • Intro Stats for Future Data Scientists — Brianna Heggeseth, Williams College ; Richard De Veaux, Williams College
    • An Undergraduate Data Science Program — James Albert, Bowling Green State University ; Maria Rizzo, Bowling Green State University
    • Modernizing an Undergraduate Multivariate Statistics Class — David Hitchcock, University of South Carolina ; Xiaoyan Lin, University of South Carolina ; Brian Habing, University of South Carolina
    • Business Analytics and Implications for Applied Statistics Education — Samuel Woolford, Bentley University
    • DataSurfing on the World Wide Web: Part 2 — Robin Lock, St. Lawrence University

10:30AM – 12:20pm

  • Showcasing Statistics and Public Policy
    • The Twentieth-Century Reversal: How Did the Republican States Switch to the Democrats and Vice Versa? — Andrew Gelman, Columbia University
    • A Commentary on Statistical Assessment of Violence Recidivism Risk — Peter B. Imrey, Cleveland Clinic ; Philip Dawid, University of Cambridge
    • Using Student Test Scores for Teacher Evaluations: The Pros and Cons of Student Growth Percentiles — J.R. Lockwood, Educational Testing Service ; Katherine E. Castellano, Educational Testing Service ; Daniel F. McCaffrey, Educational Testing Service
    • Discussant: David Banks, Duke University

If I could be in two places, I’d also love to see:

That’s it folks! It’s an ambitious itinerary, let’s hope I get through it all.

I probably won’t get a chance to write daily digests like I’ve tried to do in previous years at JSM, but I’ll tweet about interesting things I hear from @minebocek. I’m sure there will be lots of JSM chatter at #JSM2016 as well.

Now, somebody give me something else to look forward to, and tell me Chicago is cooler than Durham!

Statistics with R on Coursera

18332552I held off on posting about this until we had all the courses ready, and we still have a bit more work to do on the last component, but I’m proud to announce that the specialization called Statistics with R is now on Coursera!

Some of you might know that I’ve had a course on Coursera for a while now (whatever “a while” means on MOOC-land), but it was time to refresh things a bit to align the course with other Coursera offerings — shorter, modular, etc. So I chopped up the old course into bite size chunks and made some enhancements in each component such as

  • integrating dplyr and ggplot2 syntax into the R labs,
  • restructuring the labs to be completed in R Markdown to provide better scaffolding for a data analysis project for each course,
  • adding Shiny apps to some of the labs to better demonstrate statistical concepts without burdening the learners with coding beyond the level of the course,
  • creating an R package that contains all the data, custom functions, etc. used in the course, and
  • cleaning things up a bit to make the weekly workload consistent across weeks.

The underlying code for the labs and the package can be found at Here you can also find the R code for reproducing some of the figures and analyses shown on the course slides (and we’ll keep adding to that repo in the next few weeks).

The biggest change between the old course and the new specialization though is a completely new course: Bayesian Statistics. I touched on Bayesian inference a bit in my old course, and this generated lots of discussion on the course forums from learners wanting more on this content. Being at Duke, I figured who better to offer this course but us! (If you know anything about the Statistical Science department at Duke, you probably know it’s pretty Bayesian.) Note, I didn’t say “me”, I said “us”. I was able to convince a few colleagues (David Banks, Merlise Clyde, and Colin Rundel) to join me in developing this course, and I’m glad I did! Figuring out exactly how to teach this content in an effective way without assuming too much mathematical background took lots of thinking (and re-thinking, and re-thinking). We have also managed to feature a few interviews with researchers in academia and industry, such as Jim Berger (Duke), David Dunson (Duke), Amy Herring (UNC), and Steve Scott (Google) to provide a bit more context for learners on where and why Bayesian statistics is relevant. This course launched today, and I’m looking forward to seeing the feedback from the learners.

If you’re interested in the specialization, you can find out more about it here. The courses in the specialization are:

  1. Introduction to Probability and Data
  2. Inferential Statistics
  3. Linear Regression and Modeling
  4. Bayesian Statistics
  5. Statistics Capstone Project

You can take the courses individually or sign up for the whole specialization, but to do the capstone you need to have completed the 4 courses in the specialization. The landing page for the specialization outlines in further detail how to navigate everything, and relevant dates and deadlines.

Also note that while the graded components of the course which will allow you to pursue a certificate require payment, one can audit the courses for free and watch videos, complete practices quizzes, and work on the labs.