Ten years after Ioannidis alleged that most scientific findings are false, reproducibility – or lack thereof – has become a full-blown crisis in science. Flagship journals like Nature and Science have published hand-wringing editorials and revised their policies in the hopes of heightening standards of reproducibility. In the statistical and data sciences, the barriers towards reproducibility are far lower, given that our analysis can usually be digitally encoded (e.g., scripts, algorithms, data files, etc.

Continue reading

My JSM 2016 itinerary

JSM 2016 is almost here. I just spent an hour going through the (very) lengthy program. I think that was time well spent, though some might argue I should have been working on my talk instead… Here is what my itinerary looks like as of today. If you know of a session that you think I might be interested in that I missed, please let me know! And if you go to any one of these sessions and not see me there, it means I got distracted by something else (or something close by).

Continue reading

Statistics with R on Coursera

I held off on posting about this until we had all the courses ready, and we still have a bit more work to do on the last component, but I’m proud to announce that the specialization called Statistics with R is now on Coursera! Some of you might know that I’ve had a course on Coursera for a while now (whatever “a while” means on MOOC-land), but it was time to refresh things a bit to align the course with other Coursera offerings – shorter, modular, etc.

Continue reading

Last year I was awarded a Project TIER (Teaching Integrity in Empirical Research) fellowship, and last week my work on the fellowship wrapped up with a meeting with the project leads, other fellows from last year, as well as new fellows for the next year. In a nutshell Project TIER focuses on reproducibility. Here is a brief summary of the project’s focus from their website: For a number of years, we have been developing a protocol for comprehensively documenting all the steps of data management and analysis that go into an empirical research paper.

Continue reading

As a statistician who often needs to explain methods and results of analyses to non-statisticians, I have been receptive to the influx of literature related to the use of storytelling or a data narrative. (I am also aware of the backlash related to use of the word “storytelling” in regards to scientific analysis, although I am less concerned about this than, say, these scholars.) As a teacher of data analysis, the use of narrative is especially poignant in that it ties the analyses performed intrinsically to the data context—or at the very least, to a logical flow of methods used.

Continue reading

Tools for Managing Your Inbox

First of all, happy new year to all of our readers. As my first contribution in 2016, I thought I would share a couple tools that have helped tame my email inbox with you. In my continued resolution to finally achieve Inbox Zero, I have made a major dent in the last month. This is thanks to two tools: Unroll Me and Google Mail’s “Save & Archive” button. Unroll Me The first tool I would like to share with you is an app called Unroll Me.

Continue reading

Check out my guest post on the Simulation-based statistical inference blog: Teaching computation as an argument for simulation-based inference If you are interested in teaching simulation-based methods, or if you just want to find out more why others are, I highly recommend the posts on this blog. The page also hosts many other useful resources as well as information on upcoming workshops as well.

Continue reading

Author's picture

Citizen Statistician

Learning to swim in the data deluge