Witch hunters: discovering the missing


knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(fig.width=16, fig.height = 12) Witch hunting in Europe: a discovery of missingness Data analysis isn’t particularly technical, but in order to do it well - you need a full toolkit and understand how to use it. One thing we don’t teach very expliticly is understanding missing data or the toolkit available to do so. That’s what this post is about - it’s not technical, it’s not ‘fancy’, it’s the down-in-the-weeds basics that applied work needs to be done well.

Ggplot2 tutorial


Should you bother with ggplot? Switching to data visualisation through code is a huge ask. Is this how you feel about code? How I used to feel about code. This is a perfectly normal reaction. But..! Can you do this? You can do this. Then you’re already writing code. Maybe you don’t think about yourself as a programmer … yet!

The keys to the kingdom

         · ·     

I started a new job last week- here I am working on transport and GIS! As it turns out, both transport and GIS are awesome. But how do you get up to speed really, really quickly? You ask for help. Asking for help In return, I was given the keys to the kingdom - people to follow, blog posts to explore, books to read, offers of help. This community is amazing.

CPA Emerging Leaders Conference

              · ·

I had the opportunity to talk at the CPA Emerging Leaders Conference today in Melbourne, then again next week in Sydney. It was a great crowd with a really interesting mix of perspectives. I was talking about Creating data driven value and insights, which was aimed at showing an already very data literate group of people (accountants!) how to get started with data driven thinking. My thoughts on getting started come down to five things:

Data Driven: By Design

              · ·

The data revolution is upon us! Or at least that’s what someone with a snappy grasp of copy-editing and a mandate for clicks would say. There’s a common misperception that’s out there in the data world that only some of us know how to be data driven. Only some of us have our data driving licenses and can get behind the wheel. This is manifestly false. Even if you don’t know R or Python or any other part of the common data science toolkit (yet), there are things you can do and start to think about today as you begin to embrace the idea of data-driven decision making.

Announcing consultthat

         · ·     

Automating a consulting project workflow There’s been a great deal of really useful work around data science workflows in the last couple of years and if you’ve followed Jenny Bryan’s work at all, you’ll know exactly what I mean. In consulting, the data science workflow is also critical, but it’s wrapped up with some extra management challenges. In addition to code and data, there’s a series of documents, people and timing to manage.

New site


There’s some big changes going on in my professional life, so it was time for a change. This is my new blog - my previous work is hosted over at Rex Analytics. Welcome!



If you’re a rural data scientist, then sooner or later you’ve had to move a big file to the cloud or back. That’s a world of pain right there. If you know anything about how information is transmitted via the internet, you know that files are broken into packets, sent, then reassembled on the other side. If packets go missing, the computer requests resends until the file is fully assembled, or until the system times out.