“When Are Citi Bikes Faster Than Taxis in New York City?” is a wonderful showcase piece by Todd Schneider. It says NYC but of course it means the southern half of Manhattan, where you can hire one of the Citi bikes and hurl yourself to almost certain doom in the traffic. The ready availability of big datasets for both forms of transport has made all this possible: the power of open data.
First up is a small multiple choropleth: choose a starting neighbourhood and it shows you whether taxis or bikes win, by destination neighbourhood. This is nice and simple to use but carries a lot of detail. Much more fun than the table of stats that most of us would have reached for first. I particularly like the way that Todd has provided a natural, Bayesian metric in the % chance that a taxi would beat a bike. That is going to make a lot more sense to most readers than a pair of medians or whatever.
Next, you get some breakdowns by time of day, direction of travel, etc. We see line charts, bar charts, dot plots with connecting lines — basically the same encoding but different formats to keep it fresh.
Todd finds that taxis have got slower since 2009: 20% slower! But bikes have not. And while taxis, and traffic in general, can have really bad days, bikes whizz past. That makes sense.
It’s all a really nice piece of work with all code on GitHub. Thanks to Xiaodong Cai for pointing it out to me.