The City of Toronto has some excellent open data. Not only that, they’re constantly iterating on their portal and their strategy. This kind of agility is absolutely necessary if we want open data to succeed in the coming years. It’s critical to get the data out the door, sure, but governments at every level also need to be thinking about how they’re going to bridge the divide between data producers and data consumers, how they’ll optimize their data to make it business compatible, and how they’ll plan for the future.
As with any open data, though, there are a few data sets on Toronto’s portal that leave something to be desired. Take this data set, for example:
Despite proudly calling Toronto my home, I am not (yet) able to conjure up the geospatial coordinates of each of the city’s wards from memory. Lucky for me, the city publishes a data set on their portal that gives me, among other things, Toronto ward name and coordinates.
The Solution: Joining and Enriching with Unity
Using Unity, I can join these two data sets on the one value they have in common, ward in one, and lcode_name in the other. In Unity, the graph looks like this:
Since ward numbers 1–9 in the second graph are preceded by an unnecessary “0”, I transform the entire column into integers before matching it to the first graph. Then, using the Namara node, I specify the data set id of other graph I want (“basement flooding by ward”) and then specify each of the values I want to pull from the one data set into the other. After I run it, I can see the data set plotted on a map.
Now I can query the data in a way that makes a bit more sense, and I can immediately click through to see which wards have a higher rate of basement flooding.
Once I was looking at the data set, though, I started wondering about the other factors that might go into a high incident rate for flooding. Looking through Toronto’s open data portal, I found a data set that shows watermain breaks in Toronto
By layering that data set on top of the other one, I can start to see if there’s a correlation between the two.
Now I’ve got something that can potentially help the city understand areas in need of an infrastructure overhaul. If I’m a homeowner, I can use this data to predict the likelihood of my basement flooding. If I’m a guy selling flood insurance… well you get the picture.