ON JUNE 15, 2017

The data every city should release, and why

After spending the majority of every day looking at thousands of open data sets from hundreds of municipalities across North America, I came to some conclusions about the data that everyone was releasing. My main realization was this:

No one is doing the same thing.

That’s great, on one hand, because it means that governments are trying things out. I like this. It’s great to see a government try new things. On the other hand, it’s really profoundly difficult to work on any project involving public data if you can’t reasonably predict what’s going to be available on any given data portal.

Want to track spending? Very difficult to do if you can’t get city financials from half of the jurisdictions you’re looking at. Want to build an app that tells you how the commute’s going to be? Better hope Peel and Mississauga are releasing the same traffic data as Toronto.

Up to this point, the open data success stories we’ve seen follow a similar pattern, which generally looks like this: Someone used their city’s data to build something cool for their city.

That’s great. Building a killer transit app is great. The problem is, it’s small potatoes.

Now what if someone built the same app with national coverage? What if city officials could easily check their performance against neighbouring communities? What if cancer researchers in Barcelona had access to the research being done in Chennai? The more we tear down the silos that separate one city’s data from another the more problems we’ll be able to solve.

This is the sort of thing you start to think about when you’re looking at hundreds of thousands of open data sets. It’s a big opportunity. The whole thing has to start with the cities themselves, and it absolutely has to start with a baseline of common data.

We were thinking about this for a while, so we decided to put together a list of data sets that we felt every city should release. The data sets we decided on were by no means definitive, but they’re a good place to start. These are the data sets that most cities already have access to, and they’re the ones that drive immediate value, both to citizens and businesses.

After plotting these on a map we realized that they all seemed to emanate from 6 major themes, so we built a wheel that showed how the data sets might relate to each other. After thinking about it for a while, we realized that these data sets were really the elemental ones; the ones that you could mix together to create totally new data, derive different information, and see different use cases. When we realized that, we started calling it the Open Data Big Bang.

An open data strategy doesn’t have to be written in stone. It should, in fact, be agile enough to transform as the uses of public data become better understood. But every open data strategy, whether it’s for New York City or Moose Jaw, can start with these elemental data sets. You’ll be amazed at what they’ll create.

Namara is the data management platform created and maintained by ThinkData Works. Sign up today to start using all the open data in North America.

Yukon — Optimized File Deduplication and Merge

This was on our mind when we first discussed developing a strategy for removing and…

Is your data ready for AI?

Is your data ready for AI? For most companies, the answer is no. Industry leaders in…