A cityscape of Manchester at night.

Global open data

Recently I published a report into how PDFs could work better with data. The responses have been polarised.

Lots of people who rely on documents for their work have been positive. Lots of people who work hard and sacrifice a lot in the cause of open data were extremely angry. One very notable thing was how angry Americans were.

From Britain there was a fear that the report would be misinterpreted. From America, the response was more accusatory. In places it went too far; suggesting I was deliberately undermining the open data community and had sold my honour to Adobe.

I’ve watched enough Leeds United vs. Millwall games to deal with far worse accusations, but it did get me thinking. Why were Americans so much more upset than anyone else?

The world is big

The clue to the answer came from a fantastic comment by Mor Rubinstein at 360 Giving. I’d said in my report that the battle between tables and PDF was won, but she wondered where I’d looked.

The truth is that I work mostly with local open data in Leeds and Birmingham in the UK. I also work with UK national open data, and French national and local open data. I speak with people all around the world too, but I work in Europe.

It’s a decent spread of experience; two levels of government in two of the three security council members who are philosophically positive about open data. But my experience is certainly not the world.

So this morning I broadened my outlook. I visited national open data portals in Kenya, Morocco, Chile, Malaysia, The USA, The UK, and France, plus our local open data portal in Leeds. I counted the formats that data was being published in. You can see and add to the spreadsheet yourself, but here’s the summary in a picture.



I didn’t know.

The USA seems to be extremely unusual in the number of PDFs it publishes, and I didn’t know. Like the UK, it's been turning away from the world recently, and fewer and fewer people I work with visit. I haven't for years. I missed this, and I missed Japan.

My report is clear that releasing spreadsheets is much better than printing to PDFs. And for most of the world, I remain convinced that the battle between tables and PDFs is well on the way to being won.

But in the USA it clearly isn’t. I want my report to give countries where this battle is being won the tools that they need to extend open data’s reach into new areas; where documents and not tables remain king. Following the feedback from the USA I’ll make it clearer in my report that the priority there is still to stop printing spreadsheets as PDFs.


There are lots of caveats with this quick investigation. Just some are,

  1. I’ve done this work in a morning. Other organisations like Open Knowledge foundation and The Open Data Barometer have worked for years. I’d welcome their opinions.
  2. There are lots of other sources of open data than national open data portals. I’ve not worked with sub-national and sub-regional government anywhere except in the UK and France. I’ve not looked at FOIs.
  3. Quantity is not a great way to judge release systems. If the most important document is locked up in a PDF, it doesn’t matter if there are a hundred tables of less useful information.
  4. There are PDFs in the French datasets, they’re locked away in the zips. In the ten examples I looked at, all were visual representations of the included raw data. This wasn’t PDFs instead of data, it was PDFs in addition to data. I still think that can provide value.


blog comments powered by Disqus