A cityscape of Manchester at night.

Building a National Data Library.

Tom Forth, .

In early May 2024, Anastasia Bektimirova and Allan Nixon at think tank UK Onward published a blog post suggesting that “The Government should establish a British Library for Data”. This was part of a wider plan to secure a leading place for the UK in frontier technologies and the future global economy. Just six weeks later in mid-June 2024, the Labour party included a commitment to create a National Data Library in their manifesto. They won an election on that manifesto in early July.

Two months from posting to policy is a fantastic achievement.

But policies are not actions and the new UK government has a lot to do. Despite tax rises, it has very little spare money to do it with. The public would barely notice the government missing this commitment. They wouldn’t notice at all if the commitment were met by a token gesture such as requiring The British Library or the Alan Turing Institute to rebadge some existing work and staff as a National Data Library.

A winning manifesto commitment does not guarantee action.

The policy paper to policy paper amplification pipeline.

On the eve of Labour’s election win, the calls for that action began. Gavin Freeguard published a blog post with some suggestions on how to think about a data library. This included a useful link to a Tony Blair Institute for Global Change report published in late May. That report was a thorough proposal for a National Data Trust which, though focused on health, had very strong National Data Library vibes.

The policy paper to policy paper amplification pipeline was now in full flow.

UK Onward published a more thorough description of their National Data Library plan, though not before Administrative Data Research UK had got in with their blog post. James O’Malley argued convincingly that nobody (including the government) knows what the National Data Library is or should be. The Open Data Institute wrote a blog post on how to build a National Data Library shortly after their response to the UK government’s AI Action Plan consultation called for an AI-ready National Data Library.

Then came the PDFs. The ESRC and The Wellcome Trust put out a technical white paper challenge and five groups responded with their visions for a National Data Library.

Eleven things to read, plus the Tony Blair Institute’s closely related work, plus the UK’s AI Action Plan which had either morphed into or always been called the AI Opportunities Action Plan and which, in section 1.2, made its own suggestions for what a National Data Library should do.

It has taken a while, but I’ve now read them all. And the bonus 14th paper, Governing in the Age of AI: Building Britain’s National Data Library, which was released during my reading.

Because it’s 2025, I’ve put them all into Google’s fantastic NotebookLM product, listened to a podcast of two AI-Californians discussing the documents, asked lots of questions, including a few silly ones, and used another AI to check lots of what you’re about to read.

Let’s go.

What I read

From the first blog post to the most recent paper the single word that best summarises what I read was “London”. This might seem surprising since all fourteen documents carefully avoid suggesting a location for the new National Data Library they support, but I can explain.

The cover image and the first image of the blog post that first proposes a National Data Library is an AI generated image of a cathedral-like library full of servers with a large main window looking out over the recognisable skyline of the City of London.

To my best guess, a guess backed up by my preferred AI assistant Le Chat Pro by Mistral, 11 of the 14 documents were written and published in London by authors who mostly live and work in London. The exceptions are Oxford and Swindon (twice). The most recent piece, published by the Tony Blair Institute for Global Change in London, and written mostly by authors who live in London, is signed by 11 people. Of these 11, 10 live and work mostly in London. The other is in Cambridge.

I’ve made my point, you can read more of my thoughts on this issue on my extremely long blog post about Why North England is Poor. I’ll move on.

In reading I found it useful to get confirmations, especially via James O’Malley and Gavin Freeguard, that no-one in the UK government yet has a clear vision for what the National Data Library they’ve promised should be or do.

Beyond that, and as I digested the approximately 300 A5 pages of text I’d read over a few weeks, and discussed them with colleagues and AI assistants, the following positive and negative opinions emerged.

Positively,

Negatively,

So what do I think the National Data Library should be?

What should a National Data Library be?

My starting position is to oppose the creation of any new national institution in Britain. Our central government is the largest it has been since WW2 and we have the most centralised government in the world. This centralisation contributes to the UK having the weakest national economy and by far the poorest large non-capital cities in North Europe and North America. Further growing the UK central government and its institutions, no matter how arms-length they are claimed to be, is likely, on average and versus the counterfactual, to decrease the prosperity of Britain and of Britons.

I would oppose the creation of new national institutions in Britain even more strongly if they are likely to be headquartered in London. The disproportionately high spending on public research and development and national institutions in the capital is a subsidy to companies in South East England and a substantial dispreference to prosperity in the rest of the UK. For at least fifteen years the UK government has created and funded new national institutions focusing on data, tech, and AI almost exclusively in London and has thus incentivised companies in these sectors to relocate to and grow faster in South East England. This has made our country weaker and poorer and I could not support any further institution that deepened that damage.

But if a National Data Library was created somewhere else, I might support it.

It should,

I would argue strongly for that location to be Leeds for two big reasons,

  1. NHS data was highlighted as the most obvious big source of value to be unlocked by a National Data Library. By winning in a competitive market, over 90% of the UK’s GP data is held by two companies based in Leeds: TPP and EMIS. NHS England is based in Leeds and the functions that take over from it are likely to remain based in Leeds. Much of the NHS’s data sharing technologies and platforms are developed and maintained in Leeds by companies such as BJSS. These organisations, especially TPP and EMIS, were key to the success of OpenSAFELY, the widely celebrated example of excellent data use of the type a National Data Library is designed to help more of happen. If getting more value from health data is a core motivation for setting up a National Data Library then picking any other location is a decision to ignore these market signals and miss an opportunity to prove that British national institutions back excellence wherever it emerges, not just near to where they overwhelmingly already exist.
  2. The long-standing promise by the British Library to establish a presence in Leeds remains unfulfilled and their recently announced and enormous expansion plans in London suggests they lack interest. The British Library coming to Leeds has long been an anchor for large plans around innovation and regeneration and much of The Leeds Innovation Arc is already under construction. National government investment such as the National Data Library is key to achieving the potential of this plan for Britain’s fourth largest city and offers the last hope of saving the Grade-I listed Temple Works in the city centre which is a perfect location for a Kings Cross style development.
Especially following cancellation of HS2 and NPR The Temple Works site, a Grade-I listed building in Central Leeds, can probably only be saved with public money and investment of the type long-promised by The British Library in a Northern site to rival its headquarters in St. Pancras.

But if someone else wants to make the case for Birmingham, Manchester, Liverpool, Newcastle or Glasgow, I’m keen to read it.

Thanks for reading. You may agree with me. You may not. As always, I really appreciate coherent disagreement, especially short blog posts or long comments below. I think that calling me silly or daft is a waste of everyone’s time.

If you have a view on what a National Data Library should be and some idea of the data you’d like it to help get released please complete the survey being run by The Tony Blair Institute for Global Change and The Entrepreneurs Network.

 

blog comments powered by Disqus