Reflections from Bologna: CKAN and the Power of Community Data at csv,conf,v9
CKAN community members share insights from csv,conf,v9 in Bologna — exploring open data, community impact, and the power of digital public infrastructure.
➤ CKAN was born from a personal need: Rufus wanted data to answer serious questions about the world—but it was hard to find or locked away.
➤ Open data alone isn’t enough: We need tools and infrastructure that help us make sense of it.
Inspired by open source and CPAN: CKAN aimed to do for data what Debian and Linux package managers did for software.
➤ Sensemaking is the next frontier: The future isn’t just about more data—it’s about helping people understand and act on it.
➤ AI is both a risk and an opportunity: It could centralize power—or it could help us clean, enrich, and contextualize data at scale.
➤ The real challenge is the wisdom gap: Can we learn to use our most powerful tools wisely, collectively, and for good?
Rufus Pollock is a leading thinker, technologist, and entrepreneur in the global open data movement. He is best known as the original creator of CKAN — the world’s leading open-source data management system — and the founder of the Open Knowledge Foundation (OKF), an international nonprofit that pioneered open data infrastructure and advocacy since 2004.
Over the past 25 years, Rufus has helped shape the global open data ecosystem. He served as an adviser on open data policy to the UK Government, the US White House, the World Bank, and the UN. He is also the creator of Frictionless Data — a framework of lightweight specifications and tooling to make working with data easier, faster, and more interoperable — and founder of DataHub.io, an open platform for discovering and sharing datasets.
Among his many contributions:
Currently, Rufus is:
At CKAN Monthly Live #33, Rufus returned to the community to reflect on CKAN’s origin story — how it began with a personal need to find usable, trustworthy data; how it evolved from a wiki to powering national open data portals; and why we need better tools for sensemaking in an age of information overload and AI disruption.
“CKAN was never meant to become a big open source project. It started because I needed something that didn’t exist — a CPAN for data.”
In the late 1990s, Rufus—then a curious teenager—was asking big questions:
“How many people can the Earth support? Are we going to run out of fossil fuels? What’s the population going to be in 2050?”
He found books filled with tables and graphs. But the raw data—the numbers behind the charts—was either impossible to find or locked behind expensive paywalls.
“You’d read a book with loads of data... but when you went looking for the datasets, they just weren’t available. You could go and find reports, but they might cost like $10,000 or $20,000.”
He realized that the problem wasn’t just knowledge—it was infrastructure. The data wasn’t open, and there were no tools to manage and share it efficiently.
While studying at Cambridge, he encountered open-source tools like Linux and Debian. The experience of using a community-powered, modular system blew his mind.
“Wouldn’t that be possible for data?”, he asked.
CKAN wasn’t initially planned as a product. It began as a tool to power a single site—ckan.net (now datahub.io). The idea was simple: what if data had the same collaborative infrastructure as software?
The origins of CKAN lie in a simple but powerful analogy: what CPAN did for software, CKAN could do for data.
“CKAN is named CKAN because of CPAN.”
➤ The name CKAN comes from CPAN — the Comprehensive Perl Archive Network.
➤ The first version was a wiki (built in MoinMoin), then rewritten in Python using Pylons.
➤ Official launch: Creative Commons Summit 2007 in Dubrovnik.
➤ Originally built to run the catalog site ckan.net (now reborn as datahub.io).
In the early 2000s, Rufus was inspired by the design of open-source ecosystems like Linux and Debian—especially the idea of reusable software components managed through a package registry. CPAN, the Comprehensive Perl Archive Network, stood out as a model: a centralized catalog where developers could publish, discover, and build upon each other’s code.
He envisioned something similar for datasets:
→ A registry where data could be published once, discovered easily, and reused globally.
CKAN began humbly. The first version was a wiki, built with MoinMoin, running the catalog site ckan.net (now reborn as DataHub.io). In 2007, it was rewritten in Python using Pylons, and officially launched at the Creative Commons Summit in Dubrovnik.
At the time, there was no grand roadmap. CKAN was not created to be a global open-source standard. It was simply a tool to solve a problem: how to make datasets findable, reusable, and trustable.
But the timing was right.
Around 2008–2010, the open data wave gained political traction and the open data movement was accelerating. Governments in the US, UK, and elsewhere needed working tools—and CKAN was ready. Governments began reaching out:
“We need something like this now.”
CKAN was already mature, working, and open source—so they adopted it. From the UK and US to Australia and Canada, it quickly became the backbone of dozens of national open data portals.
From 2009 to 2014, CKAN rapidly evolved:
“It started as a tool for one site, and became a global infrastructure.”
People weren’t just publishing open government data anymore. They were using CKAN for:
Rufus credits the community—including early contributors like Adrià Mercader and Steven De Costa—for growing CKAN into what it is today.
“CKAN today is this mature, powerful, extensible platform. It’s the world’s leading open source data management system.”
At the beginning of the open data movement, many believed something simple: more data → more insight → better action.
“If we just had more data… we’d get knowledge. From knowledge would come insight. And from insight, we’d get action.”
But reality didn’t follow that path.
Why?
Data doesn’t speak for itself. It needs interpretation. Framing. Meaning.
“Information without meaning creates even confusion. Openness requires open minds.”
A central theme of Rufus’ talk was sensemaking—how individuals and societies interpret information and decide what to do.
He cited the story of a firefighter who survived by lighting a counter-fire—while others ran and died. Why? They couldn’t make sense of what he was doing. Their mental models failed.
“We’re always making sense—especially in times of crisis or change. And our tools need to support that.”
This insight reshaped his own work:
From climate collapse to AI risk, we’re facing multiple interwoven crises.
Rufus calls this the meta-crisis—a breakdown in the systems we use to make sense of the world.
And data? It’s part of the solution—but only if we use it well.
Rufus addressed the emerging role of AI with nuance:
“The AI + open data combo is incredibly powerful—but it must be democratic, ethical, and human-centered.”
Open data and open source still suffer from one key challenge:
unsustainable funding.
Rufus argued that current models (like crowdfunding or corporate donations) aren't enough. He proposed a radical solution:
This could scale to music, software, and even medicine.
The full argument is in his book: The Open Revolution (free to download).
CKAN has moved beyond open data. It's now used for:
The platform remains modular, extensible, and open by design.
And the vision? A universal “data fabric” for trustworthy, contextual, human-readable data.
“We don’t just need data. We need ways to tell stories with it, to make sense of it, and to act on it—together.”
CKAN community members share insights from csv,conf,v9 in Bologna — exploring open data, community impact, and the power of digital public infrastructure.
Spain co-chaired the 2025 OGP Summit with a focus on people, institutions, and technology. This article offers a systems-level view of open government, highlighting infrastructure challenges, policy signals, CKAN’s enabling role, and the themes shaping the future of transparency and participation.