Blog

Transport Data Commons: Open, Interoperable Transport Data Built on CKAN

Explore how the Transport Data Commons, powered by CKAN, is transforming transport data from siloed PDFs into a global, interoperable, and reusable knowledge base—especially for low-emission mobility in the Global South.

Transport Data Commons title graphic with train image symbolizing global mobility data sharing

Overview

The Transport Data Commons (TDC) is building a shared global infrastructure for sustainable mobility data — especially in the Global South, where transport emissions are rising fastest. At CKAN Monthly Live #34, Nicolas Becker of GIZ and the TDC Initiative gave a detailed walkthrough of the TDC’s vision, technical architecture, use cases, and what’s coming next.

See the TDC Portal: portal.transport-data.org

This recap captures the essential insights from his talk, including the problem TDC solves and why CKAN was chosen.

Transport Data Commons portal homepage

Why the Transport Sector Needs a Commons

Nicolas opened with context:

“Transport or mobility is a basic good we consume in our daily life… Most of us still commute to work, go grocery shopping, visit family. This is all consumption of mobility.” – Nicolas Becker

Yet transport is the only major sector where CO₂ emissions are still rising—especially in non-OECD (Global South) countries.

“A large percentage of this increase is caused by increased traffic in non-OECD countries.” – Nicolas Becker

To tackle this, data is critical—for modeling, policy, and accountability. But right now, valuable data is:

  • Locked in PDFs
  • Buried in final reports
  • Forgotten after projects end
“Ultimately, this data has limited lifetime… it usually ends somewhere in a drawer.” – Nicolas Becker

The Birth of the Transport Data Commons

In 2022, Nicolas and his team at GIZ Transport and Climate Change interviewed NGOs, academics, and development institutions. Everyone agreed: siloed, single-use data was a problem.

At the International Transport Forum (ITF) summit in Leipzig, stakeholders laid the foundation for the Transport Data Commons Initiative (TDCI).

“We’re more of a birds-of-a-feather working group than a formal organization—individuals and institutions collaborating on a shared goal.” – Nicolas Becker

Timeline and Milestones

“The Datopian team turned our Figma prototype into a fully functional CKAN platform.” – Nicolas Becker

Two Core Use Cases

1. Data Consumers (researchers, analysts, NGOs, etc.)
“This could be a student, a researcher, an NGO employee… someone working on a study who needs transport data but doesn’t know where to find it.” – Nicolas Becker

Portal features:

  • Full-text search with autocomplete
    Easily find datasets using natural language or keywords.
  • Advanced filtering options
    Narrow results by country, region, topic, organization, format, or time range.
  • Dataset previews
    View tables, metadata, and sample records before downloading.
  • API access
    Retrieve data programmatically via the built-in CKAN API.
  • Global and country-level data views
    Explore datasets by geography through interactive country dashboards.
  • Multi-format citation support
    Automatically generate citations in APA, BibTeX, and LaTeX formats.
  • Transparent metadata
    Access source, licensing, update frequency, and contributor info.
  • Data licensing visibility
    Quickly see reuse conditions and open licenses.
  • Download in standard formats
    CSV, XLSX, GeoJSON, and more—depending on the dataset.
  • Responsive interface
    Works on desktop and mobile browsers.
Advanced search
Advanced filtering options
Metadata
Metadata 2
Transparent metadata
data preview
Dataset preview
Citation
Multi-format citation support
2. Data Providers (governments, NGOs, research labs)

Organizations or individuals with relevant transport data can contribute datasets through a guided and secure workflow.

“Users log in with GitHub and can submit datasets via a guided form.” – Nicolas Becker

Key features include:

  • GitHub-based login
    Authenticate via GitHub for secure access.
  • Step-by-step dataset upload wizard
    Add metadata, files, keywords, geographies, and licensing info.
  • Save as draft
    Prepare datasets incrementally before submission.
  • Public/private dataset toggle
    Choose to publish immediately or keep datasets private within your organization.
  • Custom approval workflow
    All submissions and edits are reviewed by administrators before going live.
  • Role-based access control
    Admins, editors, and contributors have different publishing permissions.
  • Metadata quality guidance
    Inline help and tooltips guide contributors to follow metadata standards.
  • Validation checks
    Automatic format and completeness checks ensure quality submissions.
  • Contributor dashboard
    View your submitted datasets, pending approvals, and publishing activity.
  • Secure file storage with access control
    Files are stored in Cloudflare R2 with signed URLs and access restrictions.

These tools enable collaborative publishing, quality control, and community-led curation of high-quality, standards-compliant transport data.

Sign up with GitHub

Platform Infrastructure

  • Backend: CKAN
  • Frontend: PortalJS
  • Storage: Cloudflare R2 with S3-compatible buckets
  • Data privacy: Signed URLs, private dataset controls
  • Custom features:
    • Approval workflows
    • Role-based publishing
    • Embeddable citation snippets
    • Geographic data views
“We’re using a proxy with signed URLs so private files can’t be accessed or shared without permission.” – João Demenech, Datopian

Current Status (Mid-2025)

“We’re in the transition from beta testing to full operation.” – Nicolas Becker

As of June 2025:

  • 400+ datasets
  • 17 organizations onboarded
  • 7 topic categories
  • Regional data from countries like India, Ethiopia, Malawi

Why CKAN?

“Basically because it really fulfills all of these prerequisites we defined in the initial ideation phase — how a solution for one place for all data could look like.” – Nicolas Becker

During the early design phase, the TDC team identified key needs: openness, extensibility, ease of access, and an active developer community. CKAN checked all the boxes.

“CKAN was this open-source solution out there — sort of the de facto standard. We included it in our research, alongside examples like HDX and EnergyData.info, which were already using CKAN. Both developed by Datopian.” – Nicolas Becker

CKAN’s plugin architecture and strong developer base also gave the team confidence:

“We chose CKAN because we can build on a strong community and scale with custom extensions as needs evolve.”

Looking Ahead

TDC isn’t about collecting more data — it’s about making existing data FAIR (Findable, Accessible, Interoperable, Reusable).

Upcoming Priorities

  • Harmonizing and cleaning datasets
  • Automating quality checks
  • Standardized pipelines for ingestion and validation
  • Contributor metadata enrichment
  • Visualization tooling for geographies and topics
  • Community engagement and co-design
“We envision a one-stop shop that not only hosts data but improves its quality through standards and collaboration.” – Nicolas Becker

The official public launch is aligned with COP30 (UN Climate Change Conference) in November 2025 and the kickoff of the UN Decade of Sustainable Transport.

Q&A Highlights

How do you encourage contributions?

“We offer flexible levels of involvement and co-design opportunities during the portal’s development.” – Nicolas Becker

Are you focused only on public transport?

“No. All transport-related data—fleets, emissions, public transport—is welcome. The users decide what’s useful.” – Nicolas Becker

How does the approval workflow work?

“Data is reviewed before going public. Even edits trigger a new review. It’s all role-based and admin-controlled.” – Nicolas Becker

How large are datasets? Where are they stored?

“Mostly megabytes. Stored in Cloudflare R2 with S3 APIs. Access is managed by CKAN’s permission system with presigned URLs.” – Luccas Mateus & João Demenech, Datopian

🎥 Watch the Recording

Missed the live session? 👉 Watch the full session here.

FAQs

What is the Transport Data Commons?
A shared, global data platform focused on improving the usability and lifespan of transport datasets for climate, policy, and research purposes.

Who is involved?
Led by GIZ with support from UNESCAP, the European Commission (EC), the Asian Development Bank (ADB), and many others.

Can I contribute data?
Yes—if you have a GitHub account, you can submit transport-related datasets subject to approval.

What software powers the portal?
CKAN for the backend, PortalJS for the frontend, hosted on Cloudflare R2.

When will it launch publicly?
Planned for COP30 in November 2025, aligned with the start of the UN Decade of Sustainable Transport.