Tutorials

Multi-Source Data Aggregation for Public Health Platforms: Reconciling Africa CDC, WHO, Johns Hopkins, and Our World in Data

Four credible sources, four different numbers for the same indicator on the same day. The discipline of multi-source aggregation is not picking the winner. It is publishing the harmonised series with the methodological caveats preserved, so users understand what they are looking at.

P

Written by

PANEOTECH Team

Published

July 12, 2021

Read time

8 min read

The four-sources problem

A continental health platform building on international data inherits a structural problem. Multiple credible sources publish epidemiological indicators for the same African countries on the same day, and the numbers do not always agree. The Africa Centres for Disease Control and Prevention publish official continental aggregations. The World Health Organisation Regional Office for Africa publishes its own series. Our World in Data harmonises further and applies its own methodology. Johns Hopkins University publishes a global series with its own collection cadence. Each source is methodologically defensible. None is wrong. They simply represent different snapshots of an evolving data flow.

The temptation is to pick a winner. Choose one source as canonical and ignore the others. The temptation is wrong. Each source has strengths the others do not. Africa CDC carries continental institutional weight. WHO Afro carries the global health authority frame. Our World in Data carries methodological transparency and revision discipline. Johns Hopkins carries the global comparative frame. Picking one and discarding the others reduces the platform's value to the analytical communities that need access to all four perspectives.

The harmonisation discipline

The architectural answer is harmonisation rather than selection. The platform ingests all credible sources, normalises country names to standardised codes at the ingestion boundary, aligns indicator definitions through documented mappings, and publishes both the harmonised series and the source-by-source breakdowns. Users see the consolidated continental view by default, and can drop down to the source-level series when they need to understand discrepancies, methodology differences, or revision patterns.

The discipline that makes harmonisation work is methodological transparency. Every harmonised indicator is documented with its source mappings, its conversion logic, its revision policy, and the date of its last upstream update. Discrepancies between sources are surfaced, not concealed. Where the sources disagree by more than a methodological margin, the platform shows both rather than averaging them into a single number that misrepresents the underlying flow. The user sees the data the sources actually publish, with the platform's harmonisation work visibly in service of comparability rather than concealment.

What we built for COVID Watch Africa

PANEOTECH delivered the multi-source aggregation pipeline behind COVID Watch Africa for POLIWATCH AFRICA. Africa CDC, WHO Afro, Our World in Data, and Johns Hopkins University were ingested as separate streams, harmonised at the country and indicator level, and published as both consolidated continental series and source-attributed breakdowns. The methodology page on the platform documented the harmonisation logic, the source-by-source mappings, the revision policy, and the limitations users should account for in their interpretation.

The result was a platform that supported the analytical work of public health teams, journalists, and researchers across the continent, with the credibility that came from showing the data rather than improvising it. Decision-makers consulting the platform saw the harmonised continental view they needed at speed, with the source-level transparency they needed for the published record.

The institutional lesson

For public health platforms drawing on multiple international sources, the choice is not between picking a winner and presenting confused data. It is between honest harmonisation with methodological transparency and the false simplification that loses the platform its credibility the first time a user spot-checks a number against an upstream source. Aggregate honestly, document fully, and the platform earns the institutional trust that public health information demands.

About the author

PANEOTECH Team

Pan-African Digital Systems Engineering

PANEOTECH designs and delivers secure, scalable, and sustainable digital ecosystems for governments, multilateral institutions, and the private sector across Africa. Field notes, case studies, and analyses from our engagements appear in this publication.

Continue reading

More from PANEOTECH

Tutorials

Offline-First, Multilingual Mobile Architecture: Engineering Knowledge Platforms for Sahel Connectivity

A mobile knowledge platform for the Sahel that assumes continuous connectivity and a single language is a platform the audience cannot use. Offline-first multilingual architecture is not a feature. It is the structural premise that decides whether the platform reaches the users whose decisions it exists to inform.

Tutorials

BPM-Driven No-Code Workflows for Quality Teams: Configurable Forms, Routing, and Audit Trails Without a Developer

A quality management platform whose workflows can only be modified by the vendor that built it has limited the institution's quality discipline to whatever the contract scoped. The configurable BPM engine resolves the limitation, and the discipline that makes it work is institutional rather than technical.

Tutorials

Offline-First Field Operations: PWA, Trusted Web Activity, and the Sync Status Contract With the Inspector

Field inspectors do not have time to wonder whether their data was uploaded. The discipline behind offline-first design is the contract you make with the user about sync status, and the engineering that honours it.

Tutorials

Low-Bandwidth Web Performance for African Audiences: Engineering for Sub-3-Second Loads on Constrained Connections

A web platform that takes ten seconds to load on the connections the audience actually has is a platform the audience does not use. Engineering for sub-three-second performance on constrained connections is not a feature. It is the discipline that decides whether the audience reaches the platform at all.

Tutorials

AI on Public Sector Platforms: Grounded, Cited, and Subject to the Same Editorial Governance as Everything Else

Public sector AI cannot tolerate hallucination. The discipline of grounding every answer in cited source material, and routing every AI output through the same editorial governance as human content, is what makes it institutionally viable.

Tutorials

Human-in-the-Loop AI for Public Safety: Why Critical Alerts Should Never Auto-Diffuse

Full automation looks like the natural endpoint of an AI alerting system. It is not. Public-safety alerting requires institutional accountability that no algorithm can carry, and the architecture has to enforce the human validation that protects the chain of accountability.