A new WHO discussion paper on artificial intelligence and evidence-informed policymaking reads, on its surface, like a practical guide.
It explains how to use AI to accelerate systematic reviews, integrate heterogeneous data, model scenarios, and enable "living evidence" workflows that continuously update as new studies emerge. It then maps these capabilities across the policy cycle and offers principles for responsible adoption.

But its deeper argument is more consequential and deserves attention. AI is reshaping what counts as evidence in public policy, and the institutions that govern evidence aren't yet shaping what comes next.
Improving the Evidence to Policy Pipeline
AI is rapidly changing how information pipelines are assembled and updated. Instead of relying on episodic reviews or overly simplistic research overviews, governments can bring in more information, more quickly, and revisit decisions more often.
This guidance lands at a moment when many public health systems are under tremendous pressure. Budgets have tightened, staffing shortages persist, and translating evidence into policy remains slow and resource-intensive.
The bottleneck is rarely a lack of knowledge. Rather, it is the capacity and effective presentation of information that can be processed and applied in time to impact decision-making.
My experience as a policy analyst specializing in the uptake of research and innovation taught me that the bottleneck is rarely a lack of knowledge. Rather, it is the capacity and effective presentation of information that can be processed and applied in time to impact decision-making.
We can use AI to ease that constraint. Smaller teams can synthesize evidence, revisit decisions, and manage information flows in ways that were previously difficult to sustain.
That is where its value becomes more than incremental. If its promise is realized, this shifts the role of AI from one of incremental efficiency to a structural improvement in how evidence informs policy.
As one of the most persistent barriers in public health in our lifetime, this role is essential.
The Key Risks
The paper leans on a concept it calls epistemic injustice. This is the idea that AI systems privilege "quantifiable, data-rich evidence from dominant institutions while marginalizing other valuable knowledge sources such as lived experience, local expertise, Indigenous knowledge systems, and community-based insights."
Models built in data-rich settings may misrepresent realities in data-scarce ones.
Proxy variables, like using income as a stand-in for health outcomes, can flatten complex social relationships into clean numerical inputs. And generative AI tools can produce summaries that read as authoritative even when the underlying evidence is thin or fabricated.
The commercial concentration of AI development also raises real questions about whose tools, trained on whose data, end up shaping government decisions.
The cumulative effect of relying on LLM interpretation could be an ironically unintended shift toward an even narrower scope of research.
The cumulative effect of relying on LLM interpretation could be an ironically unintended shift toward an even narrower scope of research. And translating cultural and linguistic realities is a weakness that existing research and evaluation standards already struggle to correct.
Faster decisions can come at the expense of inclusiveness. More data does not mean better evidence. And the groups most likely to be missed are those whose knowledge is hardest for AI systems to ingest in the first place.
What is Augmentation?
The paper emphasizes augmentation over automation. This means AI-generated outputs as inputs to human judgment. Humans frame the questions, evaluate the quality of the evidence, interpret results in context, and weigh ethical considerations.
Tasks requiring normative judgment, like what trade-offs to accept, whose values to center, and how to weigh equity against efficiency, stay assigned to people. The paper specifically calls for multidisciplinary oversight teams that include expertise in sociology, anthropology, psychology, and ethics, not just data science.
Consider a ministry updating maternal health guidelines, an area where research is particularly scarce, with a strong correlation with life or death. Today, that process may take a year, two, or five.
Consider a ministry updating maternal health guidelines, an area where research is particularly scarce, with a strong correlation with life or death. Today, that process may take a year, two, or five. With AI, an initial evidence synthesis can be done in days. But the value comes from how that output is used.
The ministry can take those summaries and test them against what health workers are seeing, translated for the evidence pipeline more easily with the help of AI.
Are the recommendations feasible? What is missing? Where do they conflict with how care is actually delivered? Those differences can, in theory, be documented rapidly across a nation with the appropriate adoption of AI tools and used to adjust guidance before it is finalized.
Instead of waiting years for a full revision, the ministry can return to specific sections as new information comes in, whether from new studies or frontline experience.
As a result, AI can enable policymakers to gather, synthesize, and analyze research evidence and insights from affected communities more efficiently and more equitably.
As a result, AI can enable policymakers to gather, synthesize, and analyze research evidence and insights from affected communities more efficiently and more equitably.
A simple way forward begins to look like this:
-
Use AI to summarize a defined set of evidence on a policy question
-
Compare those outputs with input from practitioners or local partners, led by the communities affected
-
Use AI to document and synthesize where they align and where they diverge
-
Adjust guidance before finalizing decisions
However, for this to work in practice, government decision-makers must purposefully begin building institutional processes to assess how AI outputs hold up against frontline knowledge and develop the capacity to make broader, easier AI adoption defensible.
Why This Matters
Health ministries are an important case, but not a special one.
The same dynamics are present in education, social protection, climate adaptation, tax administration, and pretty much any issue you can think of to name. Pressure to move faster. Constrained capacity. AI vendors are offering tools that promise to close the gap.
The WHO paper's contribution is to model what serious engagement with these tools looks like when the goal is to strengthen evidence-informed policymaking rather than to skirt around it.
And most importantly, the paper is honest that this is a foundation for ongoing dialogue, not a settled framework. Concepts of evidence are evolving. Regulatory landscapes are fragmented.
The hardest question, which the paper raises but cannot yet answer, is whether the institutions responsible for evidence generation and governance can adapt quickly enough to shape AI integration on their own terms.
In my experience working with health ministries and the global health community, most local health workers and health clinics still use paper-based data collection, let alone digital health systems.
Closing this gap will require massive prioritization and upskilling.
I asked a Syrian colleague running hospitals and clinics what he thought of the guidance. His reply? I would never trust AI to make a decision for me.
I asked a Syrian colleague running hospitals and clinics what he thought of the guidance. His reply?
“Personally, I have trust issues with AI. I use it for simple tasks like translation, basic data analysis, and proofreading, but I always review the results carefully. I would never trust it to make a decision for me.
For AI to be useful in this kind of work, we would need to understand the process: how it reaches its conclusions, what data was used to train the model, and whether that data could introduce bias.”
I followed up with, “Could this work with the government happen any time soon?”
“I don’t think so. I’m close to people in the Ministry of Health, and they’re struggling just to manage with the resources they have.”
If community health leaders cannot catch up, the technology and those behind building it will define the practice. And the slow demise of whose evidence counts will rewrite reality, with or without a framework to make sense of it.