A Dozen Interns on Cocaine: What One of the Longest-Running Civic Tech Projects Reveals About AI in Government

Matti Schneider has a trick for cutting through AI hype. When civil servants come to him buzzing about what artificial intelligence will do for their agency, he tells them to swap the words “AI” for “a dozen interns on cocaine” and see if the sentence still makes sense.

“Would you give a dozen interns on cocaine the job of mocking up a prototype for next week’s team meeting?” Sure, worth a shot. “Would you delegate your core service delivery — the one that determines next year’s budget — to a dozen interns on cocaine?” Obviously not.

What makes the analogy stick is not just the chaos. It's confidence. Like over-eager interns, these systems are built to produce answers, not to question them.

Schneider isn’t just being provocative. He’s offering a simple test of risk tolerance: where is approximation acceptable, and where is it not?

We are investing in the wrong kind of AI for the most important public functions.

That distinction matters because the current wave of investment in artificial intelligence is blurring it. The problem is not that AI is useless in government. It’s that we are investing in the wrong kind of AI for the most important public functions.

Schneider has spent over a decade building something most AI evangelists don’t even know exists: computational legal tools. The idea is straightforward, even if the execution is hard. You take the rules that govern people’s lives—tax brackets, benefit eligibility thresholds, housing allowances—and translate them, provision by provision, into code that a computer can execute. This is not code that guesses or predicts. It computes.

You put in a family’s circumstances; you get back the taxes they owe and benefits they qualify for, with the same result every time. Every step is auditable. It is, in other words, the opposite of what a large language model does.

Schneider, a French engineer who co-built the government startup incubator beta.gouv.fr, has worked on three continents doing exactly this. Since 2014, much of that work has centered on OpenFisca, arguably one of the most successful civic tech projects most people have never heard of. He now runs it as an independent nonprofit after the French government stopped funding the project.

He calls OpenFisca a “fancy calculator,” and he means that as a compliment. It’s a tool, not an oracle.

He calls OpenFisca a “fancy calculator,” and he means that as a compliment. “It just calculates legislation,” he says. “You describe how much rent you pay, how many children you have, how many employees your company has, and then you press a few buttons, and in the end, you get the result of the computation." It won’t replace judges. It won’t automatically disburse funds based on a database. It’s a tool, not an oracle.

That simplicity is the point.

The project began in 2011 inside "France Stratégie," the prime minister's strategic planning body, where two economists — Mahdi Ben Jelloul and Clément Schaff — were frustrated with the expensive, opaque microsimulation tools available to them. They built their own and open-sourced it. When the government's open data taskforce, Etalab, picked it up in 2014, things accelerated. Engineers (Schneider among them) showed up at a hackathon and started building citizen-facing applications on top of the engine.

One of those applications, Mes Aides, informed over 1.6 million French households of their eligibility for benefits in a single year. A household can find out in minutes what they are entitled to across dozens of programs, such as housing aid, family allowances, and disability support, without contacting a single agency.

Another, LexImpact, lets members of parliament simulate the effects of tax amendments during live debate. The French employment agency used it to help jobseekers understand how returning to work would affect their benefits. In 2021 alone, French parliamentarians ran 122 simulations with LexImpact. Such simulations run with OpenFisca changed the legislation that was ultimately enacted.

The model spread internationally from Europe to Africa to Oceania, powering tax and benefit simulations across jurisdictions. At a 2016 Open Government Partnership hackathon, a small team modeled Senegal's income tax code in 36 hours and shipped a working simulator. Later, Schneider moved to New Zealand and joined the government’s service innovation lab to create a rates rebate calculator for homeowners. In 2023, OpenFisca won the Edge of Government Innovation Award at the World Government Summit.

And then the money ran out.

The sustainability trap

In 2020, the French government announced it would stop funding OpenFisca. “We got too successful,” Schneider says. The logic was simple: if so many others were using it, why should one government pay?

A short-term COVID relief grant provided a lifeline, allowing Schneider to spin the project into an independent nonprofit and experiment with a membership model. But the underlying problem remained. Contributors around the world used the codebase without contributing to its maintenance. Some forked it into new projects, like PolicyEngine. Now a nonprofit backed by a partnership with the National Bureau of Economic Research, PolicyEngine models tax and benefit policy across U.S. states and the UK, proof that OpenFisca's DNA is thriving even as the original project struggles.

Schneider admits that he assumed that because the project was embedded inside the government, it would have long-term stability. It didn't. "Even working within government, you are at the mercy of a change in direction, and in a way more so than in a commercial venture," he says. “A company's cash flow keeps its leadership tethered to reality. A new political appointee faces no such constraint.”

Anyone investing in public AI infrastructure should be paying attention. Without deliberate investment in governance, funding, and independence, even successful systems can starve.

Anyone investing in public AI infrastructure should be paying attention. OpenFisca proved that open, collaboratively maintained legal infrastructure can work across borders and political systems. But it also showed that without deliberate investment in governance, funding, and independence, even successful systems can starve.

Where the calculator meets the chatbot

Now Schneider worries that large language models are accelerating that neglect.

Even if you assume a language model is 99% accurate on legal questions — a generous assumption — a 1% error rate in tax collection or benefits eligibility is not a rounding error. It is systemic failure.

He recalls a representative of an international institution asking why anyone would bother making law computable when you could simply feed hundreds of pages of legislation into a language model.

His answer is what he calls the 1% problem.

Even if you assume a language model is 99% accurate on legal questions — a generous assumption — a 1% error rate in tax collection or benefits eligibility is not a rounding error. It is systemic failure. A 1% error rate in a chatbot is tolerable. A 1% error rate in determining people’s rights is not, and a loss of public trust can't be measured at all.

But Schneider isn't a Luddite about this. He sees three ways in which AI and rules-as-code reinforce each other.

First, LLMs can help with the drudge work of updating parameters, such as reformatting new tax brackets from legalese into structured data. "LLMs are really good at manipulating formats," he says. "When you're not asking them to invent things, but simply to reformat, they really shine."

Second, they can take the precise but statistical output of OpenFisca's computations and translate it into plain-language summaries for policymakers. The economist runs the simulation; the LLM writes the four-page executive summary.

Third, the possibility that Schneider is most cautious about, is the use of rules-as-code systems as guardrails for AI agents, checking LLM outputs against formally encoded law. "There's certainly something here," he says. But when asked what he'd want to test first, his answer isn't a demo. "I want to red-team the whole thing. We’re going to try and confuse it, and assess safety before potential utility."

The Limits of Generative Systems

Schneider draws from information theory to explain the difference between computational and generative AI systems. Large language models are good at expanding information, taking something compact and generating something rich and varied. They are far less reliable at taking something ambiguous and producing something precise.

Ask a language model to summarize a complex policy landscape, and it will produce a plausible narrative. Ask it to determine exactly who qualifies for a benefit, and small ambiguities can compound into real errors.

It is the very simplicity of the “technology” that makes the system so trustworthy. He worries that "when law is not understandable anymore, we lose the very fabric of democracy.”

To drive home the point that sometimes the right tools are the simplest ones, he points to the French voting system of paper ballots, transparent boxes, a maximum of 1,000 people per polling station, and anyone is welcome to stay and count. It is the very simplicity of the “technology” that makes the system so trustworthy. He worries that "when law is not understandable anymore, we lose the very fabric of democracy.”

The worst-case scenario, as Schneider frames it, isn't that AI fails. It's that we "yield to complexity" and that we stop believing humans can manage the rules that govern their own societies, and outsource all the cognitive work to systems nobody can audit. Then, when “your friend has been brought to jail, or your children cannot go to university, because computers said so,” he warns, “it's just a few years before you burn the computers down. And honestly, I will be with the ones who burn them down."

Fifteen years of OpenFisca offer a different path, and a warning. Governments can build shared, transparent, computable infrastructure for the rules that shape people's lives, and ensure that it works. Millions of households are informed of their rights. Lawmakers are simulating reforms before they vote. Jobseekers are making better decisions about returning to work. All of it deterministic, all of it auditable, all of it open.

The lesson is not to reject AI, but to be precise about where it belongs.

At the scale of government, prediction is not enough. For rights and entitlements, you need computation and transparency, not probability and plausibility.

The sustainability trap

Where the calculator meets the chatbot

The Limits of Generative Systems

Tags

What AI Governance Documents Actually Cover and What They Don’t

AI for Governance: How Institutions Can Provide AI Access Safely, Affordably, and at Scale

Using AI to Support Public Deliberation: A Conversation with Audrey Tang