South Africa’s AI Policy Failed Because the Evidence Trail Broke

South Africa’s withdrawn AI policy is easy to mock.

A national policy about artificial intelligence was itself undermined by what appear to have been AI-generated fake references. The irony is obvious. It is also the least useful part of the story.

The more useful question is not only: how did AI hallucinate sources?

The more serious question is: how did a public policy document move through review without anyone catching that parts of its evidence base could not be verified?

That is not only an AI problem. It is a workflow problem.

A white paper or policy process does not fail at the point where the final PDF is published. It usually fails much earlier, when source material is collected badly, evidence is not structured properly, claims are not linked back to sources, references are treated as decoration, and review becomes a reading exercise instead of an evidence-checking process.

AI can make that weakness worse. But AI did not invent the need for source traceability, citation checking, evidence review, and human judgement.

Those disciplines already belonged in the process.

Who this guide is for

This guide is for: Public-sector teams, policy teams, white paper teams, consultation leads, evidence-heavy reporting teams, and consultants using AI in public work.

What happened with South Africa’s AI policy

South Africa’s Draft National Artificial Intelligence Policy was published for public comment in April 2026. The draft was meant to guide the country’s approach to AI governance, innovation, ethics, digital infrastructure, and regulation.

Soon after publication, fake or unverifiable references were found in the document’s bibliography. Reuters reported that Communications Minister Solly Malatsi said the most plausible explanation was that AI-generated citations had been included without proper verification.

The draft was withdrawn.

That matters because the document was not a casual blog post, internal note, or first draft sitting in someone’s private folder. It was a public policy document dealing with a technology that already raises questions about accountability, transparency, public trust, economic power, data governance, labour, safety, and state capacity.

The failure also did not sit alone. South Africa’s Department of Home Affairs later dealt with a related problem in the Revised White Paper on Citizenship, Immigration and Refugee Protection. In that case, the department said apparent AI-generated references had been added after the fact, were not cited in the body, and appeared in a standalone reference list.

That second case is useful because it shows the same problem from another angle.

A reference list that is not tied to the body of the document is not evidence. It is decoration. If the sources do not exist, it is worse than decoration. It creates false authority.

A fake citation is not a formatting error

There is a temptation to treat fake citations as a technical mistake. Something went wrong with the tool. Someone should have checked the bibliography. The draft can be fixed and republished.

That is partly true. But it is too narrow.

In policy work, a citation is not only a line in a reference list. It is part of the evidence trail. It tells the reader where a claim comes from, what the claim depends on, and whether the argument can survive scrutiny.

If a source is fake, several things break at once.

The claim may not be supported.

The review process cannot be trusted.

The public cannot check the evidence.

Experts cannot respond properly.

The institution’s judgement is weakened.

In an evidence-heavy document, a fake source is not a typo. It is a broken evidence trail.

This is why the South African AI policy failure matters beyond the embarrassment. The document was about AI governance, but the process behind it appears to have failed on basic evidence governance.

The wrong lesson is “do not use AI”

The easy reaction is to say that AI should not be used in a white paper or policy process.

That is the wrong lesson.

AI can be useful in white paper work. It can be useful in public consultation analysis, policy review, donor reporting, research synthesis, and evidence-heavy reporting.

These processes often involve large volumes of material:

public submissions

expert comments

government documents

legal and policy frameworks

research reports

survey responses

workshop notes

transcripts

fieldwork records

previous drafts

review comments

Humans are not naturally good at repetitive extraction across hundreds or thousands of pages. People get tired. They miss patterns. They become inconsistent. They skim. They copy the wrong thing into the wrong table. They lose the link between a point and its source.

AI can help with that kind of work.

It can extract points from supplied documents. It can suggest tags. It can summarise long submissions. It can compare themes across stakeholder groups. It can pull out relevant quotes. It can prepare first-pass issue matrices. It can help a drafter find the source material behind a claim faster.

The issue is not whether AI is used.

The issue is where it is used, what it is allowed to do, what source material it is allowed to see, and how its outputs are checked before they influence the final document.

The better distinction is controlled AI versus uncontrolled AI

The debate should not be reduced to cheap AI versus expensive AI.

Tool quality matters. A basic consumer tool used casually is not the same as a more capable paid model used inside a structured workflow.

But even strong AI models can hallucinate. They can produce confident text that looks right but is wrong. They can invent citations when asked to support a claim. They can make weak evidence sound stronger than it is. They can write smoothly around gaps instead of flagging them.

The better distinction is controlled AI versus uncontrolled AI.

Uncontrolled AI works from broad prompts. It uses general model knowledge or open-ended browsing. It is asked to “make this more academic”, “support this with sources”, “write a literature review”, or “add references”. It creates polished output before the evidence base is ready. Humans then read the final prose and assume the references are real.

Controlled AI works differently.

It works only from approved source material. Each source has an ID. Each extracted point links back to a document, page, paragraph, row, or quote. AI outputs are stored in a table where people can check them. The system separates extraction from judgement. The final decision still belongs to people.

That is the real line.

AI should work inside the evidence system, not outside it.

The truth layer needs the strongest controls

Every policy or white paper process has a “truth layer”.

This is the layer where facts, sources, quotes, legal references, evidence claims, findings, and recommendations are established.

AI can support this layer. It should not invent it.

AI can help find a relevant excerpt inside an approved source folder. It can extract a claim from a submission. It can compare one source with another. It can flag possible contradictions. It can suggest which sources might support a theme.

It should not fabricate a literature review. It should not create references from memory. It should not add legal cases that nobody has opened. It should not turn general internet material into policy evidence. It should not decide what is true because a sentence sounds plausible.

The moment AI is allowed to create the truth layer without verification, the process is already unsafe.

This is where the South African AI policy failure becomes a useful case study. The problem was not only that AI may have produced fake citations. The problem was that those citations could enter the public record without being stopped by the process.

That is a human and institutional failure, not only a model failure.

AI hallucinated. Humans published the hallucination.

A responsible white paper process starts before drafting

A white paper process should not start by asking AI to write.

It should start by asking a simpler question:

What information do we have, where did it come from, how will we structure it, and how will we know whether the final document is supported by it?

That process has three connected stages:

collect the information properly

process the evidence traceably

use the evidence responsibly in reports, decisions, and public communication

This is the route I use in my own work: data collection, traceable evidence, and usable outputs.

The AI tool is not the starting point. The workflow is.

Stage one: collect the information properly

A weak intake process creates weak evidence later.

This is especially true in white paper, public consultation, and policy review work. Submissions may arrive through forms, emails, PDFs, spreadsheets, Word documents, workshops, interviews, letters, and attachments. If those inputs are not captured into a consistent system, the analysis becomes harder before it has even started.

A responsible intake process should give each submission or source a clear record.

That record should include details such as:

source ID

submitter or organisation

stakeholder type

date received

document name

file link

topic or chapter

permission or publication status

sensitivity flag

review status

notes on missing information

links to uploaded files

This is why Data Collection & Intake Systems matter. The form is not the system. The system is what happens after the information arrives.

For a white paper process, this could include a public submission portal, structured upload process, source register, automated file naming, submission IDs, and review fields. It could also include rules for how sensitive material is handled and what can be used in public-facing outputs.

This is the first AI control.

Not the prompt. Not the model. Not the subscription tier.

The first AI control is making sure the source material enters a system that can be traced and reviewed.

Stage two: process the evidence traceably

The middle of the workflow is where most evidence-heavy projects succeed or fail.

This is the stage between raw source material and the final public document. It is where submissions, interviews, reports, comments, and notes become coded evidence, themes, findings, issue matrices, quote banks, and recommendation support tables.

It is also the stage where AI can help the most, if the structure is right.

A proper evidence workflow should include:

source register

document IDs

submission IDs

coding framework

extraction template

evidence database

quote bank

claim tracker

issue matrix

synthesis table

findings matrix

recommendation support table

AI prompt library

review tracker

QA checklist

handover notes

This is the work covered by Traceable Evidence Workflow Support.

AI can assist here by extracting claims from each submission, suggesting themes, pulling exact quotes, comparing stakeholder groups, flagging contradictions, and preparing review-ready tables.

But the AI output should not go straight into the white paper.

It should go into a reviewable structure.

A human reviewer should be able to see:

what the AI extracted

which source it came from

the exact quote or excerpt

why the AI tagged it in that way

whether the tag makes sense

whether the source actually supports the claim

whether the point is strong enough to use

whether the finding needs expert judgement before drafting

This is where people often misunderstand AI.

The goal is not to make AI the thinker.

The goal is to reduce repetitive handling so people can spend more time on judgement.

A policy expert should not have to manually search through hundreds of submissions every time they need to check a point. But they should be able to check the quote, source, locator, and reasoning behind any point that enters the evidence base.

That is the difference between speed and carelessness.

Stage three: use the evidence responsibly

The final white paper should not be where the evidence trail disappears.

Once the source material has been collected and processed, the next question is how to use it. The structured evidence may need to become a white paper, consultation report, policy brief, dashboard, public summary, internal tool, microsite, presentation, or decision note.

This is where Data Use, Reporting & Communication Systems become relevant.

The output layer has to stay connected to the evidence behind it.

A white paper may need clear public language, but it still needs to be defensible. A dashboard may need to simplify the data, but it should not hide the assumptions. A public summary may need to be accessible, but it should not disconnect from the source material. A recommendation may need to be concise, but it still needs a route back to the evidence.

This is where practical tools such as a source-linked evidence table, a findings-to-recommendations matrix, or a public consultation response matrix become useful. They keep the route from source material to findings, recommendations, and public response notes visible enough for review.

AI can support this stage. It can help draft summaries, prepare briefing notes, compare evidence across themes, and suggest ways to explain patterns clearly.

But the final public document needs human responsibility.

Humans need to decide what the evidence means. Humans need to judge what is relevant. Humans need to check whether claims are fair. Humans need to decide how uncertainty is handled. Humans need to sign off the final wording.

AI should help the team move faster from structured evidence to usable output. It should not become the author of public judgement.

A practical contrast: the Local Government White Paper workflow

This is not theoretical for me.

In a Policy Evidence Workflow for a Local Government White Paper, I built a system designed to keep evidence, drafting, review, and consultation connected through a traceable workflow.

At a high level, the process followed a simple cycle:

evidence intake
evidence processing
evidence distribution
review and feedback
evidence processing
revised outputs

Public submissions, research reports, stakeholder comments, and other source material were first captured into a structured system. The evidence was then processed through thematic analysis, claims coding, synthesis, and review workflows designed to keep every finding linked back to its original source.

The resulting evidence was distributed to specialist drafting teams in the form of structured evidence packs, thematic reports, chapter-specific findings, supporting quotations, and other review-ready materials. Those teams used the evidence, together with their own expertise and judgement, to prepare and revise White Paper chapters.

When the draft entered public consultation, the process repeated. New submissions, comments, and stakeholder feedback were captured, processed, analysed, and distributed back to the drafting teams through consultation reports, thematic analyses, and chapter review packs.

Throughout the workflow, AI was used to support retrieval, comparison, synthesis, and evidence handling tasks, but every stage included human review and sign-off. Findings were checked by people, drafting decisions were made by people, and recommendations were approved by people. The system also maintained traceable evidence flows so that claims, findings, and comments could be linked back to their original sources.

The point was not to ask AI to write a white paper from scratch.

The point was to keep the evidence, drafting, and review process connected.

That chain matters.

It means the process did not rely on AI confidence. It relied on a source-controlled evidence system where AI could assist with retrieval, comparison, synthesis support, and drafting support, while humans remained responsible for review, interpretation, drafting, and sign-off.

That is the contrast.

Not AI versus no AI.

AI inside a source-controlled workflow versus AI producing unsupported authority outside one.

Where the AI policy process appears to have failed

Based on the public reporting, the South African AI policy failure appears to show several workflow problems.

AI use was not controlled clearly enough.

The bibliography was not verified before publication.

AI appears to have been allowed to generate or support citations.

There was no effective source-to-claim register.

The reference list appears to have been treated as a final document layer rather than part of the evidence system.

Internal review appears to have checked the document as a document, not as an evidence chain.

Senior approval did not catch the citation problem.

That last point matters.

In serious policy work, review cannot only mean reading the draft and checking whether it sounds reasonable. It must include checking whether the evidence exists, whether claims are supported, whether citations are real, and whether the final document can survive scrutiny.

A polished paragraph is not evidence.

A plausible reference is not a source.

A confident AI answer is not a research process.

A responsible AI white paper checklist

Any team using AI in a white paper or public consultation process should be able to answer these questions before publication.

Has an approved source library been created?

Does every source have an ID?

Does every submitted document link to a database record?

Does every extracted point have a source locator?

Is AI blocked from creating citations?

If AI is allowed to search externally, are those sources separately verified before use?

Are AI outputs stored in reviewable tables?

Is human review status visible?

Do claims link to source excerpts?

Do findings link to evidence?

Do recommendations link to findings?

Are reference list items cited in the body, or clearly marked as background material?

Has every reference been opened and verified by a person?

Is AI use logged by task, tool, source base, and reviewer?

Has a final evidence audit been completed before publication?

If the answer is no, the team does not yet have a responsible AI workflow. It has AI activity inside a weak evidence process.

That is where the risk sits.

A simple starting point is to test the process with a source traceability risk check before the draft reaches final review. For teams facing large submission volumes, a submission analysis capacity check can also show whether the review process is realistic before deadlines become the problem.

What public-sector AI adoption should learn from this

The lesson from South Africa’s AI policy withdrawal should not be that government departments, public-sector teams, or policy consultants must avoid AI entirely.

The lesson is that AI use in public work needs stronger process design.

Departments and project teams need clear rules for:

AI disclosure

source registers

citation audits

approved source folders

prompt and output logs

data sensitivity

human reviewer assignments

final evidence audits

no unchecked AI in the truth layer

They also need staff training that is not only about which button to click in an AI tool.

People need to understand what large language models are good at and where they fail. They need to know that AI can summarise a document without understanding whether the document is authoritative. They need to know that AI can produce a perfect-looking reference that does not exist. They need to know that the more polished the output looks, the more dangerous it can be if nobody checks it.

This is also why teams need to prepare documents properly for AI retrieval before they rely on AI search, summaries, or internal knowledge tools.

AI should not make weak processes look professional.

It should make strong processes faster to operate.

Structure before AI

The South African AI policy failure should make serious teams more careful. It should not make them less ambitious.

AI can help white paper processes. It can help teams process public submissions faster, compare evidence across stakeholder groups, retrieve quotes, prepare synthesis tables, and support drafting.

But it only works safely when the source material is collected properly, structured into traceable evidence, reviewed by people, and used responsibly in the final output.

A responsible white paper is not written from AI confidence.

It is built from a clear evidence process.

The real lesson is simple:

AI should help people review evidence faster. It should not be allowed to create evidence that people never reviewed.

And before any team asks which AI tool to use, it should ask a more basic question first.

What information do we have, where did it come from, how will we check it, and what does it need to become?

If your team is working on a white paper, public consultation, policy review, or evidence-heavy reporting process and wants to use AI without breaking the evidence trail, explore my work on Data Collection & Intake Systems, Traceable Evidence Workflow Support, and Data Use, Reporting & Communication Systems. The goal is not simply to add AI to the process, but to build a workflow where every claim can be traced, reviewed, and defended. You can also see how this approach works in practice in my Local Government White Paper case study.

These adjacent guides are useful when this workflow needs a tighter next step: why AI gives weak answers, source traceability, source register, public consultation response matrix, AI retrieval.

FAQ

What did South Africa’s AI policy failure show about evidence workflows?

It showed that fake or unverifiable citations are not only an AI problem. They are also a review workflow problem. Policy teams need source registers, citation checks, evidence locators and human review before a document reaches public release.

AI Retrieval & Knowledge Bases

Traceable Evidence Workflow Support

Turn interviews, submissions, case studies, survey comments, documents, and field notes into coded evidence, quote banks, synthesis tables, findings, recommendations, and report-ready outputs.

Service fit

Relevant service fit

This article sits inside the same delivery work, service logic, and practical outcomes shown across the site.

Traceable Evidence Workflow Support

Turn interviews, submissions, case studies, survey comments, documents, and field notes into coded evidence, quote banks, synthesis tables, findings, recommendations, and report-ready outputs.

Delivery examples

Related case studies

These delivery examples share the same service mix or workflow focus as the article you just read.

Next reads

Read the adjacent stage in the workflow.

Calculators

Relevant calculators

If this reflects a live bottleneck in your workflow, these tools can help you put rough numbers around it.

Softer next step

Not ready to send a brief yet?

Join the newsletter for practical notes on messy information, evidence workflows, source traceability, reporting pressure, and AI use that needs structure.

Join the newsletter Read the topic hub

Need help with a similar problem?

If this article reflects the kind of reporting, systems, or evidence challenge you are dealing with, send a short brief and I can help scope the right next step.

Who this guide is for

What happened with South Africa’s AI policy

A fake citation is not a formatting error

The wrong lesson is “do not use AI”

The better distinction is controlled AI versus uncontrolled AI

The truth layer needs the strongest controls

A responsible white paper process starts before drafting

Stage one: collect the information properly

Stage two: process the evidence traceably

Stage three: use the evidence responsibly

A practical contrast: the Local Government White Paper workflow

Where the AI policy process appears to have failed

A responsible AI white paper checklist

What public-sector AI adoption should learn from this

Structure before AI

Related guides

FAQ

What did South Africa’s AI policy failure show about evidence workflows?

Traceable Evidence Workflow Support

Relevant service fit

Related case studies

Policy Evidence Workflow for a Local Government White Paper

Next reads

How to Stop Losing Source Traceability in Evidence-Heavy Reports

How to Build a Source-Linked Evidence Table for a Report

How to Prepare Documents for AI Retrieval Without Losing Structure or Traceability

Relevant calculators

Source Traceability Risk Checker

Submission Analysis Capacity

Not ready to send a brief yet?

Need help with a similar problem?