How to Make a Local LLM Cite Only Sources It Actually Read
How to Make a Local LLM Cite Only Sources It Actually Read
I asked a self-hosted model — a local 12B running on my own machine — for a cited research brief. It came back clean: organised findings, a tidy list of sources at the bottom.
Then I compared that list against the pages the model had actually opened.
A third of them were pages it never read.
It had pulled numbers from search-result snippets and listed the links as if it had read the pages behind them. The figures were plausible. The links were real and they loaded. But "the link loads" is not "I read this page and the number is on it." That gap is where quiet, confident, wrong citations come from.
Here is exactly how I closed it — and why the fix is not a smarter prompt.
The real failure mode: snippet ≠ source
When an agent researches, it usually does two things: a web search (which returns short snippets plus links), then it opens some of those links to read them.
The trap is that a snippet already contains tempting numbers. A model under any kind of budget pressure will lift the figure straight from the snippet and attach the link — without ever opening the page. The output looks identically well-cited either way. You cannot tell, from the finished note, which sources were read and which were merely seen in a search result.
This is not exotic hallucination. Every individual fact might even be true. The danger is the provenance: a citation that claims "I read this" when the model only glanced at a one-line preview.
Why "just tell it to behave" isn't enough
My first instinct was the obvious one. I added a rule, in plain language: a search snippet is a lead, not a citation — only cite a page you actually opened.
It got better. It did not get fixed.
On a small local model, a prompt rule is a strong suggestion, not a constraint. It holds right up until the moment the model is squeezed — say it has five things to price but only enough budget to open three pages. Then it quietly falls back to snippet numbers for the other two and cites them anyway. The instruction was right; the model just didn't obey it under load.
If your reliability depends on the model choosing to follow a rule, you don't have reliability. You have a hope.
The fix: a check the model can't talk its way past
The principle: don't grade the model on its own self-report. Grade it against what actually happened. "What actually happened" is observable, because the tool that opens pages knows exactly which URLs it fetched.
Three small pieces:
1. Log every page the tool genuinely opens
Wherever your agent's browser/fetch tool returns real page content, append that URL to a small per-day log file. One line per genuine read. Skip failures, blocks, and empty responses — only log a page you truly retrieved.
2026-06-19T11:42:07Z https://www.dgav.pt/.../registo/
2026-06-19T11:43:15Z https://files.dre.pt/.../67246729.pdf
This is the ground truth. It is written by the tool, not claimed by the model.
2. After the note is written, check every citation against the log
A short script pulls each URL from the note's sources section and confirms it appears in the log. If a cited URL was never opened, the note fails the check. That's it — roughly ten lines of shell. No model judges the result, so no model can argue with it.
3. Fail the note, and route the unverified facts honestly
When the check fails, the cited-but-unopened links don't just get deleted (that would leave the number stranded in the body, looking verified). The fact moves to an explicit "to verify" section, clearly marked as unconfirmed. The reader sees precisely what was read and what wasn't.
The receipt
The first time I ran this check on a brief I had already read and approved by eye, it failed 4 of 6 sources. Two of those I had personally reviewed and waved through — they looked fine, the links worked, and I missed that the model never opened them.
A ten-line deterministic check caught two fabricated-provenance citations that a human review had passed. That is the whole argument in one result.
After wiring the check into the workflow, a fresh run came back with every cited source genuinely opened — and the few numbers it couldn't confirm on a page were honestly parked as "to verify" instead of dressed up as facts.
Why this is easier when you own the model
This is the quiet advantage of running AI on your own infrastructure rather than renting it through a closed API: you can add the guarantees the vendor doesn't give you. The browser tool, the log, the check — they live in your stack, under your rules. You're not waiting for a provider to ship a "trust" feature. The verdict logic is yours.
It also keeps the data local. The whole research loop — search, read, verify — ran on one machine, on a model I control. For anything touching regulated or sensitive material, that combination (local + verifiable) is worth more than a few points of raw model quality.
The honest limit
Be clear about what this check does and does not prove. It proves the model opened the page it cited. It does not prove the number on that page is correct, or that the page actually supports the claim. A link can be reachable and read and still be misquoted.
So this is one layer, not the whole answer. It eliminates the most common and most invisible failure — citing things you never read — and it makes the remaining risk explicit: a short list of specific figures worth a human spot-check. That is a dramatically smaller, more honest surface than "trust the tidy list at the bottom."
The takeaway
With small local models, the pattern repeats everywhere: a prompt rule is a suggestion; a deterministic check is a guarantee. The glamorous lever (a smarter prompt, a bigger model, more agents) rarely fixes a reliability problem. The boring lever — a check that observes what actually happened and refuses to be argued with — usually does.
If you're building research or agent workflows on local models, start there. Log what the tools really do, and verify the output against that log. It's the cheapest reliability you'll ever add.
All numbers in this post are from a documented ImparLabs session (19 June 2026): a local Gemma-class 12B model running on a single consumer GPU via llama.cpp. The "failed 4 of 6 sources" result is the real output of the verification check on that day's research note.
Frequently Asked Questions
Why does a local LLM cite sources it never read?
Because it does not distinguish a search-result snippet from a page it actually opened. A web search returns short snippets with links; the model lifts a number from the snippet and lists the link as a source. The link is real and reachable — but 'reachable' is not 'I opened this page and the figure is on it.' Small local models do this more often than large hosted ones, but all of them do it.
Isn't telling the model to 'only cite what you opened' enough?
It helps, and it is not enough on a small model. A prompt rule is a strong suggestion the model can break under pressure — for example when it has more items to cover than it has time to open pages. In my testing it still slipped. The reliable fix is not a better instruction; it is a deterministic check that does not depend on the model behaving.
How do you check it deterministically?
Have the tool that opens pages log every URL it genuinely fetched to a small file. After the model writes its note, a short script confirms every cited URL appears in that log. If a citation is not there, the note fails — no model involved in the verdict, so the model cannot talk its way past it. On my first run it failed 4 of 6 sources, two of which I had personally reviewed and approved.
Ready to automate your business?
We build AI tools and automation systems for European SMEs — from rapid MVPs to production systems, always GDPR-compliant.
it's human stuff
Weekly AI insights for European SMEs. No hype, just what works.
Keep Reading
How to Verify an AI-Written Plan Before It Fails: The 30-Minute Method
An AI wrote a migration plan for our infrastructure. It was 95% right — and the missing 5% would have broken production. Here is the independent-verification method that caught it, step by step, with the real findings.
345 Follow-ups, Zero Replies: Debugging Automated Outreach
Our automated follow-up emails had a 0% reply rate. The root cause was a missing SQL JOIN — the AI was writing 'personalized' emails with no actual data. Here's what we found and fixed.