Methodology
How we verify a citation
Hallucite checks whether a referenced work actually exists, whether its details are internally consistent, and whether the identifiers point where they claim to. Here is what we check and why — without the exact scoring rules, which we keep private so they can't be gamed.
Why this matters now
Large language models routinely invent plausible-looking references. A 2026 audit of 111 million references across 2.5 million papers estimated roughly 146,932 fabricated citations in 2025 alone, found that about 78.8% slipped past preprint moderation, and that ~85% persisted from preprint into the published record. Fabricated references are no longer a rare slip — they are a systemic contaminant entering the scholarly record at scale.
What we check against
We cross-check every reference against multiple independent, authoritative databases rather than trusting any single index:
- Crossref
- PubMed
- Semantic Scholar
- OpenAlex
- Google Books
- Retraction Watch (to flag retracted and corrected works)
Using several sources matters: a reference missing from one index may simply be unindexed there, not fake. Requiring corroboration across sources is how we separate genuinely non-existent works from merely obscure ones.
The three verdicts
Verified — the work exists and its details are consistent across sources.
Discrepancy— the work appears real but something doesn't line up: a wrong year, altered authors, a journal that doesn't match, or an identifier that resolves to a different paper.
Not found — no authoritative source can corroborate the work; it is likely fabricated.
We also distinguish “we couldn't check this” (a source outage or an unparseable reference) from “this is fake.” A reference we failed to check is never reported as a fabrication.
The errors we localize
When something is off, we point to the specific field rather than just raising a flag:
- Identifier mismatch — a DOI or arXiv ID that resolves to a different, unrelated work.
- Author inconsistency — fabricated or swapped author lists.
- Title analysis — titles that don't correspond to any real work.
- Year, journal & cross-reference consistency — metadata that contradicts the real record.
What we catch that broad audits miss
The largest published audits check only one thing: whether a title exists. They explicitly set aside the case of a real title paired with wrong metadata — the most common way a hijacked reference hides. Hallucite is built around exactly that class: a DOI that resolves to a different paper, a real article credited to invented authors, or a genuine title relabeled with the wrong venue. These pass a does-the-title-exist check but fail ours.
How well it works
On our curated adversarial benchmark of 137 cases run against live databases, Hallucite caught 100% of fabricated citations (none missed) and flagged 9/9 retracted works. We hold ourselves equally accountable for false alarms on genuine references — see the research & benchmark page for the full numbers and the much larger synthetic benchmark we're building.
One honest limit: verifying that a cited work exists is different from verifying that it supports the claimit's attached to. Claim-support checking is a harder, still-open research problem, and we don't claim to solve it.