Methodology

How we verify a citation

Hallucite checks whether a referenced work actually exists, whether its details are internally consistent, and whether the identifiers point where they claim to. Here is what we check and why — without the exact scoring rules, which we keep private so they can't be gamed.

Why this matters now

Large language models routinely invent plausible-looking references. A 2026 audit of 111 million references across 2.5 million papers estimated roughly 146,932 fabricated citations in 2025 alone, found that about 78.8% slipped past preprint moderation, and that ~85% persisted from preprint into the published record. Fabricated references are no longer a rare slip — they are a systemic contaminant entering the scholarly record at scale.

What we check against

We cross-check every reference against multiple independent, authoritative databases rather than trusting any single index:

  • Crossref
  • PubMed
  • Semantic Scholar
  • OpenAlex
  • Google Books
  • Retraction Watch (to flag retracted and corrected works)

Using several sources matters: a reference missing from one index may simply be unindexed there, not fake. Requiring corroboration across sources is how we separate genuinely non-existent works from merely obscure ones.

The three verdicts

Verified — the work exists and its details are consistent across sources.

Discrepancy— the work appears real but something doesn't line up: a wrong year, altered authors, a journal that doesn't match, or an identifier that resolves to a different paper.

Not found — no authoritative source can corroborate the work; it is likely fabricated.

We also distinguish “we couldn't check this” (a source outage or an unparseable reference) from “this is fake.” A reference we failed to check is never reported as a fabrication.

The errors we localize

When something is off, we point to the specific field rather than just raising a flag:

  • Identifier mismatch — a DOI or arXiv ID that resolves to a different, unrelated work.
  • Author inconsistency — fabricated or swapped author lists.
  • Title analysis — titles that don't correspond to any real work.
  • Year, journal & cross-reference consistency — metadata that contradicts the real record.

What we catch that broad audits miss

The largest published audits check only one thing: whether a title exists. They explicitly set aside the case of a real title paired with wrong metadata — the most common way a hijacked reference hides. Hallucite is built around exactly that class: a DOI that resolves to a different paper, a real article credited to invented authors, or a genuine title relabeled with the wrong venue. These pass a does-the-title-exist check but fail ours.

How well it works

On our curated adversarial benchmark of 137 cases run against live databases, Hallucite caught 100% of fabricated citations (none missed) and flagged 9/9 retracted works. We hold ourselves equally accountable for false alarms on genuine references — see the research & benchmark page for the full numbers and the much larger synthetic benchmark we're building.

One honest limit: verifying that a cited work exists is different from verifying that it supports the claimit's attached to. Claim-support checking is a harder, still-open research problem, and we don't claim to solve it.