THE Genesis and the Persistence from the Hallucination in Intelligence Artificial

The Genesis and Persistence of Hallucination in Artificial Intelligence

1. Introduction: The Ubiquity of Plausible Error and the Crisis of Trust in Generative AI

The rise of generative artificial intelligence has ushered in an era of unprecedented technological advancements, but it has also exposed a fundamental vulnerability that threatens its safe integration into high-risk domains: the phenomenon of hallucination. Far from being a mere technical flaw, hallucination—the generation of false or misleading information presented with apparent confidence and plausibility—represents a systemic barrier to the reliable adoption of large language models (LLMs) in fields such as law, medicine, and finance, where factual accuracy is not an option, but an imperative.1 

The impact of this fallibility was dramatically illustrated in the case Mata v. Avianca, Inc., which in 2023, in the United States District Court for the Southern District of New York, became a paradigmatic example of the professional and ethical risks of generative AI. In the lawsuit, lawyers submitted a legal brief citing a series of fictitious precedents, such as Varghese v. China Southern Airlines Co Ltd, manufactured by ChatGPT. 

The investigation revealed that the lawyers had used the model to supplement their research, treating it as a "super search engine." The AI, however, operated under a different paradigm: that of a "good test taker." 3, optimized to produce the most plausible answer, regardless of its veracity. Judge P. Kevin Castel, describing the situation as an "unprecedented circumstance," imposed a $5,000 fine on the lawyers and the firm for acting in "subjective bad faith." 

The case Forest was not just an anecdote about professional incompetence, but a microcosm of the fundamental misalignment between user expectation (truth) and the model's optimization function (plausibility) that lies at the heart of the AI trust crisis. The persistence of this problem, documented in outlets such as The New York Times and CNET 4, indicates that a comprehensive understanding requires more than merely cataloging flaws; it requires a robust theoretical framework.

It is in this context that the recent and seminal article “Why Language Models Hallucinate,” published on September 4, 2025, emerges as an indispensable theoretical contribution, whose credibility is amplified by its origin: its authors, Adam Tauman Kalai, Ofir Nachum, and Edwin Zhang, are researchers at OpenAI itself, the organization behind ChatGPT.3 

This critical analysis, the first to delve into his findings in Brazil, will use his framework to dissect the statistical genesis and sociotechnical persistence of hallucinations. Although the term "hallucination" is controversial because it anthropomorphizes what are, in essence, statistical errors,1, it will be used by convention, while maintaining a critical distance anchored in the theory of computational learning.

2. The Genesis of Hallucination: A Statistical Consequence Inherent to Pre-Training

The most significant contribution of Kalai et al.'s work is the demystification of hallucination, removing it from the realm of mysterious emergent properties and placing it firmly within the realm of statistical learning theory. They argue that hallucinations are not a bizarre byproduct, but a natural and mathematically predictable consequence of the objectives optimized during the training of language models.3 The crux of their argument lies in an elegant theoretical reduction: they demonstrate that the complex problem of generating valid text (an unsupervised learning task) is, in a formal sense, harder than the simpler problem of classifying whether a given text is valid or not (a supervised learning task).3 

To better explain where AI's so-called "hallucinations" come from, the authors created a mental test called the "Is-It-Valid" (IIV) problem. It works like this: instead of asking the machine to generate an entire answer (a difficult and error-prone task), you simply ask whether a pre-written answer is valid or not. The key point is that if even this simple binary (yes/no) test is already difficult, then the more complex task of generating a correct answer will be even more error-prone. Mathematically, they show that the error rate in generation will always be at least twice the error rate in IIV classification.3 

With this relationship established, the analysis turns to the factors that cause classification errors and, consequently, hallucinations. The first is epistemic uncertainty, which occurs when there is no succinct pattern or general rule that can be learned from the data, as in the case of arbitrary facts (e.g., birthdays of non-public individuals). For these facts, the model must memorize, not generalize. 

From there, they analyze two major causes of hallucinations: (a) Epistemic uncertainty – this occurs when there's no general pattern to be learned from the data. Think of arbitrary facts, like a typical person's birthday. There's no rule; the model only gets it right if it has memorized this information. The problem is that this type of data often appears only once in the training texts (so-called singletons). In this case, the chance of error is almost inevitable; (b) Inadequate models – this occurs when the "tool" itself wasn't designed for the type of problem it needs to solve. An example is using a very simple model, like the old trigram model, to try to capture long-term relationships or asking a word-based model to count characters. The architecture itself already limits accuracy.3 

This statistical framework is powerful because it helps us understand why hallucinations aren't isolated accidents or "magical glitches" of AI. They are predictable results of the limitations of the data and models used.3 

On one side, Yann LeCun, Meta's chief AI scientist, is a vocal critic of autoregressive models, arguing that they are “doomed” (doomed) because its methodology of generating one token at a time, without high-level planning or an underlying world model, makes errors inevitable and cumulative.6 For LeCun, the probability of a long answer being completely correct decreases exponentially.8 

In direct contrast, Geoffrey Hinton, one of the “godfathers” of neural networks, postulates that to predict the next word with high accuracy, a model he must develop a genuine understanding of the underlying concepts by comparing hallucinations to human confabulation—a process of memory reconstruction that can lead to plausible errors. Offering a third perspective, Emily M. Bender, Timnit Gebru and collaborators proposed the influential metaphor of LLMs as “stochastic parrots,” arguing that hallucinations are the main evidence that the models lack meaning or anchoring in reality, merely recombining linguistic patterns.9 

Kalai et al.'s work acts as a mediating theory that refines and contextualizes these three views. It complements LeCun's critique by providing a theory of origin of the initial error that the autoregressive architecture amplifies. He anchors Hinton's cognitive science analogy in computational learning theory, offering a precise statistical reason (the singleton rate) why a system, even one that "understands," would confabulate when its knowledge is sparse. And finally, he shifts the "stochastic parrots" debate from unprovable claims about internal states to observable facts about data and training objectives, demonstrating that the phenomenon can be explained entirely by external statistical pressures.

3. The Persistence of Hallucination: The “Good Test Student” Paradigm and the Ineffectiveness of Purely Technical Solutions

If pre-training explains the statistical genesis of hallucinations, the post-training phase and, more crucially, the AI evaluation ecosystem explain their persistence. Kalai et al.'s second major contribution is a sociotechnical argument that exposes how the very way the AI community measures success reinforces and rewards hallucinatory behavior. 

The paper presents the powerful analogy that LLMs are optimized to behave like students taking an exam where blank answers are worth zero, while a guess, even if incorrect, has a chance of being worth points.3 In this scenario, the optimal strategy is to never leave a question blank; guessing is always the rational choice. Language models, therefore, are perpetually in a “testing mode” (test-taking mode), which encourages them to generate plausible falsehoods rather than express uncertainty with responses like “I don’t know” (IDK).3 

This assertion is empirically substantiated by an analysis of the influential benchmarks that dominate the scoreboards and guide model development. The vast majority employ binary metrics (correct/incorrect), such as accuracy or pass rate, which inherently penalize abstention, creating what the authors call an "epidemic of uncertainty penalization."3 A model that honestly signals its uncertainty and never hallucinates will underperform a model that always “guesses” when uncertain on most current metrics. 

Faced with this problem, the technical community developed Retrieval-Augmented Generation (RAG), an architectural solution designed to anchor model responses in external, verifiable data sources.10 The RAG architecture works in two stages: a “retrieval” component searches for relevant information from a knowledge base, and a “generator” formulates a response based on that data.10 

The effectiveness of RAG in mitigating hallucinations is undeniable, and systems like Thomson Reuters' AI-Assisted Research on Westlaw Precision use it to ensure that responses only cite content from their legal database, preventing fabrication of cases as occurred in Mata v. Avianca. However, RAG is not a panacea, and Kalai et al.'s framework reveals its fundamental blind spot: it addresses the capacity of a model being factual, but not its incentiveHallucinations can still occur at any point in the RAG pipeline: the data source may contain errors, the retriever may fail, and the generator may not faithfully adhere to the context. 

More importantly, when the question is outside the scope of the knowledge base, the LLM faces the same fundamental choice: admit failure or generate an answer based on its parametric knowledge, risking hallucination. Because the assessment ecosystem penalizes the former and implicitly rewards the latter, the model is still systematically pushed toward guesswork. The case Mata v. Avianca is, again, the perfect illustration. The lawyers could have used reliable RAG systems like Westlaw or LexisNexis.

Instead, they used a general-purpose LLM that, lacking a specific legal knowledge base and optimized to always provide a confident answer, did exactly what it was trained to do: bluffed magnificently. The failure lay not just in the technology, but in a profound misalignment of expectations rooted in the core incentives of AI training and assessment.

4. Conclusion

Kalai, Nachum, Vempala, and Zhang's analysis teaches us that hallucinations in language models are not freak accidents or peripheral defects, but inevitable statistical consequences of how these systems are trained and evaluated. The key lesson is that AI reliability will not emerge as a natural byproduct of ever-larger models: it needs to be consciously designed, incentivized, and rewarded. The goal is not an omniscient machine, but an intelligence capable of recognizing the limits of its knowledge. 

To achieve this, the authors propose a decisive change: reforming the benchmarks that currently reward guesswork and punish honesty. Imagine a test in which, instead of being worth taking a risk, the student only wins if they're certain—and loses heavily if they confidently make a mistake. This is how models should be evaluated: severely punishing false convictions and rewarding calibrated abstinence. This is called behavioral calibration, which transforms the "I don't know" from a weakness into a virtue.

The implications for law and regulation are profound. Professionals can no longer be content to simply verify the machine's final output; they must examine the incentives that shape its behavior. The motto "trust, but verify" gives way to a more demanding one: "understand the incentives, then verify." Regulators, in turn, should not limit themselves to demanding precision metrics; they must also impose transparency in evaluation criteria and promote training that rewards epistemic humility. 

The future of trustworthy AI may include not just source citations, but explicit statements of confidence: "This answer is 95% certain," offering the user a clear calculation of risk and reward. In this scenario, AI ceases to be an opaque oracle and becomes a true cognitive partner, trustworthy not because it is infallible, but because it is honest about its limitations.

This movement goes beyond technical: it's also legal, philosophical, and cultural. In law, an AI that invents precedents behaves like a witness who lies with conviction—and should be treated with the same caution, always requiring corroborated evidence. In healthcare, an AI that "confabulates" a diagnosis is equivalent to a doctor who prescribes without examining: unacceptable without oversight and accountability mechanisms. 

In philosophy, we learn that epistemic trust requires calibrating language and certainty, and that the ultimate responsibility lies not with the machine, but with those who design, regulate, and use it. And from pedagogy and psychology, we inherit the notion that true cognitive maturity lies in admitting one's own ignorance—a lesson we now need to teach machines as well.

Thus, the work of Kalai and colleagues is not just an elegant diagnosis, but a call to action: if we want to reduce delusions, we must reengineer not only the models, but the entire incentive ecosystem surrounding them. This requires benchmarks that value honesty, regulations that demand transparency, professionals who understand the logic behind the machine, and a moral design that cultivates the virtue of truthfulness. 

It's not enough to wait for future advances; the challenge is present. The fate of trustworthy AI depends less on technical miracles and more on the courage to align statistics, ethics, and law around a single value: the pursuit of truth. If we succeed, we will transform language models not into fallible gods, but into lucid partners who know a lot—and especially, know when they don't know.

References cited

  1. JI, Ziwei et al. Survey of Hallucination in Natural Language Generation. ACM Computing Surveys, [S. l.], v. 55, n. 12, p. 1-38, Dec. 2023. Available at: https://dl.acm.org/doi/10.1145/3571730 . Accessed on: September 7, 2025.
  2. CAPITOL TECHNOLOGY UNIVERSITY. Combatting AI Hallucinations and Falsified Information. Capitol Technology University Blog, [S. l.], 10 Aug. 2023. Available at: https://www.captechu.edu/blog/combatting-ai-hallucinations-and-falsified-information . Accessed on: September 7, 2025.
  3. KALAI, Adam Tauman et al. Why Language Models Hallucinate. [S. l.]: OpenAI; Georgia Tech, September 4 2025. Available at: https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf . Accessed on: September 7, 2025.
  4. SHANKLAND, Stephen. What Are AI Hallucinations? Why Chatbots Make Things Up, and What You Need to Know. CNET, [S. l.], 23 jun. 2023. Available at: https://www.cnet.com/tech/services-and-software/what-are-ai-hallucinations-why-chatbots-make-things-up-and-what-you-need-to-know/ . Accessed on: September 7, 2025.
  5. LECUN, Yann. Objective-Driven AI: Towards AI systems that can learn, reason, and plan. University of Washington ECE, Seattle, January 24, 2024. Presentation. Available at: https://www.ece.uw.edu/wp-content/uploads/2024/01/lecun-20240124-uw-lyttle.pdf . Accessed on: September 7, 2025.
  6. WONDERFALL. Some thoughts on autoregressive models. Wonder's Lab, [S. l.], May 3, 2023. Available at: https://wonderfall.dev/autoregressive/ . Accessed on: September 7, 2025.
  7. BENDER, Emily M. et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In: FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. [S. l.]: Association for Computing Machinery, Mar. 2021. p. 610–623. Available at: https://dl.acm.org/doi/10.1145/3442188.3445922 . Accessed on: September 7, 2025.
  8. LECUN, Yann. Objective-Driven AI. Harvard CMSA, Cambridge, MA, March 28, 2024. Presentation. Available at: https://cmsa.fas.harvard.edu/media/lecun-20240328-harvard_reduced.pdf . Accessed on: September 7, 2025.
  9. BENDER, Emily M. et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In: FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. [S. l.]: Association for Computing Machinery, Mar. 2021. p. 610–623. Available at: https://dl.acm.org/doi/10.1145/3442188.3445922 . Accessed on: September 7, 2025.
  10. ZERO GRAVITY MARKETING. The Science Behind RAG: How It Reduces AI Hallucinations. Zero Gravity Marketing Blog, [S. l.], 1 Feb. 2024. Available at: https://zerogravitymarketing.com/blog/the-science-behind-rag/ . Accessed on: September 7, 2025.
  11. RAWAT, Bhanu et al. Detect hallucinations for RAG-based systems. AWS Machine Learning Blog, [S. l.], May 22, 2024. Available at: https://aws.amazon.com/blogs/machine-learning/detect-hallucinations-for-rag-based-systems/ . Accessed on: September 7, 2025.

CHEN, Jyun-Yu et al. Hallucination Mitigation for Retrieval-Augmented Large Language Models. Mathematics, [S. l.], v. 13, no. 5, p. 856, Feb. 2024. Available at: https://www.mdpi.com/2227-7390/13/5/856 . Accessed on: September 7, 2025.

Victor Habib Lantyer
lantyer.com.br

Lawyer, professor, author, and researcher specializing in Digital Law, AI, Intellectual Property, and the LGPD. He is the author of the book "LGPD and Its Impact on Labor Law" and "Digital Law and Innovation" and has over seven legal works. He is a member of the Permanent Technology and Innovation Committee of the Brazilian Bar Association (OAB/BA), coordinator of the Artificial Intelligence coordination team, and a member of the LGPD and Metaverse coordination teams. He is a member of the National Association of Digital Law Attorneys. He is the creator and creator of the Lantyer Educacional website (www.lantyer.com.br), which simplifies legal matters in a simple, easy, and democratic way.

Related Posts
Leave a Reply

Your email address will not be published.Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

pt_BRPortuguês do Brasil