A team of Oxford researchers has developed a novel method to predict when artificial intelligence (AI) text models are likely to generate “hallucinations” or inaccurate responses. These hallucinations, often seen in generative AI models like OpenAI’s GPT and Anthropic’s Claude, pose significant risks in fields such as medicine, journalism, and law. The new method focuses on measuring the semantic entropy of the outputs, which assesses the variability in the meanings of responses rather than just the sequence of words. By doing so, it can distinguish between a model’s uncertainty about the content and uncertainty about the phrasing. This method, tested on six large language models including GPT-4 and LLaMA 2, proved more effective than previous techniques in identifying questions likely to produce false answers. Although it requires more computational resources, the improved reliability could be invaluable for applications where accuracy is critical. This advancement addresses one of the main criticisms of large language models, as exemplified by Google’s recent disabling of its AI Overview feature due to misleading answers.

Predicting AI Hallucinations – A Breakthrough by Oxford Researchers
Researchers have developed a new method to predict when AI models are likely to generate inaccurate responses.
1–2 minutes










