A team of Oxford researchers has developed a novel method to predict when artificial intelligence (AI) text models are likely to generate “hallucinations” or inaccurate responses. These hallucinations, often seen in generative AI models like OpenAI’s GPT and Anthropic’s Claude, pose significant risks in fields such as medicine, journalism, and law. The new method focuses on measuring the semantic entropy of the outputs, which assesses the variability in the meanings of responses rather than just the sequence of words. By doing so, it can distinguish between a model’s uncertainty about the content and uncertainty about the phrasing. This method, tested on six large language models including GPT-4 and LLaMA 2, proved more effective than previous techniques in identifying questions likely to produce false answers. Although it requires more computational resources, the improved reliability could be invaluable for applications where accuracy is critical. This advancement addresses one of the main criticisms of large language models, as exemplified by Google’s recent disabling of its AI Overview feature due to misleading answers.

Source.

TOP STORIES

The Quantum Revolution - Transforming Technology and Security
Quantum computing is transforming industries, but it poses significant cybersecurity risks …
Investigation Launched Into OpenAI by State Attorneys General
A coalition of state attorneys general has opened an investigation into OpenAI …
Anthropic Faces AI Export Controls - A New Era of Regulation
The U.S. government’s export control directive has forced Anthropic to disable its new AI models, raising questions about regulation and …
SpaceX's Bold Move - Merging Rockets with AI Power
SpaceX’s recent deal with Google highlights its shift from aerospace to AI infrastructure …
Google Takes Action Against AI-Driven Cybercrime Network
Google is suing to dismantle the infrastructure behind an alleged massive AI-powered cybercrime operation …
AI Adoption Surges Despite Public Concerns
AI usage continues to grow rapidly, even as public sentiment remains skeptical …

latest stories