The increasing reliance on artificial intelligence (AI) for medical diagnoses is being questioned due to new findings about the cognitive abilities of AI systems. A recent study published in the BMJ highlights that AI tools, including large language models (LLMs) and chatbots, may exhibit signs of cognitive decline as they “age,” similar to humans. This raises important concerns about the reliability of AI in medical settings.
Key Findings:
- Researchers tested popular LLMs like ChatGPT, Sonnet, and Gemini using the Montreal Cognitive Assessment (MoCA) to evaluate their cognitive skills.
- While some tasks were manageable for the AI, they struggled significantly with visual/spatial skills and executive functions.
- The latest version of ChatGPT performed best with a score of 26 out of 30, while an older model, Gemini 1.0, scored only 16.
- The study emphasizes that these results are observational and cannot be directly compared to human cognitive performance.
Implications for the Future:
These findings suggest a potential limitation in using AI for medical diagnostics, especially in tasks requiring complex cognitive functions. As AI tools continue to evolve, understanding their cognitive limitations is crucial to ensure patient safety and maintain trust in medical recommendations. The research also opens up a humorous yet serious avenue for neurologists to evaluate AI systems for cognitive impairments, highlighting the need for a careful approach to integrating AI in healthcare.











