
As artificial intelligence (AI) integrates deeper into our daily lives, particularly in fields as critical as healthcare, understanding its reliability and longevity becomes paramount. Recent research, however, has sparked a debate about whether AI, much like the human brain, could experience a form of cognitive decline as it ages. This notion challenges the prevailing optimism about AI’s potential to seamlessly integrate into roles traditionally held by humans, such as medical diagnostics.

Exploring Cognitive Decline in AI
A study published on December 20, 2024, in the British Medical Journal (BMJ) presents intriguing evidence suggesting that older AI models, particularly those based on large language models (LLMs) like ChatGPT, Sonnet, and Gemini, exhibit signs of cognitive wear akin to human aging. The researchers utilized the Montreal Cognitive Assessment (MoCA), a tool traditionally used to detect cognitive impairment in humans, to evaluate various aspects of AI cognitive abilities including memory, attention, language, and spatial skills.
Despite the high performance in areas like language and attention, the AI models struggled significantly with tasks requiring visual-spatial skills and executive function. Notably, while the latest iteration of ChatGPT scored a proficient 26 out of 30, its predecessor, Gemini 1.0, only managed a 16, suggesting a decline in cognitive capabilities over time.
Critique and Controversy
The study’s approach has not been without its critics. Some experts argue that applying a human-centric test like MoCA to AI systems is not only inappropriate but also misleading. Aya Awwad, a research fellow at Mass General Hospital, critiqued the methodology in a recent letter, highlighting that the deficits noted in AI performance might be irrelevant to the technology’s application in clinical settings, which primarily require text processing capabilities.

Moreover, Aaron Sterling, CEO of EMR Data Cloud, alongside Stanford Assistant Professor Roxana Daneshjou, emphasized the need for repeated testing over time to truly ascertain if what is being observed can be classified as cognitive decline. Their argument points towards a more dynamic evaluation of AI capabilities post-updates and enhancements, rather than a one-time assessment.
Humor and Seriousness Intertwined
Adding an additional layer to the discourse, Roy Dayan, the lead author of the study and a doctor at Hadassah Medical Center in Jerusalem, clarified that the study’s tone, published in the Christmas edition of BMJ, was partly humorous. The study, whimsically titled “Age Against the Machine,” intended to use humor to highlight serious considerations about the evolving capabilities of AI in medical research and practice.
Dayan’s response to the critiques underscores a deeper intent to scrutinize how AI’s processing and response mechanisms differ fundamentally from human cognition, and why these differences matter in clinical applications.

Forward-Looking Perspectives
This debate serves as a critical reminder of the complexities involved in adopting AI technologies in sensitive fields like healthcare. It underscores the necessity for continuous research and adaptation to ensure AI tools remain effective and reliable as they evolve. As AI continues to grow older and more sophisticated, the medical community and AI developers must remain vigilant, ensuring these tools do not just mimic human intelligence but complement it effectively, particularly in scenarios where precision and reliability are non-negotiable.