Recent research challenges Google’s claims about its Gemini AI models’ capabilities in processing and understanding large volumes of data. Two separate studies evaluated the models’ performance with extensive datasets, including long documents and video content. The results were underwhelming, showing that Gemini 1.5 Pro and 1.5 Flash often failed to provide accurate answers, significantly underperforming compared to expectations set by Google’s marketing. For instance, when tested with lengthy fiction books, the models answered true/false questions correctly less than 50% of the time, which is no better than random chance. Similarly, tasks involving video content also revealed significant shortcomings, with the models struggling to interpret and extract information accurately. The findings suggest that while Gemini models can technically handle large amounts of data, their ability to truly understand and reason over this data is limited. These revelations come at a time when the industry is scrutinizing generative AI for its practical utility and accuracy, raising questions about the future of AI in business applications. Improved benchmarks and third-party evaluations are recommended to provide a more realistic picture of AI capabilities.

Google’s Gemini AI Models Overhyped, Struggle with Large Data Sets
Recent studies reveal that Google’s Gemini AI models struggle with large datasets, contradicting the company’s claims.
1–2 minutes










