Generative AI chatbots for reliable cancer information: Evaluating web-search, multilingual, and reference capabilities of emerging large language models
Journal Title
European Journal of Cancer
Publication Type
Online publication before print
Abstract
Recent advancements in large language models (LLMs) enable real-time web search, improved referencing, and multilingual support, yet ensuring they provide safe health information remains crucial. This perspective evaluates seven publicly accessible LLMs-ChatGPT, Co-Pilot, Gemini, MetaAI, Claude, Grok, Perplexity-on three simple cancer-related queries across eight languages (336 responses: English, French, Chinese, Thai, Hindi, Nepali, Vietnamese, and Arabic). None of the 42 English responses contained clinically meaningful hallucinations, whereas 7 of 294 non-English responses did. 48 % (162/336) of responses included valid references, but 39 % of the English references were.com links reflecting quality concerns. English responses frequently exceeded an eighth-grade level, and many non-English outputs were also complex. These findings reflect substantial progress over the past 2-years but reveal persistent gaps in multilingual accuracy, reliable reference inclusion, referral practices, and readability. Ongoing benchmarking is essential to ensure LLMs safely support global health information dichotomy and meet online information standards.
Keywords
Artificial intelligence; Cancer enquiries; English; Health enquiries; Language; Large language model
Department(s)
Medical Oncology
Open Access at Publisher's Site
https://doi.org/10.1016/j.ejca.2025.115274
Terms of Use/Rights Notice
Refer to copyright notice on published article.


Creation Date: 2025-02-11 06:48:54
Last Modified: 2025-02-11 06:50:53

© 2025 The Walter and Eliza Hall Institute of Medical Research. Access to this website is subject to our Privacy Policy and Terms of Use

An error has occurred. This application may no longer respond until reloaded. Reload 🗙