The utility of large language models in oncological multidisciplinary team meetings: A systematic review

Prabhakaran, S; Bell, S; Lee, JC; Kong, JCH

Author(s): Prabhakaran, S; Bell, S; Lee, JC; Kong, JCH;
Details: Publication Year 2026-03-06,Volume 52,Issue #5,Page 111741
Journal Title: European Journal of Surgical Oncology
Publication Type: Review
Abstract: Large language models (LLMs) have emerged in recent years as innovative artificial intelligence systems with early potential in clinical decision-making. This is the first systematic review to evaluate LLMs' oncological decision-making and compare their treatment recommendations to "gold standard" oncological multi-disciplinary team (MDT) decision-making. PubMed, EMBASE and Medline databases were last searched on 20th January in line with PRISMA guidelines. All relevant peer-reviewed publications comparing LLM and MDT treatment recommendations in patients with cancer were included. Studies using fictional cases, case reports, and conference proceedings were excluded. Modified QUADAS-2 tool was used for bias assessment. The primary outcome was the concordance between LLM and MDT treatment recommendations. 34 publications met the inclusion criteria with a total of 3513 patient cases included in this review. Studies were highly heterogenous with regards to study design, sample size, cancers studied, and LLM models evaluated, among others. Concordance rates ranged from 16 to 100% across all studies. Highest concordance rates were noted in prostate cancer cases, where the LLM was directed to incorporate established international guidelines in decision-making. One third of studies exhibited a high level of bias. Limitations to LLM decision-making include overtreatment of frail patients, lack of reproducibility, insufficient niche knowledge, occasional life-threatening recommendations, and medico-legal issues including privacy and confidentiality. LLMs may be capable of generating appropriate oncological treatment recommendations, but early outcomes are inconsistent, and conflicting across the various studies with regards to safety. Robust prospective comparative studies are yet needed to better determine their utility in this setting.
Publisher: Elsevier
Keywords: Artificial intelligence; Large language models; Multi-disciplinary meetings; Surgical oncology
Department(s): Surgical Oncology
Publisher's Version: https://doi.org/10.1016/j.ejso.2026.111741
Open Access at Publisher's Site: https://doi.org/10.1016/j.ejso.2026.111741
Terms of Use/Rights Notice: Refer to copyright notice on published article.

Creation Date: 2026-04-07 03:20:34

Last Modified: 2026-04-07 03:20:52