Quality and Usability of Prostate Cancer Information Generated by Artificial Intelligence Chatbots: A Comparative Analysis

Al-Khanaty, A; Santucci, J; Hennes, D; Sathianathen, N; Delgado, C; Sharma, K; Dinneen, E; Sandhu, K; Chen, D; Eapen, R; Moon, D; Jack, G; Goad, J; Siva, S; Ali, M; Bolton, D; Lawrentschuk, N; Murphy, DG; Perera, M

Author(s): Al-Khanaty, A; Santucci, J; Hennes, D; Sathianathen, N; Delgado, C; Sharma, K; Dinneen, E; Sandhu, K; Chen, D; Eapen, R; Moon, D; Jack, G; Goad, J; Siva, S; Ali, M; Bolton, D; Lawrentschuk, N; Murphy, DG; Perera, M;
Details: Publication Year 2026-03-11,Volume 18,Issue #6,Page 906
Journal Title: Cancers
Publication Type: Research article
Abstract: BACKGROUND: Artificial intelligence chatbots are increasingly used by patients to obtain health information, including for prostate cancer. While these platforms offer accessible and conversational responses, concerns remain regarding the quality, usability, and clinical relevance of AI-generated content. This study comparatively evaluated patient-directed prostate cancer information generated by commonly used AI chatbots. METHODS: Standardised prostate cancer-related prompts were developed using Google Trends and authoritative healthcare resources. Identical queries were submitted to five publicly accessible AI chatbots: ChatGPT 5.2, Google Gemini, Claude AI, Microsoft Copilot, and Perplexity. Responses were independently assessed by two blinded reviewers using the DISCERN instrument for information quality and the Patient Education Materials Assessment Tool for printable materials (PEMAT-P) for understandability and actionability. Inter-rater reliability was assessed using intraclass correlation coefficients (ICCs). Readability was evaluated using the Flesch-Kincaid Reading Ease score. Descriptive statistics were used for comparative and pooled analyses. RESULTS: Overall information quality was moderate, with a pooled median (interquartile range [IQR]) DISCERN score of 56.5 (53.0-61.0). Higher mean DISCERN scores were observed for ChatGPT 5.2 and Microsoft Copilot, whereas lower scores were observed for Claude and Perplexity. PEMAT-P understandability was consistently high across platforms, with a pooled median (IQR) score of 91.7% (83.3-91.7%). In contrast, PEMAT-P actionability was uniformly poor, with a pooled median (IQR) score of 0% (0-0%). Readability analysis demonstrated moderate complexity, with a pooled median (IQR) Flesch-Kincaid Reading Ease score of 50.4 (49.2-52.5) and a median word count of 666 (657-1022). Inter-rater reliability was good for PEMAT understandability (ICC 0.841) and moderate for DISCERN (ICC 0.712). CONCLUSIONS: AI chatbots provide highly understandable but only moderately high-quality patient-directed prostate cancer information, with a consistent lack of actionable guidance. Although variation in content quality was observed across platforms, significant limitations remain in evidence transparency and practical patient support. Future development should prioritise integration of evidence-based resources and actionable decision-support tools to enhance the role of AI chatbots in prostate cancer education.
Publisher: MDPI
Keywords: artificial intelligence; chatbots; health information quality; patient education; prostate cancer
Department(s): Surgical Oncology; Cancer Imaging; Radiation Oncology
Publisher's Version: https://doi.org/10.3390/cancers18060906
Open Access at Publisher's Site: https://doi.org/10.3390/cancers18060906
Terms of Use/Rights Notice: Refer to copyright notice on published article.

Creation Date: 2026-04-02 12:29:58

Last Modified: 2026-04-02 12:30:07