Scoring PD-L1 Expression in Urothelial Carcinoma: An International Multi-Institutional Study on Comparison of Manual and Artificial Intelligence Measurement Model (AIM-PD-L1) Pathology Assessments

R&#252;schoff, J; Kumar, G; Badve, S; Jasani, B; Krause, E; Rioux-Leclercq, N; Rojo, F; Martini, M; Cheng, L; Tretiakova, M; Mitchell, C; Anders, RA; Robert, ME; Fahy, D; Pyle, M; Le, Q; Yu, L; Glass, B; Baxi, V; Babadjanova, Z; Pratt, J; Brutus, S; Karasarides, M; Hartmann, A

Author(s): Rüschoff, J; Kumar, G; Badve, S; Jasani, B; Krause, E; Rioux-Leclercq, N; Rojo, F; Martini, M; Cheng, L; Tretiakova, M; Mitchell, C; Anders, RA; Robert, ME; Fahy, D; Pyle, M; Le, Q; Yu, L; Glass, B; Baxi, V; Babadjanova, Z; Pratt, J; Brutus, S; Karasarides, M; Hartmann, A;
Details: Publication Year 2024-04,Volume 484,Issue #4,Page 597-608
Journal Title: Virchows Archiv
Publication Type: Research article
Abstract: Assessing programmed death ligand 1 (PD-L1) expression on tumor cells (TCs) using Food and Drug Administration-approved, validated immunoassays can guide the use of immune checkpoint inhibitor (ICI) therapy in cancer treatment. However, substantial interobserver variability has been reported using these immunoassays. Artificial intelligence (AI) has the potential to accurately measure biomarker expression in tissue samples, but its reliability and comparability to standard manual scoring remain to be evaluated. This multinational study sought to compare the %TC scoring of PD-L1 expression in advanced urothelial carcinoma, assessed by either an AI Measurement Model (AIM-PD-L1) or expert pathologists. The concordance among pathologists and between pathologists and AIM-PD-L1 was determined. The positivity rate of ≥ 1%TC PD-L1 was between 20-30% for 8/10 pathologists, and the degree of agreement and scoring distribution for among pathologists and between pathologists and AIM-PD-L1 was similar both scored as a continuous variable or using the pre-defined cutoff. Numerically higher score variation was observed with the 22C3 assay than with the 28-8 assay. A 2-h training module on the 28-8 assay did not significantly impact manual assessment. Cases exhibiting significantly higher variability in the assessment of PD-L1 expression (mean absolute deviation > 10) were found to have patterns of PD-L1 staining that were more challenging to interpret. An improved understanding of sources of manual scoring variability can be applied to PD-L1 expression analysis in the clinical setting. In the future, the application of AI algorithms could serve as a valuable reference guide for pathologists while scoring PD-L1.
Publisher: Springer Nature
Keywords: Humans; *B7-H1 Antigen/analysis/metabolism; *Artificial Intelligence; *Biomarkers, Tumor/analysis/metabolism; *Observer Variation; Reproducibility of Results; Carcinoma, Transitional Cell/pathology/metabolism/diagnosis; Urinary Bladder Neoplasms/pathology/metabolism; Urologic Neoplasms/pathology/metabolism; Immunohistochemistry/methods; Pathologists; Urothelium/pathology/metabolism; Artificial intelligence; Bladder cancer; Pd-l1; Pathology
Department(s): Pathology
Publisher's Version: https://doi.org/10.1007/s00428-024-03795-8
Terms of Use/Rights Notice: Refer to copyright notice on published article.

Creation Date: 2024-07-17 07:22:37

Last Modified: 2024-07-17 07:24:34