Development of a Machine Learning Model for Predicting Treatment-Related Amenorrhea in Young Women with Breast Cancer
- Author(s)
- Song, L; Edib, Z; Aickelin, U; Akbarzadeh Khorshidi, H; Hamy, AS; Jayasinghe, Y; Hickey, M; Anderson, RA; Lambertini, M; Condorelli, M; Demeestere, I; Ignatiadis, M; Pistilli, B; Su, HI; Chang, S; Pang, PC; Reyal, F; Nelson, SM; Sukumvanich, P; Minisini, A; Puglisi, F; Ruddy, KJ; Couch, FJ; Olson, JE; Stern, K; Agresta, F; Stafford, L; Chin-Lenn, L; Cui, W; Anazodo, A; Gorelik, A; Nguyen, TL; Partridge, A; Saunders, C; Sullivan, E; Macheras-Magias, M; Peate, M;
- Details
- Publication Year 2025-10-28,Volume 12,Issue #11,Page 1171
- Journal Title
- Bioengineering
- Publication Type
- Research article
- Abstract
- Treatment-induced ovarian function loss is a significant concern for many young patients with breast cancer. Accurately predicting this risk is crucial for counselling young patients and informing their fertility-related decision-making. However, current risk prediction models for treatment-related ovarian function loss have limitations. To provide a broader representation of patient cohorts and improve feature selection, we combined retrospective data from six datasets within the FoRECAsT (Infertility after Cancer Predictor) databank, including 2679 pre-menopausal women diagnosed with breast cancer. This combined dataset presented notable missingness, prompting us to employ cross imputation using the k-nearest neighbours (KNN) machine learning (ML) algorithm. Employing Lasso regression, we developed an ML model to forecast the risk of treatment-related amenorrhea as a surrogate marker of ovarian function loss at 12 months after starting chemotherapy. Our model identified 20 variables significantly associated with risk of developing amenorrhea. Internal validation resulted in an area under the receiver operating characteristic curve (AUC) of 0.820 (95% CI: 0.817-0.823), while external validation with another dataset demonstrated an AUC of 0.743 (95% CI: 0.666-0.818). A cutoff of 0.20 was chosen to achieve higher sensitivity in validation, as false negatives-patients incorrectly classified as likely to regain menses-could miss timely opportunities for fertility preservation if desired. At this threshold, internal validation yielded sensitivity and precision rates of 91.3% and 61.7%, respectively, while external validation showed 92.9% and 60.0%. Leveraging ML methodologies, we not only devised a model for personalised risk prediction of amenorrhea, demonstrating substantial enhancements over existing models but also showcased a robust framework for maximally harnessing available data sources.
- Publisher
- MDPI
- Keywords
- breast cancer; cross imputation; machine learning; risk prediction model; treatment-related amenorrhea
- Department(s)
- Medical Oncology
- Publisher's Version
- https://doi.org/10.3390/bioengineering12111171
- Open Access at Publisher's Site
https://doi.org/10.3390/bioengineering12111171- Terms of Use/Rights Notice
- Refer to copyright notice on published article.
Creation Date: 2026-01-15 11:50:42
Last Modified: 2026-01-15 11:51:01