Machine learning and deep learning to improve overall survival prediction in cervical cancer patients

Nan Jiang; Xing Xiong; Xue Chen; Mengmeng Feng; Yan Guo; Chunhong Hu

doi:10.21037/tcr-2024-2304

Original Article

Machine learning and deep learning to improve overall survival prediction in cervical cancer patients

Nan Jiang¹ , Xing Xiong¹, Xue Chen^1,2 , Mengmeng Feng¹, Yan Guo³ , Chunhong Hu^1,4

¹Department of Radiology, The First Affiliated Hospital of Soochow University, Suzhou, China; ²Department of Radiology, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Suzhou, China; ³Department of Gynecology and Obstetrics, The First Affiliated Hospital of Soochow University, Suzhou, China; ⁴Institute of Medical Imaging, Soochow University, Suzhou, China

Contributions: (I) Conception and design: N Jiang, C Hu; (II) Administrative support: C Hu; (III) Provision of study materials or patients: X Chen; (IV) Collection and assembly of data: X Xiong, M Feng; (V) Data analysis and interpretation: N Jiang, Y Guo, C Hu; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Chunhong Hu, MD. Department of Radiology, The First Affiliated Hospital of Soochow University, No. 188 Shizi Street, Suzhou 215006, China; Institute of Medical Imaging, Soochow University, No. 188 Shizi Street, Suzhou 215006, China. Email: sdhuchunhong@sina.com.

Background: Cervical cancer (CC) is one of the most common gynecological malignancies. Previous studies have shown that the prognosis of CC is affected by many factors. Our study aimed to identify key prognostic factors and use machine learning and deep learning algorithms to construct models to predict the overall survival (OS) of CC patients.

Methods: Data of CC patients collected between 2007 and 2016 were collected from the Surveillance, Epidemiology, and End Results (SEER) database, and were randomly divided into the training set (1,743 patients) and test set (747 patients). Moreover, in order to enhance the practical application of the model, we conducted an X-tile analysis to categorize the patients into three distinct strata based on their age and tumor size. Least absolute shrinkage and selection operator (LASSO) and multivariate Cox regression were performed to identify the independent prognostic factors for OS, which were further used to construct CoxBoost, RandomForest, SuperPC XGBoost, and DeepSurv survival models to predict 1-, 3-, and 5-year OS.

Results: The parameters, including age, marital status, grade, tumor size, surgery, radiation, race, the American Joint Committee on Cancer (AJCC)_stage, AJCC_T, and AJCC_M, were associated with survival and were further incorporated into the five models. The concordance index (C-index) value was 0.858, 0.848, 0.849, 0.840, and 0.869, respectively, and the receiver operating characteristic (ROC) curves showed exceptional predictive performance. Among the five models, DeepSurv was the model with best performance. The ROC curve validated the area under the curve (AUC) values for 1-year OS, 3-year OS, and 5-year OS, which were 0.936, 0.915, and 0.900, respectively.

Conclusions: The prognostic model conducted by DeepSurv algorithm and the independent prognostic factors can potentially be applied in making personalized treatment plans and evaluating the prognosis of CC patients.

Keywords: Cervical cancer (CC); prognosis; overall survival (OS); Surveillance, Epidemiology and End Result program (SEER program)

Submitted Nov 19, 2024. Accepted for publication Mar 19, 2025. Published online May 26, 2025.

doi: 10.21037/tcr-2024-2304

Highlight box

Key findings

• We successfully constructed and validated the five reliable machine learning and deep learning models to predict survival in cervical cancer (CC) patients, which showed superiority over the traditional staging system.

What is known and what is new?

• The prognosis assessment and the determination of therapeutic strategies of CC have received significantly more attention. The traditional staging system ignores other potential variables, such as age, gender, race, tumor differentiation.

• The prognostic models conducted in this study, grounded on a comprehensive large-sample analysis derived from the Surveillance, Epidemiology, and End Results database, offer enhanced accuracy in predicting overall survival for CC patients and are capable of providing more precisely targeted clinical treatment recommendations. X-tile software was used to calculate the best cutoff value of age and tumor size for stratification. Utilizing the Shaply Additive exPlanation as an explainer to identify feature contributions can enhance the model’s interpretability.

What is the implication, and what should change now?

• The novel explainable model presented in this study has the potential to assist clinicians in tailoring survival predictions for CC patients and supporting the process of treatment decision-making. Further studies with larger sample sizes and more comprehensive clinical information that could be used for external validation are warranted to enhance the predictive accuracy and applicability of our model in different clinical settings.

Introduction

Cervical cancer (CC) poses a significant public health challenge worldwide, with approximately 604,127 cases in 2020 (1) and 662,301 new cases in 2022 (2). It ranks eighth in the incidence of cancer sites and contributes to a heavy global burden (3). The incidence has halved since the 1970s, nevertheless, the limited availability of medical facilities presents a significant obstacle in many countries (4), while the incidence rates have stabilized recently (5), the number of global deaths and disability-adjusted life years for CC has risen since 1990 (6). The overall survival (OS) rate for CC continues to lag, with the 5-year survival figure for all patients remaining below 66.7%, highlighting the persistent challenge of enhancing OS (7). Currently, the standard first-line therapeutic approaches for CC encompass surgical intervention, surgical intervention followed by chemotherapy or radiotherapy, or a combination of both (8). Due to advancements in radiotherapy and chemotherapy, chemoradiotherapy has emerged as the preferred treatment modality for locally advanced CC (9). Accurate prediction of clinical outcomes can significantly aid physicians in developing personalized treatment plans for CC patients, considering their risk profiles. It also enables timely intervention in high-risk patients, improving OS rates (10).

However, it is noteworthy that in clinical settings, patients without identifiable risk factors and those who had relatively good staging based on Federation of International Gynecology and Obstetrics (FIGO) staging or the tumor-node-metastasis (TNM) staging system, or those who underwent timely and appropriate treatment may still have unfavorable survival outcomes. Therefore, these factors alone cannot be reliable predictors of patient survival. Additionally, the intrinsic heterogeneity of CC results in varying survival time even among patients at the same tumor stage. In light of the current clinical practice limitations and the imperative for enhanced survival prediction, it is imperative to develop a precise prognostic prediction tool to supplement existing clinical methodologies.

Many novel opportunities arise when applying artificial intelligence (AI) to clinical practice. Because machine learning (ML) algorithms can carefully analyze complex medical data and create predictive models (11), they have become increasingly important in the field of healthcare. One of the most powerful and iconic algorithms in AI is deep learning (DL), which can learn on its own and extract relevant features from a multitude of layers of processing (12).

A large number of studies have used the Surveillance, Epidemiology, and End Results (SEER) database to study CC (13-17), but very few have focused on predicting OS. Some studies (14-17) have used nomograms with the concordance index (C-index) and the area under the curve (AUC) values below 0.9 to predict CC prognosis. In the present study, we aimed to develop and validate an optimal prognostic model for survival prediction using DL and ML algorithms based on the SEER database to guide clinical treatment and predict the OS of CC patients. We also performed an X-tile analysis to classify the patients, thereby enhancing the practical utilization of the models. Further, to provide interpretations of the predictions, we implemented an explainer Shaply Additive exPlanation (SHAP) to identify the feature attributions. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2304/rc).

Methods

Patients and data collection

We utilized SEER*Stat software (version 8.4.3) to estimate 1-, 3-, and 5-year OS for CC patients using the “Incidence-SEER Research Data, 8 Registries, Nov 2022 Sub[1975–2020]” database. This database contains detailed clinical, pathological, and prognostic data for individuals diagnosed with CC from the SEER database of the National Cancer Institute.

The inclusion criteria were as follows: (I) according to the third edition of the International Classification of Diseases for Oncology (ICD-O-3) with site codes c53.0, c53.1, c53.8, and c53.9; (II) histological codes: 8050–8089 and 8140–8389; (III) initial diagnosis year: 2007–2016. Using these three criteria, 7,635 samples were initially obtained. The following exclusion criteria were applied: (I) missing critical clinical characteristics; (II) patients found to be repeated; (III) OS of less than one month. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Feature selection

Clinical variables extracted from the SEER database included year of diagnosis, age, sex, race, marital status at diagnosis, histologic type ICD-O-3, primary site, the American Joint Committee on Cancer (AJCC) TNM stage, vital status, diagnostic confirmation, OS months, grade, chemotherapy, and radiation recodes.

Patients were randomly divided using R software. X-tile software was utilized to determine the best cutoff values for age and tumor size (18). Chi-square tests were employed to compare variables between the training and test groups to prevent excessive bias from random partitioning of the data. A P value of less than 0.05 was considered statistically significant.

To identify independent prognostic factors for CC patients, least absolute shrinkage and selection operator (LASSO) and univariate Cox regression were used to screen the variables significantly associated with patient prognosis (P<0.05). Stepwise Cox regression was then conducted for the variables identified by the two algorithms, and the Akaike information criterion (AIC) value was calculated. The feature set corresponding to the minimum AIC value was selected as the final choice.

Constructing survival prognostic models

After finalizing the features, the survival prognostic model was constructed using XGboost (version 1.7.5.1; https://cran.r-project.org/web/packages/xgboost/index.html), RandomForestSRC (version 3.2.2; https://cran.r-project.org/web/packages/glmnet/index.html), CoxBoost (version 1.4; https://cran.r-project.org/src/contrib/Archive/CoxBoost/) and SuperPC (Supervised Principal Components) (version 1.12; https://cran.r-project.org/web/packages/superpc/index.html) R packages to construct XGBoost, RandomForest, CoxBoost and SuperPC survival models. Meanwhile, Python was used to build the DL survival neural network DeepSurv model (19), which is a new DL method for survival analysis combined with Cox proportional hazards model.

All models underwent hyper-parameter tuning, and the optimal parameters were selected for model construction and training. The model training and hyper-parameter tuning were performed on the training set. We then used the R package timeROC (version 0.4; https://github.com/cran/timeROC) to estimate the AUC values of the patient OS prediction accuracy at 1-, 3-, and 5-year for the risk scores obtained from the models on the test set, and the receiver operating characteristics (ROC) curves were plotted.

Key features extraction of the optimal model

Based on the final 1-, 3-, and 5-year ROC curves of the above models, the model with the highest AUC value was selected as the optimal model. Next, we used the Python package SHAP (version 0.32.0; https://github.com/shap/shap/tree/0.32.0) to assess the significance of the features in the best model.

Statistical analyses

All statistical analyses in this study were conducted using R software version 4.3.1 (R Foundation for Statistical Computing, Vienna, Austria) and Python programming language (version 3.8.19). LASSO and Cox regression analysis were used to determine the independent predictors of CC. P values of less than 0.05 were considered statistically significant.

Results

Data sources

Data on CC samples were obtained from the SEER database between 2007 and 2016. The histology categories covered by SEER were carefully analyzed by a highly qualified gynecology expert. To lower the possibility of misclassification, the data were reclassified when necessary. Samples lacking critical characteristics, such as TNM stage, prognostic survival information, and data on all other characteristics included in the subsequent analyses, were excluded. Grouping information on ethnicity was classified as black, white, and other ethnicities (American Indian/Alaska Native, Asian/Pacific Islander), and samples with unknown race and ethnicity were also excluded. The result was a final cohort of 2,490 patients. The flow chart is shown in Figure 1.

Figure 1 Flow chart of patient screening process based on SEER database constructing training and test sets. C-index, concordance index; LASSO, least absolute shrinkage and selection operator; OS, overall survival; ROC, receiver operating characteristic; SEER, Surveillance, Epidemiology, and End Results; TNM, tumor-node-metastasis.

Feature selection

The “caret” package of R language was used to randomly divide all sample data into the training set and the test set according to the ratio of 7:3 for statistical analysis. These patients were randomly divided into a training cohort (n=1,743) and a validation cohort (n=747) using a 7:3 ratio. Then X-tile was used to perform optimal truncation grouping for age and tumor size, and finally, age was divided into 3 groups: <41, 41–59, and >59 years old. Tumor size was also divided into 3 groups: <28, 28–47, and >47 mm.

To see whether there is a significant difference between the randomly divided groups, the Chi-square test was used to calculate the difference P values of each feature between the training group and the test group, and the results are shown in Table 1. Then feature selection was performed, and the LASSO regression model fitting procedure is shown in Figure 2. The LASSO regression results showed the lowest model error in the final selection of nine features (Figure 3).

Table 1

Chi-square test results for training and test sets

Feature	Training (n=1,743)	Validation (n=747)	P value
Age (years old), n (%)			0.81
<41	661 (37.9)	277 (37.1)
>59	262 (15.0)	108 (14.5)
41–59	820 (47.0)	362 (48.5)
Marital status, n (%)			0.58
Married	1,151 (66.0)	484 (64.8)
Single	592 (34.0)	263 (35.2)
Histological type, n (%)			>0.99
Adenomas and adenocarcinomas	559 (32.1)	239 (32.0)
Squamous cell neoplasms	1,184 (67.9)	508 (68.0)
Grade, n (%)			0.41
Grade I	265 (15.2)	131 (17.5)
Grade II	834 (47.8)	335 (44.8)
Grade III	568 (32.6)	249 (33.3)
Grade IV	76 (4.36)	32 (4.28)
Tumor size (mm), n (%)			0.86
<28	806 (46.2)	351 (47.0)
>47	587 (33.7)	243 (32.5)
28–47	350 (20.1)	153 (20.5)
Surgery, n (%)			0.58
No	485 (27.8)	199 (26.6)
Yes	1,258 (72.2)	548 (73.4)
Radiation, n (%)			0.16
No	11 (0.63)	7 (0.94)
Unknown	777 (44.6)	360 (48.2)
Yes	955 (54.8)	380 (50.9)
Chemotherapy, n (%)			0.13
No/unknown	904 (51.9)	413 (55.3)
Yes	839 (48.1)	334 (44.7)
Lymph node metastasis, n (%)			0.89
No	1,338 (76.8)	576 (77.1)
Yes	405 (23.2)	171 (22.9)
Race, n (%)			0.23
Black	148 (8.49)	68 (9.10)
Other	305 (17.5)	110 (14.7)
White	1,290 (74.0)	569 (76.2)
AJCC stage, n (%)			0.67
I	1,014 (58.2)	445 (59.6)
II	213 (12.2)	85 (11.4)
III	371 (21.3)	148 (19.8)
IV	145 (8.32)	69 (9.24)
AJCC T, n (%)			0.48
T1	1,178 (67.6)	517 (69.2)
T2	357 (20.5)	137 (18.3)
T3	177 (10.2)	75 (10.0)
T4	31 (1.78)	18 (2.41)
AJCC N, n (%)			0.89
N0	1,338 (76.8)	576 (77.1)
N1	405 (23.2)	171 (22.9)
AJCC M, n (%)			0.74
M0	1,616 (92.7)	689 (92.2)
M1	127 (7.29)	58 (7.76)

The analysis revealed that there were no significant differences in features between the training set and the test set (P>0.05), indicating that the data were evenly distributed and suitable for further analysis. Subsequently, feature selection was performed using LASSO and univariate Cox regression on the training set. The LASSO regression model fitting procedure is depicted in Figures 2,3. The results indicated that the lowest model error occurred when nine features were selected using the LASSO regression. Additionally, the univariate Cox regression results showed that all 14 selected features were significantly associated with prognosis (P<0.001). AJCC, American Joint Committee on Cancer; LASSO, least absolute shrinkage and selection operator.

Figure 2 The coefficient profiles of the nine features obtained through the LASSO. LASSO, least absolute shrinkage and selection operator.

Figure 3 The selection of tuning parameters in the LASSO model through 10-fold cross-validation. LASSO, least absolute shrinkage and selection operator.

Construction of survival prediction models

According to the method, hyperparameter tuning was carried out for each algorithm, and the final parameters, including age, marital status, grade, tumor size, surgery, radiation, race, AJCC_stage, AJCC_T, and AJCC_M, were determined. Following the construction of each model with the final parameters, Table 2 illustrates the evaluation and comparison of these models using the C-index and AUC values of 1-, 3-, and 5-year. Figure 4 displays the ROC curves for each algorithm’s OS at 1, 3, and 5 years. DeepSurv was selected as the final model due to its relatively high AUC value and C-index.

Table 2

C-index and AUC comparison for the five algorithms

Model	C-index	1-year AUC	3-year AUC	5-year AUC
CoxBoost	0.858067	0.927	0.9	0.884
RandomForest	0.847942	0.902	0.886	0.874
SuperPC	0.849393	0.902	0.889	0.873
XGBoost	0.840209	0.916	0.886	0.865
DeepSurv	0.869193	0.936	0.915	0.8997

AUC, area under the curve; C-index, concordance index; SuperPC, Supervised Principal Components.

Figure 4 The ROC curves for predicting 1-, 3-, and 5-year overall survival of each algorithm. AUC, area under the curve; ROC, receiver operating characteristic.

Extraction of key features of optimal model

As mentioned in the methods, the SHAP library was used to assess the feature importance of DeepSurv after it was chosen as the final OS model. Since the random forest (RF) algorithm also has an importance ranking, the two importance rankings were compared. Figure 5 displays the importance ranking of DeepSurv algorithm; and Figure 6 shows the RF results. Through these results, it can be found that the importance ranking of DeepSurv algorithm is similar to that of RF, with both AJCC_stage and AJCC_T placing in the top 3.

Figure 5 The importance rankings of features in DeepSurv algorithm. AJCC, American Joint Committee on Cancer; SHAP, Shaply Additive exPlanation.

Figure 6 Curve of random forest error with the number of decision trees and feature importance ranking. The optimal random forest model was constructed by setting the “mtry” parameter to 3 and the “ntree” parameter to 120, with the error rate of 0.1567, ensuring the utmost precision and reliability in the modeling process. AJCC, American Joint Committee on Cancer.

Discussion

SEER that compiles cancer statistics and offers data is a database supported by the United States National Cancer Institution (20). Approximately 28 percent of U.S. population are covered by this database with information (21). This is among the biggest and most extensive databases of American cancer patients. It offers anonymized information on the following topics: patient demographics, tumor morphology, site, therapeutic regimen, diagnosis stage, patient’s vital status, and follow-up data. An ongoing quality control program is used to reduce errors and ensure high-quality data. The study obtained CC patients’ data from SEER between 2007 and 2016.

The ability to provide prognostic information in oncology has a big impact on clinical judgment. The AJCC “TNM” cancer staging system, which classifies malignancies according to tumor size (T), nodal involvement (N), and the existence of distant metastases (M), is one of the commonly used frameworks. While the TNM staging framework is still useful, efforts are being made to improve its accuracy by adding more variables, such as tumor grade and clinical parameters, to further improve prognostic estimations.

Recently, ML has been applied extensively in the medical field to accurately diagnose illnesses and forecast results (22). Advancement in data mining techniques with ML models is creating promising prediction approaches (11). Data mining converts raw healthcare data into meaningful information for predictions and decisions (23). Various techniques such as LASSO, Cox regression, support vector machine (SVM), and RF, and other techniques (24-28) are currently used for data analysis, requiring proper training for unknown datasets. These methods have been successful in improving prediction accuracy, especially in predicting the survival of cancer patients. This research increased the accuracy of OS predictions by ML models, DL algorithms, and a wealth of real-world clinical data. In the classification, the time span of patient survival was considered, defined as 1, 3, and 5 years from the date of diagnosis of the patient. Data skewness is an inevitable problem with medical datasets as it can lead to non-uniform target feature sampling and decrease the model’s generalizability. To determine the skewness of the datasets, we distinguished between age and tumor size, respectively, so that the established model is more widely applicable. Data standardization was performed using the sixth edition of the of the AJCC standard to standardize the data scale. To forecast OS, we suggest a DL model, which is the DeepSurv model, that outperformed the other models by utilizing the SHAP library.

ML and DL models must be interpretable for better understanding and trust (29). Clinicians can effectively interpret model performance and data patterns by understanding key features that influence outcomes. This enhances medical decision quality and patient treatment. Interpretable ML and DL models can predict patient survival time in various medical specialties, aiding clinicians in informed decision-making. Using the Python SHAP library, we evaluated the influence of features of the model. SHAP analysis showed that AJCC_stage, AJCC_T, grade, and AJCC_M were the most important features in DeepSurv. The relevance of those traits and their effect on the prognosis of patients with CC have previously been emphasized by other clinical studies (30-33), which is in line with our findings. A study has indicated that the OS among three groups of patients with early-stage CC are comparable, regardless of the nodal assessment technique used: pelvic lymphadenectomy (LND), sentinel node mapping with backup lymphadenectomy (SNM + LND), and sentinel node mapping alone (SNM). This suggests that conization and minimally invasive SNM are safe and effective (34). For instance, our SHAP diagram indicated that a higher stage harmed survival time, while a lower stage had a positive impact on the number of survival months. This is consistent with other research that found patients with CC who had higher AJCC_stage, especially the AJCC_T stage, had a higher chance of dying. Furthermore, by using interpretable techniques like SHAP, we can better comprehend and explain the decisions and performance of the model. These methods also assist ML models in producing more understandable results compared to statistical models. The RF model, a popular model used in previous studies (35-37), prioritizes variables such as AJCC_T, AJCC_stage, surgery, and AJCC_M, highlighting their importance in predicting the OS of patients with CC. The fact that these two models placed AJCC_stage and AJCC_T in the top 3 indicates how crucial these two characteristics are in determining the chances of patients with CC surviving. In the regression approach, the deep neural network (DNN) used by the DeepSurv model performed better than all the other models in every aspect. The CoxBoost model came in second by a small margin. By precisely estimating the number of years patients would live, the regressors can aid in precise patient treatment planning.

The C-index is a crucial healthcare metric for evaluating model predictability (38). It gauges agreement between observed and predicted outcomes, providing a reliable performance measure. Using different feature sets, we compared the C-index of five models. In ML, it is essential to measure overall performance. Therefore, we computed the area under the ROC curve values in addition to the C-index to predict survival time at 1, 3, and 5 years. Comparable results were observed, indicating a significant improvement in AUC by about 0.2 (39). Our findings align with Chen’s and Dong’s research, confirming Cox analyses and RFs’ excellent performance (15,35). Additionally, DNNs demonstrate promising OS prediction capabilities. The C-index value in our DeepSurv algorithm-constructed model was 0.869, higher than in the other four models. The estimated AUC values for 1-, 3-, and 5-year survival are 0.936, 0.915, and 0.900, respectively. These performances were optimal in all of the models. Compared to other literature, the nomogram model of CC developed by Zeng et al. (17) predicted patients’ survival at 3 and 5 years, respectively, with an accuracy of 0.796 and 0.783 on the validation set. In the nomogram model of early CC constructed by Li et al. (39), the accuracy of predicting 1-, 2-, and 3-year survival on the test set was 0.768, 0.755, and 0.751, respectively. Previous studies used the traditional Cox model and nomogram to construct the prognostic survival model (14,15,40-42), while this study utilized ML and DL algorithms, resulting in a significantly improved AUC value compared to previous studies.

In the cohort of 5,566 CC patients, those receiving brachytherapy showed better OS (64.0% vs. 51.4%, P<0.001) (43). In the model we constructed, radiotherapy is also an important factor. Another study by Chen et al. reported that marital status was identified as an independent prognostic factor for CC, with marriage being associated with a more favorable prognosis for the disease (44). Among our more important variables, there are also marriage variables, which are consistent with the results reported in the research. In the DeepSurv model that we have developed, age occupies the fifth position in the ranking, indicating its significance. Fan et al. considered the age at diagnosis to be a significant risk factor influencing cancer prognosis (45), in line with our view. The studies conducted by Teshome et al. and Lin et al. reported that significant factors that influenced survival rates of advanced-stage CC patients included patient age, the stage of cancer, the presence of anemia, tumor size, platelet-to-albumin ratio (PAR) and the duration of waiting time for treatment (30,46). This fully illustrates the important factors related to age and cancer stage for survival. Later, we will continue to collect clinical data to observe the relationship between anemia, PAR, waiting for treatment and survival.

Earlier statistical research did not use DL algorithms, our study employed five algorithms to predict CC patient survival. Most SEER-based studies are primarily conducted using binary logistical regression methods. Furthermore, these studies have largely ignored the interpretability of the model. Clinicians’ confidence in the constructed models in their practical work is diminished when these factors are neglected. We developed a DL model, and by applying it, we were able to attain the best results in terms of C-index and AUC values. By predicting expected survival, this novel approach seeks to provide more accurate predictions, enabling precise and effective treatment planning for patients with CC.

Despite our encouraging results for a DL model with these high values, there are a number of limitations in this study. Specifically, limitations include the lack of complete treatment information, including the order in which treatments are administered and the patient’s genetic profiles. Incorporating such information, obtainable from hospital records or insurance databases, could enhance the dataset, leading to more accurate survival predictions and a more thorough evaluation of the model’s predictive capabilities. Given that the study is retrospective, the process of choosing patients might have been biased. Furthermore, since there was no independent dataset available for validation, the existing models did not undergo external validation. To address this limitation, future endeavors aim to collect an external dataset similar to the current one. The validation of the best models against the new dataset would confirm the robustness and reliability of the study, demonstrating the generalizability of the models across different data contexts.

Conclusions

Our study utilized clinical data from the SEER database to develop five ML and DL models. These models are capable of predicting the 1-, 3-, and 5-year OS of patients with CC. The DeepSurv model, as an explainable method, developed as part of our research, incorporates clinical features and demographic information, demonstrating significant advantages and outstanding performance in terms of C-index and AUC values. This makes it a valuable tool for clinical practice. Although the models demonstrated the ability to learn prognostic signals, further development and validation in larger datasets are needed to enhance predictive accuracy and improve their clinical utility. Utilizing this model can aid doctors in creating personalized treatment plans for CC patients, effectively allocating resources, and ultimately reducing patient anxiety and challenges.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2304/rc

Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2304/prf

Funding: This work was supported by the Gusu Health Talent Project of Suzhou (No. GSWS2020003, to C.H.) and Jiangsu Provincial Medical Key Discipline Cultivation Unit (No. JSDW202242, to C.H.).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2304/coif). C.H. reports that this study was supported by the Gusu Health Talent Project of Suzhou (No. GSWS2020003), and Jiangsu Provincial Medical Key Discipline Cultivation Unit (No. JSDW202242). The other authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Singh D, Vignat J, Lorenzoni V, et al. Global estimates of incidence and mortality of cervical cancer in 2020: a baseline analysis of the WHO Global Cervical Cancer Elimination Initiative. Lancet Glob Health 2023;11:e197-206. [Crossref] [PubMed]
Ferlay J, Ervik M, Lam F, et al. Global Cancer Observatory: Cancer Today. Lyon: International Agency for Research on Cancer; 2024. Accessed August 5 2024. Available online: https://gco.iarc.who.int/today
Brisson M, Kim JJ, Canfell K, et al. Impact of HPV vaccination and cervical screening on cervical cancer elimination: a comparative modelling analysis in 78 low-income and lower-middle-income countries. Lancet 2020;395:575-90. [Crossref] [PubMed]
Allogmani AS, Mohamed RM, Al-Shibly NM, et al. Enhanced cervical precancerous lesions detection and classification using Archimedes Optimization Algorithm with transfer learning. Sci Rep 2024;14:12076. [Crossref] [PubMed]
Siegel RL, Giaquinto AN, Jemal A. Cancer statistics, 2024. CA Cancer J Clin 2024;74:12-49. [Crossref] [PubMed]
Liu Y, Shi W, Mubarik S, et al. Assessment of secular trends of three major gynecologic cancers burden and attributable risk factors from 1990 to 2019: an age period cohort analysis. BMC Public Health 2024;24:1349. [Crossref] [PubMed]
Arbyn M, Weiderpass E, Bruni L, et al. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. Lancet Glob Health 2020;8:e191-203. [Crossref] [PubMed]
Koh WJ, Abu-Rustum NR, Bean S, et al. Cervical Cancer, Version 3.2019, NCCN Clinical Practice Guidelines in Oncology. J Natl Compr Canc Netw 2019;17:64-84. [Crossref] [PubMed]
Shrivastava S, Mahantshetty U, Engineer R, et al. Cisplatin Chemoradiotherapy vs Radiotherapy in FIGO Stage IIIB Squamous Cell Carcinoma of the Uterine Cervix: A Randomized Clinical Trial. JAMA Oncol 2018;4:506-13. [Crossref] [PubMed]
Hua L, Wei M, Feng C, et al. Nomogram for Predicting Survival in Locally Advanced Cervical Cancer with Concurrent Chemoradiotherapy plus or Not Adjuvant Chemotherapy: A Retrospective Analysis Based on 2018 FIGO Staging. Cancer Biother Radiopharm 2024;39:690-705. [Crossref] [PubMed]
Islam MA, Majumder MZH, Miah MS, et al. Precision healthcare: A deep dive into machine learning algorithms and feature selection strategies for accurate heart disease prediction. Comput Biol Med 2024;176:108432. [Crossref] [PubMed]
Tadesse GA, Zhu T, Liu Y, et al. Cardiovascular disease diagnosis using cross-domain transfer learning. Annu Int Conf IEEE Eng Med Biol Soc 2019;2019:4262-5. [Crossref] [PubMed]
Chang L, Zhao K. Construction and validation of an innovative prognostic nomogram for overall survival in cervical cancer patients with lung metastasis: an analysis utilizing the SEER database. Front Oncol 2024;14:1397454. [Crossref] [PubMed]
Shan Y, Ding Z, Cui Z, et al. Incidence, prognostic factors and a nomogram of cervical cancer with distant organ metastasis: a SEER-based study. J Obstet Gynaecol 2023;43:2181690. [Crossref] [PubMed]
Chen L, Chen Y, Shi H, et al. Enhancing prognostic accuracy: a SEER-based analysis for overall and cancer-specific survival prediction in cervical adenocarcinoma patients. J Cancer Res Clin Oncol 2023;149:17027-37. [Crossref] [PubMed]
Liu Q, Li W, Xie M, et al. Development and validation of a SEER-based prognostic nomogram for cervical cancer patients below the age of 45 years. Bosn J Basic Med Sci 2021;21:620-31. [PubMed]
Zeng S, Yang P, Xiao S, et al. Development and validation of prognostic nomographs for patients with cervical cancer: SEER-based Asian population study. Sci Rep 2024;14:7681. [Crossref] [PubMed]
Camp RL, Dolled-Filhart M, Rimm DL. X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res 2004;10:7252-9. [Crossref] [PubMed]
Katzman JL, Shaham U, Cloninger A, et al. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 2018;18:24. [Crossref] [PubMed]
Akinyemi OA, Abodunrin FO, Andine TF, et al. Second Malignancies Following Primary Cervical Cancer Diagnosis: Analysis of the SEER Database. Cureus 2022;14:e26171. [Crossref] [PubMed]
Chen Y, Zheng Y, Wu Y, et al. Local excision as a viable alternative to hysterectomy for early-stage cervical cancer in women of reproductive age: a population-based cohort study. Int J Surg 2023;109:1688-98. [Crossref] [PubMed]
Sahoo P, Kundu M, Begum J. Artificial Intelligence in Cancer Diagnosis: A Game-Changer in Healthcare. Curr Pharm Biotechnol 2024; Epub ahead of print. [Crossref] [PubMed]
Singh M, Kumar A, Khanna NN, et al. Artificial intelligence for cardiovascular disease risk assessment in personalised framework: a scoping review. EClinicalMedicine 2024;73:102660. [Crossref] [PubMed]
Xu H, Sun D, Zhou D, et al. Immune Cell Infiltration Types as Biomarkers for the Recurrence Diagnosis and Prognosis of Bladder Cancer. Cancer Invest 2024;42:186-98. [Crossref] [PubMed]
Wang M, Li Z, Zeng S, et al. Explainable machine learning predicts survival of retroperitoneal liposarcoma: A study based on the SEER database and external validation in China. Cancer Med 2024;13:e7324. [Crossref] [PubMed]
Chen Q, Guo Y, Wang Z, et al. Development and validation of a nomogram to predict overall survival of gastroenteropancreatic neuroendocrine carcinoma: a SEER database analysis. Transl Cancer Res 2024;13:4678-93. [Crossref] [PubMed]
Wu J, Zhou X, Ren J, et al. Glycosyltransferase-related prognostic and diagnostic biomarkers of uterine corpus endometrial carcinoma. Comput Biol Med 2023;163:107164. [Crossref] [PubMed]
An C, Lim H, Kim DW, et al. Machine learning prediction for mortality of patients diagnosed with COVID-19: a nationwide Korean cohort study. Sci Rep 2020;10:18716. [Crossref] [PubMed]
Huang G, Li Y, Jameel S, et al. From explainable to interpretable deep learning for natural language processing in healthcare: How far from reality? Comput Struct Biotechnol J 2024;24:362-73. [Crossref] [PubMed]
Lin J, Lin J, Liu L, et al. A novel nomogram based on inflammation biomarkers for predicting radiation cystitis in patients with local advanced cervical cancer. Cancer Med 2024;13:e7245. [Crossref] [PubMed]
Lura N, Wagner-Larsen KS, Forsse D, et al. What MRI-based tumor size measurement is best for predicting long-term survival in uterine cervical cancer? Insights Imaging 2022;13:105. [Crossref] [PubMed]
Kumar M, Baruah U, Begum D, et al. To study the survival outcomes of uncommon recurrences among patients with cervical cancer compared with loco-regional and nodal recurrences at a tertiary care center in North East India - Bridging the knowledge gap in the existing literature. Eur J Obstet Gynecol Reprod Biol X 2024;22:100314. [Crossref] [PubMed]
Huang XD, Chen K, Shi L, et al. Construction of refined staging classification systems integrating FIGO/T-categories and corpus uterine invasion for non-metastatic cervical cancer. Cancer Med 2023;12:15079-89. [Crossref] [PubMed]
Bogani G, Scambia G, Fagotti A, et al. Sentinel node mapping, sentinel node mapping plus back-up lymphadenectomy, and lymphadenectomy in Early-sTage cERvical caNcer scheduled for fertilItY-sparing approach: The ETERNITY project. Eur J Surg Oncol 2024;50:108467. [Crossref] [PubMed]
Dong T, Wang L, Li R, et al. Development of a Novel Deep Learning-Based Prediction Model for the Prognosis of Operable Cervical Cancer. Comput Math Methods Med 2022;2022:4364663. [Crossref] [PubMed]
Rahimi M, Akbari A, Asadi F, et al. Cervical cancer survival prediction by machine learning algorithms: a systematic review. BMC Cancer 2023;23:341. [Crossref] [PubMed]
Chanudom I, Tharavichitkul E, Laosiritaworn W. Prediction of Cervical Cancer Patients' Survival Period with Machine Learning Techniques. Healthc Inform Res 2024;30:60-72. [Crossref] [PubMed]
Vickers AJ, Holland F. Decision curve analysis to evaluate the clinical benefit of prediction models. Spine J 2021;21:1643-8. [Crossref] [PubMed]
Li E, Ni H. Prognostic nomogram for early-stage cervical cancer in the elderly: A SEER database analysis. Prev Med Rep 2024;41:102700. [Crossref] [PubMed]
Tian T, Gong X, Gao X, et al. Comparison of survival outcomes of locally advanced cervical cancer by histopathological types in the surveillance, epidemiology, and end results (SEER) database: a propensity score matching study. Infect Agent Cancer 2020;15:33. [Crossref] [PubMed]
Jiang K, Ai Y, Li Y, et al. Nomogram models for the prognosis of cervical cancer: A SEER-based study. Front Oncol 2022;12:961678. [Crossref] [PubMed]
Chen X, Duan H, Liu P, et al. Development and validation of a prognostic nomogram for 2018 FIGO stages IB1, IB2, and IIA1 cervical cancer: a large multicenter study. Ann Transl Med 2022;10:121. [Crossref] [PubMed]
Han K, Colson-Fearon D, Liu ZA, et al. Updated Trends in the Utilization of Brachytherapy in Cervical Cancer in the United States: A Surveillance, Epidemiology, and End-Results Study. Int J Radiat Oncol Biol Phys 2024;119:143-53. [Crossref] [PubMed]
Chen Q, Zhao J, Xue X, et al. Effect of marital status on the survival outcomes of cervical cancer: a retrospective cohort study based on SEER database. BMC Womens Health 2024;24:75. [Crossref] [PubMed]
Fan X, He W, Zhang Q, et al. Evaluation and Prediction Analysis of 3- and 5-Year Relative Survival Rates of Patients with Cervical Cancer: A Model-Based Period Analysis. Cancer Control 2024;31:10732748241232324. [Crossref] [PubMed]
Teshome R, Yang I, Woldetsadik E, et al. Survival Status and Predictors Among Women with Advanced Stage of Cervical Cancer. Int J Womens Health 2024;16:605-17. [Crossref] [PubMed]

Cite this article as: Jiang N, Xiong X, Chen X, Feng M, Guo Y, Hu C. Machine learning and deep learning to improve overall survival prediction in cervical cancer patients. Transl Cancer Res 2025;14(5):3057-3068. doi: 10.21037/tcr-2024-2304

Machine learning and deep learning to improve overall survival prediction in cervical cancer patients

Highlight box

Introduction

Methods

Patients and data collection

Feature selection

Constructing survival prognostic models

Key features extraction of the optimal model

Statistical analyses

Results

Data sources

Feature selection

Table 1

Construction of survival prediction models

Table 2

Extraction of key features of optimal model

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share