Cervical cancer molecular subtype identification and prognosis classification by a metabolism-related gene expression

Xiaohong Chen; Caixia Hong; Guohui Zhang; Bixian Lin; Jingyi Liu; Ronglong Wang; Chunbo Li

doi:10.21037/tcr-2024-2208

Original Article

Cervical cancer molecular subtype identification and prognosis classification by a metabolism-related gene expression

Xiaohong Chen^1#, Caixia Hong^1#, Guohui Zhang¹, Bixian Lin¹, Jingyi Liu¹, Ronglong Wang¹, Chunbo Li²

¹Department of Obstetrics and Gynecology, Zhangzhou Affiliated Hospital of Fujian Medical University, Zhangzhou, China; ²Department of Obstetrics and Gynecology, Obstetrics & Gynecology Hospital of Fudan University, Shanghai Key Lab of Reproduction and Development, Shanghai Key Lab of Female Reproductive Endocrine Related Diseases, Shanghai, China

Contributions: (I) Conception and design: C Li, R Wang; (II) Administrative support: C Li, R Wang; (III) Provision of study materials or patients: None; (IV) Collection and assembly of data: X Chen, C Hong; (V) Data analysis and interpretation: X Chen, C Hong; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work.

Correspondence to: Ronglong Wang, MD. Department of Obstetrics and Gynecology, Zhangzhou Affiliated Hospital of Fujian Medical University, 9 Zhangma Road, Zhangzhou 363000, China. Email: earthfire1999@163.com; Chunbo Li, MD. Department of Obstetrics and Gynecology, Obstetrics & Gynecology Hospital of Fudan University, Shanghai Key Lab of Reproduction and Development, Shanghai Key Lab of Female Reproductive Endocrine Related Diseases, 419 Fangxie Road, Shanghai 200011, China. Email: lichunbo142@126.com.

Background: The high molecular phenotype heterogeneity of cervical cancer (CC) is the main focus of individualized therapy. Molecular classification may lead to personal treatment and new drug discovery. We summarized the molecular features by establishing a new classification of metabolism-related gene expression profiles.

Methods: Clinical information and messenger ribonucleic acid (mRNA) expression were downloaded from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO). Twenty-two immune cells were detected by CIBERSORT method. K-means clustering algorithm based on 258 metabolism-related genes was used for CC classification. Univariate and multivariate Cox regression analyses were carried out to find out the optimal metabolism-related genes. A predictive model was established to evaluate the overall survival (OS) of patients. Then, a nomogram model was established to predict the OS of patients based on independent prognostic factors.

Results: Based on the expression profiles of 258 survival-related metabolic genes, we identified two metabolism-related subtypes of CC. Cluster_A subtype was characterized with significant glucose metabolism, and had a poor prognosis; and cluster_B subtype exhibited high enrichment of lipid metabolism-related and immune-related signaling pathways. Then, seven metabolism-related genes (CYP4F12, NPL, CH25H, NOS2, SDR16C5, PGK1 and LYZ) were used to establish a metabolism-related risk signature. Patients in high risk groups had a worse prognosis than those in low risk group. Multivariate Cox regression analysis indicted that the metabolism-related risk signature could predict OS as an independent prognostic factor.

Conclusions: Our study provides new insight into the metabolic heterogeneity of CC and its relationship with immune landscape. The novel metabolism-related gene signature is an effective potential prognostic signature in the individualized prognosis prediction of CC.

Keywords: Cervical cancer (CC); gene; metabolism; classification; prognosis

Submitted Nov 08, 2024. Accepted for publication Apr 29, 2025. Published online Oct 29, 2025.

doi: 10.21037/tcr-2024-2208

Highlight box

Key findings

• We used seven significant metabolic-related genes to establish a novel independent prognostic signature, and found that the prognosis of patients in the high-risk group was generally poor.

What is known and what is new?

• Tumor cells often rely on altered metabolism (e.g., enhanced glucose uptake, lipid synthesis) to support proliferation, a phenomenon observed in various cancers.

• We establish a novel independent prognostic signature based on 258 metabolism-related genes, enabling risk stratification of cervical cancer patients.

What is the implication, and what should change now?

• The metabolic risk signature can potentially guide clinical decision-making by identifying patients at high risk of poor outcomes, warranting more aggressive intervention.

Introduction

Cervical cancer (CC) is the fourth most common gynecological malignancy worldwide (1). Although surgery and radiotherapy are recommended as the most effective treatments, about 30% of patients still develop progressive or recurrent tumors, ultimately leading to death (2). When conducting clinical management or risk assessment, it is generally recommended to use the Federation International of Gynecology and Obstetrics (FIGO) guidelines (3). However, due to high molecular heterogeneity, even with the same pathology, patients may have different risks of recurrence and mortality. Therefore, there is an urgent need for a new CC classification to more specifically determine the prognosis of patients.

Cellular metabolism is an essential biochemical process of cell survival and proliferation. Emerging reports shows that metabolic reprogramming promotes tumor growth and progression (4). Through the crosstalk with stromal and immune cells in the tumor microenvironment, tumor cells typically defeat adjacent normal cells to obtain better nutrition, produce immunosuppressive metabolites, and then induce immune dysfunction and tumor progression (5). Therefore, in the near future, research focusing on metabolic processes has become a very promising field of anti-tumor research. Furthermore, further exploration has shown that changes in metabolic molecules may contribute to the development of cancer treatment. Several studies have classified hepatocellular carcinoma (HCC) and colorectal cancer into different subtypes based on metabolism-related gene expression (6,7). Each subtype has specific metabolic pathways and prognosis. Recently, Shang et al. reported a seven-gene signature for predicting the prognosis of CC and demonstrated the potential value of metabolic pathways associated with anti-tumor immunotherapy (8). However, molecular classification studies on CC metabolism have not yet been proposed.

In this study, a new classification of CC metabolic molecules was proposed based on the expression of metabolism-related genes. Meanwhile, we evaluated the prognosis, transcriptome, metabolic genes, immune infiltration, clinical features, and gene mutation of different subtypes. Then, we used seven significant metabolic-related genes to establish a new independent prognostic feature, and found that the prognosis of high-risk patients was generally poor. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2208/rc).

Methods

Microarray data

CC clinical and gene data (including mRNA expression and gene mutation data) were obtained from The Cancer Genome Atlas (TCGA) database (https//portal.gdc.cancer.gov/). Moreover, normal, repeated and samples without critical clinical value were excluded. Besides, human CC mRNA expression data were downloaded from Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/) for validation. GSE29817 and GSE68339, containing 272 CC samples, were picked to test the molecular classification of CC. Another GSE44001 with 300 CC samples were used as test set to verify the prognostic risk model. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Identification of CC subclusters

First, we obtained 2,752 metabolism-related genes from a previously published study by Possemato et al. (9). Before entering the classification stage, the filtering process was carried out first. CC sample candidate genes with zero expression or median absolute deviation (MAD) value less than 0.5 in TCGA were excluded. Cox proportional-hazards model was established according to “survival” R software package to screen significant overall survival (OS) genes. Then, metabolic related genes with high expression (MAD >0.5) and significant prognostic value (P<0.05) were selected for the following analysis. In the training and test datasets, K-means clustering was performed by R software (10). The value was determined by Cophenetic correlation coefficient. The magnitude was selected as the most suitable number of clusters (11). Principal components analysis (PCA) was used to calculate the difference between the two clusters.

Functional and pathway enrichment analysis

Gene annotation enrichment analysis of metabolism-related genes [the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG)] were performed by R package cluster profiler or metascape (https://metascape.org/) (12).

Gene set variation analysis (GSVA)

GSVA can determine specific pathways based on transcriptomic data (13). According to GSVA by “limma” package, each sample from the TCGA database got a score. Differential analyses were then conducted on the signature scores. The signatures with a log₂ fold change (FC) >0.4 (adjusted P<0.05) were determined as significant differential expressed characteristics.

Mutation differences of CC subclusters

The downloaded MAF files contain the mutation information of the training set, and the “maftools” package was used to evaluate the gene mutations in the CC subtypes (14).

Immune infiltration estimation of CC subclusters

CIBERSORT used support vector regression method to quantify the cellular components from tissue gene expression profiles (15). CIBERSORT was used to identify 22 kinds of immune cells by the gene expression data with high sensitivity and specificity. In order to evaluate the proportion of immune cells in CC samples, CIBERSORT package was constructed to calculate the 22 immune cells.

Construction and validation of the prognostic risk score signature

We established a prognostic model based on univariate and multivariate Cox regression analysis. According to the median risk score, CC samples were divided into two groups: high-risk group and low-risk group in training set (TCGA) and test set (GEO). The OS of the two groups was estimated using the Kaplan-Meier method using the survival and survminer R package. In addition, the receiver operating characteristic (ROC) analysis was constructed, and the area under curve (AUC) was calculated to predict the model efficiency. Meanwhile, we sorted out the patient’s clinical information. Finally, the clinical data of all patients were analyzed by univariate and multiple regression to determine whether the risk score may become an independent predictor of CC.

Validation of the hub genes

To verify the difference of hub gene between CC and normal tissues, GEPIA tool (http://gepia.cancer-pku.cn/) was applied using the RNA-Seq data (16). And the transcripts per million (TPM) algorithm was used to measure RNA expression. Then, protein expression data were obtained from the Human Protein Atlas (HPA) database (17), which is the largest and most comprehensive database for evaluating the distribution of proteins in human tissues. Immunohistochemical staining images were used to determine the protein expression of the selected genes in normal and CC tissues. The effect of seven metabolism-related genes on immune cell infiltration levels were evaluated by applying the Tumor Immune Estimation Resource (TIMER2, http://cistrome.shinyapps.io/timer/) database (18). The GISTIC 2.0data were ulilized in the TIMER.

Build a predictive nomogram

Based on the independent prognostic factors identified by multivariate Cox regression analysis with RMS (https://cran.r-project.org/web/packages/rms/) package, nomogram was established to predict OS probability for 1, 3 and 5 years. The efficiency of the nomogram was evaluated by discrimination and calibration.

Statistical analysis

Statistical analysis was performed using version 3.2.3 of the R software. In order to determine the relationship between risk score and pathological characteristics, Chi-squared or Fisher’s test were used to analyze categorical variables. The correlation between two continuous variables was measured by Pearson’s correlation coefficient. Cox regression model was used to estimate the hazard ratio (HR) with 95% confidence intervals (CIs). Kaplan-Meier method was used for survival analysis, and the log rank test to determine the statistical significance of difference. Overall, P<0.05 was considered as statistically significant.

Results

Two metabolism-related subclusters were confirmed in CC

We collected 253 CC samples form TCGA, including RNA-sequencing expression information and clinical records from TCGA, and 272 patients from GEO (GSE29817 and GSE68339). There were 2,752 metabolism-related genes selected as the basis for subsequent clustering analysis (available online: https://cdn.amegroups.cn/static/public/10.21037tcr-2024-2208-1.pdf). After preliminary filtering, 465 genes were abandoned because the level of MAD was low or even undetectable. 2,287 genes were selected for the following analysis. Univariate cox proportional hazards model was used to find out metabolism-related genes with prognostic value. Our results indicated that 258 metabolism-related genes in TCGA had a significant risk for patient survival (available online: https://cdn.amegroups.cn/static/public/10.21037tcr-2024-2208-2.pdf). Thereafter, these genes were identified as k-means clustering (Figure 1A). The Data exhibited that the cophenetic correlation coefficient decreased sharply when k=2 in TCGA cohort (Figure S1A). In addition, when k=2, the consensus matrix heatmap kept clear boundaries (Figure S1B). PCA also suggested the most suitable number of subclusters (Figure S1C). In GEO cohort, we acquired the consistent results (Figure S1D-S1F). Therefore, k=2 was designated as the most appropriate number of clusters in TCGA cohort (Figure 1B) and GEO cohort (Figure 1C). 253 patients in TCGA and 272 patients in GEO were divided into two groups (Cluster_A and Cluster_B). GO analysis of 258 significant genes showed the enrichment of metabolic processes, localization, and biological regulation (Figure S2A,S2B). Specifically, cluster_A presented high expression levels of ABO, ATP1B3, CDIPT, CPT1A and DUOX1 (Figure 1D,1E), and GO analysis confirmed that these genes were related to monophosphate metabolism procedure, triphosphate metabolic, ATP metabolic and coenzyme metabolic procedure (Figure S2C). Survival analysis showed that patients in cluster_A had a poor prognosis (Figure 1F). Cluster_B showed high expression levels of A4GALT, CA9, COX4I1, ENO1 and ESD (Figure 1D,1E). GO analysis exhibited the high enrichment of small molecular catabolic procedure, organic acid catabolic procedure, carboxylic acid catabolic procedure, fatty acid metabolic procedure and transition metal iron transport (Figure S2D). Survival analysis showed that the prognosis of patients in cluster_B was better (Figure 1F). In fact, recent studies have reported the relationship between gene mutation and metabolism, so we further explored the differences between the two subclusters and gene mutation (8). The results showed several genes with high mutation frequencies, such as MUC4, KMT2C, PIK3CA, and TTN in CC (Figure 1G,IH) and each distinct clusters often had a very different proportion of gene mutations. For example, cluster_A had a high number of MUC16 mutation, while cluster_B had a high number of PIK3CA and KMT2C mutations. However, the number of mutations between the two groups did not show statistically significant (Figure S3).

Figure 1 Consensus clustering of CC and the identification of each cluster. (A) The flow chart of k-means clustering. (B) Consensus clustering (k=2) using 258 metabolism-related genes in 253 CC from TCGA as a training set. (C) Consensus clustering (k=2) using 258 metabolism-related genes in 272 CC from GEO as a test set. (D) Heatmap of the two clusters defined by the metabolism-related genes expression in training set. (E) Heatmap of the two cluster defined by the metabolism-related genes expression in test set. (F) Kaplan-Meier curves of the two cluster for survival time in 253 TCGA CC. Log-rank test presented an overall P<0.05; oncoprint of mutation status of genes in cluster_A (G) and cluster_B (H). CC, cervical cancer; CESC, cervical squamous cell carcinoma; GEO, Gene Expression Omnibus; M, metastasis; MAD, median absolute deviation; N, node; T, tumor; TCGA, The Cancer Genome Atlas.

Correlation of the CC clusters with metabolism-related signatures

In this study, CC classification was based on metabolism-related genes. Therefore, we further explored whether different subtypes have different metabolic related signatures. First, all metabolism-related genes were listed and evaluated by GSVA analysis. Thereafter, each sample obtained an individual score for the identical metabolism-related pathway signaling. Differential analysis was performed to find subtype-specific metabolism characteristics (Figure 2A,2B). Cluster_A showed the enrichment of glycosaminoglycan biosynthesis metabolism, fructose and mannose metabolism, other glycan degradation, and one carbon pool by folate. Meanwhile, TGF-β signaling pathway also was enriched in cluster_A (Figure 2A,2B). Cluster_B was mainly related to highly enrichment of lipid metabolism, such as fatty acid metabolism, glycerolipid metabolism, ether lipid metabolism (Figure 2A,2B). Peroxisome proliferator-activated receptors (PPARs) are a group of nuclear receptor proteins that promote ligand-dependent transcription of target genes and play a pivotal role in nutrient sensing, metabolism, and lipid-related processes (19). Glycolysis is the major source of metabolic symbiosis combined with, leading to the proliferation and metastasis of cancer. Then, we compared the enrichment difference between two clusters (20). Our results exhibited higher enrichment of PPAR signaling pathway that in cluster_B, while cluster_A presented higher a oxidative phosphorylation level (Figure 2C,2D). Meanwhile, cluster_B exhibited higher enrichment of T cell receptor signaling pathway, indicating a potential immunological association (Figure 2C,2D).

Figure 2 The functional analysis of two clusters. (A) Heatmap of the two clusters defined differential metabolic pathways in training set. (B) Heatmap of the two clusters defined differential metabolic pathways in test set. (C) Violin plot exhibited the difference of PPAR signaling pathway (left), oxidative phosphorylation (middle) and T cell receptor signaling pathway (right) between cluster_A and cluster_B in train set. (D) Violin plot exhibited the difference of PPAR signaling pathway (left), P53 signaling pathway (middle), and oxidative phosphorylation (right) between cluster_A and cluster_B in test set. KEGG, Kyoto Encyclopedia of Genes and Genomes; M, metastasis; N, node; PPAR, peroxisome proliferator-activated receptor; T, tumor.

Then, we established a metabolic scoring system based on 256 significant metabolism-related genes. The results showed that Cluster_A had a higher metabolic score compared to cluster_B (Figure S4A,S4B). Survival analysis proved that high metabolism score was associated with poor prognosis (Figure S4C-S4E). Correlation analysis showed that the metabolic score was positively correlated with macrophages M0, activated mast cells, activated NK cells, resting NK cells, and activated dendritic cells. In addition, it also presented a negative correction of activated memory CD4 T cells, CD8 T cells, gamma delta T cells, macrophages M1, resting mast cells and resting dendritic cells (Figure S5).

Correlation between metabolic subtypes and immune cells infiltration

CIBERSORT was used to evaluate tumor heterogeneity and calculate the score of 22 immune cells in the training set and test set. The network of these cells strongly suggesting overall crosstalk among all immune cells (Figure 3A,3B). Then, we compared the differences of immune cells in the two clusters. In the training set, cluster_A presented high infiltration of macrophages M0, activated mast cells and neutrophils, while cluster_B had high level of resting mast cells, macrophages M1 and resting dendritic cells (Figure 3C). Similarly, in the test set, cluster_A also presented high infiltration level of macrophages M0, neutrophils and activated mast cells, while cluster_B was enriched with resting mast cells and resting CD4⁺ memory T cell (Figure 3D). Immune checkpoint targeted therapy has entered the clinic to treat a variety of tumor types (21). We further investigated the association between subtypes and gene expression of several potentially immune checkpoint genes. In the training and test set, cluster_B exhibited higher expression level of common immune checkpoint genes than cluster_A, including PDCD1, IDO1, CTLA, KLRG1, CD28, and TIGIT, demonstrating a probably drug sensitivity toward promising checkpoint inhibitors (Figure S6). Several studies have revealed that tumor-infiltrating mast cells can be found in a large of cancer types and have tumor promoting or anti-tumor specificity (22). In our study, survival analysis showed that activated mast cells were significantly associated with poor prognosis, while resting mast cells had a better prognosis (Figure 3E). These results were consistent with the classification of metabolism-related genes.

Figure 3 The characterization of immune cells and prognosis of CC between two clusters. (A) Cellular interaction of immune cells in training set. (B) Cellular interaction of immune cells in test set. (C) Boxplot of the abundance of stromal and immune cell population distinguished by the two clusters in training set. (D) Boxplot of the abundance of stromal and immune cell population distinguished by the two clusters in test set. (E) Kaplan-Meier curves of several immune cells (activated mast cells, macrophages M0, resting dendritic cells and resting mast cells) for survival time. CC, cervical cancer; NK, natural killer.

Construction of metabolism-related risk signature

In order to ensure the prognostic prediction of the above metabolism-related genes, a univariate Cox regression analysis was performed. 51 genes were associated with OS of CC patients in the training set (Figure 4A). Then, we performed a LASSO analysis on the 51 significant genes, and 7 genes (CYP4F12, NPL, CH25H, NOS2, SDR16C5, PGK1 and LYZ) were found out as metabolism signature (Figure 4B and Table S1). RNA sequencing data from GEPIA tool showed that 5 genes (NOS2, NPL, SDR16C5, PLG1 and LYZ) were differentially expressed in cancer and normal tissues (Figure S7A). Meanwhile, immunohistochemistry from HPA database also showed the positive expression of CYP4F12, NOS2, SDR16C5, PKG1 and LYZ in tumor tissue (Figure S7B). We further found that the expression intensity and quantity of SDR16C5, PKG1 and LYZ in tumor tissue was higher than that in normal cervical tissue. Survival analysis confirmed that the seven genes were all associated with the prognosis of patients (Figure S7C). To validate the correlation between the seven metabolism-related genes and the level of immune cell infiltration, we obtained the coefficient of genes and immune cells using TIMER database. The outcomes exhibited that the majority of genes were associated with immune cells (Figure S8).

Figure 4 The identification of marker genes and construction of metabolism score. (A) Forest map showing the univariate Cox regression analysis of 51 significant genes (P<0.05). (B) Forest map showing the multivariate Cox regression analysis of 7 significant genes (P<0.05). (C) Information for 7 significant genes significantly related to overall survival. CI, confidence interval.

As previously described, the prediction model was determined by the linear combination of seven genes, and their relative coefficient in multiple Cox regression are as follows: risk score = (−0.24589 × CYP4F12) + (−0.31893 × NPL) + (−0.26555 × CH25H) + (−0.53404 × NOS2) + (−0.35866 × PGK1) + (0.40006 × SDR16C5) + (−0.27741 × LYZ) (Figure 4C). Risk scores for each individual sample in the training set were calculated (based on the 7 genes expression). According to the median risk score, all patients in the training set were divided into high-risk group (n=126) or low-risk group (n=127). Then, we performed the same analysis of GSE40001 data as the test set and divided it into a high-risk group (n=150) or low-risk group (n=150). The heatmap revealed that the seven gene expression pattern (Figure S9A,S9B). Gene expression levels were higher in the low-risk group (CH25H, CYP4F12, LYZ, NOS2, NPL, SDR16C5), while the expression of PKG1 were higher in high-risk group (Figure S9C,S9D). Further survival analysis exhibited that high-risk patients had poorer OS compared with low-risk patients in both training (Figure 5A) and test set (Figure 5B). In addition, we ranked the risk scores and further analyzed their survival status (Figure 5C-5F). Notably, The ROC curve was used to evaluate the capability of seven genes to predict the prognosis of CC patients. In the training set, the AUC of ROC was 0.843 at 1 year and 0.785 at 3 years (Figure S10A). In the test set, the AUC of ROC was 0.632 at 1 year and 0.735 at 3 years (Figure S10B). The calibration plots (Figure S10C,S10D) exhibited better performance. Then, we determined the link between prognostic risk scores and immune cells. Then, we found that high-risk had activated mast cells, eosinophils and macrophages M0, while low risk had CD8 T cells, macrophages M1, regulatory T cells and resting mast cells (Figure S11A,S11B). Then, we found that with the increase of risk score, the infiltration level of immune cells such as CD8 T cells, and macrophages M1 decreased (Figure S11C,S11D).

Figure 5 The prognosis of CC between high- and low-risk groups. Kaplan-Meier curves of OS for patients in the high- and low-risk groups in TCGA (A) and GSE44001 (B). The distribution of risk score, patient’s survival time and status in TCGA (C,E) and GSE44001 dataset (D,F). CC, cervical cancer; TCGA, The Cancer Genome Atlas.

Prognostic signature acts as an independent prognostic predictive factor

Finally, the univariate and multivariate Cox regression analysis in TCGA further verified our prediction model with the addition of other common prognostic factors. Although univariate Cox analysis indicated that N stage had prognostic effect on our model (Figure 6A), only the seven-gene signature could be applied as an independent prognostic factor (Figure 6B). In order to quantitatively predict the prognosis of CC, we established a nomogram based on TCGA (Figure 6C), which integrated potential prognostic variables by Cox analysis. The results showed that the metabolism risk score was the leading factor to predict OS. The calibration plots (Figure 6D) presented better performance. More importantly, the ROC curve showed a satisfactory predict sensitivity and specificity, with AUC =0.822 for 1 year and AUC =0.782 for 3 years (Figure 6E).

Figure 6 The construction of nomogram model. Forest map showing hazard ratios of risk score in univariate (A) and multivariate (B) Cox regression models combined with TCGA CC patient clinical characteristics. (C) The nomogram to predict survival possibilities. (D) The calibration plot of the nomogram to predict 1-, 3-, and 5-year survival rates. (E) The time-dependent ROC curve to predict 1-, 3-, and 5-year survival rates. AUC, area under the curve; CC, cervical cancer; M, metastasis; N, node; OS, overall survival; ROC, receiver operating characteristic; T, tumor; TCGA, The Cancer Genome Atlas.

Discussion

As a matter of fact, several CC classifications based on gene expression signatures have been proposed in recent years, but consensus has not yet been reached at the molecular level. In this situation, it is urgent to classify the prognosis of patients. Therefore, we propose a CC comprehensive classification based on 2,752 metabolism-related genes. Our research results confirmed that CC could be divided into two different metabolism-related subtypes, and the reproducibility of this metabolism classification has been validated based on the GEO database. In addition, we constructed an independent prognostic signature based on seven significant metabolism-related genes. By linking tumor-node-metastasis (TNM) classification stage and risk score, we demonstrated that novel metabolism-related prognostic signature can be used to predict the prognosis of CC.

In recent years, increasing evidence suggests a potential relationship between energy metabolism and tumor survival and growth (23). To understand the significance of tumor metabolism, we successfully divided all CC patients into two subtypes. We observed that cluster_A is mainly involved in glycolysis metabolism, such as glycolysis gluconeogenesis, glyoxylate dicarboxylate, and glycosaminoglycan biosynthesis. Despite the lower efficiency of glycolysis, tumor cells still exhibit more actively in glycolysis. Undoubtedly, glycolysis produces more energy and various metabolites for tumor cells in a short period of time. Some studies have reported that total lesion glycolysis seriously affects the recurrence free survival and OS of CC patients (24). In addition, it is well known that glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is a classical glycolytic enzyme that is significantly increased in CC (25). In this study, we found GAPDH was highly expressed in cluster_A, indicating high glycolysis metabolism. Then, we explored the relationship between immune cell infiltration and metabolism. As is well known, additional glycolysis can lead to an acidic microenvironment, influence immune cell infiltration, and contribute to the survival of cancer cells (26). Cascone et al. reported that patients with melanoma or non-small cell lung cancer, have ultra-high glycolysis and relatively poor immune cell infiltration, such as cytotoxic T cells, memory T cells, macrophages, T helper cells or NK cells reduction (27). In breast cancer, Li et al. showed that high glycolysis usually exhibited high immune scores (28). However, in their research, the anti-tumor function of immune cells has not been fully elucidated, and they have only concluded that high glycolysis is associated with immunosuppression properties (28). In our study, we observed high levels of abnormal macrophages M0 and activated mast cells infiltration in cluster_A. It was reported that activated mast cells might promote tumor invasion by releasing a series of matrix metalloproteinases (MMPs) (29). Meanwhile, it can induce tumor angiogenesis by releasing a broad range of pro-angiogenic factors, such as fibroblast growth factor 2 (FGF2), platelet-derived growth factor (PDGF), vascular endothelial growth factor (VEGF) and interleukin-6 (IL-6). The interactions between these cells may lead to poor prognosis (30).

Cluster_B is mainly involved in drug metabolism, and liquid metabolism (e.g., fatty acid metabolism, glycerolipid metabolism and linoleic acid metabolism). Aberrant lipid metabolism in cancer cell contributes to the formation of an immunosuppressive microenvironment that supports cancer progression (31). Targeted therapy for enzymes related to aberrant lipid metabolism is a brand-new therapeutic strategy that can benefit CC patients. In addition, we found that in immune related pathways in cluster_B were highly enriched, such as NK cells mediated cytotoxicity, T cell receptor signaling pathway, and Toll like receptor signaling pathway, which may explain the better prognosis of patients in cluster_B. Similarly, we observed a significant infiltration of resting mast cells, macrophages M1 and activated dendritic cells (DCs) in cluster_B. As we all know, the increased M1 macrophages secretes a series of pro-inflammatory factors to maintain the chronic inflammatory microenvironment and activate T cell adaptation immune response, exerting anti-tumor effects (32). Yan et al. reported that macrophages M1 not only plays an important role in the blocking effect of programmed cell death protein 1 (PD-1)/programmed death-ligand 1 (PD-L1), but also is also associated with improved tumor prognosis in gastric cancer (33). DCs are not only the strongest antigen presenting cells, but also key cells that regulate activation and tolerance (34). Recent experiments have shown that when DCs are activated, necessary and coordinated change in lipid metabolism occur, and the disruption of cancer lipid metabolism is associated with a decrease in immune stimulation ability (35). Meanwhile, we found that cluster_B is associated with the expression of immune checkpoint genes such as PDCD1, TIGIT, KLRG1 and CTLA4, which indicated that it is more effective to immunotherapy. These results indicated that cluster_A showed high enrichment of glycolysis metabolism, while cluster_B exhibited high concentrations of lipid metabolism and immune-related signaling pathways.

Then, we constructed a metabolism-related risk signature based on seven-gene (CYP4F12, NPL, CH25H, NOS2, SDR16C5, PGK1 and LYZ) and found significant differences in OS between high-risk group and low-risk groups. Considering other clinical factors, we developed a nomogram to predict individual clinical outcomes. The nomogram generated a statistical prediction model that scores each factor such as age, grade, and TNM stage in the clinical setting. By summarizing all the key points, the model provides individuals with numerical possibility regarding clinical outcomes, and has been proven to be a stable and reliable tool for tumor prognosis. These genes will help us better explore the mechanism of metabolism-related CC, and may become biomarkers of CC targeted metabolic therapy. Although the value of these genes in CC is not yet clear, some studies suggest a correlation between metabolism-related genes and various malignant tumors. In addition, some studies reports that the high expression of these genes is associated with tumor metabolism and immunity. CYP4F12 belongs to the cytochrome P450 4F subfamily and is involved in the biosynthesis and degradation of steroids, vitamins, fatty acids, arachidonic acid, prostaglandins, amines, pheromones, and plant metabolites (36). Eun et al. reported that higher expression of CYP4F12 is related to better OS in HCC patients (37). Zhao et al. demonstrated that lung cancer patients with high expression of CYP4F12 have a better prognosis (38). CH25H can catalyze the formation of 25-hydroxycholesterol and is traditionally regarded as an important regulator of cholesterol homeostasis by inhibiting SREBP and activating LXR (39). Mittempergher et al. found that CH25H expression is associated with the prognosis of cancer patients, making it an independent predictor of prognosis, especially poor prognosis, which is consistent with our data in CC (40). Inducible NO (NOS2) synthase can be expressed in almost any cells or tissue, producing many physiological functions. If expressed in macrophages, it will produce a large amount of NO, resulting in severe cytotoxic responses. Its expression is associated with inflammation and epithelial cell growth (41). Some studies suggest that iNOS is a harmful enzyme, and its overexpression can lead to the endothelial dysfunction and inflammation (42). Phosphoglycerate kinase 1 (PGK1) is a key rate-limiting enzyme in the process of aerobic glycolysis. Therefore, PGK1 is associated with cancer metastasis, and high expression often predicts poor survival outcomes (43). Recent study has reported that the high expression of PGK1 in the renal cell cancer is often accompanied by an increase in glycolysis-related enzymes. PGK1 can exhibit pro-tumorigenic properties in vitro and xenograft tumor models by accelerating glycolysis and inducing CXCR4-mediated AKT and ERK phosphorylation (44). Lysozyme (LYZ) is an antimicrobial protein that exists in particles of neutrophils and macrophages, and a variety of biological fluids. Recent study reported that elevated plasma lysozyme levels in obesity were associated with hyperglycemia, insulin resistance, dyslipidemia and inflammatory parameters (45). Interestingly, a study also demonstrated that LYZ is expressed in human subcutaneous and visceral adipose tissue (46). According to reports, SDR16C5 and NPL are mainly present in large number in the process of hormone metabolic. A study showed that SDR16C5 can promote retinoic acid biosynthesis when expressed in mammalian cells (47). NPL can regulate the cellular concentrations of sialic acid and prevent the recycling of sialic acid for further sialylation with glycoconjugates (48). Although these genes have not yet been reported in CC, their potential role as diagnostic markers is expected to be further exploration in the development of CC. In addition, our study also found that these genes may be involved in immune cell infiltration, which needs further research to confirm.

Our preliminary results provide insightful ideas for exploring the role of metabolism in CC. Some highlights of this study should be emphasized, such as larger sample size, verification of multiple methods, application of logistic regression and several machine learning methods. Although our model can provide a close link between metabolism and prognosis of CC patients, some studies are still needed to improve these findings. Our prediction model is based on online databases and lacks biochemical experiments for validation. In addition, some of these genes have not been reported in CC. Our work only reflects a few specific aspects of metabolism reprogramming. Thus, we need experimental data on most of these genes to demonstrate their correlation. Meanwhile, in confirming the predictive value of the model, various dataset, and larger-sample prospective studies are still needed to evaluate its clinical relevance.

Conclusions

We identified two metabolism-related subtypes of CC, and further evaluated the differences of infiltrating immune cells and signaling pathways between the two subtypes, providing new insights for the combination of anti-metabolites and immunotherapy. Then, we successfully developed seven metabolic genes signatures, that was associated with multiple types of immune infiltrating cells, and effectively predict OS in CC patients. Combining clinical characteristics and risk scores, the predictive performance of CC is further improved.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2208/rc

Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2208/prf

Funding: None.

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2208/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2024;74:229-63. [Crossref] [PubMed]
Brücher BLDM. The Erosion of Healthcare and Scientific Integrity: A Growing Concern. J Healthc Leadersh 2025;17:23-43. [Crossref] [PubMed]
Wright JD, Matsuo K, Huang Y, et al. Prognostic Performance of the 2018 International Federation of Gynecology and Obstetrics Cervical Cancer Staging Guidelines. Obstet Gynecol 2019;134:49-57. [Crossref] [PubMed]
Martínez-Reyes I, Chandel NS. Cancer metabolism: looking forward. Nat Rev Cancer 2021;21:669-80. [Crossref] [PubMed]
Rabelo ILA, Arnaud-Sampaio VF, Adinolfi E, et al. Cancer Metabostemness and Metabolic Reprogramming via P2X7 Receptor. Cells 2021;10:1782. [Crossref] [PubMed]
Yang C, Huang X, Liu Z, et al. Metabolism-associated molecular classification of hepatocellular carcinoma. Mol Oncol 2020;14:896-913. [Crossref] [PubMed]
Zhang M, Wang HZ, Peng RY, et al. Metabolism-Associated Molecular Classification of Colorectal Cancer. Front Oncol 2020;10:602498. [Crossref] [PubMed]
Shang C, Huang J, Guo H. Identification of an Metabolic Related Risk Signature Predicts Prognosis in Cervical Cancer and Correlates With Immune Infiltration. Front Cell Dev Biol 2021;9:677831. [Crossref] [PubMed]
Possemato R, Marks KM, Shaul YD, et al. Functional genomics reveal that the serine synthesis pathway is essential in breast cancer. Nature 2011;476:346-50. [Crossref] [PubMed]
Brunet JP, Tamayo P, Golub TR, et al. Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci U S A 2004;101:4164-9. [Crossref] [PubMed]
Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 2010;26:1572-3. [Crossref] [PubMed]
Zhou Y, Zhou B, Pache L, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun 2019;10:1523. [Crossref] [PubMed]
Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics 2013;14:7. [Crossref] [PubMed]
Mayakonda A, Lin DC, Assenov Y, et al. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res 2018;28:1747-56. [Crossref] [PubMed]
Yoshihara K, Shahmoradgoli M, Martínez E, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun 2013;4:2612. [Crossref] [PubMed]
Tang Z, Li C, Kang B, et al. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res 2017;45:W98-W102. [Crossref] [PubMed]
Thul PJ, Lindskog C. The human protein atlas: A spatial map of the human proteome. Protein Sci 2018;27:233-44. [Crossref] [PubMed]
Li T, Fan J, Wang B, et al. TIMER: A Web Server for Comprehensive Analysis of Tumor-Infiltrating Immune Cells. Cancer Res 2017;77:e108-10. [Crossref] [PubMed]
Porcuna J, Mínguez-Martínez J, Ricote M. The PPARα and PPARγ Epigenetic Landscape in Cancer and Immune and Metabolic Disorders. Int J Mol Sci 2021;22:10573. [Crossref] [PubMed]
El Sayed R, Haibe Y, Amhaz G, et al. Metabolic Factors Affecting Tumor Immunogenicity: What Is Happening at the Cellular Level? Int J Mol Sci 2021;22:2142. [Crossref] [PubMed]
Dhar R, Seethy A, Singh S, et al. Cancer immunotherapy: Recent advances and challenges. J Cancer Res Ther 2021;17:834-44. [Crossref] [PubMed]
Sabit H, Arneth B, Abdel-Ghany S, et al. Beyond Cancer Cells: How the Tumor Microenvironment Drives Cancer Progression. Cells 2024;13:1666. [Crossref] [PubMed]
Bian X, Liu R, Meng Y, et al. Lipid metabolism and cancer. J Exp Med 2021;218:e20201606. [Crossref] [PubMed]
Lane AN, Higashi RM, Fan TW. Metabolic reprogramming in tumors: Contributions of the tumor microenvironment. Genes Dis 2020;7:185-98. [Crossref] [PubMed]
Sun X, Shu Y, Ye G, et al. Histone deacetylase inhibitors inhibit cervical cancer growth through Parkin acetylation-mediated mitophagy. Acta Pharm Sin B 2022;12:838-52. [Crossref] [PubMed]
Gore M, Kabekkodu SP, Chakrabarty S. Exploring the metabolic alterations in cervical cancer induced by HPV oncoproteins: From mechanisms to therapeutic targets. Biochim Biophys Acta Rev Cancer 2025;1880:189292. [Crossref] [PubMed]
Cascone T, McKenzie JA, Mbofung RM, et al. Increased Tumor Glycolysis Characterizes Immune Resistance to Adoptive T Cell Therapy. Cell Metab 2018;27:977-987.e4. [Crossref] [PubMed]
Li C, Chen T, Li Y, et al. Impact of diabetes and metformin on cuproptosis and ferroptosis in breast cancer patients: an immunohistochemical analysis. Discov Oncol 2025;16:634. [Crossref] [PubMed]
Gonzalez-Avila G, Sommer B, García-Hernández AA, et al. Matrix Metalloproteinases' Role in Tumor Microenvironment. Adv Exp Med Biol 2020;1245:97-131. [Crossref] [PubMed]
Zheng R, Li F, Li F, et al. Targeting tumor vascularization: promising strategies for vascular normalization. J Cancer Res Clin Oncol 2021;147:2489-505. [Crossref] [PubMed]
Cui MY, Yi X, Zhu DX, et al. Aberrant lipid metabolism reprogramming and immune microenvironment for gastric cancer: a literature review. Transl Cancer Res 2021;10:3829-42. [Crossref] [PubMed]
Cozac-Szőke AR, Cozac DA, Negovan A, et al. Immune Cell Interactions and Immune Checkpoints in the Tumor Microenvironment of Gastric Cancer. Int J Mol Sci 2025;26:1156. [Crossref] [PubMed]
Yan R, Yang X, Wang X, et al. Association Between Intra-Tumoral Immune Response and Programmed Death Ligand 1 (PD-L1) in Gastric Cancer. Med Sci Monit 2019;25:6916-21. [Crossref] [PubMed]
Herber DL, Cao W, Nefedova Y, et al. Lipid accumulation and dendritic cell dysfunction in cancer. Nat Med 2010;16:880-6. [Crossref] [PubMed]
Fernández LP, Gómez de Cedrón M, Ramírez de Molina A. Alterations of Lipid Metabolism in Cancer: Implications in Prognosis and Treatment. Front Oncol 2020;10:577420. [Crossref] [PubMed]
Murray GI. The role of cytochrome P450 in tumour development and progression and its potential in therapy. J Pathol 2000;192:419-26. [Crossref] [PubMed]
Eun HS, Cho SY, Lee BS, et al. Profiling cytochrome P450 family 4 gene expression in human hepatocellular carcinoma. Mol Med Rep 2018;18:4865-76. [Crossref] [PubMed]
Zhao M, Li M, Chen Z, et al. Identification of immune-related gene signature predicting survival in the tumor microenvironment of lung adenocarcinoma. Immunogenetics 2020;72:455-65. [Crossref] [PubMed]
Zhao J, Chen J, Li M, et al. Multifaceted Functions of CH25H and 25HC to Modulate the Lipid Metabolism, Immune Responses, and Broadly Antiviral Activities. Viruses 2020;12:727. [Crossref] [PubMed]
Mittempergher L, Saghatchian M, Wolf DM, et al. A gene signature for late distant metastasis in breast cancer identifies a potential mechanism of late recurrences. Mol Oncol 2013;7:987-99. [Crossref] [PubMed]
Ma S, Sun X, Yu Q, et al. Dihydropyridine-coumarin-based fluorescent probe for imaging nitric oxide in living cells. Photochem Photobiol Sci 2020;19:1230-5. [Crossref] [PubMed]
Król M, Kepinska M. Human Nitric Oxide Synthase-Its Functions, Polymorphisms, and Inhibitors in the Context of Inflammation, Diabetes and Cardiovascular Diseases. Int J Mol Sci 2020;22:56. [Crossref] [PubMed]
Archid R, Solass W, Tempfer C, et al. Cachexia Anorexia Syndrome and Associated Metabolic Dysfunction in Peritoneal Metastasis. Int J Mol Sci 2019;20:5444. [Crossref] [PubMed]
Li L, Bai Y, Gao Y, et al. Systematic Analysis Uncovers Associations of PGK1 with Prognosis and Immunological Characteristics in Breast Cancer. Dis Markers 2021;2021:7711151. [Crossref] [PubMed]
Latorre J, Lluch A, Ortega FJ, et al. Adipose tissue knockdown of lysozyme reduces local inflammation and improves adipogenesis in high-fat diet-fed mice. Pharmacol Res 2021;166:105486. [Crossref] [PubMed]
Taube JM, Young GD, McMiller TL, et al. Differential Expression of Immune-Regulatory Genes Associated with PD-L1 Display in Melanoma: Implications for PD-1 Pathway Blockade. Clin Cancer Res 2015;21:3969-76. [Crossref] [PubMed]
Adams MK, Lee SA, Belyaeva OV, et al. Characterization of human short chain dehydrogenase/reductase SDR16C family members related to retinol dehydrogenase 10. Chem Biol Interact 2017;276:88-94. [Crossref] [PubMed]
Arero GB, Ozmen O. Effects of heat stress on reproduction and gene expression in sheep. Anim Reprod 2025;22:e20240067. [Crossref] [PubMed]

Cite this article as: Chen X, Hong C, Zhang G, Lin B, Liu J, Wang R, Li C. Cervical cancer molecular subtype identification and prognosis classification by a metabolism-related gene expression. Transl Cancer Res 2025;14(10):6207-6221. doi: 10.21037/tcr-2024-2208

Cervical cancer molecular subtype identification and prognosis classification by a metabolism-related gene expression

Highlight box

Introduction

Methods

Microarray data

Identification of CC subclusters

Functional and pathway enrichment analysis

Gene set variation analysis (GSVA)

Mutation differences of CC subclusters

Immune infiltration estimation of CC subclusters

Construction and validation of the prognostic risk score signature

Validation of the hub genes

Build a predictive nomogram

Statistical analysis

Results

Two metabolism-related subclusters were confirmed in CC

Correlation of the CC clusters with metabolism-related signatures

Correlation between metabolic subtypes and immune cells infiltration

Construction of metabolism-related risk signature

Prognostic signature acts as an independent prognostic predictive factor

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share