Multi-algorithm machine learning combined with in silico gene knockout reveals the diagnostic value and functional regulatory networks of ferroptosis-related genes in gastric cancer
Highlight box
Key findings
• Five ferroptosis-related genes (AKR1C1, CTSB, EZH2, IDO1, TIMP1) were identified as key diagnostic markers for gastric cancer (GC), and a multi-algorithm machine learning model showed high and stable diagnostic performance across multiple cohorts.
What is known and what is new?
• Ferroptosis is involved in the development and progression of GC.
• This study integrates large-scale machine learning screening with in silico gene knockout and single-cell analysis to define a robust ferroptosis-related diagnostic signature.
What is the implication, and what should change now?
• The identified gene signature may support early and accurate diagnosis of GC. Ferroptosis-related pathways represent potential targets for future mechanistic and translational studies.
Introduction
Gastric cancer (GC) is one of the most common malignancies of the digestive system worldwide, with persistently high incidence and mortality rates, particularly imposing a heavy public health burden in East Asia (1). Early diagnosis of GC is critical for improving patient prognosis and reducing mortality. At early stages, GC is often curable by surgical resection, with a 5-year survival rate approaching 90% (2). Despite recent advances in early screening, surgical techniques, chemotherapy, targeted therapy, and immunotherapy, overall treatment outcomes remain unsatisfactory (3). Most patients are diagnosed at advanced stages, and the marked biological heterogeneity of GC contributes to therapy resistance and disease recurrence, resulting in persistently low long-term survival rates (4,5). Therefore, systematically elucidating the molecular mechanisms underlying GC development and identifying prognostic biomarkers and therapeutic targets with clinical potential remain critical scientific challenges in GC research and clinical practice.
Ferroptosis is a form of programmed cell death first proposed in 2012, characterized by iron-dependent accumulation of lipid peroxides, involving iron homeostasis imbalance, enhanced lipid peroxidation, and impaired antioxidant defense systems (6,7). Key molecular events include GPX4 inactivation, glutathione depletion, and excessive reactive oxygen species generation (8). Recent studies have demonstrated that ferroptosis plays a crucial regulatory role in tumor initiation, progression, metastasis, and treatment response (9). In GC, ferroptosis-related genes (FRGs) are involved in cell proliferation, metabolic reprogramming, and regulation of sensitivity to chemotherapy and immunotherapy (10-12). Furthermore, classical oncogenic signaling pathways such as PI3K/AKT, p53, and NRF2 intersect with ferroptosis regulatory networks, influencing GC cell survival and malignant behaviors (8,13,14). Although prior studies have revealed functional roles for some FRGs in GC, their overall expression profiles, gene interaction networks, and clinical applicability remain incompletely characterized in a systematic and integrative manner.
Therefore, this study innovatively integrates bulk transcriptomic and single-cell datasets, employs over 100 combinations of machine learning algorithms to construct a multi-gene diagnostic model, and utilizes in silico gene knockout analysis to systematically investigate the functional regulatory networks of FRGs. Using this approach, we not only identify key FRGs and core hub nodes in GC but also reveal their potential cell-type-specific expression patterns and functional interactions, providing novel theoretical insights and clinical references for molecular subtyping, risk assessment, and potential therapeutic targets in GC. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0035/rc).
Methods
Data sources and preprocessing
Transcriptome data of GC used in this study were obtained from the Gene Expression Omnibus database (https://www.ncbi.nlm.nih.gov/geo/). Specifically, GSE184336 included 231 GC tumor samples and 230 adjacent normal samples; GSE54129 contained 111 GC tumor samples and 21 normal samples; GSE13911 comprised 38 primary GC tumor samples and matched adjacent normal tissues. All datasets were background-corrected and normalized prior to downstream analyses. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Differential expression analysis
Differentially expressed genes (DEGs) between tumor and normal samples in GSE184336 were identified using the limma package in R. The screening criteria were |log2fold change (FC)| >0.5 and adjusted P<0.05. Hierarchical clustering and heatmaps were applied to visualize expression patterns across samples, and volcano plots were generated to depict the distribution of DEGs.
Weighted gene co-expression network analysis (WGCNA)
A weighted gene co-expression network was constructed using all genes in GSE184336 to identify modules highly correlated with GC phenotype. The soft-thresholding power was set to 7 to satisfy the scale-free topology criterion. Subsequently, hierarchical clustering based on the topological overlap matrix (TOM) was performed to cluster similar genes into modules, and module eigengenes (ME) were correlated with clinical groups (tumor vs. normal). Key modules were defined as |R| >0.3 and P<0.05, and hub genes within modules were selected with module eigengene connectivity (KME) >0.6.
Ferroptosis-related hub gene deletion
A list of 583 FRGs was compiled based on previously published literature (15). The intersection of these genes with WGCNA key module genes and DEGs was used to identify potential ferroptosis-related hub genes for subsequent analyses.
Protein-protein interaction (PPI) network construction and functional enrichment analysis
The STRING database (https://string-db.org/) was employed to construct PPI networks for the selected hub genes, and network topological features (e.g., node degree) were calculated to identify core genes. Network visualization was performed using the igraph and ggraph packages in R. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were conducted using the clusterProfiler package to investigate biological processes, cellular components, molecular functions, and potential pathways. Significance was set at P<0.05.
Construction and validation of multi-gene diagnostic models
Multi-gene diagnostic models were constructed in the training set using various feature selection methods combined with 19 machine learning algorithms (including lasso, Stepglm, glmBoost, Enet, random forest, svmRadial, svmLinear, k-nearest neighbors, gradient boosting machine, plsRglm, lda, knn, neural network, naive_bayes, pls, stepQDA, glmStepAIC, stepLDA, and LogitBoost). Among these, Lasso, Stepglm, glmBoost, and Enet are known for their excellent feature selection capabilities. By combining these feature-selection algorithms with the others, a total of 120 algorithm combinations were generated. GSE184336 was used as the training set, while GSE54129 and GSE13911 served as external validation sets to evaluate model generalizability. Receiver operating characteristic (ROC) curves and area under the curve (AUC) values were used to evaluate performance. Nomograms were constructed for individualized risk prediction, and calibration curves and decision curve analysis (DCA) were applied to assess clinical utility. To assess the relative contribution of each model gene to predictive performance, an interpretability analysis was performed on the model using Shapley Additive exPlanations (SHAP). By calculating the Shapley value for each gene, the contribution of each gene to individual risk predictions was quantified, and a gene-contribution ranking plot was generated.
Gene set variance analysis (GSVA)
To explore the association between model genes and key tumor-related pathways, gene sets related to ferroptosis and epithelial-mesenchymal transition (EMT) were obtained from MSigDB (https://www.gsea-msigdb.org/gsea/msigdb). GSVA enrichment analysis was performed on the transcriptomic data of TCGA GC samples using the GSVA package in R, yielding enrichment scores for each pathway in every sample. Subsequently, the correlation between model gene expression and GSVA enrichment scores was calculated to assess the potential regulatory roles of the genes in ferroptosis and EMT pathways.
Single-cell transcriptome analysis
The single-cell dataset GSE163558 initially contained 10 samples, and 4 non-metastatic samples were included for analysis. Quality control criteria were set as nFeature_RNA >200 and <5,500, and mitochondrial gene proportion (percent.mt) <20%. A total of 14,499 high-quality cells underwent dimensionality reduction and unsupervised clustering, with t-distributed stochastic neighbor embedding (t-SNE) used for visualization of cell subpopulations. Cell type annotation was performed based on known marker genes, and the expression specificity of diagnostic model core genes across cell types was analyzed. Analysis was conducted using the Seurat package, with cell annotation assisted by singleR and the CellMarker2 database (http://117.50.127.228/CellMarker/).
In silico gene knockout analysis
Based on the single-cell transcriptome data, in silico knockout analysis was performed for the five model genes to evaluate their potential impact on cellular transcriptional profiles. The scTenifoldKnk package was used to extract epithelial cell populations for virtual gene knockout, allowing investigation of gene function and underlying mechanisms.
Cell culture
GC cell lines AGS (#CL-0022) and HGC-27 (#CL-0107), as well as the normal gastric epithelial cell line GES-1 (#CL-0563), were purchased from Wuhan Pricella Biotechnology Co., Ltd. Cells were cultured in RPMI-1640 medium containing 10% fetal bovine serum (FBS) at 37 ℃ with 5% CO2.
Cell transfection
HGC-27 cells were cultured to 70–80% confluence. AKR1C1 overexpression (oe-AKR1C1) was achieved via plasmid vector transfection using Lipofectamine 3000 according to the manufacturer’s instructions. Knockdown of CTSB, EZH2, IDO1, and TIMP1 was performed using specific siRNAs (si-CTSB, si-EZH2, si-IDO1, si-TIMP1), with a control siRNA (si-NC) included for each group. After 48 hours of transfection, cells were collected for quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR) and Western blotting to validate the gene expression modulation.
Cell Counting Kit-8 (CCK-8)
HGC-27 cells after transfection were trypsinized, counted, and resuspended to a concentration of 1×105 cells/mL. A total of 5,000 cells were seeded into each well of a 96-well plate. According to the CCK-8 kit instructions, 10 µL of CCK-8 reagent was added to each well at 24, 48, and 72 hours, followed by incubation at 37 ℃ for 2 hours. The absorbance was then measured at 450 nm using a microplate reader.
qRT-PCR
Total RNA was extracted from cells using TRIzol reagent (Beyotime, Shanghai, China). RNA concentration and purity were measured with a NanoDrop 2000. RNA was then reverse-transcribed into cDNA using the PrimeScript™ RT kit. mRNA expression levels were detected using SYBR Green on a qRT-PCR system, with glyceraldehyde-3-phosphate dehydrogenase (GAPDH) as the internal control. Relative expression levels were calculated using the 2−ΔΔCt method.
Western blotting
Total protein was extracted from cells using radioimmunoprecipitation assay (RIPA) lysis buffer, and protein concentrations were measured using the bicinchoninic acid (BCA) assay (Beyotime). Equal amounts of protein (20 µg) were separated on 10% sodium dodecyl sulfate‑polyacrylamide gel electrophoresis (SDS-PAGE) gels and transferred onto polyvinylidene difluoride (PVDF) membranes. Membranes were blocked with 5% skim milk at room temperature for 1 hour, followed by overnight incubation at 4 ℃ with the following primary antibodies: AKR1C1 (1:5,000, #ab203834, Abcam, Cambridge, UK), CTSB (1:1,000, #ab214428, Abcam), EZH2 (1:1,000, #ab191250, Abcam), IDO1 (1:1,000, #ab211017, Abcam), TIMP1 (1:1,000, #ab211926, Abcam), GLS (1:1,000, #ab156876, Abcam), GLUL (1:1,000, #ab176562, Abcam), CPS1 (1:1,000, #ab129076, Abcam), p53 (1:1,000, #ab32049, Abcam), p21 (1:1,000, #ab109520, Abcam), Cyclin D1 (1:10,000, #ab134175, Abcam), MMP2 (1:1,000, #ab92536, Abcam), MMP9 (1:1,000, #ab76003, Abcam), N-cadherin (1:5,000, #ab76011, Abcam), Vimentin (1:1,000, #ab92547, Abcam), E-cadherin (1:1,000, #ab40772, Abcam), and GAPDH (1:10,000, #ab181602, Abcam). The following day, membranes were washed three times with tris-buffered saline with Tween-20 (TBST) and incubated with horseradish peroxidase (HRP)-conjugated secondary antibody Goat Anti-Rabbit IgG H&L (1:10,000, #ab6721, Abcam) at room temperature for 1 hour. Protein bands were visualized using enhanced chemiluminescence (ECL), and results were analyzed with ImageJ software. Protein expression levels were normalized to GAPDH by calculating the ratio of the target protein to GAPDH.
Statistical analysis
All statistical analyses were conducted in R (version 4.3.1) or GraphPad Prism (version 10.1.3). Results were presented as mean ± standard deviation from at least three independent experiments. Comparisons between two groups were performed using the Student’s t-test. For comparisons among multiple groups, one-way analysis of variance (ANOVA) followed by Tukey’s post hoc test was applied. Statistical significance was set at P<0.05. Specific analytical methods and packages used are described in detail in the corresponding subsections.
Results
Identification of ferroptosis-related key modules and genes in GC
To systematically characterize transcriptional differences between tumor and normal tissues and to further identify key gene modules closely associated with ferroptosis, this study employed an integrated multi-level analysis combining limma differential expression analysis and WGCNA. First, the GSE184336 dataset was normalized, and differential expression analysis between tumor and normal groups was conducted using the limma package, resulting in the identification of 7,670 DEGs. Hierarchical clustering heatmaps demonstrated that these DEGs could effectively distinguish tumor samples from normal samples, indicating a clear separation in global transcriptional profiles (Figure 1A). Volcano plots further illustrated the distribution of DEGs, highlighting the dataset’s high biological interpretability and statistical reliability (Figure 1B). To systematically identify co-expression gene modules highly correlated with the GC phenotype, WGCNA was performed on all genes in GSE184336. Using soft-thresholding power selection, the analysis indicated that a power of 7 satisfied the scale-free network criteria while maintaining a reasonable average connectivity, and was thus chosen to construct the weighted co-expression network (Figure 1C). Subsequently, hierarchical clustering based on the TOM identified nine co-expression modules distinguished by different colors, with each module demonstrating good stability in terms of gene number and clustering structure (Figure 1D). The correlations between MEs and clinical grouping (tumor vs. normal) were then evaluated. Results revealed that multiple modules were significantly associated with tumor status: MEBrown and MEBlue modules showed significant positive correlations, whereas MEBlack and METurquoise modules were negatively correlated with tumor tissues (Figure 1E).
To further pinpoint genes closely linked to ferroptosis, a curated set of 583 FRGs was compiled from published literature (Table S1). Intersection analysis among WGCNA key module genes, DEGs, and the FRG set identified 53 candidate key genes (Figure 1F). These intersecting genes were considered potential core regulators of ferroptosis in GC.
PPI network features and functional pathways of ferroptosis-related key genes
To further explore potential synergistic relationships among ferroptosis-related key genes from a systems biology perspective, a PPI network was constructed based on the 53 intersecting genes, and their biological characteristics were assessed through functional enrichment analysis. The PPI network demonstrated a tightly connected and hierarchically organized structure, with several nodes exhibiting high connectivity (Figure 2). Quantitative analysis of network topology based on node degree (Table 1) revealed that HIF1A and TGFB1 had the highest connectivity (degree =20), followed by HMOX1 (degree =13), CD44 (degree =12), and CAV1 (degree =12), indicating strong network centrality. NQO1 and CYBB both had a degree of 10, whereas NOX4 (degree =9), TIMP1 (degree =8), and EZH2 (degree =8) also occupied important regulatory nodes. These top ten genes formed the core hub cluster of the PPI network, suggesting they may play key synergistic roles in ferroptosis regulation.
Table 1
| Node | Degree |
|---|---|
| HIF1A | 20 |
| TGFB1 | 20 |
| HMOX1 | 13 |
| CD44 | 12 |
| CAV1 | 12 |
| NQO1 | 10 |
| CYBB | 10 |
| NOX4 | 9 |
| TIMP1 | 8 |
| EZH2 | 8 |
| MUC1 | 7 |
| GJA1 | 7 |
| TGFBR1 | 7 |
| IDO1 | 7 |
| DUOX1 | 7 |
| AKR1C1 | 5 |
| HNF4A | 5 |
| NCF2 | 5 |
| GPT2 | 5 |
| SOCS1 | 5 |
| CTSB | 4 |
| HSPB1 | 4 |
| AKR1C3 | 4 |
| AKR1C2 | 4 |
| CBR1 | 4 |
| ALOX5 | 3 |
| NEDD4L | 3 |
| CA9 | 3 |
| CYGB | 2 |
| HIC1 | 2 |
| GOT1 | 1 |
| PANX1 | 1 |
| FZD7 | 1 |
| RGS4 | 0 |
GO and KEGG functional enrichment analyses were performed for the 53 FRGs. GO biological process analysis (Figure S1A) showed that these genes were primarily enriched in reactive oxygen species metabolism, oxidative stress response, and inflammation-related processes, including reactive oxygen species metabolic process, response to oxidative stress, superoxide metabolic process, inflammatory response to wounding, and wound healing, highlighting their role in oxidative damage regulation and inflammatory responses. Notably, terms related to tissue remodeling were also significantly enriched, reflecting potential functions in dynamic tissue structural changes. In terms of cellular components, enriched terms included apical plasma membrane, basolateral plasma membrane, microvillus membrane, nicotinamide adenine dinucleotide phosphate (NADPH) oxidase complex, and focal adhesion, suggesting key gene localization in membrane-associated redox reactions, cell adhesion, and signal transduction. Molecular function analysis revealed significant enrichment in oxidoreductase activity, aldo-keto reductase (NADPH) activity, antioxidant activity, and heme binding, further indicating their central role in iron homeostasis and lipid peroxidation regulation. KEGG pathway enrichment analysis (Figure S1B) indicated that the ferroptosis pathway was among the top significantly enriched pathways, directly confirming the strong association between the identified intersecting genes and ferroptosis. Additionally, multiple pathways related to hypoxia, oxidative stress, and inflammation were significantly enriched, including the HIF-1 signaling pathway, NOD-like receptor signaling pathway, and AGE-RAGE signaling pathway. Enrichment of metabolic pathways such as Arachidonic acid metabolism and Arginine biosynthesis suggested potential roles in lipid and amino acid metabolic reprogramming. Enrichment in tumor-related pathways, such as Proteoglycans in cancer and FoxO signaling pathway, further supports the potential biological significance of these ferroptosis-related key genes in GC development and progression.
Construction, validation, and clinical evaluation of a ferroptosis-related diagnostic model
Based on the previously identified ferroptosis-related key genes and network analysis, we further evaluated their diagnostic value for tumor identification and constructed a stable and reliable multi-gene diagnostic model. Using the training dataset, multiple feature selection strategies combined with various machine learning classification algorithms were applied to comprehensively compare the discriminative performance of different algorithm combinations. The results demonstrated notable differences in the performance of the models in distinguishing tumor from normal samples (Figure 3A). Considering both the AUC performance in the training and validation sets and model stability, the glmBoost + pls combination was selected as the optimal diagnostic model for subsequent analyses, achieving an average AUC of 0.965. In the training set, the model exhibited strong discriminative capability, with an ROC AUC of 0.934 (Figure 3B). To further validate the model’s generalizability and cross-cohort robustness, it was applied to two independent external validation cohorts, GSE54129 and GSE13911. The model achieved an AUC of 0.991 in GSE54129 and 0.969 in GSE13911 (Figure 3B), demonstrating excellent and stable tumor identification performance across different datasets and sample backgrounds, indicating strong robustness. Subsequently, the diagnostic performance of the five ferroptosis-related feature genes within the model (AKR1C1, CTSB, EZH2, IDO1, and TIMP1) was individually evaluated. These genes showed high discriminative ability in both the training set and external validation cohorts (Figure 3C). Among them, TIMP1 consistently achieved the highest single-gene AUC across multiple cohorts, followed by CTSB and EZH2, suggesting that certain FRGs also have potential standalone diagnostic value. For model performance assessment, calibration curve analysis indicated good agreement between predicted probabilities and observed outcomes, with the bias-corrected curve closely aligning with the ideal reference line, demonstrating strong calibration ability (Figure 3D). Additionally, DCA showed that the model provided higher standardized net benefits than either “intervene-all” or “intervene-none” strategies over a wide range of threshold probabilities (Figure 3E), further supporting its potential utility in clinical decision-making.
Diagnostic contribution and biological relevance of model genes
To enhance clinical applicability, a nomogram was constructed based on the multi-gene diagnostic model for individual-level tumor risk assessment (Figure 4A). Results indicated that, except for CTSB, the remaining four model genes contributed to the total risk score to varying degrees (P<0.05). Elevated expression of EZH2, IDO1, and TIMP1, as well as decreased expression of AKR1C1, corresponded to higher tumor risk. Further analysis of expression differences between tumor and normal tissues showed that TIMP1, CTSB, IDO1, and EZH2 were significantly upregulated in tumor tissues, whereas AKR1C1 was significantly downregulated (all P<0.0001, Figure 4B). Moreover, SHAP analysis demonstrated the relative contribution of each gene to the model’s predictive performance, highlighting TIMP1 and AKR1C1 as top contributors (Figure 4C). GSVA further revealed correlations between these model genes and key tumor-related pathways, including ferroptosis and EMT. As shown in Figure 4D, all five genes were positively associated with ferroptosis, whereas for EMT, AKR1C1 exhibited a significant negative correlation, while the remaining four genes showed significant positive association.
Cellular heterogeneity at the single-cell transcriptomic level and cell-type-specific expression of key genes
To further explore the cellular origin and potential functional basis of the model genes at single-cell resolution, a total of 14,499 high-quality cells were obtained from single-cell transcriptomic data. After dimensionality reduction and unsupervised clustering, 15 distinct cell subpopulations were clearly identified in the t-SNE space (Figure 5A). Based on the expression patterns of cluster-specific marker genes, these subpopulations exhibited clear and biologically meaningful transcriptional characteristics (Figure 5B). Using classical cell-type marker genes, these subpopulations were annotated into eight major cell types, including epithelial cells, stromal cells, T cells, B cells, NK cells, myeloid cells, fibroblasts, and mast cells (Figure 5C). Cross-validation with multiple known marker genes further confirmed the accuracy and reliability of the cell-type annotations (Figure 5D,5E). While the proportion of different cell types varied among samples, the overall distribution trend was consistent (Figure 5F), indicating good stability of cellular composition and providing a reliable basis for subsequent differential gene expression analysis. At the single-cell level, the cell-type-specific expression patterns of the five key model genes were systematically evaluated. t-SNE distribution and violin plot analyses (Figure 5G,5H) revealed that TIMP1, CTSB, and EZH2 displayed relatively broad expression across multiple cell types. Specifically, TIMP1 and CTSB were significantly enriched in myeloid cells and fibroblasts, whereas EZH2 expression was mainly concentrated in T cells and NK cells. In contrast, AKR1C1 and IDO1 showed generally low expression across all cell types, without evident cell-type-specific enrichment.
In silico knockout analysis reveals potential regulatory networks and functional pathways of model genes
To further investigate the potential intracellular regulatory mechanisms of the five model genes, in silico knockout (virtual gene deletion) analysis was performed using single-cell transcriptomic data. Virtual knockout of AKR1C1, CTSB, EZH2, IDO1, and TIMP1 resulted in varying degrees of transcriptional remodeling, indicating that these genes play critical roles in maintaining cellular homeostasis and associated functional networks (Figure 6A-6E, Table S2). Specifically, AKR1C1 knockout caused significant changes in the expression of multiple metabolism-related genes. GSVA analysis indicated that the downstream alterations were primarily enriched in the Nitrogen metabolism pathway (Figure 6F), suggesting that AKR1C1 may participate in disease-related biological processes by regulating metabolic reprogramming. In contrast, the virtual knockouts of CTSB, IDO1, and TIMP1 exhibited similar functional pathway alteration patterns. GSVA revealed that deletion of these three genes significantly affected Cornified envelope formation, extracellular matrix (ECM)-receptor interaction, and Focal adhesion pathways, indicating potential synergistic regulation in ECM remodeling, cell adhesion, and tissue structure maintenance (Figure 6F). Notably, EZH2 knockout displayed distinct functional characteristics compared to the other model genes. EZH2 deletion led to significant enrichment in cell cycle, progesterone-mediated oocyte maturation, and p53 signaling pathway, all classic pathways related to cell cycle and proliferation regulation (Figure 6F). This suggests that EZH2 primarily exerts its biological function by modulating cell cycle progression and genomic stability.
Expression and functional validation of model key genes in GC cell lines
To validate the expression patterns and functional roles of the model-selected key genes in GC, two GC cell lines (AGS and HGC-27) and one normal gastric epithelial cell line (GES-1) were selected for mRNA and protein-level detection. qRT-PCR results showed that, compared with GES-1, AKR1C1 expression was significantly downregulated in AGS and HGC-27 cells (P<0.01), whereas CTSB, EZH2, IDO1, and TIMP1 were markedly upregulated (P<0.01, Figure 7A). Protein-level detection revealed trends consistent with the mRNA results: AKR1C1 was decreased in GC cells, while CTSB, EZH2, IDO1, and TIMP1 were significantly increased (P<0.001, Figure 7B).
For further functional validation, gene overexpression or knockdown experiments were conducted in HGC-27 cells. AKR1C1 was overexpressed (oe-AKR1C1), while CTSB, EZH2, IDO1, and TIMP1 were knocked down using siRNAs (si-CTSB, si-EZH2, si-IDO1, si-TIMP1). qRT-PCR and Western blot analyses confirmed that the expression levels of each gene were significantly altered in the treated groups (P<0.01, Figure S2A). CCK-8 assays demonstrated that AKR1C1 overexpression or knockdown of CTSB, EZH2, IDO1, or TIMP1 significantly inhibited cell proliferation (P<0.001, Figure S2B). Validation of the virtual knockout analysis revealed that AKR1C1 overexpression significantly upregulated energy metabolism-related proteins GLS, GLUL, and CPS1 (P<0.01, Figure 8A), suggesting that it might suppress tumor proliferation by regulating energy metabolism. EZH2 knockdown led to upregulation of p53 and p21 and downregulation of Cyclin D1 (P<0.01, Figure 8B), indicating its involvement in cell cycle arrest. Knockdown of CTSB, IDO1, or TIMP1 resulted in decreased levels of MMP2, MMP9, N-cadherin, and Vimentin, while E-cadherin was increased (P<0.01, Figure 8C-8E), suggesting that these genes positively regulated EMT and migratory capacity in GC cells.
Discussion
GC is a highly complex disease driven by multi-level molecular events. Its molecular heterogeneity is not only intrinsic to tumor cells themselves but is also profoundly influenced by the tumor microenvironment (TME), metabolic reprogramming, and inflammatory-immune states (16). Although numerous studies have attempted to identify potential diagnostic or prognostic biomarkers from transcriptomic profiles or individual signaling pathways, such approaches often suffer from limited stability and reproducibility, thereby restricting their clinical translational value. In recent years, ferroptosis, a form of programmed cell death characterized by lipid peroxidation and iron-dependent oxidative damage, has been increasingly recognized as playing a critical role in GC (17). However, integrative research frameworks that systematically combine ferroptosis-related molecular features with clinical diagnostic models and mechanistic investigations remain relatively scarce. In this study, by integrating differential expression analysis, WGCNA, multi-algorithm machine learning modeling, and single-cell-level functional inference, we established a ferroptosis-related molecular diagnostic system with both strong biological plausibility and clinical interpretability, providing a novel perspective for understanding the molecular heterogeneity of GC.
Previous studies have revealed potential links between ferroptosis and GC progression from multiple perspectives. For instance, dysregulation of iron homeostasis, enhanced oxidative stress, and aberrant lipid metabolism have been reported to be closely associated with GC development, all of which constitute fundamental biological processes underlying ferroptosis (18-20). In the present study, we first employed co-expression network analysis to identify key modules that were highly correlated with GC phenotypes within a whole-transcriptome context. These modules were further intersected with DEGs and curated FRG sets, effectively reducing the noise introduced by conventional differential expression analysis alone. WGCNA enables the identification of functionally coherent gene modules that are stably associated with disease states within complex transcriptional networks (21). The ferroptosis-related intersecting genes identified by this strategy exhibited highly interconnected topological features in the PPI network, with core nodes predominantly involved in oxidative stress responses, inflammatory signaling, and ECM regulation. These findings are highly consistent with the biological foundations of ferroptosis and further validate the reliability of our gene-selection strategy from a network perspective.
In recent years, machine learning approaches have been widely applied to the construction of molecular diagnostic models in oncology. However, different algorithms vary substantially in their sensitivity to feature selection and data structure, and reliance on a single algorithm often leads to overfitting or limited generalizability (22). Comparative evaluation and integration of multiple algorithms can improve model robustness across heterogeneous datasets, yet such systematic strategies remain underexplored in ferroptosis-based diagnostic modeling. In this study, we performed a comprehensive multi-algorithm evaluation and selected the model with the highest diagnostic performance. Previous studies showed that the ferroptosis-related single gene LANCL2 achieved an AUC of only 0.648 for the diagnosis of GC (23). In contrast, our machine learning-based multi-gene diagnostic model achieved an AUC of 0.934 for GC, markedly outperforming the previously reported ferroptosis-related multi-gene diagnostic model (AUC =0.891) (7). Importantly, the diagnostic model was further validated across multiple independent external cohorts, demonstrating consistently high performance. These results indicate that the multi-gene diagnostic model based on ferroptosis-related molecular signatures exhibits strong cross-cohort robustness, maintaining stable discriminative power across different sample sources and technical platforms, thereby reinforcing its potential clinical applicability.
Notably, the five genes ultimately incorporated into the diagnostic model in this study were not only statistically robust but also exhibited clear disease relevance in terms of their expression patterns and biological functions. AKR1C1 is an NADPH-dependent reductase involved in steroid metabolism, maintenance of redox homeostasis, and cellular detoxification processes (24). In the present study, AKR1C1 was significantly downregulated in GC tissues, a finding consistent with the report by Zhou et al. (25), suggesting that reduced AKR1C1 expression may contribute to early metabolic dysregulation and aberrant oxidative stress during gastric carcinogenesis. In recent years, AKR1C1 has been repeatedly identified as an FRG, functioning to eliminate lipid peroxidation products and sustain NADPH-dependent antioxidant capacity, thereby conferring resistance to ferroptosis in tumor cells (26-28). Building upon these observations, we further found that high AKR1C1 expression was significantly associated with activation of oxidative phosphorylation pathways, while showing a negative correlation with ECM remodeling and related signaling pathways. These results suggest that AKR1C1 may enhance the metabolic adaptability and survival advantages of tumor cells by maintaining mitochondrial metabolic homeostasis and suppressing aberrant matrix remodeling.
CTSB is a lysosome-associated cysteine protease that plays a critical role in ECM degradation, inflammatory regulation, and remodeling of the TME. Previous studies have indicated that serum CTSB levels in women may serve as a potential predictive biomarker for GC (29), highlighting its clinical relevance at an early stage of gastric tumorigenesis. Further tissue-based investigations have demonstrated that CTSB is significantly upregulated in GC tissues, and its high expression is closely associated with poorer overall survival and enhanced immune cell infiltration (30), supporting its role as a molecular indicator of unfavorable prognosis in GC. Mechanistically, CTSB has recently been identified as a regulator of ferroptosis, with its tumor-promoting effects through immune microenvironment remodeling validated in prostate cancer (31). In addition, CTSB is involved in autophagy-dependent inflammasome activation (32), underscoring its central role at the intersection of lysosomal function, inflammatory responses, and cell death regulation. In our study, high CTSB expression was significantly associated with activation of cell cycle progression, DNA replication, and multiple inflammation-related signaling pathways. Moreover, single-cell transcriptomic analysis revealed that CTSB was predominantly enriched in myeloid cells and fibroblast subsets within the TME, suggesting that its functional impact may extend beyond tumor cells themselves. Instead, CTSB may indirectly influence ferroptosis-related processes by modulating the inflammatory milieu and proliferative activity within the TME.
EZH2 is the core catalytic subunit of polycomb repressive complex 2 (PRC2) and mediates transcriptional repression of target genes at the epigenetic level through trimethylation of histone H3 at lysine 27 (H3K27me3). It is widely recognized as a classical oncogene in GC. Extensive evidence has demonstrated that EZH2 is aberrantly overexpressed in gastric tissues and GC, where it promotes tumor initiation and progression through multiple signaling pathways (33-35). Mechanistically, EZH2 can form complexes with various long non-coding RNAs (lncRNAs) and be recruited to the promoters of tumor suppressor genes, inducing transcriptional silencing via H3K27me3 modification. Through this mechanism, EZH2 regulates a broad spectrum of tumor-associated biological processes, including cell cycle progression, cell proliferation and growth, migration, invasion, metastasis, and chemoresistance (36). In addition, EZH2 functions as a critical node within miRNA regulatory networks; for example, miR-137 suppresses GC cell proliferation by directly targeting EZH2 (37). In recent years, the potential role of EZH2 in ferroptosis regulation has attracted increasing attention. Previous studies have shown that EZH2 cooperates with the PRC1 component CBX2 to enhance H3K27 trimethylation and suppress ferroptosis-related processes, thereby promoting resistance of GC cells to 5-fluorouracil (5-FU) (38). Meanwhile, combined treatment with pharmacological EZH2 inhibitors and ferroptosis inducers has been demonstrated to markedly inhibit tumor growth in both in vitro and in vivo models (39), suggesting that EZH2 may serve as an important therapeutic target linking epigenetic regulation to ferroptosis-based treatment strategies. Building on these findings, our study further revealed that high EZH2 expression was significantly associated with cell cycle progression and activation of the p53 signaling pathway, whereas virtual knockdown of EZH2 primarily affected pathways related to cell proliferation and DNA replication. These results indicate that, in GC, EZH2 may not function as a direct executor of ferroptosis; instead, it may indirectly modulate tumor cell sensitivity to ferroptotic stimuli by sustaining a highly proliferative state and active cell cycle dynamics.
IDO1 is a key metabolic enzyme that has attracted considerable attention in the field of tumor immunoregulation. Its role in mediating immunosuppression and promoting tumor immune evasion through tryptophan catabolism has been well established. In recent years, multiple studies have further highlighted the important clinical relevance of IDO1 in GC. Wang et al. reported that IDO1 may serve as a potential diagnostic biomarker for GC (40). Moreover, in patients receiving neoadjuvant chemotherapy, the combined assessment of IDO1 expression and CD8+ T-cell infiltration has been shown to provide prognostic value in preoperative GC specimens (41), underscoring its critical role in shaping the tumor immune microenvironment. Beyond its immunomodulatory functions, IDO1 has recently been implicated in metabolic regulation related to ferroptosis. In glioblastoma, IDO1 suppresses ferroptosis and promotes tumor progression by regulating FTO-mediated m6A modification and enhancing the stability of SLC7A11 mRNA (42). In addition, IDO1-driven activation of the aryl hydrocarbon receptor can upregulate NRF2 signaling and enhance pentose phosphate pathway activity, thereby exerting anti-ferroptotic effects in lung cancer (43). However, whether IDO1 participates in ferroptosis regulation in GC and the underlying mechanisms remain largely unexplored. In the present study, we found that IDO1 was significantly upregulated in GC tissues, and its high expression was closely associated with cell cycle progression, activation of DNA replication, and multiple inflammation-related signaling pathways. These findings suggest that IDO1 may not only exert its effects through remodeling the tumor immune microenvironment but may also be involved in regulating the intrinsic proliferative state and metabolic adaptability of tumor cells, thereby indirectly influencing ferroptosis-related processes.
TIMP1 is an important regulator of ECM dynamics and TME remodeling. In GC, previous studies have demonstrated that high TIMP1 expression is significantly associated with poor tumor differentiation and unfavorable prognosis (44), suggesting a pro-tumorigenic role during GC progression. Functional studies have shown that overexpression of TIMP1 markedly enhances resistance of colorectal cancer cells to RSL3-induced ferroptosis (45) and suppresses sorafenib-induced ferroptosis through activation of the PI3K/Akt signaling pathway (46), indicating that TIMP1 represents a critical molecular basis of the anti-ferroptotic phenotype. In the present study, we further found that high TIMP1 expression was significantly associated with multiple inflammation-related signaling pathways and the ECM-receptor interaction pathway. Single-cell transcriptomic analysis revealed that TIMP1 was predominantly enriched in myeloid cells and fibroblast subsets within the TME, rather than in tumor epithelial cells themselves, suggesting that its functional effects are mainly mediated through remodeling of the TME architecture and regulation of inflammatory responses.
Taken together, the five genes incorporated into the diagnostic model collectively cover multiple core ferroptosis-related processes, including redox homeostasis, cell cycle regulation, inflammatory and immune responses, and ECM remodeling. Their synergistic effects within the multi-gene diagnostic model reflect the multi-pathway and multi-level molecular features involved in GC development. By integrating single-cell transcriptomic analysis with virtual gene knockdown, this study not only validates the rationality of the selected model genes at the expression level but also elucidates their potential regulatory networks at the functional level, thereby providing new theoretical support for the systemic role of ferroptosis in GC. Future studies may build upon these findings by incorporating prospective clinical cohorts and experimental validation to further assess the utility of this model in early GC screening, risk stratification, and prediction of therapeutic responses. Moreover, in-depth investigations into the precise molecular mechanisms by which the model genes regulate ferroptosis are expected to uncover novel molecular targets for ferroptosis-based therapeutic strategies.
Conclusions
In summary, by integrating multi-algorithm machine learning with systems-level network analysis and single-cell functional inference, this study established a ferroptosis-related molecular model with high diagnostic performance and strong biological interpretability. These findings not only provide a novel candidate tool for the early molecular diagnosis of GC but also lay a theoretical foundation for future exploration of ferroptosis-targeted intervention strategies, highlighting its potential clinical translational value.
Acknowledgments
None.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0035/rc
Data Sharing Statement: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0035/dss
Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0035/prf
Funding: This work was supported by
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0035/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Zhao H, Ao L. Ferroptosis and gastric cancer: from molecular mechanisms to clinical implications. Front Immunol 2025;16:1581928. [Crossref] [PubMed]
- Wang Z, Wu Q. Advancements in non-invasive diagnosis of gastric cancer. World J Gastroenterol 2025;31:101886. [Crossref] [PubMed]
- Peng J, Yin Y, Liu X, et al. CD51 promotes gastric cancer stemness via blocking Numb-mediated Notch1 degradation. Cancer Lett 2025;629:217886. [Crossref] [PubMed]
- Joshi SS, Badgwell BD. Current treatment and recent progress in gastric cancer. CA Cancer J Clin 2021;71:264-79. [Crossref] [PubMed]
- Wu QN, Qi J, Liu ZK, et al. HIPK3 maintains sensitivity to platinum drugs and prevents disease progression in gastric cancer. Cancer Lett 2024;584:216643. [Crossref] [PubMed]
- Wang S, Zhang S, Li X, et al. Development of oxidative stress- and ferroptosis-related prognostic signature in gastric cancer and identification of CDH19 as a novel biomarker. Hum Genomics 2024;18:121. [Crossref] [PubMed]
- Kuang Y, Yang K, Meng L, et al. Identification and validation of ferroptosis-related biomarkers and the related pathogenesis in precancerous lesions of gastric cancer. Sci Rep 2023;13:16074. [Crossref] [PubMed]
- Zhang J, Tian T, Li X, et al. p53 inhibits OTUD5 transcription to promote GPX4 degradation and induce ferroptosis in gastric cancer. Clin Transl Med 2025;15:e70271. [Crossref] [PubMed]
- Singh M, Arora HL, Naik R, et al. Ferroptosis in Cancer: Mechanism and Therapeutic Potential. Int J Mol Sci 2025;26:3852. [Crossref] [PubMed]
- Zhou Q, Liu T, Qian W, et al. HNF4A-BAP31-VDAC1 axis synchronously regulates cell proliferation and ferroptosis in gastric cancer. Cell Death Dis 2023;14:356. [Crossref] [PubMed]
- Yang H, Hu Y, Weng M, et al. Hypoxia inducible lncRNA-CBSLR modulates ferroptosis through m6A-YTHDF2-dependent modulation of CBS in gastric cancer. J Adv Res 2022;37:91-106. [Crossref] [PubMed]
- Fu D, Wang C, Yu L, et al. Induction of ferroptosis by ATF3 elevation alleviates cisplatin resistance in gastric cancer by restraining Nrf2/Keap1/xCT signaling. Cell Mol Biol Lett 2021;26:26. [Crossref] [PubMed]
- Nie J, Zhang H, Li X, et al. Pachymic acid promotes ferroptosis and inhibits gastric cancer progression by suppressing the PDGFRB-mediated PI3K/Akt pathway. Heliyon 2024;10:e38800. [Crossref] [PubMed]
- Yan N, Li G, Zhao L, et al. Crocin promotes ferroptosis in gastric cancer via the Nrf2/GGTLC2 pathway. Front Pharmacol 2025;16:1527481. [Crossref] [PubMed]
- Xie J, Deng X, Yang A, et al. Response to "A commentary on 'Leveraging diverse cell-death patterns to predict the prognosis and drug sensitivity of triple-negative breast cancer patients after surgery'". Int J Surg 2025;111:3138-9. [Crossref] [PubMed]
- Zhao L, Liu Y, Zhang S, et al. Impacts and mechanisms of metabolic reprogramming of tumor microenvironment for immunotherapy in gastric cancer. Cell Death Dis 2022;13:378. [Crossref] [PubMed]
- Yue Z, Yuan Y, Zhou Q, et al. Ferroptosis and its current progress in gastric cancer. Front Cell Dev Biol 2024;12:1289335. [Crossref] [PubMed]
- Liu Y, Yu Y, Luo Z, et al. Artesunate induces ferroptosis in gastric cancer by targeting the TFRC-HSPA9 axis for iron homeostasis regulation. Redox Biol 2025;87:103867. [Crossref] [PubMed]
- Ozdemir G, Kaplan HM. GPX4 Inhibition Enhances the Pro-Oxidant and ER Stress Effects of Tempol in Colon and Gastric Cancer Cell Lines. Curr Issues Mol Biol 2025;47:856. [Crossref] [PubMed]
- Wang LM, Zhang WW, Qiu YY, et al. Ferroptosis regulating lipid peroxidation metabolism in the occurrence and development of gastric cancer. World J Gastrointest Oncol 2024;16:2781-92. [Crossref] [PubMed]
- Yu T, Zhang J, Cao J, et al. Hub Gene Mining and Co-Expression Network Construction of Low-Temperature Response in Maize of Seedling by WGCNA. Genes (Basel) 2023;14:1598. [Crossref] [PubMed]
- Tang M, Jiang S, Huang X, et al. Integration of 3D bioprinting and multi-algorithm machine learning identified glioma susceptibilities and microenvironment characteristics. Cell Discov 2024;10:39. [Crossref] [PubMed]
- Fang X, Liu M, Ren Q, et al. Multi-omics analysis identifies LANCL2 as a potential biomarker for the diagnosis and prognosis of gastric cancer. Sci Rep 2025;15:18231. [Crossref] [PubMed]
- Zeng CM, Chang LL, Ying MD, et al. Aldo-Keto Reductase AKR1C1-AKR1C4: Functions, Regulation, and Intervention for Anti-cancer Therapy. Front Pharmacol 2017;8:119. [Crossref] [PubMed]
- Zhou Y, Lin Y, Li W, et al. Expression of AKRs superfamily and prognostic in human gastric cancer. Medicine (Baltimore) 2023;102:e33041. [Crossref] [PubMed]
- Zhen S, Jia Y, Zhao Y, et al. NEAT1_1 confers gefitinib resistance in lung adenocarcinoma through promoting AKR1C1-mediated ferroptosis defence. Cell Death Discov 2024;10:131. [Crossref] [PubMed]
- Jiang X, Chen X, Xia J, et al. MAFF drives pancreatic cancer progression through AKR1C1-mediated inhibition of ferroptosis. QJM 2026;119:271-81. [Crossref] [PubMed]
- Liu C, Zhang C, Wu H, et al. The AKR1C1-CYP1B1-cAMP signaling axis controls tumorigenicity and ferroptosis susceptibility of extrahepatic cholangiocarcinoma. Cell Death Differ 2025;32:506-520. [Crossref] [PubMed]
- Alarcón-Millán J, Lorenzo-Nazario SI, Jiménez-Wences H, et al. Women with chronic follicular gastritis positive for Helicobacter pylori express lower levels of GKN1. Gastric Cancer 2020;23:754-9. [Crossref] [PubMed]
- Yin Y, Wang B, Yang M, et al. Gastric cancer prognosis: unveiling autophagy-related signatures and immune infiltrates. Transl Cancer Res 2024;13:1479-92. [Crossref] [PubMed]
- Song J, Zhang Q, Ma M, et al. Multi-omics genetic study revealing ferroptosis regulator CTSB driving prostate cancer progression by modulating the immune microenvironment. Naunyn Schmiedebergs Arch Pharmacol 2026;399:3073-88. [Crossref] [PubMed]
- Li C, Sun S, Zhuang Y, et al. CTSB Nuclear Translocation Facilitates DNA Damage and Lysosomal Stress to Promote Retinoblastoma Cell Death. Mol Biotechnol 2024;66:2583-94. [Crossref] [PubMed]
- Yu W, Liu N, Song X, et al. EZH2: An Accomplice of Gastric Cancer. Cancers (Basel) 2023;15:425. [Crossref] [PubMed]
- Wang P, Zhao L, Rui Y, et al. SMYD3 regulates gastric cancer progression and macrophage polarization through EZH2 methylation. Cancer Gene Ther 2023;30:575-81. [Crossref] [PubMed]
- Zheng Y, Li P, Ma J, et al. Cancer-derived exosomal circ_0038138 enhances glycolysis, growth, and metastasis of gastric adenocarcinoma via the miR-198/EZH2 axis. Transl Oncol 2022;25:101479. [Crossref] [PubMed]
- Mohebbi H, Esbati R, Hamid RA, et al. EZH2-interacting lncRNAs contribute to gastric tumorigenesis; a review on the mechanisms of action. Mol Biol Rep 2024;51:334. [Crossref] [PubMed]
- Weng XQ, Wang W. miR-137 Modulates Human Gastric Cancer Cell Proliferation, Apoptosis, and Migration by Targeting EZH2. Crit Rev Eukaryot Gene Expr 2022;32:31-40. [Crossref] [PubMed]
- Zeng M, Li B, Guan Q, et al. CBX2 and EZH2 cooperatively contribute to 5-Fu resistance in gastric cancer by suppressing ferroptosis via trimethylation of H3k27. Cell Signal 2025;136:112078. [Crossref] [PubMed]
- Xiao H, Du X, Hou H, et al. ATOH8 confers the vulnerability of tumor cells to ferroptosis by repressing SCD expression. Cell Death Differ 2025;32:1397-412. [Crossref] [PubMed]
- Wang Y, Jin Y, Wang T, et al. IDO1 as a potential diagnostic biomarker for gastric cancer. Asian J Surg 2024;47:4351-3. [Crossref] [PubMed]
- Chen H, Zheng Q, Jiang Y, et al. IDO1 Expression and CD8+ T-Cell Levels Are Useful Prognostic Biomarkers in Preoperative Gastric Cancer Specimens Before Neoadjuvant Chemotherapy. Appl Immunohistochem Mol Morphol 2025;33:1-9. [Crossref] [PubMed]
- Tian Q, Dan G, Wang X, et al. IDO1 inhibits ferroptosis by regulating FTO-mediated m6A methylation and SLC7A11 mRNA stability during glioblastoma progression. Cell Death Discov 2025;11:22. [Crossref] [PubMed]
- Zhan J, Chen Y, Liu Y, et al. IDO1-mediated AhR activation up-regulates pentose phosphate pathway via NRF2 to inhibit ferroptosis in lung cancer. Biochem Pharmacol 2025;236:116913. [Crossref] [PubMed]
- Zheng M, Wang P, Wang Y, et al. Clinicopathological and prognostic significance of TIMP1 expression in gastric cancer: a systematic review and meta-analysis. Expert Rev Anticancer Ther 2024;24:1169-76. [Crossref] [PubMed]
- Li M, Ni QY, Yu SY. Integration of single-cell transcriptomics and epigenetic analysis reveals enhancer-controlled TIMP1 as a regulator of ferroptosis in colorectal cancer. Genes Genomics 2024;46:121-33. [Crossref] [PubMed]
- Wang L, Wang J, Chen L. TIMP1 represses sorafenib-triggered ferroptosis in colorectal cancer cells by activating the PI3K/Akt signaling pathway. Immunopharmacol Immunotoxicol 2023;45:419-25. [Crossref] [PubMed]


