Multi-omics clustering combined with multiple machine learning to identify epigenetic features in low-grade glioma patients
Highlight box
Key findings
• We identified three subtypes of low-grade glioma (LGG) patients based on messenger RNA (mRNA) omics, long non-coding RNA (lncRNA) omics, and microRNA omics. Simultaneously, we constructed a prognostic model using 101 machine learning combinations and screened out Jumonji Domain-Containing 8 (JMJD8) as a potential therapeutic target.
What is known and what is new?
• Epigenetic heterogeneity has been demonstrated in a variety of cancers including leukemia, colorectal cancer, and glioma, but its application to LGG has been rarely reported. We identified three epigenetic subtypes of LGGs. For the first time, JMJD8 has been identified as a molecular biomarker for LGG.
What is the implication, and what should change now?
• Research has confirmed that three epigenetic subtypes exhibit differences in the tumor microenvironment, drug treatment, and prognosis. The association of JMJD8 with the M2 macrophage program and activation of the cGAS-string signaling pathway in LGGs remains to be confirmed.
Introduction
Glioma is a tumor that originates in the neuroglia cells of the brain and it is the most common primary intracranial tumor. According to the World Health Organization (WHO) classification, gliomas are classified as grades 1 to 4, where grades 1 and 2 are considered low-grade glioma (LGG), while grades 3 and 4 are high-grade glioma (glioblastoma) (1). LGG exhibit a slow-growth versus infiltrative growth biology, leading to their susceptibility to postoperative recurrence and transformation to high-grade gliomas (2). Improvements in imaging histology and the use of the first-line chemotherapeutic agent temozolomide (TMZ) have led to new advances in the treatment of patients with LGG, but their molecular heterogeneity and epigenetic alterations continue to significantly affect prognosis (3).
Unlike classical genetics, epigenetics is defined as the regulatory code that determines whether a gene is expressed or not and can be inherited stably, while the genome sequence remains unchanged (4). Chromatin regulators (CRs) play a central role in epigenetics, and based on their different regulatory effects on epigenetics, we classify them into three categories: DNA methylation factors, histone modification factors and chromatin remodeling factors (5). DNA methylation is a common feature in cancer, and the high methylation of CpG islands makes gliomas, leukemias, and colorectal cancers (CRCs) often exhibiting a methylated phenotype (6). DNMT3A and DNMT3B of the DNA methyltransferase family add methylation groups to previously unmethylated CpG sites, whereas DNMT1 is primarily responsible for maintaining pre-existing methylation patterns (7). Excessive DNA methylation silences the promoters of oncogenes, thereby promoting tumor development. The amino acid sequences of histones are highly homologous in all eukaryotes and the most important post-translational histone modifications are acetylation, ubiquitination, phosphorylation and methylation of histones (8). SETD7 is an important lysine methyltransferase that is involved in cell differentiation by modifying histones, affecting the cell cycle and protein expression influencing tumor progression (9). Chromatin remodeling refers to the modification of chromatin structure by depleting ATP energy, thereby moving, disrupting, or reorganizing nucleosomes, which in turn regulates selective gene expression. Chromatin remodeling complexes can be classified into four major groups: the SWI/SNF family, the ISWI family, the CHD family, and the INO80 family (10). Mutations in SWI/SNF family genes are widely present in cancer and play a strong pathogenic role in pediatric tumors (11). Single-cell sequencing has demonstrated a strong correlation between the SWI/SNF family and cancer cell evolution in kidney renal clear cell carcinoma (KIRC), and revealed its potential role as a therapeutic target for cancer (12). Significant prognostic differences were observed in LGG patients with isocitrate dehydrogenase (IDH) mutant compared to LGG patients with wild-type IDH, which produces high levels of 2-hydroxyglutarate to inhibit DNA demethylase activity, resulting in a highly methylated phenotype. The above studies suggest targeting epigenetic status as a new therapeutic strategy for cancer. However, the study of epigenetic markers for the altered immune microenvironment and metabolic reprogramming of cancer in LGG patients is incomplete. Our research is dedicated to identifying novel epigenetic molecular subtypes in LGG patients. By means of translational bioinformatics, we established Jumonji Domain-Containing Protein 8 (JMJD8) as an important histone modifier, while focusing on its potential mechanism as a novel molecular marker for LGGs. The prognostic value of JMJD8 in a variety of cancers was identified by pan-cancer survival analysis, based on the single-cell transcriptome identifying that it strongly correlates with oncogenic signals such as cell differentiation and DNA damage repair. We suggest that JMJD8 may serve as a promising prognostic biomarker and therapeutic target for LGG patients. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2226/rc).
Methods
Data collection and organization
Survival information, clinicopathological characterization information, transcriptome expression, microRNA (miRNA) expression, long non-coding RNA (lncRNA) expression, and DNA methylation levels of LGG patients were obtained from The Cancer Genome Atlas (TCGA) website (https://portal.gdc.cancer.gov/) (13), and we integrated 476 LGG patients with multiple histologic data for subsequent analysis. All the messenger RNA (mRNA), lncRNA and miRNA were processed with log[transcripts per million (TPM) + 1]. In addition, the CGGA-325 cohort and CGGA-693 cohort were downloaded from the Chinese Glioma Genome Atlas (CGGA) database (http://www.cgga.org.cn) as validation cohorts (14). The 870 CRs driving epigenetic features were collected through the FACER database (http://bio-bigdata.hrbmu.edu.cn/FACER/) (tables available at https://cdn.amegroups.cn/static/public/tcr-2025-aw-2226-1.xlsx), and all CR RNA-sequencing data were TPM normalized and logarithmically processed (15). Finally, we screened four single-cell RNA sequencing (scRNA-seq) samples of tumor cores, GSM4955731, GSM4955733, GSM4955735, GSM4955737, from the GSE162631 cohort for validation of the analysis from the cellular level (16). The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Multiple histological consensus clustering identifies molecular subtypes
In this study, we performed an in-depth analysis of mRNA, lncRNA, miRNA, and DNA methylation multi-omics sequential data, and employed the getElites function in the MOVICS package to accurately screen a series of genes with significant variants by setting method = mad and combining survival information with Cox proportional risk regression analysis. In order to minimize the technical noise that may be introduced by a single clustering algorithm, we further utilize the getMOIC function in the MOVICS package to integrate ten advanced clustering algorithms, including CIMLR, ConsensusClustering, SNF, iClusterBayes, PINSPlus, moCluster, NEMO, IntNMF, COCA, and LRA (17). With the multi-algorithm consensus clustering strategy, we are able to obtain more robust and reliable clustering results. In addition, in order to further validate the stability and consistency of the clustering results, we used CGGA’s independent validation queue to identify and validate the clustering results by applying two methods, nearest template prediction (NTP) and partitioning around medoids (PAM). Specifically, the top-varying genes identified in the discovery cohort were used as classification templates, with centroids defined by the PAM medoids of each subtype. Subtype assignment in the external cohort was determined via the NTP algorithm using a nominal P value threshold of <0.05 to ensure high-confidence prediction. Samples that did not meet this significance threshold were categorized as ‘unclassified’ to minimize misclassification bias.
Evaluating subtype immune profiles
To deeply explore the tumor microenvironment (TME) characteristics of the newly defined molecular subtypes, we collected immune checkpoint-related genes from previous literature and systematically assessed the immune differences between subtypes. We used the estimate algorithm to evaluate the immune score and the stromal score. In addition, multiple cutting-edge algorithms in the IOBR package, including CIBERSORT, MCP-counter, EPIC, TIMER, and Xcell, were utilized to comprehensively reveal TME differences across subtypes (18). We applied the limma package for differential analysis to screen out up- and down-regulated differential genes. Gene set enrichment analysis (GSEA) was further performed by Gene Ontology (GO) terminology to identify subtype-specific pathways (19). To evaluate drug sensitivity, we estimated IC50 values for cisplatin, paclitaxel, TMZ, and bexarotene using the oncoPredict package and GDSC2 dataset. This hypothesis-generating analysis identifies potential molecular vulnerabilities while acknowledging clinical constraints like blood-brain barrier permeability and the inherent limitations of in silico modeling (20).
Multiple machine learning to develop prognostic models
Based on the Mime1 package, we employed 10 advanced machine learning algorithms to construct 101 unique combinations of algorithms to optimize the predictive performance of the model (21). These algorithms cover Support Vector Machine (SVM), Lasso, Gradient Boosting Machine (GBM), Random Survival Forest (RSF), Elastic Net (ENET), Stepwise Cox, Ridge, CoxBoost, SuperPC and Partial Least Squares Regression Cox (PLSRCox). Based on the consistency index (C-index) and previous prognostic model construction strategies, we used the CoxBoost + SuperPC combination to construct a CR prognostic model (CRS). For a detailed description of each algorithm and specific implementations of the various combinations, refer to our previous study (22).
Clinicopathological characterization and model comparison
To strengthen the translational value of molecular typing and risk stratification, key clinicopathologic characteristics such as age, overall survival (OS), IDH gene mutation status and 1p/19q co-deletion status were included based on molecular subtypes and risk subgroups to assess the heterogeneity of their distribution among groups. In addition, we quantitatively characterized the predictive power of the model for survival outcomes of LGG patients by calculating the area under the curve (AUC) at 1-, 3- and 5-year survival nodes based on the “timeROC” R software package. We used univariate Cox regression combined with multivariate Cox regression to identify independent risk factors associated with the prognosis of LGG patients, and we collected the C-index of the prognostic models in the previous literature to compare with the CRS to test the predictive efficacy of the models.
Predicting tumor mutational load and intra-tumor heterogeneity
Based on the single nucleotide variation (SNV) data, we performed an in-depth analysis of the mutation profiles of the two risk subgroups using the “maftools” R package, showing the mutation frequencies of the top 20 mutated genes and their mutation types (23). In addition, we computed tumor mutational burden (TMB) to reveal the differences in mutational landscapes across subtypes and between risk subgroups. To quantify intra-tumor heterogeneity, we used a method based on the distribution of mutant alleles, the Mutant Allele Tumor Heterogeneity (MATH) score, and calculated the MATH score for each patient (24). On this basis, we innovatively combined the MATH score with the risk score and the TMB score with the risk score to construct a new prognostic stratification. By Kaplan-Meier survival curve analysis, we observed prognostic differences between different strata.
Functional enrichment analysis probes molecular mechanisms
In order to analyze the functional characteristics and biological significance of differentially expressed genes (DEGs) between the two risk subgroups, we used various enrichment strategies. Based on the “clusterProfiler” R package, we carried out GO functional annotation to reveal the potential functional characteristics of DEGs from three dimensions: biological process (BP), molecular function (MF) and cellular component (CC) (25). The key signaling pathways significantly associated with risk subgroups were further identified by pathway mapping analysis of Kyoto Encyclopedia of Genes and Genomes (KEGG) database (26). To minimize false-positive risks associated with multiple testing, statistical significance for GO and KEGG enrichments was defined by an adjusted P value [Benjamini-Hochberg false discovery rate (FDR)] of <0.05. Pathways were identified as significantly enriched only if the FDR q-value was <0.1 and the P value was <0.05. GSEA was performed based on the “c2.cp.kegg.v7.0.symbols.gmt” gene set from the MSigDB database to identify differential activation patterns at the pathway level (27). Gene Set Variation Analysis (GSVA) combined with hallmark gene set (“h.all.v2023.1.Hs.symbols.gmt”) to quantify the differential activity of 50 human oncogenic pathways by unsupervised analytical methods (28). Finally, we calculated the correlations between model genes and oncogenic pathways to reveal the underlying molecular mechanisms of the model genes.
Statistical analysis
Statistical analysis and plotting were performed using R software (version 4.3.1). Differences between the two groups were compared using the Wilcoxon test. Kaplan-Meier curves and clinical characteristics were tested using the log-rank test. Differences were considered statistically significant at *P<0.05, **P<0.01, and ***P<0.001.
Results
Recognizing epigenetic molecular subtypes
The 870 CRs obtained from the screening, which are closely related to epigenetic regulation, were integrated with the multi-omics data (transcriptome, miRNAs, lncRNAs, and DNA methylation) of 476 patients with LGGs in the TCGA database, and through ten complementary clustering algorithms Multidimensional molecular typing was performed (Figure 1A). The stability of the algorithms was systematically evaluated using the Clustering Prognostic Index (CPI), and three significantly different epigenetic subtypes (CS1, CS2, and CS3) were finally established (Figure 1B-1D), suggesting that this subtype may inhibit tumor growth through epigenetic silencing mechanism. The CS3 subtype was characterized by a widespread hypomethylation pattern and a highly active transcriptome. Specifically, the hypomethylation of cg23520075 within the regulatory region of the glycolytic rate-limiting enzyme phosphofructokinase, platelet (PFKP) suggests an epigenetic priming for enhanced metabolic requirements and the Warburg effect. This is accompanied by significant DNA methylation loss at cg02692964 (associated with the cell-cycle inhibitor CDKN2B) and cg18643762 (associated with the autophagy-related signaling adaptor SQSTM1), which, alongside the hypomethylation of the transcriptional co-activator ANKRD1 at cg12436377, collectively points toward a state of robust cellular proliferation and stress resistance. Notably, the significant upregulation of histone modifiers (e.g., ASF1B, TOP2A) in this subtype is consistent with a potentially more accessible chromatin landscape, which may be associated with its malignant phenotype. CS2 subtype presented a dynamic transition of molecular subtypes (Figure 1C). Kaplan-Meier survival analysis revealed that CS1 subtype had the best prognosis while CS3 subtype had the worst prognosis (Figure 1E).
External cohort validation of epigenetic subtypes
In the TCGA training cohort, we used the NTP algorithm and the PAM algorithm to verify the accuracy of the molecular typing results (Figure 2A,2B). The clustering similarity of the three typing results was moderate, 0.72, 0.62, and 0.88, respectively (Figure 2C). In the CGGA-325 cohort, we validated the results of multi-omics typing and examined the consistency between the NTP algorithm and the PAM algorithm in subtype prediction results (Figure 2D,2E). Meanwhile, we demonstrated significant survival differences between subtypes, with a median survival of 84 months for CS1 patients versus only 12 months for CS3 patients (Figure 2F). In the CGGA-693 cohort, we found that the CS2 subtype was similar to CS1 in terms of survival outcomes, distinguishing it from the transitional state in the CGGA-325 cohort (Figure 2G-2I).
Immune signature recognition and therapeutic tools for epigenetic subtypes
To resolve the interplay between epigenetic reprogramming and TME, we examined the expression levels of immune checkpoint molecules and found that the expression of immune checkpoint molecules, such as PD-1 and CTLA-4, was significantly higher in patients with CS3 subtype than CS1 subtype, suggesting that there might be a dynamic regulation of their immunosuppressive microenvironment (Figure 3A). Further immune cell infiltration characteristics assessed based on a multiplex algorithm showed a significant increase in CD4+ and CD8+ T-cell infiltration accompanied by an upregulation of cancer-associated fibroblast (CAF) expression in the CS3 subtype (Figure 3B). This phenomenon may be closely related to global hypomethylation-mediated activation of tumor-associated antigens (TAAs) and enhanced chemokine transcriptional activity driven by chromatin-open state (29). Functional enrichment analysis further elucidated the biological characteristics of the CS3 subtype: its up-regulated genes were significantly enriched in the dynamic assembly of the microtubule cytoskeleton, and the cell cycle checkpoint pathway (Figure 3C), suggesting that the tumor cells of this subtype may be in the proliferative quiescent phase, with a higher sensitivity to cell cycle non-specific agents (CCNSA). Notably, the down-regulated genes of CS3 subtype showed a significant negative correlation with histone H3 acetylation modification (Figure 3D), validating the association of histone modification in CS3 subtype. Drug sensitivity analysis showed that the response scores to cisplatin, paclitaxel and TMZ were significantly higher in patients with CS3 subtype than those with CS1 subtype (Figure 3E-3H), suggesting that patients with CS3 subtype may have a higher therapeutic response rate to specific chemotherapy regimens (30).
Multiple machine learning combinations to build chromatin regulatory factor prognostic models
We screened prognosis-related molecular markers by univariate Cox regression analysis based on significantly up-regulated DEGs in CS3 subtypes, and constructed prognostic models using integrated machine learning strategies. The 101 machine learning algorithm combinations were systematically evaluated by a comprehensive comparison of time-dependent consistency indices (C-index) (Figure 4A), and it was finally determined that the combined algorithms of CoxBoost feature selection in conjunction with SuperPC integrated modeling were effective in the TCGA training cohort (C-index =0.816) and the CGGA-325 validation cohort (C index =0.677) demonstrated optimal prediction efficacy. The model validation results showed that patients in the high-risk group in the TCGA training cohort had significantly shorter OS (log-rank P<0.001), their risk scores were significantly and positively correlated with the expression levels of the modeled genes, and the incidence of fatal events showed a gradient increasing trend with the increase of the risk scores (Figure 4B,4C). This finding was well validated in the external cohort: there was a difference in prognostic outcomes between the high-risk scoring group and the low-risk scoring group, and the elevated risk score was accompanied by a rise in the number of deaths as well as a decrease in the quality of patient survival (Figure 4D-4G).
Application and validation of CRS model in prognostic prediction of LGG
The analysis of the receiver operating characteristic (ROC) diagnostic curves showed that the AUC values of the CRS model in the training cohort were 0.84, 0.85, 0.86, 0.83, and 0.83 at 1, 2, 3, 4, and 5 years, respectively (Figure 5A). The 5-year AUC values were all greater than 0.8, indicating that the CRS model has high accuracy in predicting survival outcomes in LGG patients. Further, we validated the CRS model in two independent validation cohorts, CGGA-325 and CGGA-693, and the results similarly confirmed its good generalizability (Figure 5B,5C). Among the three subtypes, the G2 patients accounted for the highest proportion in the CS1 subtype, while the G3 patients constituted the highest proportion in the CS3 subtype (Figure 5D). In addition, patients in the high-risk group exhibited clinical features of increased age, increased tumor grade, and worse survival outcomes (Figure 5E), which made our risk grouping more relevant to the needs of practical clinical applications and provided strong support for clinical decision-making. Compared with previously published LGG prognostic models, the CRS model demonstrated a more excellent predictive performance (Figure 5F) (31). The results of one-way Cox regression analysis and multifactorial Cox regression analysis showed that the risk scores calculated by modeling the expression of the genes PADL1, APOBEC3F, APOBEC3C, APOBEC3D, TOP2A, JMJD8, BRCA2, RAD54B, and BUB1 were the independent risk factors for LGG patients, and were expected to be the provide a new tool for prognostic assessment of LGG patients (Figure 5G,5H).
Intratumor heterogeneity and differences in tumor mutation status
IDH mutations are considered benign mutations in patients with LGGs and are usually associated with better prognostic outcomes and longer survival times (32). In this study, we found that the frequency of IDH mutations in the low-risk group was up to 90%, which was significantly higher than that in the high-risk group. In addition, we observed an increased mutation frequency of TP53, ATRX, and TTN genes in the high-risk group, while CIC genes were increased in the low-risk group, suggesting that CIC may have the potential to serve as a prognostic target for LGG patients (Figure 6A). Further analysis showed that the high-risk group had a higher TMB score, while the low-risk group and CS1 subtype had a higher proportion of patients with IDH mutations, resulting in a significant increase in their MATH score (Figure 6B,6C) (33). By integrating the TMB score and MATH score, we successfully categorized the LGG patients in the training cohort into four groups. Survival analysis showed that patients in the low TMB score, low MATH score, and low-risk groups had a significant survival advantage, whereas patients in the high TMB score and high-risk groups, and high MATH score, exhibited a significant survival disadvantage (Figure 6D,6E). We also found a synergistic effect of IDH with TP53 and an antagonistic effect with IDH2, which provided a new perspective for a deeper understanding of the molecular mechanisms in patients with IDH mutations (Figure 6F) (34). By Spearman correlation analysis, we further confirmed that the MATH score was negatively correlated with the risk score, and the TMB score was also negatively correlated with the risk score (Figure 6G). This is in contrast to previous studies, in which the MATH score showed some prognostic value for epigenetically typed patients due to the presence of benign mutations such as IDH in LGG patients.
Differences in molecular mechanisms between risk subgroups
In this study, we revealed significant differences in MF, CC and BP among different risk subgroups by GO enrichment analysis. Specifically, we found that processes such as DNA-binding transcriptional activator activity, RNA polymerase II-specific DNA-binding transcriptional activator activity, cohesion chromosome and mitotic sister chromatid segregation were significantly different among different risk subgroups (Figure 7A). Further, we performed KEGG pathway enrichment analysis and found that pathways such as cell cycle, Wnt signaling pathway and extracellular matrix (ECM)-receptor interaction were significantly enriched in the high-risk subgroups, revealing potential molecular mechanisms of carcinogenesis in the high-risk subgroups (Figure 7B). In addition, GSEA was significantly enriched in immune-related pathways such as cytokine-cytokine receptor interactions and antigen processing and presentation in the low-risk group, suggesting that the immune response was activated in the low-risk group (Figure 7C). In the high-risk group, GSEA enrichment analysis revealed a metabolic reprogramming landscape in LGG patients, in which significant enrichment of oxidative phosphorylation pathways indicated active mitochondrial function, while enrichment of spliceosome pathways highlighted the potential role of RNA splicing in tumorigenesis (Figure 7D). These results provide new insights into the metabolic features and molecular mechanisms of the high-risk group. We also focused on the mechanism of action of model genes and found that APOBEC3 family genes play important roles in tumor suppression and viral defense (35), while the BRCA2 and RAD54B genes have critical roles in DNA double-strand break repair (Figure 7E) (36). These findings further deepened our understanding of the functions of these genes in tumor biology. Finally, we were pleasantly surprised to find that, unlike previous studies, JMJD8 was significantly correlated with glycolysis and Wnt signaling pathways (Figure 7F,7G). This finding provides new evidence for the role of JMJD8 in tumor metabolism and signaling.
JMJD8 identified as a prognostic marker for LGG patients
Firstly, we united the GTEX database to observe the difference of JMJD8 expression in LGG tumor tissues and normal tissues (Figure 8A). The ROC curve indicated a high accuracy of JMJD8 with an AUC value of 0.843 in predicting the survival of LGG patients (Figure 8B). For OS, progression-free interval (PFI), and DSS survival of LGG patients, patients with high expression of JMJD8 produced significant prognostic differences (Figure 8C). The results of pan-cancer analysis showed that JMJD8 expression was significantly increased in tumor tissues in BRCA, CHOL, and glioblastoma (Figure 8D). Meanwhile, HR value >1.5 in LGG was identified as a risk factor (Figure 8E), and there was a correlation between OS survival and PFI survival in LGG (Figure 8F,8G). Immune infiltration analysis indicated that in LGG, JMJD8 may suppress the immune function of B cells and dendric cells, thereby promoting TME dysregulation (Figure 8H). We also found that JMJD8 correlates with oncogenic phenotypes such as cell differentiation, DNA damage (Figure 8I), as well as increased expression in age >40 years, G3 classification, IDH wild type, and immunotherapy non-responsive outcome, which is of potential clinical value (Figure 8J-8M). In single-cell transcriptomics, JMJD8 is expressed at higher levels in glial cells than in other cell types (Figure S1).
Discussion
The clinical management of LGGs has always been challenged by both their biological complexity and molecular heterogeneity. Despite the slow growth rate of LGGs, their infiltrative growth pattern makes complete surgical resection difficult, and about 50–70% of cases eventually malignantly transform into high-grade gliomas. More and more evidence suggests that the molecular heterogeneity of LGG is an important factor affecting prognosis. The methylation status of the MGMT promoter is exceptionally important for TMZ-induced DNA damage repair (37). CDKN2A/B-mediated cell cycle regulation and tumor vascular survival are independent prognostic factors in IDH mutant patients (38). Although many LGG studies are available today, the heterogeneity of these molecular features makes it difficult to generalize a single treatment paradigm, and some patients face early relapse or drug resistance even after standardized treatment. Integrating multi-omics data to further explore the molecular subtypes of LGG patients and constructing accurate prognostic models, so as to develop targeted drugs against specific molecular pathways can help realize precision therapy.
There is growing evidence of crosstalk between epigenetic alterations and tumor malignant progression, and the epigenetic doctrine has greatly complemented the gene-centric cancer doctrine. Single-cell sequencing has comprehensively revealed tumor heterogeneity and epigenetic features by revealing different epigenetic layers of individual cells and different cellular characteristics (39). In breast cancer, silencing of Wnt antagonist genes (including SFRP and DKK) by methylation is the main cause of sequential enhancement of the Wnt signaling pathway (40). In CRC, the average CRC methylome has hundreds to thousands of aberrantly methylated genes, and CRC patients with CpG island methylation phenotypes provide new ideas for clinical diagnosis and treatment (41). In acute myeloid leukemia (AML), alterations in epigenetic modifiers (DNMT3A, TET2, JAK2, ASXL1, SF3B1, PPM1D, CBL, BCOR, and TP53) induced myeloid malignant proliferation to some extent (42). In addition, there is crosstalk between epigenetics and tumor metabolic reprogramming. Succinyl CoA is catalyzed by the α-ketoglutarate dehydrogenase complex in the tricarboxylic acid (TCA) cycle and is involved in the succinylation of histone lysine residues in epigenetic modifications. α-Ketoglutarate dehydrogenase exerts non-metabolizing enzyme activity to regulate the expression of KAT2A (lysine acetyltransferase 2A), which in turn mediates the succinylation of histone H3 and promotes the proliferation of tumor cells (43). The ketone body β-hydroxybutyrate produced by lipid metabolism, in addition to its energy-donating role, is also involved in β-hydroxybutyrylation modification of histone lysine, which mediates downstream transcriptional regulation (44). Proteomic analyses demonstrated that p300 catalyzes the β-hydroxybutyrylation of lysine intracellularly, and HDAC1 and HDAC2 reversibly inhibit this process (45). Thus, epigenetics is a target for potentially therapeutic cancers.
In this study, we synthesized multi-omics data and multiple machine learning for a comprehensive analysis. We identified three epigenetic subtypes based on transcriptomic, miRNA, lncRNA, and DNA methylation data: the CS1 episodic silent type, which exhibits a high degree of DNA methylation of oncogenic genes and thus produces a good prognosis, the CS2 episodic excess type, and the CS3 chromatin-open type, which exhibits an aberrant histone modification process that results in the generation of malignant features of the tumor. The immune microenvironment and specific drugs for the three subtypes were further identified, and cisplatin, paclitaxel and TMZ may be potential specific drugs for patients with CS3 subtype. Subsequently, a CRS prediction model containing nine modeling genes (ADL1, APOBEC3F, APOBEC3C, APOBEC3D, TOP2A, JMJD8, BRCA2, RAD54B and BUB1) was constructed based on genes that characterize the CS3 subtypes ultimately identifying JMJD8 as a potential prognostic target for LGG patients for subsequent analysis. JMJD8 is a novel endoplasmic reticulum protein containing the Jmjc structural domain that functions as a positive regulator of tumor necrosis factor (TNF)-induced NF-kappa-B signaling (46). In addition, it has been shown that JMJD8 acts as a surface marker for M2 macrophages in a pan-cancer cohort while exhibiting novel AKT1 lysine demethylase activity (47). A recent study reported that in breast cancer, JMJD8 competes with TBK1 for binding to STING, blocking STING-TBK1 complex formation and limiting type I interferon (IFN) and IFN-stimulated gene expression and immune cell infiltration (48). We observed that in LGG, JMJD8 was strongly associated with clinical features while with LGG affecting its prognosis.
There are some limitations in this study that need to be improved in subsequent work: first, retrospective analyses based on public databases carry an inherent risk of selection bias, and prospective large-sample clinical studies are needed to validate the clinical application of polyamine metabolism prognostic modeling; second, the current modeling cohort has a limited sample size, and multicenter large-scale cohorts need to be included to improve the statistical validity of the model; in addition, the key regulatory genes identified in this study have not been functionally validated, and in vitro experiments are needed to systematically analyze their molecular mechanisms and biological functions. In addition, the key regulatory genes identified in this study have not yet been functionally validated, and it is necessary to carry out in vitro and in vivo experiments to systematically analyze their molecular mechanisms and biological functions.
Conclusions
In summary, we established three novel epigenetic subtypes and utilized bioinformatics to explore their immunological characteristics and potential biological properties, and constructed new prognostic models with the aim of playing an effective role in early diagnosis and prognostic prediction in the clinic. We also identified JMJD8 as a novel molecular marker for LGG, which may regulate the generation of malignant phenotypes through the DNA damage repair process. JMJD8 may be a potential prognostic biomarker and therapeutic target for future clinical treatments.
Acknowledgments
We thank the participating TCGA and CGGA databases for providing data.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2226/rc
Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2226/prf
Funding: This study received grants from
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2226/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Li J, Wang Y, Meng X, et al. Modulation of transcriptional activity in brain lower grade glioma by alternative splicing. PeerJ 2018;6:e4686. [Crossref] [PubMed]
- Choi S, Yu Y, Grimmer MR, et al. Temozolomide-associated hypermutation in gliomas. Neuro Oncol 2018;20:1300-9. [Crossref] [PubMed]
- Pace A, Dirven L, Koekkoek JAF, et al. European Association for Neuro-Oncology (EANO) guidelines for palliative care in adults with glioma. Lancet Oncol 2017;18:e330-40. [Crossref] [PubMed]
- Feinberg AP, Levchenko A. Epigenetics as a mediator of plasticity in cancer. Science 2023;379:eaaw3835. [Crossref] [PubMed]
- Wang B, Feng Y, Li Z, et al. Identification and validation of chromatin regulator-related signatures as a novel prognostic model for low-grade gliomas using translational bioinformatics. Life Sci 2024;336:122312. [Crossref] [PubMed]
- Noushmehr H, Weisenberger DJ, Diefes K, et al. Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma. Cancer Cell 2010;17:510-22. [Crossref] [PubMed]
- Zhang ZM, Lu R, Wang P, et al. Structural basis for DNMT3A-mediated de novo DNA methylation. Nature 2018;554:387-91. [Crossref] [PubMed]
- Neganova ME, Klochkov SG, Aleksandrova YR, et al. Histone modifications in epigenetic regulation of cancer: Perspectives and achieved progress. Semin Cancer Biol 2022;83:452-71. [Crossref] [PubMed]
- Qi Y, Jin C, Qiu W, et al. The dual role of glioma exosomal microRNAs: glioma eliminates tumor suppressor miR-1298-5p via exosomes to promote immunosuppressive effects of MDSCs. Cell Death Dis 2022;13:426. [Crossref] [PubMed]
- Dawson MA, Kouzarides T. Cancer epigenetics: from mechanism to therapy. Cell 2012;150:12-27. [Crossref] [PubMed]
- Kadoch C. Diverse compositions and functions of chromatin remodeling machines in cancer. Sci Transl Med 2019;11:eaay1018. [Crossref] [PubMed]
- Zhuang K, Wang L, Lu C, et al. Assessment of SWI/SNF chromatin remodeling complex related genes as potential biomarkers and therapeutic targets in pan-cancer. Mol Cancer 2024;23:176. [Crossref] [PubMed]
- Tomczak K, Czerwińska P, Wiznerowicz M. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol (Pozn) 2015;19:A68-77. [Crossref] [PubMed]
- Zhao Z, Zhang KN, Wang Q, et al. Chinese Glioma Genome Atlas (CGGA): A Comprehensive Resource with Functional Genomic Data from Chinese Glioma Patients. Genomics Proteomics Bioinformatics 2021;19:1-12. [Crossref] [PubMed]
- Lu J, Xu J, Li J, et al. FACER: comprehensive molecular and functional characterization of epigenetic chromatin regulators. Nucleic Acids Res 2018;46:10019-33. [Crossref] [PubMed]
- Xie Y, He L, Lugano R, et al. Key molecular alterations in endothelial cells in human glioblastoma uncovered through single-cell RNA sequencing. JCI Insight 2021;6:e150861. [Crossref] [PubMed]
- Lu X, Meng J, Zhou Y, et al. MOVICS: an R package for multi-omics integration and visualization in cancer subtyping. Bioinformatics 2021;36:5539-41. [Crossref] [PubMed]
- Zeng D, Ye Z, Shen R, et al. IOBR: Multi-Omics Immuno-Oncology Biological Research to Decode Tumor Microenvironment and Signatures. Front Immunol 2021;12:687975. [Crossref] [PubMed]
- Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015;43:e47. [Crossref] [PubMed]
- Tomar MS, Kumar A, Srivastava C, et al. Elucidating the mechanisms of Temozolomide resistance in gliomas and the strategies to overcome the resistance. Biochim Biophys Acta Rev Cancer 2021;1876:188616. [Crossref] [PubMed]
- Liu H, Zhang W, Zhang Y, et al. Mime: A flexible machine-learning framework to construct and visualize models for clinical characteristics prediction and feature selection. Comput Struct Biotechnol J 2024;23:2798-810. [Crossref] [PubMed]
- Xie Y, Chen H, Tian M, et al. Integrating multi-omics and machine learning survival frameworks to build a prognostic model based on immune function and cell death patterns in a lung adenocarcinoma cohort. Front Immunol 2024;15:1460547. [Crossref] [PubMed]
- Mayakonda A, Lin DC, Assenov Y, et al. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res 2018;28:1747-56. [Crossref] [PubMed]
- Gavish A, Tyler M, Greenwald AC, et al. Hallmarks of transcriptional intratumour heterogeneity across a thousand tumours. Nature 2023;618:598-606. [Crossref] [PubMed]
- Yu G, Wang LG, Han Y, et al. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 2012;16:284-7. [Crossref] [PubMed]
- Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000;28:27-30. [Crossref] [PubMed]
- Jiline M, Matwin S, Turcotte M. Annotation concept synthesis and enrichment analysis: a logic-based approach to the interpretation of high-throughput experiments. Bioinformatics 2011;27:2391-8. [Crossref] [PubMed]
- Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics 2013;14:7. [Crossref] [PubMed]
- Guo H, Vuille JA, Wittner BS, et al. DNA hypomethylation silences anti-tumor immune genes in early prostate cancer and CTCs. Cell 2023;186:2765-2782.e28. [Crossref] [PubMed]
- Tu Z, Li J, Long X, et al. Transcriptional Patterns of Lower-Grade Glioma Patients with Distinct Ferroptosis Levels, Immunotherapy Response, and Temozolomide Sensitivity. Oxid Med Cell Longev 2022;2022:9408886. [Crossref] [PubMed]
- Liu XP, Jin X, Seyed Ahmadian S, et al. Clinical significance and molecular annotation of cellular morphometric subtypes in lower-grade gliomas discovered by machine learning. Neuro Oncol 2023;25:68-81. [Crossref] [PubMed]
- Wen PY, van den Bent M, Youssef G, et al. RANO 2.0: Update to the Response Assessment in Neuro-Oncology Criteria for High- and Low-Grade Gliomas in Adults. J Clin Oncol 2023;41:5187-99. [Crossref] [PubMed]
- Chen K, Wang Q, Li M, et al. Single-cell RNA-seq reveals dynamic change in tumor microenvironment during pancreatic ductal adenocarcinoma malignant progression. EBioMedicine 2021;66:103315. [Crossref] [PubMed]
- Cancer Genome Atlas Research Network. Electronic address: wheeler@bcm; Cancer Genome Atlas Research Network. Comprehensive and Integrative Genomic Characterization of Hepatocellular Carcinoma. Cell 2017;169:1327-1341.e23.
- Warren CJ, Santiago ML, Pyeon D. APOBEC3: Friend or Foe in Human Papillomavirus Infection and Oncogenesis? Annu Rev Virol 2022;9:375-95. [Crossref] [PubMed]
- Yoshida K, Miki Y. Role of BRCA1 and BRCA2 as regulators of DNA repair, transcription, and cell cycle in response to DNA damage. Cancer Sci 2004;95:866-71. [Crossref] [PubMed]
- Wu S, Li X, Gao F, et al. PARP-mediated PARylation of MGMT is critical to promote repair of temozolomide-induced O6-methylguanine DNA damage in glioblastoma. Neuro Oncol 2021;23:920-31. [Crossref] [PubMed]
- Fortin Ensign SP, Jenkins RB, Giannini C, et al. Translational significance of CDKN2A/B homozygous deletion in isocitrate dehydrogenase-mutant astrocytoma. Neuro Oncol 2023;25:28-36. [Crossref] [PubMed]
- Hu Y, Shen F, Yang X, et al. Single-cell sequencing technology applied to epigenetics for the study of tumor heterogeneity. Clin Epigenetics 2023;15:161. [Crossref] [PubMed]
- Garcia-Martinez L, Zhang Y, Nakata Y, et al. Epigenetic mechanisms in breast cancer therapy and resistance. Nat Commun 2021;12:1786. [Crossref] [PubMed]
- Lao VV, Grady WM. Epigenetics and colorectal cancer. Nat Rev Gastroenterol Hepatol 2011;8:686-700. [Crossref] [PubMed]
- Cai SF, Levine RL. Genetic and epigenetic determinants of AML pathogenesis. Semin Hematol 2019;56:84-9. [Crossref] [PubMed]
- Huang F, Luo X, Ou Y, et al. Control of histone demethylation by nuclear-localized α-ketoglutarate dehydrogenase. Science 2023;381:eadf8822. [Crossref] [PubMed]
- Gao Y, Sheng X, Tan D, et al. Identification of Histone Lysine Acetoacetylation as a Dynamic Post-Translational Modification Regulated by HBO1. Adv Sci (Weinh) 2023;10:e2300032. [Crossref] [PubMed]
- Huang H, Zhang D, Weng Y, et al. The regulatory enzymes and protein substrates for the lysine β-hydroxybutyrylation pathway. Sci Adv 2021;7:eabe2771. [Crossref] [PubMed]
- Yeo KS, Tan MC, Wong WY, et al. JMJD8 is a positive regulator of TNF-induced NF-κB signaling. Sci Rep 2016;6:34125. [Crossref] [PubMed]
- Liang X, Zhang H, Wang Z, et al. JMJD8 Is an M2 Macrophage Biomarker, and It Associates With DNA Damage Repair to Facilitate Stemness Maintenance, Chemoresistance, and Immunosuppression in Pan-Cancer. Front Immunol 2022;13:875786. [Crossref] [PubMed]
- Yi J, Wang L, Du J, et al. ER-localized JmjC domain-containing protein JMJD8 targets STING to promote immune evasion and tumor growth in breast cancer. Dev Cell 2023;58:760-778.e6. [Crossref] [PubMed]

