Prognostic value and therapeutic implications of HMGA2 in EGFR-mutant non-small cell lung cancer
Highlight box
Key findings
• HMGA2 is significantly upregulated in epidermal growth factor receptor (EGFR)-mutant lung adenocarcinoma and correlates with poor prognosis and altered tumor immune microenvironment.
What is known and what is new?
• HMGA2 is a chromatin-binding oncoprotein promoting tumor proliferation and metastasis.
• This study demonstrated that HMGA2 overexpression in EGFR-mutant (EGFRmut) non-small cell lung cancer (NSCLC) is linked to worse survival, distinct immune infiltration, and drug sensitivity profiles.
What is the implication, and what should change now?
• HMGA2 may serve as a prognostic biomarker and therapeutic target in EGFRmut NSCLC, warranting further exploration of HMGA2-directed combination therapies.
Introduction
Epidermal growth factor receptor (EGFR) mutations are prevalent oncogenic drivers in non-small cell lung cancer (NSCLC) (1). The introduction of EGFR tyrosine kinase inhibitors (TKIs) has significantly improved clinical outcomes for patients with EGFR-mutant (EGFRmut) NSCLC (2). However, resistance to TKIs, whether inherent or acquired, remains a significant challenge (3). Existing prognostic models primarily utilize clinicopathological factors [e.g., Tumor-Node-Metastasis (TNM) staging] and the EGFR mutation status, but they do not adequately address the molecular diversity within EGFRmut subgroups or accurately predict treatment response (4,5). These limitations highlight the critical need for new biomarkers that can complement the EGFR status to improve risk assessment and guide personalized treatment approaches. HMGA2, a nonhistone chromatin-associated protein, has emerged as a key regulator of tumorigenesis by influencing transcriptional processes related to cell proliferation, epithelial-mesenchymal transition (EMT), and metastasis.
HMGA2, a chromatin-associated non-histone protein, plays a crucial role in tumorigenesis by regulating transcriptional programs related to cell proliferation, EMT, and metastasis (6,7). Elevated HMGA2 expression is commonly observed in various cancers, including lung, colorectal, and breast cancer (8-11). Despite its well-established oncogenic function, the molecular mechanisms connecting HMGA2 to the remodeling of the tumor microenvironment (TME) and the development of therapeutic resistance in specific molecular subtypes, such as EGFRmut NSCLC, remain poorly understood.
This study integrated multiomics data and experimental validation to address gaps in knowledge. HMGA2 expression profiles across normal and malignant tissues were delineated via public databases [Human Protein Atlas (HPA), Tumor Immune Estimation Resource (TIMER), and Gene Expression Profiling Interactive Analysis (GEPIA)]. The prognostic significance of HMGA2 in lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) was evaluated, along with its association with EGFR mutations. Transcriptomic and immune deconvolution analyses were used to investigate the role of HMGA2 in shaping the TME and predicting the response to chemotherapy. Through a combination of bioinformatics and in vitro validation, this study aimed to elucidate the HMGA2-driven mechanisms in EGFRmut NSCLC and identify potential therapeutic targets. We present this article in accordance with the REMARK reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2371/rc).
Methods
Database analysis
The HPA (https://www.proteinatlas.org/) database was used to detect HMGA2 RNA and protein expression in different tissues and organs. Furthermore, immunofluorescence and immunohistochemistry were performed to confirm the HMGA2 protein level in A-431 cell line and normal and lung cancer tissue samples, including LUAD and LUSC samples. The TIMER (http://timer.cistrome.org/) was used to analyse the protein expression of HMGA2 in multiple malignant tumors. The GEPIA (http://gepia2.cancer-pku.cn/#index) was used to analyse HMGA2 protein expression level in LUAD and squamous cell carcinoma.
Clinical and prognostic analysis
The Cancer Genome Atlas (TCGA) data
Raw data were downloaded from the TCGA database and were obtained from a total of 515 patients, including 66 EGFRmut patients and 449 EGFR wild-type (EGFRwt) patients. In total, 77 tissues from EGFRmut patients and 517 tissues come from EGFRwt patients were used. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
HMGA2 expression and the prognosis of LUAD and LUSC patients
The expression of HMGA2 was considered as a dichotomous variable. We analysed LUAD and LUSC separately via the Kaplan-Meier (KM) plotter database (https://kmplot.com/analysis/). This included overall survival (OS), progression-free survival (PFS) and first progression (FP). Notably, we selected ’split patients by-auto select best cut-off: percentile’.
HMGA2 expression and EGFR mutation in LUAD
The TIMER database was used to analyse HMGA2 protein expression levels in relation to mutations in a number of common driver genes associated with LUAD, and the Assistant for Clinical Bioinformatics (ACLBI) database (www.aclbi.com) was used to analyse HMGA2 expression levels in EGFRmut and EGFRwt LUAD tissues compared with normal lung tissues, as well as subgroup analyses on the basis of clinical characteristics.
Survival analysis of EGFRmut LUAD patients stratified by HMGA2 expression
LUAD data associated with EGFR mutations were obtained from the TCGA database. We classified LUAD patients into the EGFRmut and EGFRwt groups on the basis of EGFR expression. Survival analysis methods were implemented via the ‘limma’ package, ‘survival’ package and ‘survminer’ package.
Differential expression and pathway enrichment analysis
To gain further insight into the biological processes and signaling pathways associated with HMGA2 gene expression, we initially assessed the differential expression of mRNAs between the HMGA2lo subgroup and the HMGA2hi subgroup via the ‘limma’ software package. Heatmaps were generated using an adjusted p-value of less than 0.05 and a fold change of 1 as thresholds, displaying the top 50 genes that were either upregulated or downregulated. These heatmaps illustrate the expression trends observed in the EGFRmut LUAD samples. Subsequently, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis and Gene Ontology (GO) enrichment analysis were performed on the differential gene data. The aims of this study were KEGG pathways and to study the GO functions of different genes between the HMGA2lo and HMGA2hi subgroups. The ‘clusterProfiler’ software package was used for this purpose. Enrichment was considered to have occurred if P<0.05 or false discovery rate (FDR) <0.05.
Immune analysis
The HMGA2lo and HMGA2hi groups were compared via the ‘immunedeconv’ package, which uses the algorithms xCELL, TIMER, QUANTISEQ, MCPCOUNTER, EPIC, CIBERSORT, and CIBERSORT-ABS, to investigate the potential associations between HMGA2 expression and immunological status in the TME. Furthermore, the ‘ggplot2’ program was employed to extract immune checkpoint genes, including SIGLEC15, TIGIT, CD274, HAVCR2, PDCD1, CTLA4, LAG3, and PDCD1LG2, which exhibited differential expression in the HMGA2lo and HMGA2hi subgroups.
Drug sensitivity analysis
The R package ‘pRRophetic’ was used to perform drug sensitivity analysis. P<0.05 results were saved locally.
Cell culture
In this study, HBE, A549, and PC9 cells were used as tool cells and were cultured in a 37 ℃, 5% CO2 cell culture incubator with 1640 F12K medium containing 5% serum, and the cells in the logarithmic growth phase were selected for the next experiments.
Western blot (WB)
Total cellular protein extraction was performed via RIPA buffer and protease inhibitors, protein denaturation was performed in a 75 ℃ water bath, and electrophoresis was performed via a 7.5% concentrated gel with a separator gel at 90 V at constant pressure for 30 min followed by 1 h of 120 V electrophoresis. The membrane was subsequently transferred at 300 V for 2 h. The primary antibody was incubated at 4 ℃ overnight, and the secondary antibody was incubated for 2 h to develop the image.
Statistical analysis
Data processing and statistical analysis were performed via GraphPad Prism 9.5 and R (4.2.1). Data from the TCGA database were analysed via R packages. The Kruskal-Wallis or Wilcoxon test was used to compare HMGA2 expression between two or more groups. Continuous variables are presented as the means ± standard deviations or medians with interquartile ranges. Statistical significance was assessed via t-tests or one-way analysis of variance (ANOVA). The WB images were analysed with ImageJ software. All the statistical tests were two-sided, with P<0.05 considered statistically significant.
Results
HMGA2 mRNA and protein are differentially expressed in a variety of normal tissues
First, we analysed HMGA2 mRNA and protein expression in normal tissues via the HPA database. As shown in Figure 1A, HMGA2 mRNA is expressed at low levels in most organs, with slightly higher expression in the stomach, male gonads, and bone marrow. HMGA2 protein was expressed at the highest level in the male gonads, followed by the brain, thyroid, lung, digestive tract, female gonads, heart, connective and soft tissues, skin, and bone marrow.
Figure 1B provides an overview of HMGA2 RNA expression, incorporating data from the HPA and the Genotype-Tissue Expression (GTEx) project, reported in normalized transcripts per million (nTPM). These results indicate increased HMGA2 mRNA expression in the bone marrow, testis, and intestinal tissues. Figure 1C, derived from the FANTOM5 project (12), reports expression in scaled tags per million. The testis presented the highest HMGA2 mRNA expression, followed by the colon, adipose tissue, and small intestine. Figure 1D presents HMGA2 protein expression across 44 tissues, with tissues of similar function grouped by color. The results confirmed that the highest HMGA2 protein expression was in the testis, which was consistent with previous findings.
HMGA2 is highly expressed in various malignant tumors
As shown in Figure 2A, HMGA2 expression was upregulated in various malignant tumors. Its expression is significantly greater in tumor tissues than in normal tissues for bladder urothelial carcinoma, breast invasive carcinoma, cholangiocarcinoma, colon adenocarcinoma, esophageal carcinoma, glioblastoma, head and neck cancer, kidney chromophobe, kidney renal clear cell carcinoma, kidney renal papillary cell carcinoma, liver hepatocellular carcinoma, LUSC, prostate adenocarcinoma, rectum adenocarcinoma, stomach adenocarcinoma, thyroid carcinoma, and uterine corpus endometrial carcinoma (P<0.05) (Figure 2B,2C). Additionally, data from the GEPIA2 database revealed similar expression patterns in ovarian serous cystadenocarcinoma, skin cutaneous melanoma, and uterine carcinosarcoma.
High HMGA2 expression is associated with poor prognosis in LUAD
Figure 3A,3B show HMGA2 expression in LUAD and LUSC. HMGA2 expression was significantly higher in LUAD (Figure 3A) and LUSC (Figure 3B) tissues than in normal tissues (P<0.05). Immunohistochemistry and immunofluorescence data from the HPA database demonstrated that HMGA2 expression was lower in normal tissues (Figure 3C) but significantly upregulated in LUAD (Figure 3D) and LUSC (Figure 3E). In A-431 cells, the HMGA2 protein was highly expressed in the cytoplasm (Figure 3F). Survival analysis via the KM plotter database revealed that high HMGA2 expression is associated with poorer OS (Figure 3G), PFS (Figure 3H), and FP (Figure 3I) in LUAD patients but not in LUSC patients (OS, Figure 3J; PFS, Figure 3K; FP, Figure 3L).
HMGA2 expression is associated with EGFR mutations
Figure 4A shows the correlation between HMGA2 expression and common gene mutations in LUAD and LUSC. HMGA2 expression was significantly associated with the EGFR mutation status only in LUAD patients (P=0.003). As shown in Figure 4B HMGA2 expression was greater in the EGFRmut group than in the EGFRwt group, and both groups presented higher HMGA2 expression than normal tissues did. Survival analysis (Figure 4C,4D) revealed that high HMGA2 expression was associated with poorer OS in EGFRmut LUAD patients but not in EGFRwt LUAD patients. Targeted therapy is recommended for patients with advanced EGFRmut NSCLC. To further identify potential therapeutic targets related to HMGA2, drug sensitivity analysis via the R package ‘pRRophetic’ revealed a significant association between HMGA2 expression and chemotherapeutic agents such as paclitaxel (Figure 4E-4G).
Subgroup analysis on the basis of clinical factors, including age, sex, TNM stage, and ethnicity, revealed HMGA2 expression was significantly higher in stage N2 than in stage N0, but not significantly different from patients in stage N1, and no significant differences in other subgroups (Figure S1). WB analysis confirmed these findings in the cell lines (Figure 4H,4I). HMGA2 expression is greater in LUAD cell lines (A549 and PC9) than in normal lung epithelial cells [human bronchial epithelial (HBE)] and is greater in EGFRmut LUAD cells (PC9) than in EGFRwt LUAD cells (A549).
Functional and pathway enrichment analysis
Figure 5A presents the top 50 genes correlated with HMGA2 expression. A protein-protein interaction (PPI) network analysis via Cytoscape identified 18 hub genes (Figure 5B). GO analysis (Figure 5C) revealed that HMGA2 is associated with biological processes such as extracellular matrix (ECM) organization, extracellular structure organization, and embryonic organ development. In terms of cellular components, HMGA2 is linked to the collagen-containing ECM and endoplasmic reticulum lumen. Molecular function analysis revealed associations with ECM structural constituents and glycosaminoglycan binding. KEGG pathway analysis (Figure 5D) revealed that HMGA2 is involved in ECM-receptor interactions, nitrogen metabolism, arrhythmogenic right ventricular cardiomyopathy, and complement and coagulation cascades. Gene set enrichment analysis (GSEA) (Figure 5E) further revealed that genes highly expressed in HMGA2 were enriched in pathways such as antigen processing and presentation, cytosolic DNA sensing, ECM-receptor interaction, and Toll-like receptor signalling.
Immune infiltration and immune checkpoint analysis
Various algorithms were used to assess the relationship between HMGA2 expression and immune cell infiltration in the EGFRmut and EGFRwt groups (Figure 6A). HMGA2 expression is positively correlated with M0 macrophages, cancer-associated fibroblasts (CAFs), natural killer (NK) cells, and myeloid dendritic cells, but negatively correlated with CD8+ T cells and B cells. Further analysis (Figure 6B) revealed that high HMGA2 expression was associated with increased infiltration of M0 macrophages and resting mast cells. Immune checkpoint analysis (Figure 6C) revealed that HMGA2 expression was correlated with lower human leukocyte antigen (HLA)-related protein levels. Myeloid-derived suppressor cell (MDSC) analysis (Figure 6D) revealed a positive correlation between HMGA2 and NOS2, as well as associations with OLR1 and STAT6. Exhausted T cell analysis (Figure 6E) revealed a negative correlation between HMGA2 and markers such as GZMK and HAVCR2. Chemokine analysis (Figure 6F) revealed a positive correlation between HMGA2 and CXCL5 and a negative correlation with CXCL16 and CXCL17. CAF analysis (Figure 6G) revealed that HMGA2 is positively correlated with PDGFRB and COL1A1.
Discussion
Clinical and prognostic significance of HMGA2 in LUAD
This study systematically analyzed the expression and clinical significance of HMGA2 in NSCLC, especially in EGFR-mutated LUAD. We integrated the transcriptome data, survival analysis, immune infiltration analysis, pathway enrichment analysis and protein validation results from multiple databases. The results showed that HMGA2 was significantly elevated in LUAD and was closely associated with poor prognosis. Patients with high expression of HMGA2 had shorter OS, PFS, and time to FP. Therefore, HMGA2 may be a stable prognostic biomarker.
Previous studies have shown that HMGA2 is a cancer-related protein that can promote tumor cell proliferation, invasion and metastasis (13). This study further expands these findings to LUAD and suggests that HMGA2 has specific prognostic value in this subtype.
EGFR mutation-specific association and molecular heterogeneity
One of the important findings of this study is that HMGA2 is closely associated with the EGFR mutation status. We observed that HMGA2 is expressed at a higher level in EGFR-mutated LUAD, and its prognostic value is mainly manifested in this group of patients. However, it is not significant in EGFRwt patients. This indicates that LUAD exhibits significant molecular heterogeneity, and also suggests that HMGA2 can serve as an additional risk indicator beyond the EGFR status (14).
EGFR mutant tumors show significant differences in their response to targeted therapy, and are prone to developing drug resistance. Therefore, the search for new prognostic markers holds significant clinical importance. Our results suggest that HMGA2 may provide additional information beyond the traditional clinical indicators (such as TNM staging).
Biological and immunological implications
This study did not conduct direct functional experiments, but pathway analysis and immune infiltration analysis provided some biological information. The genes co-expressed with HMGA2 are mainly enriched in ECM remodeling, ECM-receptor interactions, metabolic regulation, and immune regulation pathways. These processes are closely related to tumor invasion, changes in the stroma, and treatment resistance. Therefore, HMGA2 may be involved in the regulation of the TME.
The results of immune infiltration further indicated that the high expression of HMGA2 was associated with changes in the proportions of various immune cells. Specifically, HMGA2 is negatively correlated with CD8+ T cells and B cells (15), but positively correlated with macrophages and CAF (16). This pattern suggests that high HMGA2 expression may be associated with an immunosuppressive microenvironment and may contribute to tumor progression and a decreased response to treatment (17).
These results indicate that HMGA2 is not only a molecular marker but may also be involved in the interaction between tumors and the microenvironment.
Clinical relevance and therapeutic implications
From a clinical perspective, HMGA2 has certain application value. As a prognostic marker, HMGA2 can help identify high-risk patients and guide more intensive follow-up or more aggressive treatment strategies. Furthermore, the drug sensitivity analysis revealed that the expression of HMGA2 is associated with the response to certain chemotherapy drugs. This suggests that HMGA2 may be helpful in individualized treatment selection.
It should be noted that the “treatment-related significance” proposed in this study is mainly based on correlation analysis and pathway speculation, and has not been directly verified through functional experiments.
Study limitations, and future directions
This study serves as an initial integrative analysis focusing on the clinical and molecular associations of HMGA2 in EGFRmut NSCLC. Although functional experiments were not included in the current study, related mechanistic investigations are ongoing and will be reported separately. Therefore, the current proposed therapeutic significance of HMGA2 is mainly based on clinical correlation analysis and pathway enrichment results, and further experimental verification is still required.
Furthermore, this study still has some limitations. Firstly, this study is mainly based on retrospective data analysis of public databases. Secondly, there is still a lack of independent external clinical cohorts for verification. Future studies will conduct both in vitro and in vivo to determine whether targeting HMGA2 can directly inhibit tumor growth or enhance therapeutic effects.
Conclusions
HMGA2 expression is positively correlated with the EGFR mutation status, suggesting its potential as a diagnostic and prognostic marker for LUAD.
Acknowledgments
Yichen Yang especially wishes to thank Yang Dong, Huilan Su, Wenwen Xia, and Qiqi Li, who have given me their strong support and company.
Footnote
Reporting Checklist: The authors have completed the REMARK reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2371/rc
Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2371/prf
Funding: This work was supported by
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2371/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Wu YL, Zhou Q. Combination Therapy for EGFR-Mutated Lung Cancer. N Engl J Med 2023;389:2005-7. [Crossref] [PubMed]
- Wang Q, Gao W, Gao F, et al. Efficacy and acquired resistance of EGFR-TKI combined with chemotherapy as first-line treatment for Chinese patients with advanced non-small cell lung cancer in a real-world setting. BMC Cancer 2021;21:602. [Crossref] [PubMed]
- Yin X, Liu X, Ren F, et al. The later-line efficacy and safety of immune checkpoint inhibitors plus anlotinib in EGFR-mutant patients with EGFR-TKI-resistant NSCLC: a single-center retrospective study. Cancer Immunol Immunother 2024;73:134. [Crossref] [PubMed]
- Yu D, Kane MJ, Koay EJ, et al. Machine learning identifies prognostic subtypes of the tumor microenvironment of NSCLC. Sci Rep 2024;14:15004. [Crossref] [PubMed]
- She Y, Jin Z, Wu J, et al. Development and Validation of a Deep Learning Model for Non-Small Cell Lung Cancer Survival. JAMA Netw Open 2020;3:e205842. [Crossref] [PubMed]
- Hammond SM, Sharpless NE. HMGA2, microRNAs, and stem cell aging. Cell 2008;135:1013-6. [Crossref] [PubMed]
- Young AR, Narita M. Oncogenic HMGA2: short or small? Genes Dev 2007;21:1005-9. [Crossref] [PubMed]
- Wang X, Wang J, Wu J. Emerging roles for HMGA2 in colorectal cancer. Transl Oncol 2021;14:100894. [Crossref] [PubMed]
- Ma Q, Ye S, Liu H, et al. The emerging role and mechanism of HMGA2 in breast cancer. J Cancer Res Clin Oncol 2024;150:259. [Crossref] [PubMed]
- Wang X, Wang J, Zhao J, et al. HMGA2 facilitates colorectal cancer progression via STAT3-mediated tumor-associated macrophage recruitment. Theranostics 2022;12:963-75. [Crossref] [PubMed]
- Dai FQ, Li CR, Fan XQ, et al. miR-150-5p Inhibits Non-Small-Cell Lung Cancer Metastasis and Recurrence by Targeting HMGA2 and β-Catenin Signaling. Mol Ther Nucleic Acids 2019;16:675-85. [Crossref] [PubMed]
- Lizio M, Harshbarger J, Shimoji H, et al. Gateways to the FANTOM5 promoter level mammalian expression atlas. Genome Biol 2015;16:22. [Crossref] [PubMed]
- Khazem F, Zetoune AB. Decoding high mobility group A2 protein expression regulation and implications in human cancers. Discov Oncol 2024;15:322. [Crossref] [PubMed]
- Gong Z, Du M, Li Y, et al. Machine learning identifies TIME subtypes linking EGFR mutations and immune states in lung adenocarcinoma. NPJ Digit Med 2025;8:796. [Crossref] [PubMed]
- Zhao C, Cheng L, Li A, et al. EGFR-Mutant Lung Adenocarcinoma Cell-Derived Exosomal miR-651-5p Induces CD8+ T Cell Apoptosis via Downregulating BCL2 Expression. Biomedicines 2025;13:482. [Crossref] [PubMed]
- Gu X, Zhu Y, Su J, et al. Lactate-induced activation of tumor-associated fibroblasts and IL-8-mediated macrophage recruitment promote lung cancer progression. Redox Biol 2024;74:103209. [Crossref] [PubMed]
- Kim S, Koh J, Kim TM, et al. Remodeling of tumor microenvironments by EGFR tyrosine kinase inhibitors in EGFR-mutant non-small cell lung cancer. iScience 2025;28:111736. [Crossref] [PubMed]

