Analyses of single-cell and bulk RNA-sequencing datasets identify biomarker-based molecular subtypes and a prognostic signature in lung adenocarcinoma
Original Article

Analyses of single-cell and bulk RNA-sequencing datasets identify biomarker-based molecular subtypes and a prognostic signature in lung adenocarcinoma

Minshan Fang, Xiuxia Zheng

Oncology, Fujian Provincial Tuberculosis Prevention and Treatment Hospital, Fuzhou, China

Contributions: (I) Conception and design: Both authors; (II) Administrative support: X Zheng; (III) Provision of study materials or patients: M Fang; (IV) Collection and assembly of data: M Fang; (V) Data analysis and interpretation: M Fang; (VI) Manuscript writing: Both authors; (VII) Final approval of manuscript: Both authors.

Correspondence to: Xiuxia Zheng, Master’s Degree. Oncology, Fujian Provincial Tuberculosis Prevention and Treatment Hospital, No. 8, Beiyuan Road, Cangshan District, Fuzhou 350008, China. Email: 251505079@qq.com.

Background: Immune checkpoint blockade has improved outcomes for a subset of patients with lung adenocarcinoma (LUAD), yet primary and acquired resistance remain common. Given the heterogeneity of the tumor microenvironment (TME), biomarkers that reflect both tumor-intrinsic programs and immune context are needed for prognostic stratification and for generating clinically relevant hypotheses. The aim of this study was to identify immune-associated genes and construct a prognostic signature for LUAD by integrating single-cell and bulk transcriptomic data.

Methods: We integrated single-cell RNA sequencing (scRNA-seq; GSE131907) with bulk transcriptomic profiles from The Cancer Genome Atlas (TCGA)-LUAD to identify immune-associated genes and construct a prognostic signature. Major cell populations were delineated and annotated using reference-based methods, and immune-associated differentially expressed genes (DEGs) were derived from the scRNA-seq analysis. Genes consistently dysregulated in both scRNA-seq and bulk data were carried forward for functional enrichment, survival analyses, and molecular subtype discovery. A multigene risk score was built using Cox regression and validated in an independent Gene Expression Omnibus (GEO) cohort (GSE13213) and evaluated using Kaplan-Meier analysis, time-dependent discrimination metrics, and multivariable adjustment for clinical covariates. Immune infiltration and immune functional states were characterized using Estimation of STromal and Immune cells in MAlignant Tumor tissues (ESTIMATE), single-sample gene set enrichment analysis (ssGSEA), Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts (CIBERSORT), and Tumor Immune Dysfunction and Exclusion (TIDE). Protein-level evidence was examined using the Human Protein Atlas (HPA), and messenger RNA (mRNA) expression was assessed in paired clinical specimens by reverse transcription quantitative polymerase chain reaction (RT-qPCR) (n=10 pairs).

Results: Across datasets, we identified immune-associated genes consistently altered in LUAD and enriched for antigen presentation and immune response pathways. A four-gene signature (HLA-DRB5, LDHA, ENO1, and TIMP1) stratified patients into low- and high-risk groups with significantly different overall survival (OS) and remained informative after adjustment for major clinical variables. High-risk tumors displayed an immune-suppressive phenotype, including reduced immune infiltration and attenuated antigen-presentation programs, together with higher predicted immune evasion signals in computational immunotherapy-response inference.

Conclusions: Our integrative analysis nominates a four-gene immune-associated signature that captures prognostic heterogeneity in LUAD and is linked to distinct TME states. These findings provide candidate biomarkers and testable hypotheses for immunotherapy-related mechanisms, which require validation in independent immunotherapy-treated cohorts and functional experiments.

Keywords: Lung adenocarcinoma (LUAD); single-cell RNA sequencing (scRNA-seq); tumor microenvironment (TME); prognostic signature; immune infiltration; immunotherapy response inference


Submitted Jan 04, 2026. Accepted for publication Mar 19, 2026. Published online Apr 28, 2026.

doi: 10.21037/tcr-2026-1-0025


Highlight box

Key findings

• Integration of single-cell and bulk transcriptomics identified immune-associated genes consistently altered in lung adenocarcinoma (LUAD).

• A four-gene signature (HLA-DRB5, LDHA, ENO1, and TIMP1) stratified patients into low- and high-risk groups with significantly different overall survival.

• High-risk tumors displayed an immune-suppressive phenotype with reduced immune infiltration and attenuated antigen presentation.

What is known and what is new?

• Immune checkpoint blockade benefits only a subset of LUAD patients, and effective prognostic biomarkers reflecting tumor microenvironment (TME) heterogeneity are lacking.

• This study provides an integrative four-gene immune-associated signature that captures both prognostic heterogeneity and distinct TME immune states in LUAD.

What is the implication, and what should change now?

• The signature offers candidate biomarkers for prognostic stratification in LUAD patients.

• It generates testable hypotheses for immunotherapy-related mechanisms.

• Prospective validation in immunotherapy-treated cohorts and functional experiments are warranted before clinical application.


Introduction

Worldwide, lung cancer is the leading cause of death from cancer, presenting a high mortality rate in both genders (1). A considerable proportion of patients are diagnosed at an advanced stage (2). The primary objective in treating advanced lung cancer is to extend survival, alleviate symptoms, and enhance quality of life. Systemic therapies, including chemotherapy, targeted therapy, and immunotherapy, are central to the management of advanced disease (3). Histologically, non-small cell lung cancer (NSCLC) accounts for approximately 85% of lung cancer cases, with lung squamous cell carcinoma (LUSC) and lung adenocarcinoma (LUAD) being the most common subtypes (4,5). Over the past decade, significant strides have been made in targeted therapies and immunotherapy, demonstrating promising efficacy in numerous advanced NSCLC cases. Nonetheless, immune checkpoint inhibitor (ICI) treatment is only beneficial for a small percentage of LUAD patients (6). Therefore, identifying robust biomarkers and developing reliable predictive models are essential for evaluating prognosis and treatment response in LUAD.

The tumor microenvironment (TME) encompasses various cel both of which have shown promising efficacy in subsets of patients with advanced NSCLC lular and non-cellular components (CCs), with immune cells playing a crucial role in tumor initiation and progression (7,8). Multiple factors influence the tumor immune microenvironment (9,10), including the level of local immune cell infiltration, intracellular signaling pathway activation, and characteristic gene expression patterns. Cellular states are shaped by phenotypic and functional characteristics that are ultimately governed by gene expression programs (11). Specifically, cells respond to external signals by activating relevant signal transduction pathways, allowing specific transcription factors to regulate gene expression and determine cell fate. Dynamic changes in gene expression facilitate biological processes (BPs) during development, homeostasis, and responses to stress. Notably, transcriptional plasticity drives complex functional outputs in innate immune cells such as macrophages (12,13). However, the antitumor roles of immune-cell-associated differentially expressed genes (DEGs) in LUAD remain insufficiently understood. Further investigation of these genes and their associations with prognosis and treatment response is therefore warranted.

Single-cell sequencing technology is a high-throughput method used for sequencing and analyzing the genome, transcriptome, and epigenome at the individual cell level (14,15). Researchers may examine cellular heterogeneity and cell-to-cell interactions using this approach (16,17). Increasing attention has been paid to predicting immunotherapy response by characterizing interactions between tumor cells and immune cells. Such analyses may help identify novel therapeutic targets, clarify cellular heterogeneity, and reveal mechanisms underlying sensitivity or resistance to targeted therapy and immunotherapy. This approach may facilitate more individualized treatment strategies (18-20). Several prognostic gene signatures and prediction models for LUAD have been reported, however, many are based solely on bulk transcriptomic data, offering limited cell-type resolution and insufficient insight into the immune microenvironment. In addition, some models show reduced cross-platform generalizability. Therefore, we integrated single-cell RNA sequencing (scRNA-seq) and bulk RNA-seq data to derive immune-associated predictors and develop and externally validate a prognostic model for LUAD overall survival (OS).

scRNA-seq and RNA-seq data of LUAD were acquired from public databases for extensive analyses in this work, which sought to find genes that were differently expressed in immune cells and create a predictive model. Using the Gene Expression Omnibus (GEO) dataset, the model’s efficacy was further confirmed. Additionally, different subgroups’ immunological responses, immune cell infiltration, and immune checkpoint gene expression levels were investigated. These findings offer potential new strategies to enhance the clinical efficacy of immunotherapy. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0025/rc).


Methods

Collection of datasets

RNA-seq data for 59 healthy lung tissues, 509 LUAD specimens, and the clinical features for 91 The Cancer Genome Atlas (TCGA)-LUAD samples were obtained from the TCGA website (https://portal.gdc.cancer.gov/) (21). Dataset GSE131907, including 11 tumor and 11 distant normal lung samples, was retrieved from the GEO (22). For external validation, the gene-expression profile matrices of the LUAD cohort GEO: GSE13213 were acquired from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). TCGA-LUAD is an RNA-seq cohort, whereas GSE13213 is a microarray dataset. To improve cross-platform comparability, expression values from both datasets were log2-transformed and normalized before downstream analyses. When multiple entries corresponded to the same gene symbol, the average expression value was retained. R software version 4.3.3 (https://www.r-project.org/) was used, and the sva package (http://bioconductor.org/packages/release/bioc/html/sva.html) was utilized to reduce batch effects. Because all cohorts were obtained from public repositories (TCGA and GEO) and downloaded as de-identified datasets, accrual and follow-up end dates were not consistently available in the retrieved records; therefore, we did not report specific accrual/follow-up calendar dates. TCGA-LUAD served as the development cohort, and external validation was performed in a public GEO cohort (GSE13213). The scRNA-seq dataset (GSE131907) was used to identify immune-cell-associated DEGs. Eligibility criteria were: (I) available gene-expression profiles; and (II) available OS time and status. Samples/patients with missing survival information were excluded from survival modeling analyses. Missing data were handled using complete-case analysis: patients with missing values in OS or required predictors were excluded from the corresponding analyses. The number of excluded cases is reported in the “Results” (participant flow).

Single-cell data analysis

The scRNA-seq data underwent analysis using R’s Seurat package. Initially, the “NormalizeData” function was employed to normalize the scRNA-seq data, followed by the utilization of the “ScaleData” function to scale the data. Subsequently, the “FindVariableFeatures” function was utilized to identify the top 2,000 genes exhibiting high variability. Following this, the “RunPCA” function facilitated dimensionality reduction analysis, significant principal components were identified using the JackStraw procedure, which were subsequently utilized for further analysis. Cells were then clustered using the graph-based clustering algorithm via the “FindNeighbors” function, and a single-cell map was generated using “FindClusters” with the resolution set to 0.5. Moreover, the “FindAllMarkers” function was employed to detect gene expression markers in cells. These analyses were conducted using the Seurat package (version 5.0.3) (23) in R. Lastly, the R package SingleR (version 2.0.0) (24) and the CellMarker dataset (25) were utilized to annotate the cells in this study.

Differential gene expression analysis and functional enrichment analysis

Differential gene expression analysis was performed on the TCGA LUAD messenger RNA (mRNA) dataset using the DESeq2 R package. The Benjamini-Hochberg procedure was applied to control the false discovery rate (FDR) for multiple hypothesis testing. For GEO microarray datasets, differential expression analysis was conducted using the limma package (26). DEGs were identified with a threshold of adjusted P<0.05 and |log2fold change (FC)| >1 (27). Volcano plots were generated using the EnhancedVolcano package to visualize DEG distributions. To elucidate the biological functions of identified DEGs, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed using the clusterProfiler R package (28,29). Enriched terms with adjusted P<0.05 (Benjamini-Hochberg correction) were considered statistically significant.

Prognostic values of the prognostic gene signature and nomogram establishment

OS was defined as the time from the recorded baseline time point (diagnosis provided by TCGA) to death from any cause; patients alive at last follow-up were censored. To identify prognosis-related genes, univariate Cox proportional hazards regression analysis was performed on each DEG in the TCGA training cohort (30,31). Genes with P<0.05 were subsequently included in multivariate Cox regression analysis using the “coxph” function of the “survival” package to construct a prognostic model (32). The risk score for each patient was calculated as follows: Risk score = Σ(βi × Expi), where βi represents the regression coefficient of each selected gene and Expi denotes its expression level. Patients were stratified into high-risk and low-risk groups based on the median risk score in the TCGA cohort, and the same formula and cutoff were applied to the GEO validation cohort (GSE13213). Survival differences between risk groups were evaluated using the Kaplan-Meier method with log-rank tests implemented in the “survfit” function. Time-dependent receiver operating characteristic (ROC) curves were generated using the “survivalROC” package to assess the predictive accuracy of the risk score. The relationships between gene expression, patient survival time, and event status were visualized using heatmaps. To develop a clinically applicable tool, a nomogram integrating six variables [age, sex, pathological tumor (T) stage, node (N) stage, metastasis (M) stage, smoking and risk score] was constructed using the “rms” package in R. The nomogram’s predictive performance was evaluated using the concordance index (C-index) and calibration curves (33). Model performance was assessed through independent external validation in the GEO cohort (GSE13213); no internal validation was performed.

Immune responses of the prognostic gene signature

The prognostic gene signature’s immunological responses to immunotherapies may be predicted using the Tumor Immune Dysfunction and Exclusion (TIDE) database (http://tide.dfci.harvard.edu/) (34). Patients were stratified into high- and low-risk groups according to the median risk score, and TIDE score, T-cell dysfunction score, and T-cell exclusion score were compared between the two groups. Higher TIDE scores indicate a higher likelihood of immune evasion and a lower probability of benefiting from immunotherapy (35). This analysis provides valuable insights into the potential efficacy of immunotherapies based on the immune landscape associated with the identified prognostic gene signature.

Immune landscape exploration

To examine the immune landscape across high- and low-risk groups as well as across molecular subtypes, three immune-associated algorithms were used. The study employed single-sample gene set enrichment analysis (ssGSEA) to investigate the immune capabilities and activity of macrophages in individual samples. Based on the ratio of immune cells to stromal cells, the Estimation of STromal and Immune cells in MAlignant Tumor tissues (ESTIMATE) algorithm made it easier to calculate immune scores, stromal scores, ESTIMATE scores, and tumor purity. In addition, the immune cell population composition in each LUAD sample was determined using the Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts (CIBERSORT) method. Moreover, aggregation analysis and signatures were used to compare the expression levels of MHC molecules. Regarding immune checkpoints, common immunosuppressive molecules were preliminarily compared based on clustering and risk levels. These comprehensive analyses provide insights into the immune landscape and potential immunotherapeutic targets in LUAD.

Sangerbox tools and Human Protein Atlas (HPA) database

The Sangerbox version 2.0 website (http://www.sangerbox.com/tool) single-gene pan-cancer tools were used to assess the associations between tumor mutational burden (TMB) and the expression of prognostic genes based on TCGA-LUAD, purity, with the cutoff criterion of P<0.05 (36). The HPA database offers information on the patterns of protein expression in both healthy and sick tissues. In our study, we confirmed the expression of candidate genes (HLA-DRB5, LDHA, ENO1, and TIMP1) in LUAD and normal lung tissues by immunohistochemical data from the HPA database.

Reverse transcription quantitative polymerase chain reaction (RT-qPCR) validation

Ten matched pairs of LUAD tumor tissues and adjacent normal lung tissues were collected from patients who underwent surgical resection at the First Affiliated Hospital of Fujian Medical University. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Medical Research and Clinical Technology Application Subcommittee of the Ethics Committee of the Fujian Provincial Tuberculosis Prevention and Treatment Hospital (No. [2021]024), and all patients provided written informed consent. Tissue samples were fixed in formalin, embedded in paraffin, and sectioned for analysis. Total RNA was extracted using TRIzol reagent (TaKaRa, Kyoto, Japan) according to the manufacturer’s protocol. First-strand cDNA was synthesized from 1 µg of total RNA using the PrimeScript RT Reagent Kit with gDNA Eraser (TaKaRa). RT-qPCR was performed on an ABI 7500 system (Applied Biosystems, Waltham, MA, USA) using SYBR Green detection. The mRNA expression levels of HLA-DRB5, LDHA, ENO1, and TIMP1 were quantified, with all primer sequences provided in Table S1.

Statistical analysis

All statistical analyses were performed using R software (version 4.5.1). Differentially expressed genes were identified using the Wilcoxon rank-sum test with a false discovery rate (FDR) <0.05. Overall survival (OS) was estimated using the Kaplan-Meier method and compared between risk groups using the log-rank test. A multivariate Cox proportional hazards regression model was used to assess the independent prognostic value of the risk score after adjusting for clinical covariates (age, gender, stage). The risk score was calculated as a linear combination of gene expression values weighted by their Cox regression coefficients. Time-dependent receiver operating characteristic (ROC) curves were used to evaluate predictive accuracy. Correlations between the risk score and immune infiltration levels were assessed using Spearman’s rank correlation coefficient. All tests were two-sided, and a P value <0.05 was considered statistically significant.


Results

Participant flow and analysis cohorts: a total of 509 LUAD tumor specimens were available in TCGA; after excluding 10 cases with missing OS information, 499 patients were included for model development. The external validation cohort comprised 117 LUAD patients from GSE13213 with complete OS information (49 deaths; median observed survival time 67.0 months).

Identification and analysis of differential expression characteristics of single cells

Drawing on single cell profiles from GSE131907, we extracted the gene expression data from 88,145 cells that were derived from 22 samples (11 patient samples and 11 normal lung tissue samples) (Figure 1A-1C). Through principal component analysis (PCA) dimensionality reduction, we identified eight distinct cell populations—myeloid cells, T cells, endothelial cells, fibroblasts, natural killer (NK) cells, epithelial cells, B lymphocytes, and mast cells—each defined by their unique marker gene expression profiles (Figure 1D). The cell expression profiles of patients and controls were then distinguished, and the differential expression profiles were identified using Seurat’s FindMarkers module. A total of 1,735 LUAD immune cell differential genes were found (available online: https://cdn.amegroups.cn/static/public/tcr-2026-1-0025-1.xlsx).

Figure 1 Identification of the LUAD-associated cell subtypes. (A) Quality control indicators; (B) genes with a high degree of variability between cells; (C) UMAP plot classifying cell clusters based on scRNA-seq data; (D) UMAP plot identifying the various cell subtypes. LUAD, lung adenocarcinoma; scRNA-seq, single-cell RNA sequencing; UMAP, uniform manifold approximation and projection.

Identification of DEGs and functional analysis

The TCGA cohort comprised 499 LUAD patients with available survival statistics and clinicopathological information (available online: https://cdn.amegroups.cn/static/public/tcr-2026-1-0025-2.xlsx). A total of 6,486 DEGs were identified from the TCGA LUAD dataset, including 2,267 up-regulated and 590 down-regulated genes (Figure 2A). Among these, 101 DEGs were also detected in GEO single-cell cohorts, as illustrated in the volcano plot (Figure 2B). The overlap between genes identified from single-cell and bulk RNA sequencing data was visualized using a Venn diagram (Figure 2C). GO enrichment analysis revealed that the BPs were primarily associated with the adaptive immune response based on somatic recombination of immune receptors derived from immunoglobulin superfamily domains, leukocyte-mediated immunity, and the humoral immune response (Figure 2D). Regarding CCs, the genes were mainly enriched in the immunoglobulin complex (IgG), blood microparticles, and the external side of the plasma membrane (Figure 2E). In the molecular function (MF) category, significant enrichment was observed in antigen binding, immunoglobulin receptor binding, and fatty acid binding (Figure 2F). KEGG pathway analysis indicated that these genes were significantly involved in osteoclast differentiation, glycolysis/gluconeogenesis, and the interleukin (IL)-17 signaling pathway (Figure 2G). Collectively, these findings suggest that the overlapping gene set is closely associated with immune-related functions.

Figure 2 Identification of differentially expressed genes and functional enrichment analysis. (A) Volcano plot of DEGs from TCGA RNA-seq data. (B) Volcano plot of DEGs from GEO single-cell RNA sequencing data (GSE131907). (C) Venn diagram showing the overlap between DEGs identified from single-cell and bulk RNA sequencing datasets. (D-F) GO enrichment analysis of the overlapping DEGs: (D) BP, (E) CC, and (F) MF. (G) KEGG pathway enrichment analysis of the overlapping DEGs. BP, biological process; CC, cellular component; DEGs, differentially expressed genes; FC, fold change; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; MF, molecular function; RNA-seq, RNA sequencing; TCGA, The Cancer Genome Atlas.

Identification of prognostic gene signature in TCGA training cohorts and identification of molecular subtypes

Nineteen DEGs (red for risk genes and green for protective genes) were shown to be substantially correlated with OS using univariate Cox regression analysis (Figure 3A). Figure 3B shows how the expression of these genes differs in LUAD patients compared to normal tissues. Molecular subtype analysis based on these 19 prognostic genes clustered LUAD patients into two subgroups with good internal consistency and constraint of characteristics (Figure 3C-3E). In addition, cluster 1 showed a better prognosis than cluster 2 (Figure 3F). The heatmap showed that there were differences in gene expression between the two clusters, showed substantial correlations with clinicopathological variables including stage, N stage, and T stage, but no discernible variations were seen in terms of gender, age, or M stage (Figure 3G).

Figure 3 Identification of prognostic gene signature and molecular subtypes. (A) Forest plot of univariate Cox regression analysis for the candidate DEGs. (B) Expression levels of survival-related genes in tumor and adjacent normal tissues. (C) Consensus clustering matrix for k=2. (D) Consensus clustering CDF for k values ranging from 2 to 10. (E) Relative change in area under the CDF curve for k=2. (F) Kaplan-Meier survival curves showing significant differences in overall survival between cluster 1 and cluster 2. (G) Heatmap illustrating the expression patterns of the 19 prognostic genes across the two molecular subtypes, along with their associations with clinical factors. ****, P<0.0001. CDF, cumulative distribution function; DEG, differentially expressed gene; FPKM, fragments per kilobase of transcript per million mapped reads; M, metastasis; N, node; T, tumor.

Construction and validation of prognostic gene signature

Four genes were selected to construct the prognostic model using multivariable Cox regression analysis (Figure 4A). Patients were divided into high- and low-risk groups according to the median risk score, and Figure 4B shows the patients’ survival rates as well as the expression of the model’s genes. As the risk score increased, patient survival decreased, accompanied by corresponding changes in follow-up status and the expression of the four model genes. As anticipated, HLA-DRB5 gene was a protective factor and showed a downward trend in expression with increasing risk score, while LDHA, ENO1 and TIMP1 genes were risk factors and showed an upward trend in expression with increasing risk score. In addition, the Kaplan-Meier survival curve showed that the low-risk group had better survival (Figure 4C). The ROC curve for the probability of survival risk score from 1 to 5 years showed a maximum area under the curve (AUC) of 0.8 (Figure 4D), indicating good sensitivity and specificity of the model. The same evaluation method was applied to the independent validation set from GEO. Based on the risk score, patients were divided into high- and low-risk groups, with the low-risk group showing a significantly better prognosis (Figure 4E). The time-dependent ROC curves for 1‑, 3‑, and 5‑year survival predictions in the GEO cohort are shown in Figure 4F. Furthermore, comparisons of survival status and risk‑related gene expression between the high‑ and low‑risk groups revealed marked differences (Figure 4G).

Figure 4 Characteristics and validation of the prognostic gene signature. (A) Forest plot of the four genes selected for the signature using multivariate Cox regression analysis in the TCGA training cohort. (B) Distribution of risk scores, survival time, and patient status in the TCGA cohort. (C) Kaplan-Meier survival curves for high-risk and low-risk groups in the TCGA cohort. (D) Time-dependent ROC curves for 1-, 3-, and 5-year survival predictions in the TCGA cohort. (E) Kaplan-Meier survival curves for high- and low-risk groups in the GEO validation cohort (GSE13213). (F) Time-dependent ROC curves for 1-, 3-, and 5-year survival predictions in the GEO validation cohort. (G) Distribution of risk scores, survival time, and patient status in the GEO validation cohort. AUC, area under the curve; CI, confidence interval; GEO, Gene Expression Omnibus; HR, hazard ratio; ROC, receiver operating characteristic; TCGA, The Cancer Genome Atlas.

Univariate Cox regression analysis identified several clinicopathological factors significantly associated with overall survival in LUAD patients (Figure 5A). Stage III disease (HR =2.455, 95% CI: 1.311–4.598, P=0.005), T stage (HR =2.175, 95% CI: 1.200–3.941, P=0.01), and N stage (HR =0.524, 95% CI: 0.305–0.902, P=0.02) were significantly correlated with patient prognosis. Notably, among smoking status, patients who had quit smoking for more than 15 years exhibited a significantly lower risk of death compared to current smokers (HR =0.356, 95% CI: 0.141–0.899, P=0.03). Multivariate Cox regression analysis revealed that male sex was an independent prognostic factor for poor survival (Figure 5B; HR =1.857, 95% CI: 1.041–3.315, P=0.04). Importantly, long-term smoking cessation (>15 years) remained a significant protective factor after adjusting for other covariates, with a 66.5% reduction in mortality risk compared to current smokers (HR =0.335, 95% CI: 0.129–0.867, P=0.02). T stage showed a trend toward statistical significance (HR =2.441, 95%CI: 0.973–6.124, P=0.057). The distribution of risk scores across different age groups, tumor stages, and smoking status is shown in Figure 5C-5E, respectively. A nomogram integrating age, smoking status, and tumor stage was constructed to predict 1-, 3-, and 5-year survival probabilities (Figure 5F). The calibration plots demonstrated excellent agreement between the nomogram-predicted and actual survival outcomes at 1-, 3-, and 5-year time points, indicating robust predictive performance of the model (Figure 5G). To explore the underlying biological mechanisms, gene set enrichment analysis (GSEA) was performed to compare the high- and low-risk groups. Genes in the high-risk group were significantly enriched in pathways related to cell cycle, pyrimidine metabolism, spliceosome, mismatch repair, and p53 signaling pathway. In contrast, genes in the low-risk group were predominantly enriched in vascular smooth muscle contraction, primary bile acid biosynthesis, ABC transporters, and aldosterone-regulated sodium reabsorption pathways (Figure 5H).

Figure 5 Cox regression analysis and nomogram in TCGA LUAD cohort. (A,B) Forest plots of univariate (A) and multivariate (B) Cox regression analysis showing hazard ratios and 95% CIs for clinicopathological factors. (C-E) Distribution of risk scores across different age groups (C), tumor stages (D), and smoking status (E). (F) Nomogram integrating age, smoking status, and tumor stage for predicting 1-, 3-, and 5-year survival probabilities. (G) Calibration plots demonstrating agreement between nomogram-predicted and actual survival at 1, 3, and 5 years. (H) GSEA enrichment analysis showing significantly enriched pathways in high- vs. low-risk groups. **, P<0.01. CI, confidence interval; GSEA, gene set enrichment analysis; LUAD, lung adenocarcinoma; TCGA, The Cancer Genome Atlas.

Analysis of tumor immune dysfunction and rejection in high- and low-risk groups

Six molecular immune subtypes have been found by previous investigations; these subtypes are related to tumor molecular features and patient prognosis. They include wound-healing (C1), interferon-gamma (IFN-γ) dominant (C2), inflammation (C3), lymphocyte depletion (C4), immunoquiescent (C5), and TGF-β dominant (C6) (37). We analyzed the distribution of risk scores across these six subtypes. In our LUAD cohort, samples were classified into five subtypes (C1, C2, C3, C4, and C6), with no samples assigned to C5 (Figure 6A). The predictive value of the model and the response to immunotherapy were anticipated by using the TIDE database to produce the TIDE score, T cell dysfunction score, and immunological rejection score of various subgroups. As shown in Figure 6B-6E, the high-risk group had significantly higher TIDE score, exclusion score, and dysfunction score and lower microsatellite instability (MSI) score, indicating that patients in the high-risk group had increased immune escape potential and may have a poor response to ICI.

Figure 6 Immune response profiles associated with the prognostic gene signature. (A) Distribution of risk scores across six cancer-specific immune subtypes (C1–6). No samples were assigned to C5 in our cohort. (B-E) Comparison of immunotherapy response-related scores between high- and low-risk groups: (B) TIDE score, (C) T cell dysfunction score, (D) immune exclusion score, and (E) MSI score. ****, P<0.001. MSI, microsatellite instability; TIDE, Tumor Immune Dysfunction and Exclusion.

Tumor immune cell infiltration and evaluation of ICIs

The crucial role that the microenvironment plays in the development of cancer has been shown in earlier research. Therefore, we further investigated the association between the tumor immune microenvironment and risk stratification. Comparing the high-risk group to the low-risk cohort, immune infiltration analysis utilizing the ssGSEA algorithm and CIBERSORT tool revealed that the high-risk cohort had less immune cell infiltration and the presence of immune-related pathways (Figure 7A,7B). The high-risk group showed higher infiltration of M0 and M1 macrophages than the low-risk group (Figure 7C). Furthermore, the stromal/immune cell infiltration level in tumor tissue was predicted using the ESTIMATE algorithm. The analysis results indicated that, in the high-risk cohort, the tumor purity score increased while the stromal, immune, and ESTIMATE scores decreased (Figure 7D). By analyzing the expression patterns of MHC molecules and common immunotherapeutic targets in the treatment of LUAD in subgroups, we found that the high-risk population had significantly lower levels of MHC molecules than the low-risk cohort (Figure 7E), as well as significantly lower levels of immunotherapeutic targets (Figure 7F).

Figure 7 Immune landscape analysis of high- and low-risk groups. (A,B) Comparison of immune cell infiltration and immune-related functions between risk groups using ssGSEA. (C) Relative proportions of 22 immune cell subtypes estimated by the CIBERSORT algorithm. (D) Stromal scores, immune scores, ESTIMATE scores, and tumor purity between high-risk and low-risk groups. (E) Expression patterns of MHC molecules. (F) Expression levels of common immune checkpoint genes. *, P<0.05; **, P<0.01; ***, P<0.001. CIBERSORT, Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts; ESTIMATE, Estimation of STromal and Immune cells in MAlignant Tumor tissues; FPKM, fragments per kilobase of transcript per million mapped reads; ssGSEA, single-sample gene set enrichment analysis; MHC, major histocompatibility complex.

Analysis prognostic gene signature

The previous results showed that the prognostic model had good performance, and the immune infiltration and immune response showed significant differences in the different subgroups. We further investigated the performance of prognostic genes. In the single-cell results data, we found that HLA-DRB5 was mainly enriched in myeloid cells, epithelial cells, fibroblasts and endothelial cells. LDHA was mainly expressed in T lymphocytes, NK cells, myeloid cells, B lymphocytes, mast cells, fibroblasts, epithelial cells, and endothelial cells. TIMP1 showed enrichment in myeloid cells, B lymphocytes, mast cells, fibroblasts, epithelial cells, and endothelial cells. In contrast, ENO1 showed relatively low expression in immune cells compared with the other three genes (Figure 8A,8B). HLA-DRB5 was found to be differentially expressed in B lymphocytes, but not in tumor cells. LDHA expression was different in NK cells, myeloid cells, and epithelial cells, and was lost in distant normal lung. TIMP1 expression was different in Epithelial cells between groups. Expression was absent in distant normal (Figure 8C,8D).

Figure 8 Expression patterns of prognostic genes across different cell populations. (A,B) Distribution of the four prognostic genes (HLA-DRB5, LDHA, ENO1, and TIMP1) across all immune cell subtypes. (C,D) Comparative expression of prognostic genes between tumor-associated immune cells and normal tissue immune cells. UMAP, uniform manifold approximation and projection.

The SangerBox database was used to analyze the relationships between the four prognosis-related genes and genomic heterogeneity indicators, including TMB, mutant-allele tumor heterogeneity (MATH), MSI, neoantigen load (NEO), tumor purity, ploidy, homologous recombination deficiency (HRD), and loss of heterozygosity (LOH) (Figure 9A). HLA-DRB5 expression was significantly negatively correlated with TMB, tumor purity, MATH, ploidy, HRD, and LOH. LDHA expression showed a positive correlation with ploidy, while ENO1 expression was positively correlated with TMB, MATH, ploidy, HRD, and LOH. TIMP1 expression exhibited a significant negative correlation with tumor purity (Figure 9A). Analysis of genetic alterations revealed that all four genes had low mutation frequencies, with amplification being the most common alteration type (Figure 9B). Protein expression levels of HLA-DRB5, LDHA, ENO1, and TIMP1 in LUAD tumor tissues and adjacent normal tissues were further examined using the HPA database. Consistent with mRNA expression patterns, HLA-DRB5 protein levels were substantially lower in tumor tissues compared to normal tissues, whereas LDHA, ENO1, and TIMP1 protein levels were significantly elevated in tumor tissues (Figure 9C).

Figure 9 Genomic heterogeneity, mutation landscape, and protein expression of prognostic genes. (A) Correlation analysis between the expression of the four prognostic genes (HLA-DRB5, LDHA, ENO1, and TIMP1) and genomic heterogeneity indicators, including TMB, MATH, MSI, NEO, tumor purity, ploidy, HRD, and LOH. (B) Mutation frequencies and alteration types of the four genes in LUAD patients based on the cBioPortal database. (C) Immunohistochemistry (IHC) validation of HLA-DRB5, LDHA, ENO1, and TIMP1 protein expression in normal lung tissues and LUAD tumor tissues. All images were sourced from the HPA database. Specific sources and patient identifiers are as follows: HLA-DRB5: tumor (https://www.proteinatlas.org/ENSG00000198502-HLA-DRB5/cancer/lung+cancer#img); normal (https://www.proteinatlas.org/ENSG00000198502-HLA-DRB5/tissue/lung#img). LDHA: tumor (https://www.proteinatlas.org/ENSG00000134333-LDHA/cancer/lung+cancer#img); normal (https://www.proteinatlas.org/ENSG00000134333-LDHA/tissue/lung#img). ENO1: tumor (https://www.proteinatlas.org/ENSG00000074800-ENO1/cancer/lung+cancer#img); normal (https://www.proteinatlas.org/ENSG00000074800-ENO1/tissue/lung#img). TIMP1: tumor (https://www.proteinatlas.org/ENSG00000102265-TIMP1/cancer/lung+cancer#img); normal (https://www.proteinatlas.org/ENSG00000102265-TIMP1/tissue/lung#img). HPA, Human Protein Atlas; HRD, homologous recombination deficiency; IHC, immunohistochemistry; LOH, loss of heterozygosity; LUAD, lung adenocarcinoma; MATH, mutant-allele tumor heterogeneity; MSI, microsatellite instability; NEO, neoantigen load; TMB, tumor mutational burden.

Verification of the mRNA expression of prognostic genes in LUAD

To learn more about the expression patterns of the four candidate genes in lung cancer and normal tissues, the RT-qPCR test was employed. Ten sets of tissues, one for each lung cancer and the neighboring normal tissue in each pair, were collected. The expression levels of LDHA, ENO1, and TIMP1 were found to be highly expressed in LUAD tissues, which was found to be consistent with the TCGA database results, as shown in Figure 9C. In contrast, the expression level of HLA-DRB5 was poorly expressed in LUAD compared with normal lung samples (P<0.01), as shown in Figure 10A-10E. As with the mRNA expression of genes, the HPA043151 antibody in the HPA database showed that the HLA-DRB5 protein expression was low in LUAD tissues as compared to normal lung tissues, the tumor tissues had significantly higher levels of LDHA, ENO1, and TIMP1 protein (Figure 9C).

Figure 10 Validation of prognostic gene expression by RT-qPCR. (A-D) Relative mRNA expression levels of HLA-DRB5, LDHA, ENO1, and TIMP1 in 10 paired LUAD tumor tissues and adjacent normal lung tissues. (E) Summary comparison of the four genes’ expression levels between tumor and normal specimens. Expression data are presented as ΔCt values (ΔCt value = Cт value of prognostic gene − Ct value of reference gene), where lower ΔCt values indicate higher gene expression. Statistical significance was determined using paired t-tests. **, P<0.01; ***, P<0.001. LUAD, lung adenocarcinoma; RT-qPCR, reverse transcription quantitative polymerase chain reaction.

Discussion

In the past decade, various targeted therapies and immunotherapy have achieved rapid development and have shown good efficacy in many patients with advanced NSCLC. Numerous clinical trials have validated the effectiveness of ICIs, such as anti-CTLA-4, anti-programmed cell death protein 1 (PD-1), and anti-programmed death-ligand 1 (PD-L1), in the management of NSCLC (38). ICIs have greatly increased OS and are now the accepted standard of care for patients with NSCLC that has spread locally (39,40). Nonetheless, several challenges remain in the clinical application of immunotherapy, including variable efficacy, drug resistance, adverse effects, and the lack of reliable biomarkers (33-36). One major factor contributing to these constraints is the absence of trustworthy prognostic indicators. Single-cell sequencing provides a new perspective for cancer research and helps elucidate tumor pathogenesis, identify novel therapeutic targets and monitor therapeutic effect and prognosis (41,42). This work combines single-cell data from GEO with RNA data from TCGA to investigate the genes that are differentially expressed between immune cells and tumor cells in patients with LUAD. We discovered genes in immune cells that were differently expressed, and we created a 4-gene predictive signature that was confirmed in the GEO cohort. Marker gene enrichment in immune-related pathways was mostly observed, according to GO and KEGG enrichment analysis. In addition, among the risk subgroups, we observed significant differences in immune score, stromal score, immune cell infiltration, immune checkpoint expression pattern. These findings provide new insights for precise and personalized treatment of LUAD patients. The research comprised four prognostic genes: HLA-DRB5, LDHA, ENO1, and TIMP1. Previous studies have shown that HLA-DRB5 is a member of the HLA class II beta chain family and plays an essential role in antigen presentation and immune regulation. HLA-DRB5 has been reported to participate in the presentation of tumor antigens to CD4+ T cells (43). Lactate dehydrogenase A (LDHA), a key enzyme in glucose metabolism, is considered to be a potential target for cancer treatment. LDHA is responsible for the conversion of pyruvate to lactate in the last step of glycolysis. Considering the important biological roles of PDKs and LDHA in glucose metabolism, the authors suggested that simultaneous inhibition of both enzymes would more effectively reverse glucose metabolism from glycolysis to oxidative phosphorylation in the cytoplasm, thereby inducing cancer cell death (44). Enolase 1 (ENO1) is a glycolytic enzyme that plays an important role in various pathological activities, including cancer development. Studies have found that ENO1 promotes tumor growth through AA metabolism mediated by YAP1/PLCB1/HPGD axis in hepatocellular carcinoma (45). In the study of lung cancer, it has been found that targeting eno1 may be an effective new method to overcome drug resistance, and ENO1-targeting liposomes loaded with anticancer drugs have tremendous potential for application as a targeted drug delivery system for cancer therapy (46). TIMP1 belongs to the TIMPs family, and its members are considered to be natural inhibitors of cancer-promoting metalloproteinases. Clinical and functional studies have shown that some of them are associated with poor prognosis and contribute to cancer progression and metastasis (47,48). TIMP1 is confirmed to be a biomarker for early lung cancer detection and prognosis (49). In our prognostic model, HLA-DRB5 was identified as a protective factor, while LDHA, ENO1 and TIMP1 were considered as risk factors. In TCGA LUAD mRNA data, HLA-DRB5 was highly expressed in normal tissues, while LDHA, ENO1 and TIMP1 were highly expressed in tumors. Significantly, we further verified the mRNA expression levels of HLA-DRB5, LDHA, ENO1 and TIMP1 by examining clinical specimens. Meanwhile, the protein expression of these genes was confirmed by HPA database, which was consistent with the mRNA expression level. TMB is considered to be an important factor affecting the prognosis of cancer patients and can also be used as a biomarker for immunotherapy (50). Based on our results, HLA-DRB5 and ENO1 were significantly associated with TMB.

Furthermore, given the important role of tumor-infiltrating immune cells in the TME and their impact on prognosis, we used ssGSEA, ESTIMATE, and CIBERSORT to compare immune infiltration between risk groups. Tumors in the high-risk cohort often had decreased levels of immune cell infiltration and immunological function, suggesting that these tumors are “cold tumors” with minimal antitumor activity (51,52). Tumor cells may be able to avoid immune monitoring and advance tumor growth if there is a decrease in immune cell infiltration. The considerably poorer survival rate of individuals with high-risk LUAD may be explained by this. Immune checkpoint-related genes are highly expressed in low-risk LUAD patients, suggesting that immunotherapy may be more suitable for this population. An antigen-presenting molecule that controls the immune system’s reaction to LUAD is the HLA family (53). The majority of HLA family genes were widely expressed in the low-risk group, according to our analysis, which may indicate a more active local immune response. Lastly, we predicted the two subgroups’ immunological responses to immunotherapy using the TIDE database (54), and found that patients in the high-risk group had higher TIDE, T cell dysfunction, and immune rejection scores, suggesting that these patients were less likely to benefit from immunotherapy. In conclusion, the low-risk cohort’s patients showed increased immune responses and immune cell infiltration, indicating that they would respond better to immunotherapy.

This study integrated single-cell and bulk transcriptomic data to investigate the roles of prognosis-related genes in LUAD and may provide a new strategy for enhancing the clinical effect of immunotherapy. However, this study has many shortcomings, and the data of this study are all from public databases, which may affect the power of the study. Further in vivo experiments are needed to study the results. Second, experimental validation, including cell-based assays and drug-sensitivity analyses, is needed in future studies. it is necessary to explore the mechanism by which the DEGs in immune cells affect immune cells and the potential mechanism between this and the prognosis of LUAD, in order to deepen our knowledge of the illness and create more potent treatment plans.


Conclusions

In this study, we integrated single-cell RNA sequencing with bulk transcriptomic data to identify an immune-associated four-gene signature (HLA-DRB5, LDHA, ENO1, and TIMP1) for LUAD. This signature effectively stratified patients into low- and high-risk groups with significantly different overall survival. High-risk patients exhibited an immune-suppressive tumor microenvironment phenotype, characterized by reduced immune infiltration, attenuated antigen presentation, and higher TIDE scores, suggesting poor response to immune checkpoint inhibition.

Our findings provide candidate biomarkers for prognostic stratification and generate testable hypotheses for immunotherapy-related mechanisms. The signature requires prospective validation in larger cohorts and functional experiments to elucidate the mechanistic roles of these genes. In summary, this four-gene immune-associated signature captures prognostic heterogeneity in LUAD and is linked to distinct TME states, warranting further validation in immunotherapy-treated cohorts.


Acknowledgments

None.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0025/rc

Data Sharing Statement: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0025/dss

Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0025/prf

Funding: This work was supported by the FuZhou Municipal Science and Technology Plan Project (No. 2022-S-029), under the category of FuZhou municipal scientific research funding for medical and health projects.

Conflicts of Interest: Both authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0025/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Medical Research and Clinical Technology Application Subcommittee of the Ethics Committee of the Fujian Provincial Tuberculosis Prevention and Treatment Hospital (No. [2021]024), and all patients provided written informed consent.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Nicholson AG, Tsao MS, Beasley MB, et al. The 2021 WHO Classification of Lung Tumors: Impact of Advances Since 2015. J Thorac Oncol 2022;17:362-387. [Crossref] [PubMed]
  2. Zhang J, IJzerman MJ, Oberoi J, et al. Time to diagnosis and treatment of lung cancer: A systematic overview of risk factors, interventions and impact on patient outcomes. Lung Cancer 2022;166:27-39. [Crossref] [PubMed]
  3. Esposito M, Ganesan S, Kang Y. Emerging strategies for treating metastasis. Nat Cancer 2021;2:258-70. [Crossref] [PubMed]
  4. Chen P, Liu Y, Wen Y, et al. Non-small cell lung cancer in China. Cancer Commun (Lond) 2022;42:937-70. [Crossref] [PubMed]
  5. Leiter A, Veluswamy RR, Wisnivesky JP. The global burden of lung cancer: current status and future trends. Nat Rev Clin Oncol 2023;20:624-39. [Crossref] [PubMed]
  6. Ettinger DS, Wood DE, Aisner DL, et al. NCCN Guidelines® Insights: Non-Small Cell Lung Cancer, Version 2.2023. J Natl Compr Canc Netw 2023;21:340-50. [Crossref] [PubMed]
  7. Gaggero S, Witt K, Carlsten M, et al. Cytokines Orchestrating the Natural Killer-Myeloid Cell Crosstalk in the Tumor Microenvironment: Implications for Natural Killer Cell-Based Cancer Immunotherapy. Front Immunol 2020;11:621225. [Crossref] [PubMed]
  8. Lei X, Lei Y, Li JK, et al. Immune cells within the tumor microenvironment: Biological functions and roles in cancer immunotherapy. Cancer Lett 2020;470:126-33. [Crossref] [PubMed]
  9. Wendler F, Stamp GW, Giamas G. Tumor-Stromal Cell Communication: Small Vesicles Signal Big Changes. Trends Cancer 2016;2:326-9. [Crossref] [PubMed]
  10. WARBURG O. On the origin of cancer cells. Science 1956;123:309-14. [Crossref] [PubMed]
  11. Stadhouders R, Filion GJ, Graf T. Transcription factors and 3D genome conformation in cell-fate decisions. Nature 2019;569:345-54. [Crossref] [PubMed]
  12. Rué P, Martinez Arias A. Cell dynamics and gene expression control in tissue homeostasis and development. Mol Syst Biol 2015;11:792. [Crossref] [PubMed]
  13. Shapouri-Moghaddam A, Mohammadian S, Vazini H, et al. Macrophage plasticity, polarization, and function in health and disease. J Cell Physiol 2018;233:6425-40. [Crossref] [PubMed]
  14. Kieffer Y, Hocine HR, Gentric G, et al. Single-Cell Analysis Reveals Fibroblast Clusters Linked to Immunotherapy Resistance in Cancer. Cancer Discov 2020;10:1330-51. [Crossref] [PubMed]
  15. Wu F, Fan J, He Y, et al. Single-cell profiling of tumor heterogeneity and the microenvironment in advanced non-small cell lung cancer. Nat Commun 2021;12:2540. [Crossref] [PubMed]
  16. Kinker GS, Greenwald AC, Tal R, et al. Pan-cancer single-cell RNA-seq identifies recurring programs of cellular heterogeneity. Nat Genet 2020;52:1208-18. [Crossref] [PubMed]
  17. Lei Y, Tang R, Xu J, et al. Applications of single-cell sequencing in cancer research: progress and perspectives. J Hematol Oncol 2021;14:91. [Crossref] [PubMed]
  18. Shakiba M, Zumbo P, Espinosa-Carrasco G, et al. TCR signal strength defines distinct mechanisms of T cell dysfunction and cancer evasion. J Exp Med 2022;219:e20201966. [Crossref] [PubMed]
  19. Wei J, Bu Z, Ji J. A commentary on: "A pan-cancer single-cell transcriptional atlas of tumor infiltrating myeloid cells" - tumor microenvironment: the Achilles heel of cancer. Med Rev (2021) 2021;1:126-8. [Crossref] [PubMed]
  20. Qian J, Olbrecht S, Boeckx B, et al. A pan-cancer blueprint of the heterogeneous tumor microenvironment revealed by single-cell profiling. Cell Res 2020;30:745-62. [Crossref] [PubMed]
  21. Liu J, Lichtenberg T, Hoadley KA, et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell 2018;173:400-416.e11. [Crossref] [PubMed]
  22. Clough E, Barrett T. The Gene Expression Omnibus Database. Methods Mol Biol 2016;1418:93-110. [Crossref] [PubMed]
  23. Hao Y, Stuart T, Kowalski MH, et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat Biotechnol 2024;42:293-304. [Crossref] [PubMed]
  24. Aran D, Looney AP, Liu L, et al. Bhattacharya M. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat Immunol 2019;20:163-172. [Crossref] [PubMed]
  25. Zhang X, Lan Y, Xu J, et al. CellMarker: a manually curated resource of cell markers in human and mouse. Nucleic Acids Res 2019;47:D721-8. [Crossref] [PubMed]
  26. Smyth GK. Limma: linear models for microarray data. In: Gentleman R, Carey V, Dudoit S, et al. editors. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. New York: Springer; 2005:397-420.
  27. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010;26:139-40. [Crossref] [PubMed]
  28. Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000;25:25-9.
  29. Kanehisa M, Furumichi M, Tanabe M, et al. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res 2017;45:D353-61. [Crossref] [PubMed]
  30. Linden A, Yarnold PR. Modeling time-to-event (survival) data using classification tree analysis. J Eval Clin Pract 2017;23:1299-308. [Crossref] [PubMed]
  31. Nagashima K, Sato Y. Information criteria for Firth's penalized partial likelihood approach in Cox regression models. Stat Med 2017;36:3422-36. [Crossref] [PubMed]
  32. Tibshirani R. The lasso method for variable selection in the Cox model. Stat Med 1997;16:385-95. [Crossref] [PubMed]
  33. Huang Y, Li W, Macheret F, et al. A tutorial on calibration measurements and calibration models for clinical prediction models. J Am Med Inform Assoc 2020;27:621-33. [Crossref] [PubMed]
  34. Jiang P, Gu S, Pan D, et al. Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response. Nat Med 2018;24:1550-8. [Crossref] [PubMed]
  35. Cheng Q, Duan W, He S, et al. Multi-Omics Data Integration Analysis of an Immune-Related Gene Signature in LGG Patients With Epilepsy. Front Cell Dev Biol 2021;9:686909. [Crossref] [PubMed]
  36. Liu S, Wang Y, Miao C, et al. High expression of CDCA7 predicts poor prognosis for clear cell renal cell carcinoma and explores its associations with immunity. Cancer Cell Int 2021;21:140. [Crossref] [PubMed]
  37. Thorsson V, Gibbs DL, Brown SD, et al. The Immune Landscape of Cancer. Immunity 2018;48:812-830.e14. [Crossref] [PubMed]
  38. Lemjabbar-Alaoui H, Hassan OU, Yang YW, et al. Lung cancer: Biology and treatment options. Biochim Biophys Acta 2015;1856:189-210. [Crossref] [PubMed]
  39. Ye Y, Wang J, Hu Q, et al. Synergistic Transcutaneous Immunotherapy Enhances Antitumor Immune Responses through Delivery of Checkpoint Inhibitors. ACS Nano 2016;10:8956-63. [Crossref] [PubMed]
  40. Mahoney KM, Freeman GJ, McDermott DF. The Next Immune-Checkpoint Inhibitors: PD-1/PD-L1 Blockade in Melanoma. Clin Ther 2015;37:764-82. [Crossref] [PubMed]
  41. Nam AS, Chaligne R, Landau DA. Integrating genetic and non-genetic determinants of cancer evolution by single-cell multi-omics. Nat Rev Genet 2021;22:3-18. [Crossref] [PubMed]
  42. Bejarano L, Jordāo MJC, Joyce JA. Therapeutic Targeting of the Tumor Microenvironment. Cancer Discov 2021;11:933-59. [Crossref] [PubMed]
  43. Brightman SE, Naradikian MS, Thota RR, et al. Tumor cells fail to present MHC-II-restricted epitopes derived from oncogenes to CD4+ T cells. JCI Insight 2023;8:e165570. [Crossref] [PubMed]
  44. Feng Y, Xiong Y, Qiao T, et al. Lactate dehydrogenase A: A key player in carcinogenesis and potential target in cancer therapy. Cancer Med 2018;7:6124-36. [Crossref] [PubMed]
  45. Sun L, Suo C, Zhang T, et al. ENO1 promotes liver carcinogenesis through YAP1-dependent arachidonic acid metabolism. Nat Chem Biol 2023;19:1492-503. [Crossref] [PubMed]
  46. Wu CH, Kuo YH, Hong RL, et al. α-Enolase-binding peptide enhances drug delivery efficiency and therapeutic efficacy against colorectal cancer. Sci Transl Med 2015;7:290ra91. [Crossref] [PubMed]
  47. Radisky ES, Raeeszadeh-Sarmazdeh M, Radisky DC. Therapeutic Potential of Matrix Metalloproteinase Inhibition in Breast Cancer. J Cell Biochem 2017;118:3531-48. [Crossref] [PubMed]
  48. Eckfeld C, Häußler D, Schoeps B, et al. Functional disparities within the TIMP family in cancer: hints from molecular divergence. Cancer Metastasis Rev 2019;38:469-81. [Crossref] [PubMed]
  49. Dantas E, Murthy A, Ahmed T, et al. TIMP1 is an early biomarker for detection and prognosis of lung cancer. Clin Transl Med 2023;13:e1391. [Crossref] [PubMed]
  50. Miao Y, Wang J, Li Q, et al. Prognostic value and immunological role of PDCD1 gene in pan-cancer. Int Immunopharmacol 2020;89:107080. [Crossref] [PubMed]
  51. Zhang J, Huang D, Saw PE, et al. Turning cold tumors hot: from molecular mechanisms to clinical applications. Trends Immunol 2022;43:523-45. [Crossref] [PubMed]
  52. Bonaventura P, Shekarian T, Alcazer V, et al. Cold Tumors: A Therapeutic Challenge for Immunotherapy. Front Immunol 2019;10:168. [Crossref] [PubMed]
  53. Li D, Bentley C, Yates J, et al. Engineering chimeric human and mouse major histocompatibility complex (MHC) class I tetramers for the production of T-cell receptor (TCR) mimic antibodies. PLoS One 2017;12:e0176642. [Crossref] [PubMed]
  54. Zhang B, Wu Q, Li B, et al. m(6)A regulator-mediated methylation modification patterns and tumor microenvironment infiltration characterization in gastric cancer. Mol Cancer 2020;19:53. [Crossref] [PubMed]
Cite this article as: Fang M, Zheng X. Analyses of single-cell and bulk RNA-sequencing datasets identify biomarker-based molecular subtypes and a prognostic signature in lung adenocarcinoma. Transl Cancer Res 2026;15(4):234. doi: 10.21037/tcr-2026-1-0025

Download Citation