Single-cell analysis of UNC13D-mediated immune and dedifferentiation heterogeneity in acute myeloid leukemia and development of a prognostic model
Highlight box
Key findings
• UNC13D is identified as a key regulator linking immune modulation and dedifferentiation in acute myeloid leukemia (AML). An eight-gene MYC-pathway-based prognostic model (PRDX4, KPNB1, DEK, ABCE1, ODC1, GLO1, MCM5, CCNA2) demonstrated robust survival prediction across cohorts.
What is known and what is new?
• While AML heterogeneity and dedifferentiation drive relapse, the molecular regulators remain elusive. UNC13D, a vesicle-fusion mediator essential for cytotoxic granule exocytosis in immune cells, is identified here as a novel marker linking immune signaling and differentiation plasticity through integrative single-cell and bulk analyses.
What is the implication, and what should change now?
• UNC13D may serve as a biomarker of leukemic plasticity and immune evasion. The derived prognostic model provides a potential tool for patient stratification and individualized therapy optimization in AML.
Introduction
Acute myeloid leukemia (AML) is a biologically complex and heterogeneous hematologic malignancy characterized by clonal expansion of hematopoietic stem/progenitor cells with impaired differentiation and uncontrolled proliferation. Although AML accounts for only 1% of all cancer cases, it represents the most common form of leukemia in adults (1). The median age at diagnosis is approximately 68 years, and the disease is associated with poor outcomes: the estimated 5-year overall survival (OS) is 32%, with younger patients faring better (up to 50%) compared to those over 60 years old (less than 10%) (2). The challenges in AML diagnosis, treatment, and prognosis are largely driven by its extensive biological heterogeneity. Intratumoral heterogeneity, defined by the coexistence of leukemic subpopulations with distinct molecular and functional properties, contributes significantly to drug resistance and relapse. Previous studies relying on bulk transcriptomic data have limited resolution and often obscure the specific characteristics of diverse leukemic subpopulations, thereby limiting mechanistic insights (3).
The UNC13D gene is predominantly expressed in hematopoietic cells and encodes a key regulatory protein, Munc13-4, which plays an essential role in vesicular trafficking and cytotoxic granule exocytosis in immune cells. Munc13-4 mediates the calcium-dependent fusion of cytolytic granules with the plasma membrane. Functional deficiency of this protein impairs the release of cytotoxic mediators such as perforin and granzymes, thereby disrupting the cytolytic activity of natural killer (NK) cells and cytotoxic T lymphocytes (CTLs). This impairment underlies the pathogenesis of familial hemophagocytic lymphohistiocytosis (FHL) type 3 (FHL3). Defective cytotoxicity in immune effector cells compromises immune surveillance and increases susceptibility to malignancies (4-6). While mutations in UNC13D have been reported in patients with FHL who later progressed to acute leukemia (7), the role of UNC13D in AML remains poorly understood, particularly with regard to its influence on immune cell differentiation and the heterogeneity of the tumor microenvironment. Addressing this gap could provide novel perspectives for understanding AML biology and improving prognostic assessment.
In AML, dedifferentiation refers to the process whereby leukemic cells that have already undergone differentiation revert to an immature state with stem cell-like properties, thereby regaining proliferative and tumorigenic capacity. This phenomenon highlights the high degree of differentiation plasticity in AML cells. The leukemia stem cell (LSC) model proposed by Bonnet et al. captures the intrinsic heterogeneity of AML: LSCs are positioned at the apex or the origin of the hierarchical differentiation trajectory, and are associated with drug resistance, relapse, and poor prognosis. They recapitulate the normal hematopoietic stem cell (HSC)-to-myeloid differentiation axis, exhibiting diverse maturation stages including HSCs, multipotent progenitors (MPPs), and granulocyte-macrophage progenitors (GMPs). Notably, AML cells that fail to achieve full terminal differentiation following induction therapy may retain the potential to dedifferentiate and reinitiate leukemogenesis upon treatment cessation (8-10). In addition, suppression or downregulation of differentiation-associated genes, such as the key myeloid transcription factor PU.1, may also induce the reversion of differentiated AML cells to a less mature, leukemogenic state (11,12). Consequently, monotherapy strategies aimed solely at inducing differentiation are often insufficient to eradicate AML, necessitating combinatorial approaches that target multiple differentiation stages to block dedifferentiation pathways and achieve durable remission.
In summary, the dedifferentiation characteristics of leukemic cells are closely linked to their proliferative potential, drug resistance, and malignant phenotype, with substantial interpatient heterogeneity. The coexistence of dedifferentiation and immune evasion traits may reflect the dynamic adaptation and evolution of tumor cells within the immune microenvironment. However, the regulatory mechanisms governing this process and its clinical significance remain poorly understood.
This study analyzed single-cell transcriptomic data from AML patients to characterize the differentiation landscape and intratumoral heterogeneity, identify key genes involved in dedifferentiation and immune regulation, and focus on the potential role of UNC13D in modulating immune escape and dedifferentiated states. By integrating bulk RNA sequencing data from The Cancer Genome Atlas (TCGA) and an external validation cohort, we further constructed and validated a prognostic risk model incorporating both immune and dedifferentiation signatures, and explored its biological underpinnings through cell-cell communication analysis and pathway enrichment (Figure 1). These findings may provide novel theoretical insights and therapeutic targets for the pathogenesis, early diagnosis, and individualized treatment of AML. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2307/rc).
Methods
Data acquisition and preprocessing
The human AML single-cell RNA sequencing (scRNA-seq) dataset GSE178910, generated using the 10× Genomics platform, was obtained from Gene Expression Omnibus (GEO) in matrix format. Expression matrices were imported into Seurat (v5.3.0) via Read10X() and initial Seurat objects were created. Features expressed in fewer than 3 cells or with fewer than 200 detected genes were excluded. Bulk RNA-seq data for AML were sourced from TCGA-LAML and the Oregon Health & Science University (OHSU) cohort (13). The training cohort comprised RNA-seq profiles from the TCGA-LAML dataset {Illumina HiSeqV2 platform, log2[RNA-Seq by Expectation-Maximization (RSEM) + 1] normalized counts} and corresponding survival data from the TCGA Pan-Cancer survival files. Expression and clinical information were merged by matched sample identifiers using an inner join, retaining only samples with both gene expression and OS data. Cases with missing OS time, missing survival status, or nonpositive OS time (≤0) were excluded. The expression matrix provided by TCGA was already normalized and contained no missing values after quality control; therefore, no single or multiple imputation was performed. The independent validation cohort was obtained from the OHSU Beat AML project, consisting of reads per kilobase of transcript per million mapped reads (RPKM)-normalized RNA-seq data and corresponding clinical annotations generated on the Illumina HiSeq 2500 platform. Expression and clinical data were processed following the same pipeline as in the training cohort. Survival outcome was defined as OS, with status coded as Alive =0 and Dead =1. Samples with missing or invalid OS data were excluded, and the expression matrix contained no missing values after preprocessing. Complete-case analysis was applied consistently across both cohorts for all subsequent modeling procedures. Both the TCGA-LAML and OHSU AML datasets were accessed on April 25, 2025, from their respective public repositories. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Quality control and cell filtering
To eliminate low-quality cells and technical noise, metrics including number of detected genes per cell (nFeature_RNA), total unique molecular identifier (UMI) counts (nCount_RNA), percentage of mitochondrial genes (percent.mt, genes prefixed with “MT-”), ribosomal gene percentage (percent.ribo, genes prefixed with “RPS” or “RPL”), and hemoglobin gene proportion (percent.hb, including HBB, HBA1, HBA2, HBG1, HBG2) were computed. FeatureScatter() and violin plots (VlnPlot()) guided threshold selection. Filtering criteria applied: nFeature_RNA: 200–5,000; nCount_RNA: 500–40000; percent.mt <7%; percent.ribo <50%; percent.hb <5%. Retained cells were normalized [NormalizeData()) and standardized (ScaleData()], resulting in a high-quality single-cell dataset.
Doublet detection and removal
Potential doublets were identified using scDblFinder (v1.20.2). Seurat objects were converted to SingleCellExperiment format, processed via scDblFinder(), and resulting singlet/doublet assignments along with scores reincorporated into the Seurat metadata. Only cells classified as “singlet” were retained for downstream analyses to ensure data integrity and analysis accuracy.
Multi-sample integration and clustering
Individual Seurat objects were merged using merge(), followed by normalization and identification of highly variable features. Data were scaled [ScaleData()], and principal component analysis (PCA) was performed for dimensionality reduction. Batch effects across samples were corrected using Harmony (RunHarmony) based on the sample origin metadata. The top 30 Harmony-adjusted principal components were used to construct a nearest-neighbor graph (FindNeighbors), followed by clustering (FindClusters) at a resolution of 0.5. Cell-cell communication analysis was performed using CellChat, and pseudotime trajectory inference was conducted with Monocle3.
Construction of a MYC pathway-based prognostic model and feature selection
To construct a MYC pathway-related prognostic model, genes from the HALLMARK_MYC_TARGETS_V1 gene set—the top-ranked pathway most strongly associated with high UNC13D expression—were selected. Univariate Cox proportional hazards regression was performed to identify candidate genes significantly correlated with OS. Based on the expression profiles of these survival-associated genes, consensus clustering analysis was conducted to stratify patients into subtypes with distinct prognostic outcomes, thereby identifying the dominant MYC-active subgroup. Differential gene expression analysis was subsequently performed between identified subtypes, yielding a subset of differentially expressed genes (DEGs). These DEGs were subjected to least absolute shrinkage and selection operator (LASSO)-Cox regression modeling using the glmnet package. The optimal penalization parameter (λ) was determined via 10-fold cross-validation to ensure model stability and generalizability. The final prognostic model incorporated eight feature genes: PRDX4, KPNB1, DEK, ABCE1, ODC1, GLO1, MCM5, and CCNA2. Each gene’s LASSO-derived coefficient was used to compute individual risk scores using the following formula:
Where βi represents the LASSO regression coefficient and Expri denotes the expression level of gene i. Patients were stratified into high- and low-risk groups based on the median risk score.
Model visualization and clustering stability assessment
PCA was applied to both training and validation cohorts to visualize dimensionality reduction and assess the separation of high- and low-risk groups in expression space. Risk heatmaps, risk score distribution plots, and survival status scatter plots were generated to provide intuitive visualizations of the differential expression patterns of model genes and associated clinical outcomes across risk groups.
Survival analysis and validation
Kaplan-Meier survival curves were constructed for both the training and validation cohorts. Statistical significance between survival curves of high- and low-risk groups was assessed using the log-rank test. The model demonstrated a significant ability to stratify patients by OS in both cohorts, supporting its robustness and generalizability.
Evaluation of predictive performance
Time-dependent receiver operating characteristic (ROC) curves were generated at 1-, 2-, and 3-year time points using the timeROC package. The corresponding area under the curve (AUC) values were calculated to quantify predictive accuracy. In parallel, Harrell’s concordance index (C-index) was computed to assess overall predictive consistency. AUC and C-index values were reported and compared between training and validation cohorts.
Dynamic AUC curve analysis
A sliding time window (ranging from 100 to 1,095 days, step size =30 days) was used to calculate dynamic AUC curves, providing a longitudinal view of the model’s predictive stability across the entire follow-up period. Both training and validation cohorts exhibited persistently high AUC values over time, indicating strong and sustained predictive performance.
Decision curve analysis (DCA)
DCA was carried out by means of the rmda package to evaluate the net clinical benefit of the prognostic model across a range of risk thresholds. DCA curves were plotted to compare the net benefit of the model against the “treat-all” and “treat-none” strategies. The results demonstrated the model’s clinical utility in decision-making scenarios across varying levels of predicted risk.
Calibration curve analysis
To assess the calibration performance of the prognostic model at a specific time point, a calibration analysis was performed at the 1-year survival time using the calibrate function from the rms package. Kaplan-Meier estimates were used to calculate observed survival probabilities (cmethod = “KM”), and internal validation was conducted with 1,000 bootstrap resampling iterations (B =1,000). Smoothing was applied by setting the group number to 50 (m =50). The resulting calibration plot illustrated the agreement between predicted probabilities and observed survival outcomes, supporting the model’s reliability in short-term survival prediction.
Statistical analysis
All statistical analyses were performed using R software (version 4.4.3). For single-cell analyses, statistical comparisons of gene expression and pathway activity were conducted using methods implemented in the respective R packages as described above. Correlations were assessed using Pearson correlation analysis. Survival analyses were performed using Kaplan-Meier methods with log-rank tests. Univariate and multivariate Cox proportional hazards regression models were applied to identify survival-associated genes. The LASSO Cox regression was used for feature selection and model construction. Time-dependent ROC curves were used to evaluate predictive performance, and the AUC and C-index were calculated. DCA was conducted to assess clinical net benefit. Unless otherwise specified, all statistical tests were two-sided, and a P value <0.05 was considered statistically significant.
Results
Identification of AML tumor subpopulations exhibiting heterogeneous differentiation features based on scRNA-seq data
This study utilized the publicly available AML scRNA-seq dataset GSE178910 from the GEO database, with each sample sequenced using the 10× Genomics platform. Details of the cell quality control workflow are provided in Figure 2. To further delineate the heterogeneity in dedifferentiation features in AML, 11 transcriptionally distinct cell clusters (Clusters 0–10) were pinpointed based on transcriptomic profiles (Figure 3).
To preliminarily explore the developmental trajectories of different cell clusters, a panel of marker genes representing key stages of hematopoietic differentiation was compiled: HSCs: CD34, AVP, PROM1, MEIS1; MPPs: KIT, CD38, SOX4; common myeloid progenitors (CMPs)/GMPs: CEBPA, MPO, ITGAM; terminally differentiated cells (Diffs): CD14, CD68, S100A9. Using FeaturePlot on uniform manifold approximation and projection (UMAP) coordinates (Figure 4), the expression of these markers across clusters was visualized. Notably, CD34 and PROM1 were enriched in specific clusters, indicating retention of stem-like properties. Genes such as ELANE, AZU1, and CEBPA showed variable expression across clusters, suggesting differences in developmental hierarchy or immune function. High expression of CD14 and S100A9 in certain populations indicated a trend toward terminal differentiation. These distribution characteristics indicate that there are multiple cell populations with different developmental stages and lineage tendencies in the AML samples, providing important reference basis for subsequent annotation.
To further assess the biological significance of each cluster, the module scores for five differentiation stages [HSC, MPP, CMP, GMP, and terminal differentiation (Diff)] were calculated using the AddModuleScore function. Each cell was assigned a developmental label based on the highest score among the five. For example, in Cluster 0, the Diff score was the highest (n=1,359), compared to GMP (n=55) and CMP (n=38), indicating that this population primarily comprised terminally differentiated cells (Figure 5A). Expression plots of stage-specific markers within Cluster 0 (Figure 5B-5F) confirmed the high expression of CD14, CD68, and S100A9, further validating its differentiated identity. The same approach was applied to the remaining 10 clusters [see online figure 2 (available at https://cdn.amegroups.cn/static/public/tcr-2025-aw-2307-2.pdf)], allowing for functional annotation of each subpopulation, which will be further validated through trajectory inference and enrichment analyses. These results collectively demonstrate that AML comprises multiple subpopulations at different developmental stages, with distinct dedifferentiation characteristics, underscoring its intratumoral heterogeneity.
Expression landscape and functional relevance of UNC13D across AML developmental subtypes
Following the annotation of cellular subpopulations, we categorized the 11 clusters into five major functional AML subtypes: HSC-like, HSC/MPP-like, CMP-like, GMP-like, and differentiated AML. Their spatial distribution in the UMAP embedding is presented in Figure 6A. UNC13D has been implicated in tumor occurrence and progression (14,15). However, its role in governing dedifferentiation processes and the intratumoral heterogeneity of AML has not yet been elucidated. Therefore, the association between UNC13D expression and dedifferentiation phenotypes in AML was systematically investigated. FeaturePlot analysis demonstrated that UNC13D expression was enriched in cell populations proximal to the intermediate differentiation stage (CMP-like) (Figure 6B). Violin plots further indicated elevated expression levels of UNC13D in Clusters 3, 5, and 8 (Figure 6C), with Clusters 3 and 5 previously annotated as CMP-like AML. Both DotPlot and violin plots showed that UNC13D expression was highest in the CMP-like AML subtype and markedly lower in early hematopoietic stem-like populations (Figure 6D,6E). Marker gene DotPlots (Figure 6F) and module score heatmaps (Figure 6G) further confirmed that these expression differences closely aligned with hematopoietic differentiation stages. To quantify the relationship between UNC13D expression and differentiation status, we calculated its mean expression level per cluster and performed Pearson correlation analyses against module scores corresponding to the five functional categories (HSC, MPP, CMP, GMP, and Diff) (Figure 6H). The results revealed the strongest positive correlation between UNC13D and CMP module scores (R=0.449, P=0.16), followed by MPP (R=0.427) and HSC (R=0.407). No significant correlation was observed with Diff scores (R=−0.089), suggesting that UNC13D may function primarily in intermediate stages of hematopoietic differentiation, particularly within CMP-like cells, rather than being restricted to the terminally differentiated compartment.
Cell-cell communication analysis reveals distinct interaction patterns of CMP-like AML subpopulation
To explore the intercellular communication mechanisms among AML subtypes, we constructed a CellChat model based on the five annotated functional categories: HSC-like, HSC/MPP-like, CMP-like, GMP-like, and differentiated AML. The global interaction network revealed extensive communication among these cell types (Figure 7A), with differentiated AML and CMP-like AML exhibiting the highest number of interactions. Analysis of interaction strength further indicated variable pathway activity levels despite widespread signaling across cell types (Figure 7B). Then the incoming (receptor) and outgoing (ligand) signal flux for each subtype was quantified and visualized using bar plots and heatmaps (Figure 7C,7D). HSC-like and HSC/MPP-like AML subtypes primarily acted as signal senders, while differentiated AML cells were more inclined toward signal reception, suggesting a reliance on extrinsic cues for microenvironmental modulation. Bubble plots illustrated the source-target relationships of all major signaling pathways (Figure 7E), highlighting active transmission of major histocompatibility complex class II (MHC-II), CD99, MIF, LGALS9 (Galectin-9), and PECAM1 signals across AML differentiation stages. While Figure 7C,7D summarize pathway-level aggregated incoming and outgoing signaling roles across AML subtypes, Figure 7E depicts significant ligand-receptor interactions at the communication pair level; therefore, weaker or non-significant interactions may be less apparent in the bubble plot representation. Immune-related pathways such as IL-16, MIF-CD74, and APP-CD74 were also predominantly active between terminally differentiated and GMP-like AML cells, implying a potential immunosuppressive regulatory axis in AML. Finally, cluster-specific outgoing and incoming communication maps (Figure 7F-7O) were generated, which further delineated the distinct signaling roles of AML subtypes within the communication network. For instance, HSC-like AML subtypes mainly functioned as unidirectional signal senders, whereas CMP-like and differentiated AML subtypes served as primary receivers, integrating extracellular signals into their cellular states.
CMP-like AML subpopulations exhibit pronounced biological heterogeneity
To evaluate signaling activity at the single-cell level, AUCell (AUC-based cell-level scoring) scores were computed for each cell based on the Hallmark gene sets from Molecular Signatures Database (MSigDB). The results revealed marked heterogeneity in pathway activation across AML subpopulations at different differentiation stages (Figure 8A). For instance, HSC-like AML cells were enriched in stemness-related pathways such as WNT/β-catenin signaling and Notch signaling, whereas differentiated AML populations exhibited pronounced activation of stress- and metabolism-related pathways, including tumor necrosis factor alpha/nuclear factor kappa B (TNFA/NF-κB), apoptosis, and glycolysis, suggesting a shift toward stress responses and metabolic reprogramming. Cell cycle analysis demonstrated significant differences in the distribution of cells across G1 (gap 1), S (synthesis), and G2M (gap 2/mitosis) phases among AML subgroups (Figure 8B). CMP-like AML cells displayed heightened proliferative activity, with a significantly higher proportion of cells in S and G2M phases compared to other subpopulations, indicating elevated proliferative potential. To exclude potential population size bias, the proportional distribution of cell cycle phases within each AML subtype was further examined, confirming the enrichment of S and G2M phase cells in the CMP-like AML population (Figure 8C). Trajectory inference indicated a continuous spectrum of AML cell differentiation, with UNC13D expression peaking in intermediate states. Pseudotime analysis using Monocle3 (Figure 8D,8E), anchored at the CMP-like AML population, delineated a dynamic transition from CMP/GMP-like to differentiated AML states, supporting the existence of multiple developmental stages at the single-cell level. Correlation analysis integrating AUCell pathway scores and UNC13D expression revealed positive associations between UNC13D and several intermediate metabolism and stress-response pathways, including MYC targets, MTORC1 signaling, reactive oxygen species (ROS) pathway, and the unfolded protein response (UPR). A heatmap was generated to illustrate the expression patterns of UNC13D and representative genes from these pathways across pseudotime stages (Figure 8F). UNC13D expression peaked during the intermediate (Mid) stage and co-expressed with key regulatory genes such as ATF4, RPS6, EIF4EBP1, SOD2, and XBP1, suggesting a potential role for UNC13D in bridging stress responses and transcriptional regulation during AML development.
UNC13D expression is strongly associated with AML dedifferentiation features
The pseudotemporal trajectory revealed co-upregulation of UNC13D and multiple pathway-related genes in intermediate AML states. To further explore the functional implications of UNC13D in AML progression, joint expression trends were analyzed along pseudotime for UNC13D and representative genes from 15 highly correlated Hallmark pathways (e.g., MYC targets, oxidative phosphorylation, UPR, and ROS pathways). A total of 46 key genes were selected for analysis, followed by Z-score normalization and three-layer visualization: dot plots of differentiation states, violin plots of expression distribution, and scatter plots with locally estimated scatterplot smoothing (LOESS) regression to capture pseudotemporal trends. As illustrated in Figure 9, UNC13D showed peak expression in State 2, alongside stress-response genes such as XBP1, HSPA5, and ATF4. Concurrent upregulation of proliferation-associated genes (e.g., MYC, MKI67, PCNA) was also observed in this intermediate state. Together, these observations indicate that UNC13D expression is closely coordinated with stress-response and proliferative programs during the intermediate phase of AML differentiation. Rather than implying a direct regulatory role, the co-expression patterns suggest that UNC13D marks a cellular state characterized by enhanced metabolic activity and stress adaptation.
Prognostic modeling based on MYC pathway reveals survival-associated driver genes
Several MYC pathway-related genes are significantly associated with OS in LAML patients, and clustering based on their expression effectively stratifies high- and low-risk populations. Integrating results from single-cell analysis, UNC13D was found to be most highly expressed in the CMP-like AML subpopulation, which also exhibited the highest MYC pathway activity. Based on this observation, 200 genes from the HALLMARK_MYC_TARGETS_V1 gene set were selected for univariate Cox regression analysis using TCGA-LAML bulk transcriptomic data. A total of 22 genes were identified as significantly associated with OS (P<0.05). Consensus clustering based on the expression profiles of these genes (K =3) classified patient samples into three distinct subgroups. Kaplan-Meier survival analysis revealed significant differences in OS among the clusters (P<0.001; Figure 10A). Differential expression analysis between Cluster 2 and Cluster 1 identified 13 key genes [|log2fold change (FC)| >0.5, adjusted P<0.05] (Figure 10B), which were subsequently used for risk model construction. Notably, Cluster 2 exhibited the highest MYC pathway activity (Figure 10C). Pairwise comparisons revealed a statistically significant difference between Clusters 1 and 2 (P=0.03), which supported their selection for downstream differential expression analysis. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of these genes indicated significant involvement in essential biological processes such as DNA replication, double-strand break repair, and cell cycle regulation, supporting their potential roles as drivers of AML proliferation and progression (Figure 10D,10E). A multigene prognostic model was then constructed using LASSO-Cox regression based on the above DEGs (Figure 10F-10L). Ten-fold cross-validation was used to determine the optimal λ value, ultimately identifying eight prognostically relevant genes. Among them, PRDX4, KPNB1, DEK, GLO1, and MCM5 were positively correlated with risk (risk factors), while ABCE1, ODC1, and CCNA2 were negatively correlated (protective factors) (see Table S1). Hazard ratios (HRs) and 95% confidence intervals (CIs) for each gene were visualized using a forest plot (Figure 10M). The minimal cross-validation error and coefficient trajectories further supported the robustness of the model (Figure 10N,10O).
Validation of the prognostic risk scoring model in training and validation cohorts
To further assess the clinical utility of the 8-gene risk scoring model, this study performed multidimensional validation analyses in both the training cohort (TCGA-LAML) and an independent validation cohort (OHSU). First, patients were stratified into high- and low-risk groups based on the median risk score. A heatmap of gene expression revealed that the high-risk group exhibited generally elevated expression levels of most model genes (Figure 11A). The corresponding risk score distribution plot showed a marked increase in risk scores among high-risk patients (Figure 11B), and the survival status distribution plot indicated that patients in the high-risk group tended to have shorter survival durations and higher mortality (Figure 11C). In the validation cohort, the same risk stratification showed a consistent direction of association, although the separation between risk groups was less pronounced compared with the training cohort (Figure 11D-11F). Kaplan-Meier survival analysis further confirmed the prognostic power of the risk score: in both the training and validation cohorts, patients in the high-risk group had significantly shorter OS compared to those in the low-risk group (training set: P<0.001; validation set: P=0.02; Figure 11G,11H), with a reduced effect size observed in the validation cohort. PCA revealed clear transcriptional separation between the high- and low-risk groups (Figure 11I,11J). To evaluate predictive performance, time-dependent ROC curve analysis was conducted. In the training set, the AUCs at 1, 2, and 3 years were 0.697, 0.742, and 0.743, respectively (Figure 11K). In contrast, the validation cohort showed lower AUC values (0.560, 0.578, and 0.640 at 1, 2, and 3 years, respectively; Figure 11L), indicating a more modest predictive performance while preserving the same risk direction. ROC curves at the 1-year time point allow for a clear comparison of AUC values between the training and validation sets (Figure 11M). The dynamic AUC curves further illustrate the temporal performance of the model in both datasets (Figure 11N). DCA indicated that across a wide range of threshold probabilities, the model provided greater net clinical benefit compared to either the “treat all” or “treat none” strategies (Figure 11O). Finally, calibration curves showed good agreement between predicted and observed survival probabilities, particularly for 1-year survival prediction (Figure 11P).
Functional enrichment analysis of high- and low-risk groups based on the LASSO model
To further explore the functional differences between high- and low-risk AML patients, this study conducted gene set enrichment analysis (GSEA) using the gene expression profiles from the training cohort. GO enrichment revealed that the high-risk group was significantly enriched for nucleotide metabolism processes (e.g., “ATP biosynthetic process” and “oxidative phosphorylation”) and antigen presentation pathways (e.g., “MHC class II complex assembly”). In contrast, the low-risk group was enriched for signal perception and regulatory pathways, such as “taste perception” and “kinetochore organization”. KEGG pathway enrichment analysis demonstrated that the high-risk group was associated with classic oncogenic or inflammation-related pathways, including “proteasome”, “ribosome”, and “viral infection”. Hallmark pathway analysis further showed that the high-risk group exhibited activation of pathways such as “MYC targets”, “TNF-α signaling”, and “oxidative phosphorylation”, whereas the low-risk group was enriched for pathways involved in cell cycle homeostasis, including “mitotic spindle” and “G2M checkpoint”. These findings suggest that the proposed risk scoring system not only stratifies patients prognostically but also reflects underlying biological differences relevant to AML progression and malignancy (Figure 12).
Discussion
Normal hematopoiesis is hierarchically organized, initiated at the apex by HSCs, which reside in the bone marrow and possess self-renewal capacity. These HSCs give rise to MPPs, which subsequently differentiate into lineage-committed hematopoietic and immune cells. For instance, MPPs differentiate into common myeloid progenitors (CMPs), which further give rise to GMPs. GMPs eventually produce fully differentiated and functional cells such as neutrophils and macrophages (16). AML is considered a malignant disruption of normal hematopoiesis at the stem cell level. AML clones demonstrate a hierarchical structure that parallels normal hematopoietic differentiation. Through the sequential acquisition of genetic mutations, HSCs or MPPs may give rise to pre-leukemic stem cells (pre-LSCs), wherein HSCs acquire enhanced proliferative potential and MPPs gain self-renewal capacity. Pre-LSCs further evolve into LSCs—also referred to as leukemic initiating cells (LICs) or leukemic propagating cells (LPCs)—characterized by CD34⁺CD38⁻ phenotype, unlimited self-renewal, leukemia-maintaining capacity, and the ability to recapitulate the disease upon transplantation into immunodeficient hosts (16,17). AML progression involves the massive proliferation and accumulation of undifferentiated leukemic blasts. Notably, even terminally differentiated cells derived from myeloid leukemic blasts may undergo dedifferentiation into an immature leukemic state and reinitiate disease propagation (8). This was demonstrated in a reversible PU.1-knockdown murine AML model, in which reactivation of PU.1—a key transcription factor promoting myeloid differentiation—induced leukemic differentiation and remission. Conversely, re-suppression of PU.1 restored clonogenic potential and tumorigenicity. Similarly, all-trans retinoic acid (ATRA) was shown to induce differentiation in human acute promyelocytic leukemia (APL) cell lines (NB4 and HT93); upon ATRA withdrawal, the differentiated cells regained clonogenicity. Primary human APL samples also exhibited induced differentiation after ATRA treatment, which regressed after drug discontinuation. Transcriptomic analysis revealed that PU.1 activation upregulated thousands of differentiation-related genes (e.g., Itgam/CD11b, Jak2, Stat3), with approximately 60% of them reverting to a pre-differentiated state within 4 days of PU.1 suppression. Chromatin accessibility profiling further confirmed the bidirectional regulatory capacity of this transcription factor (11). These findings suggest that AML cells exhibit phenotypic and functional plasticity, with reversible differentiation states. Therefore, characterizing AML subpopulations at distinct differentiation stages may provide deeper insights into leukemogenesis and support personalized therapeutic strategies.
Utilizing the publicly available scRNA-seq dataset GSE178910, single-cell transcriptomic profiling was conducted across four AML samples. A total of 11 distinct cellular subpopulations were identified. Given the phenotypic similarity between AML clonal architecture and normal hematopoietic differentiation hierarchies, functional annotation was performed by scoring the expression of canonical marker genes corresponding to various hematopoietic differentiation stages. This led to the identification of five functional AML subgroups: HSC-like, HSC/MPP-like, CMP-like, GMP-like, and differentiated clusters. Focusing on the gene of interest UNC13D, its expression was found to be highest and most specific in the CMP-like subpopulation. Cell-cell communication analysis revealed highly active intercellular signaling through key ligand-receptor axes, particularly CD99, PPIA-BSG, and MIF-(CD74 + CXCR4)/(CD74 + CD44) pathways, among AML subgroups at different differentiation stages. Previous studies (18) have reported significant overexpression of CD99 in AML cells and LSCs compared to normal CD34⁺CD38⁻ HSCs, correlating with leukemic stemness. Loss of CD99 expression was associated with a phenotype more reminiscent of normal HSCs and better functional integrity. Moreover, CD99 expression has been proposed as a potential predictor of FLT3-ITD mutations and therapeutic target in AML. Its isoform, CD99-L, has been implicated in promoting AML cell proliferation, aggregation, elevated ROS, and apoptosis (19). Overexpression of PPIA and BSG has been associated with poor prognosis in various cancers, including increased metastatic potential, reduced survival, and advanced tumor stages (20,21). These molecules have also been linked to cancer stem cell-like properties. MIF (macrophage migration inhibitory factor), a pro-inflammatory cytokine, plays a crucial role in modulating immune responses and has been found to be upregulated in multiple malignancies (22). Elevated MIF expression promotes tumor progression, angiogenesis, metastasis, and contributes to the formation of an immunosuppressive tumor microenvironment (23). The observed activation of these signaling axes in AML subpopulations with intermediate or relatively low differentiation levels in the present study may be functionally relevant to their stemness and pathological roles. Among the subpopulations, CMP-like cells exhibited the highest frequency of intercellular communication, predominantly as signal receivers (incoming interactions). This characteristic may reflect their central positioning within the leukemic differentiation hierarchy.
UNC13D is a key component of the vesicle priming and regulated secretion machinery and has been extensively studied in immune cells, where it facilitates cytotoxic granule exocytosis and extracellular vesicle (EV) release through interactions with Rab GTPases and SNARE complexes (4). Recent studies have summarized UNC13D interacting partners and highlighted its central role in vesicular trafficking and secretory pathways rather than direct transcriptional regulation. In this context, elevated UNC13D expression in intermediate AML states may reflect increased secretory demand and intercellular communication capacity during leukemic differentiation transitions (4,14,24-26).
UNC13D primarily orchestrates the secretion of lysosome-related vesicles, promoting fusion between cytotoxic granules and the plasma membrane, thereby facilitating the release of perforin and granzymes by CTLs and NK cells, which is integral to immune response (27). AUCell scores were computed for each cellular subpopulation using MSigDB Hallmark gene sets, revealing substantial pathway heterogeneity across AML differentiation stages. Correlation analysis between AUCell pathway activity and UNC13D expression produced a correlation matrix, from which 15 Hallmark pathways exhibiting the strongest positive correlations with UNC13D were selected. These included oxidative phosphorylation, endoplasmic reticulum stress response, ROS pathway, and glycolysis. Notably, pathways such as interferon-α (antiviral response), NF-κB activation, and ROS-related signaling showed direct or indirect functional linkages to UNC13D. The most strongly correlated pathway was HALLMARK_MYC_TARGETS_V1, a gene set regulated by the proto-oncogene MYC transcription factor and implicated in cell cycle control, protein translation, ribosome biogenesis, DNA replication/repair, and metabolic processes—features characteristic of rapidly proliferative and invasive tumor cells (28,29). This aligns with pseudotime analysis, which demonstrated that S-phase activity—indicative of DNA replication—is predominantly contributed by CMP-like subpopulations. Correspondingly, UNC13D expression peaked at intermediate pseudotime stages, suggesting a potential mechanistic association with AML differentiation and development.
Although this study systematically integrated single-cell and large-scale transcriptome data, and constructed an AML prognostic model based on UNC13D-related immune and dedifferentiation features, there are still several limitations that cannot be ignored. First, the construction and validation of all models mainly rely on retrospective cohorts in public databases (such as TCGA and OHSU), which have relatively limited sample sizes. Furthermore, there is no unified sequencing platform and clinical standards, which may have batch effects or potential biases. Second, although annotation and cell communication inference at the single-cell level reveal the relationship between AML cell lineages and their interactions with the immune microenvironment, no spatial transcriptome information is used, making it difficult to accurately depict the spatial localization and functional status of cells in the tissue microenvironment. Third, although UNC13D has been identified as a potential key gene, its function in regulating immune escape or cell dedifferentiation remains to be validated in experiments, and the relevant mechanism still needs to be confirmed by in vitro functional experiments or animal models. In addition, although the constructed risk score model shows good predictive ability in external cohorts, its generalization ability still needs to be further evaluated in larger, multi-center prospective clinical cohorts.
Conclusions
In summary, focusing on UNC13D and integrating scRNA-seq with bulk RNA-seq data revealed a robust association between UNC13D expression and AML dedifferentiation and immune evasion phenotypes. A MYC-pathway-based multigene prognostic model incorporating dedifferentiation and immune features exhibited strong risk stratification and prognostic prediction in the TCGA training cohort and was independently validated in the OHSU cohort. These findings deepen the understanding of UNC13D’s functions within the leukemic immune microenvironment and provide theoretical foundations for AML molecular classification, personalized therapy, and clinical management. Future investigations integrating spatial transcriptomics, single-cell multi-omics, and functional validation studies are warranted to mechanistically delineate UNC13D’s role in tumor cell dedifferentiation and assess its translational potential.
Acknowledgments
We acknowledge the use of BioRender for creating Figure 1 (Created in BioRender by Z.W., 2026. https://BioRender.com/ry5maip).
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2307/rc
Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2307/prf
Funding: This study was supported by
Conflicts of Interest: Both authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2307/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Wachter F, Pikman Y. Pathophysiology of Acute Myeloid Leukemia. Acta Haematol 2024;147:229-46. [Crossref] [PubMed]
- Shimony S, Stahl M, Stone RM. Acute Myeloid Leukemia: 2025 Update on Diagnosis, Risk-Stratification, and Management. Am J Hematol 2025;100:860-91. [Crossref] [PubMed]
- Zhou J, Chng WJ. Unveiling novel insights in acute myeloid leukemia through single-cell RNA sequencing. Front Oncol 2024;14:1365330. [Crossref] [PubMed]
- Duong VT, Lee D, Kim YH, et al. Functional role of UNC13D in immune diseases and its therapeutic applications. Front Immunol 2024;15:1460882. [Crossref] [PubMed]
- Cichocki F, Schlums H, Li H, et al. Transcriptional regulation of Munc13-4 expression in cytotoxic lymphocytes is disrupted by an intronic mutation associated with a primary immunodeficiency. J Exp Med 2014;211:1079-91. [Crossref] [PubMed]
- Galgano D, Soheili T, Voss M, et al. Alternative UNC13D Promoter Encodes a Functional Munc13-4 Isoform Predominantly Expressed in Lymphocytes and Platelets. Front Immunol 2020;11:1154. [Crossref] [PubMed]
- Chang TY, Jaffray J, Woda B, et al. Hemophagocytic lymphohistiocytosis with MUNC13-4 gene mutation or reduced natural killer cell function prior to onset of childhood leukemia. Pediatr Blood Cancer 2011;56:856-8. [Crossref] [PubMed]
- Abdel-Aziz AK. Advances in acute myeloid leukemia differentiation therapy: A critical review. Biochem Pharmacol 2023;215:115709. [Crossref] [PubMed]
- Takahashi K, Tanabe K, Ohnuki M, et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 2007;131:861-72. [Crossref] [PubMed]
- Bonnet D, Dick JE. Human acute myeloid leukemia is organized as a hierarchy that originates from a primitive hematopoietic cell. Nat Med 1997;3:730-7. [Crossref] [PubMed]
- McKenzie MD, Ghisi M, Oxley EP, et al. Interconversion between Tumorigenic and Differentiated States in Acute Myeloid Leukemia. Cell Stem Cell 2019;25:258-272.e9. [Crossref] [PubMed]
- Nerlov C, Graf T PU.. 1 induces myeloid lineage commitment in multipotent hematopoietic progenitors. Genes Dev 1998;12:2403-12. [Crossref] [PubMed]
- Tyner JW, Tognon CE, Bottomly D, et al. Functional genomic landscape of acute myeloid leukaemia. Nature 2018;562:526-31. [Crossref] [PubMed]
- Duong VT, Ha M, Kim J, et al. Recycling machinery of integrin coupled with focal adhesion turnover via RAB11-UNC13D-FAK axis for migration of pancreatic cancer cells. J Transl Med 2024;22:800. [Crossref] [PubMed]
- Wei C, Zhao D, Xue S, et al. Germline defects of familial hemophagocytic lymphohistiocytosis-related genes presenting as adult-onset peripheral T-cell lymphoma. Front Immunol 2024;15:1365975. [Crossref] [PubMed]
- Chopra M, Bohlander SK. The cell of origin and the leukemia stem cell in acute myeloid leukemia. Genes Chromosomes Cancer 2019;58:850-8. [Crossref] [PubMed]
- Long NA, Golla U, Sharma A, et al. Acute Myeloid Leukemia Stem Cells: Origin, Characteristics, and Clinical Implications. Stem Cell Rev Rep 2022;18:1211-26. [Crossref] [PubMed]
- Manara MC, Pasello M, Scotlandi K. CD99: A Cell Surface Protein with an Oncojanus Role in Tumors. Genes (Basel) 2018;9:159. [Crossref] [PubMed]
- Vaikari VP, Du Y, Wu S, et al. Clinical and preclinical characterization of CD99 isoforms in acute myeloid leukemia. Haematologica 2020;105:999-1012. [Crossref] [PubMed]
- Han JM, Jung HJ. Cyclophilin A/CD147 Interaction: A Promising Target for Anticancer Therapy. Int J Mol Sci 2022;23:9341. [Crossref] [PubMed]
- Liu L, Zhou Y, Ye Z, et al. Single-cell profiling uncovers the intricate pathological niche diversity in brain, lymph node, bone, and adrenal metastases of lung cancer. Discov Oncol 2025;16:512. [Crossref] [PubMed]
- Liang J, Lei K, Liang R, et al. Single-cell RNA sequencing reveals the MIF-ACKR3 receptor-ligand interaction between iCAFs and tumor cells in esophageal squamous cell carcinoma. Cell Signal 2024;117:111093. [Crossref] [PubMed]
- Zhao Z, Jia H, Sun Z, et al. A new perspective on macrophage-targeted drug research: the potential of KDELR2 in bladder cancer immunotherapy. Front Immunol 2024;15:1485109. [Crossref] [PubMed]
- Gu J, An N, Wang X, et al. UNC13D c.2588G>A Nucleotide Variant Impairs NK-Cell Cytotoxicity in Adult-Onset EBV-Associated Hemophagocytic Lymphohistiocytosis: A Pedigree Study. Int J Mol Sci 2025;26:8683. [Crossref] [PubMed]
- Liu C, Liu D, Zheng X, et al. Munc13-4 mediates tumor immune evasion by regulating the sorting and secretion of PD-L1 via exosomes. Nat Commun 2025;16:9080. [Crossref] [PubMed]
- Wei C, Zhang Y, Zhao D, et al. Germline defects of familial haemophagocytic lymphohistiocytosis-Related genes may represent a predisposing factor for mature T- and natural killer-cell lymphoma. Br J Haematol 2025;207:842-50. [Crossref] [PubMed]
- Ansari U, Chen V, Sedighi R, et al. Role of the UNC13 family in human diseases: A literature review. AIMS Neurosci 2023;10:388-400. [Crossref] [PubMed]
- Schulze A, Oshi M, Endo I, et al. MYC Targets Scores Are Associated with Cancer Aggressiveness and Poor Survival in ER-Positive Primary and Metastatic Breast Cancer. Int J Mol Sci 2020;21:8127. [Crossref] [PubMed]
- Oshi M, Gandhi S, Yan L, et al. Abundance of reactive oxygen species (ROS) is associated with tumor aggressiveness, immune response, and worse survival in breast cancer. Breast Cancer Res Treat 2022;194:231-41. [Crossref] [PubMed]

