Liquid-liquid phase separation-related gene signature characterizes prognostic subtypes and therapeutic sensitivities in gastric cancer
Original Article

Liquid-liquid phase separation-related gene signature characterizes prognostic subtypes and therapeutic sensitivities in gastric cancer

Bing Yan, Qiying Lao, Lijuan Lin

Department of Anesthesiology, Central People’s Hospital of Zhanjiang, Zhanjiang, China

Contributions: (I) Conception and design: B Yan; (II) Administrative support: B Yan, Q Lao; (III) Provision of study materials or patients: All authors; (IV) Collection and assembly of data: All authors; (V) Data analysis and interpretation: All authors; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Bing Yan, Bachelor’s Degree. Department of Anesthesiology, Central People’s Hospital of Zhanjiang, No. 236 Yuanzhu Road, Chikan District, Zhanjiang 524500, China. Email: yb_919@163.com.

Background: Gastric cancer (GC) is a heterogeneous malignancy with variable outcomes. Liquid-liquid phase separation (LLPS) has emerged as a regulator of cancer-related processes. The study aims to identify GC subgroups associated with LLPS and to establish a prognostic model for patient outcomes.

Methods: The Cancer Genome Atlas (TCGA) GC transcriptomic and clinical data were analyzed. LLPS-related genes were obtained from the Phase Separation Database. Differential expression analysis and unsupervised consensus clustering were performed to define molecular subtypes. A prognostic RiskScore was constructed using least absolute shrinkage and selection operator (LASSO) Cox regression and validated in independent Gene Expression Omnibus cohorts (GSE84437, GSE66229, GSE28541). Intrinsically disordered regions (IDRs) were predicted using the Intrinsically Unstructured Protein Predictor. Model performance and clinical utility were assessed using time-dependent receiver operating characteristic (ROC) curve analysis and decision curve analysis (DCA). Immune cell infiltration, immunotherapy responsiveness, functional pathway enrichment, and drug sensitivity were analyzed using established computational frameworks.

Results: A three-gene LLPS-related prognostic signature (ZFYVE27, GNG11, and DOK7) was used to derive a RiskScore that defined high- and low-risk GC groups with significantly different overall survival across multiple cohorts. Intrinsic disorder analysis revealed that all three proteins contain substantial IDRs, supporting their LLPS-related properties. Time-dependent ROC and concordance index analyses indicated moderate discriminative performance. A nomogram integrating RiskScore and clinical variables demonstrated improved predictive accuracy and calibration. DCA showed that the combined model provided a higher net benefit than the clinical model alone. The risk groups exhibited distinct immune cell infiltration profiles and differences in predicted immunotherapy responsiveness. Gene set enrichment analysis demonstrated enrichment of cell cycle, oncogenic, and inflammatory pathways in the high-risk group, including G2M checkpoint, MYC targets, and IL6/JAK/STAT3 signaling. Drug sensitivity prediction suggested differential therapeutic vulnerabilities, with high-risk patients showing increased predicted sensitivity to JAK inhibitors, dasatinib, and nutlin-3a.

Conclusions: An LLPS-related three-gene RiskScore identifies GC subgroups with different prognosis, immune features, and therapeutic sensitivity.

Keywords: Gastric cancer (GC); immunotherapy; tumor-infiltrating lymphocytes; liquid-liquid phase separation (LLPS); prognosis


Submitted Dec 22, 2025. Accepted for publication Apr 03, 2026. Published online Apr 26, 2026.

doi: 10.21037/tcr-2025-1-2846


Highlight box

Key findings

• Dysregulated liquid-liquid phase separation (LLPS) genes define two gastric cancer (GC) subgroups. A three-gene prognostic model (ZFYVE27, GNG11, DOK7) showed moderate accuracy, linked risk to altered immunity and enriched pathways (e.g., MYC, STAT3), informing potential personalized treatment strategies.

What is known and what is new?

• LLPS is implicated in cancer, but its specific role in GC prognosis and tumor biology remains largely unexplored.

• This study systematically identifies two GC subtypes based on dysregulated LLPS-related genes. It establishes a novel three-gene (ZFYVE27, GNG11, DOK7) prognostic signature, linking LLPS dysregulation to distinct immune microenvironments, key oncogenic pathways (e.g., MYC, STAT3), and potential drug sensitivities, offering a framework for personalized GC management.

What is the implication, and what should change now?

• This work suggests that LLPS dysregulation is a key feature of GC, mechanistically linking it to altered tumor immunity and aggressive pathways. The model offers a potential tool for risk stratification.

• The prognostic signature requires rigorous prospective clinical validation. Future research should functionally validate the roles of ZFYVE27, GNG11, and DOK7 in GC LLPS and experimentally test the predicted drug sensitivities in high-risk patients.


Introduction

Gastric cancer (GC) is the fifth most prevalent cancer globally, affecting over 1 million people annually (1). The delayed detection of GC contributes significantly to its high fatality rate, making it one of the leading causes of cancer death globally (2). GC progression is influenced by factors including Helicobacter pylori infection, dietary habits, and environmental conditions (3). Systemic chemotherapy is the cornerstone of treatment, extending the overall survival (OS) to approximately 12 months (4). Reliance solely on pathological classifications is inadequate for accurate patient categorization and individualized therapy (5). Therefore, it is crucial to identify reliable predictors to improve clinical outcomes.

Beyond genetic mutations, a fundamental hallmark of cancer is the disruption of the spatiotemporal organization of intracellular processes. Liquid-liquid phase separation (LLPS), which drives the assembly of biomolecular condensates, has emerged as a key mechanism coordinating transcription, signaling, and stress responses (6,7). Aberrant LLPS has been implicated in tumor progression by sustaining oncogenic signaling, altering chromatin structure, and promoting genomic instability (8). For example, circSPECC1 can facilitate the LLPS of Autophagy Related 4B Cysteine Peptidase in GC, thereby enhancing its ubiquitination and degradation (9). Moreover, LLPS has been linked to tumor–immune interactions and therapeutic response (10). However, systematic investigation of how the global LLPS network contributes to GC heterogeneity and prognosis remains lacking.

This study sought to construct a prognostic model using differentially expressed LLPS-related genes (DeLRGs) for GC risk stratification and investigate the potential interaction between LLPS and GC immune landscape. The findings highlight the significance of LRGs in GC prognosis and the immune microenvironment, suggesting new therapeutic targets. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-1-2846/rc).


Methods

Study cohorts and data sources

The GC cohort was obtained from The Cancer Genome Atlas Stomach Adenocarcinoma (TCGA-STAD) project via the University of California, Santa Cruz Xena Browser (https://xenabrowser.net/). Sample information is summarized in Table S1. Fragments Per Kilobase of transcript per million mapped reads were transformed to transcripts per million (TPM), followed by batch effect removal using “ComBat” from “sva” (11). The TCGA cohort was used to identify DeLRGs and to develop a risk model. Gene Expression Omnibus (GEO) datasets GSE84437, GSE66229, and GSE28541 (12,13) were used for model validation. LRGs were obtained from the Phase Separation Database (PhaSepDB; http://db.phasep.pro/) (14). The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Identification of DeLRGs

In the TCGA cohort, differentially expressed genes (DEGs) were identified using the R package “limma” (15) [adjusted P<0.05, log2 |fold change (FC)| >1] (16). The intersection of DEGs and LRGs identified by a Venn diagram was designated as DeLRGs.

Enrichment analysis

DEG functional enrichment was assessed by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) using “clusterProfiler” in R (17). Enrichment results were reported with Benjamini-Hochberg-corrected P values.

Consensus clustering

GC subtypes were identified from DeLRG expression in TCGA using ConsensusClusterPlus (18) with Spearman distance, 500 iterations, and 80% resampling of items. The cumulative distribution function (CDF) curve, consistency score, and consistency matrix were analyzed to determine the optimal number of subtypes (k). DEGs between subtypes were identified using “limma” (|log FC| >1, adjusted P<0.05).

Identification of feature genes

Univariate COX analysis was conducted on TCGA survival data to identify prognosis-related genes (P<0.05) among the DEGs between GC subtypes. Using a 7:3 ratio, a training set was established for least absolute shrinkage and selection operator (LASSO) regression, implemented via “glmnet” in R (19). The optimal λ was identified to minimize the criterion. Genes with retained coefficients were selected, and their weight coefficients were combined with expression levels for risk score calculation:

RiskScore=incoefi×genei

Patients were stratified into high- and low-risk groups based on the median RiskScore in each cohort.

Intrinsic disorder prediction

Protein sequences corresponding to the model genes were obtained from the Universal Protein Resource (UniProt) database (Q5T4F4, P61952, and Q18PE1, respectively). Intrinsically disordered regions (IDRs) were predicted using the IUPred3 online tool (https://iupred.elte.hu/) under the “long disorder” mode. Residues with scores >0.5 were defined as disordered. Site-level disorder scores were extracted for downstream visualization and analysis.

Protein-protein interaction (PPI) network

GeneMANIA (20) was utilized for PPI network construction to identify proteins that potentially exhibit shared functions.

Nomogram construction and evaluation

Cox regression analysis was conducted on the risk score, along with demographic and disease-related factors. Forest plots were generated by “forestplot”. A nomogram was developed using “rms”. Calibration curves were produced by the “calibrate” function to assess the performance of the nomogram.

Decision curve analysis (DCA)

To evaluate the clinical applicability of the nomogram, DCA was performed in both the training and validation cohorts using the ggDCA package in R. Net benefit was calculated to determine whether nomogram-assisted decision making provided superior clinical value compared with treat-all, treat-none, or tumor-node-metastasis (TNM) stage-based strategies across a range of threshold probabilities. Decision curves were generated to assess 1-, 2-, and 3-year OS by integrating true-positive rates while penalizing false-positive classifications according to the relative clinical consequences of unnecessary intervention versus missed events.

Immune landscape analysis

The relative abundance of 22 tumor-infiltrating immune cell types in TCGA GC samples was estimated using the CIBERSORT algorithm (21) based on TPM-normalized expression data. Samples with CIBERSORT P<0.05 were retained. Immune cell types with zero abundance across all samples were excluded. Immune infiltration was additionally evaluated using xCell (22) and Estimating the Proportions of Immune and Cancer cells (EPIC) (23) algorithms implemented in the Immuno-Oncology Biological Research R package. Tumor immune and stromal components were quantified using the Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression (ESTIMATE) algorithm to derive ImmuneScore, StromalScore, and tumor purity. Associations between RiskScore and immune checkpoint genes, including PDCD1 [programmed death-1 (PD-1)], CD274 [programmed death ligand-1 (PD-L1)], PDCD1LG2 [programmed death ligand-2 (PD-L2)], and CTLA4, were assessed using Pearson correlation analysis. Tumor Immune Dysfunction and Exclusion (TIDE) analysis (http://tide.dfci.harvard.edu) was used to calculate TIDE, T-cell dysfunction, and T-cell exclusion scores as indicators of tumor immune evasion. Differences between high- and low-risk groups were evaluated using the Wilcoxon rank-sum test.

Association of RiskScore with GC molecular subtypes and human epidermal growth factor receptor 2 (HER2) status

Molecular subtype information [Epstein-Barr virus (EBV), microsatellite instability (MSI), chromosomal instability (CIN), and genomically stable (GS)] for TCGA-STAD samples was obtained from published annotations (24). Differences in RiskScore among subtypes were evaluated using the Kruskal-Wallis test. HER2 status was inferred based on Erb-B2 receptor tyrosine kinase 2 (ERBB2) messenger RNA (mRNA) expression levels. Patients were stratified into high and low expression groups according to the upper quantile threshold, and differences in RiskScore between groups were assessed using the Wilcoxon rank-sum test.

Gene set enrichment analysis (GSEA)

GSEA was conducted on Hallmark gene sets using the R package “clusterProfiler”. Pathway enrichment was specifically investigated in high-risk patients.

Drug sensitivity prediction

Data from the Cancer Therapeutics Response Portal version 2 and Genomics of Drug Sensitivity in Cancer databases were analyzed using “calcPhenotype” in “oncoPredict”. The expression matrix and drug treatment data from the training set were utilized to model the TCGA cohort. Half maximal inhibitory concentration (IC50) values for the TCGA cohort were computed, and their correlation with the risk score was assessed. The PreScore represents the predicted drug response score for patients. A lower PreScore indicates a higher sensitivity. Differential drug response between risk groups was assessed using log FC as effect size. P values were adjusted using the Benjamini-Hochberg method to control the false discovery rate (FDR).

Statistical analysis

The “survival” package was used for survival analysis, and the results were visualized with “survminer”. Receiver operating characteristic (ROC) analysis was conducted using “timeROC” and presented using “pROC”. Heatmaps were generated using “pheatmap”. Result diagrams were generated using “plot” or “ggplot2”. Pearson’s method was applied for correlation analysis. Group differences were assessed with the t-test. All analyses were conducted using R (4.3.1). Statistical significance was defined as P<0.05. Concordance index (C-index) was calculated to quantify the discriminative ability of the prognostic model using the “survival” and “survcomp” R packages, with 95% confidence intervals (CIs) estimated. Bootstrap resampling (1,000 iterations) was performed to evaluate model stability and obtain robust estimates.


Results

DeLRG identification

From PhaSepDB, 1,419 LRGs were identified. Differential expression analysis of the TCGA dataset identified 8,864 DEGs (Figure 1A, table available at https://cdn.amegroups.cn/static/public/tcr-2025-1-2846-1.xlsx), with the top 40 displayed in the heatmap (Figure 1B). The intersection of LRGs and DEGs identified 148 DeLRGs. GO functional enrichment analysis revealed 15 enriched biological processes, including “RNA splicing” and “RNA localization”. The enriched cellular components comprised 10 categories, such as “ribonucleoprotein granule”, “PcG protein complex”, and “nuclear speck”. The molecular functions included 6 categories, such as “molecular condensate scaffold activity”, “modification-dependent protein binding”, and “ribonucleoprotein complex binding” (Figure 1C, Table S2).

Figure 1 Identification and characterization of DeLRGs in GC. (A) A volcano plot illustrates DEGs in the TCGA GC cohort. Red dots represent upregulated genes, and blue dots represent downregulated genes. The top 20 significantly upregulated and downregulated genes are labeled. (B) A heatmap displays the top 40 DEGs. Each row represents a gene, and each column represents a sample. The color gradient indicates the level of gene expression, with red representing higher expression and blue representing lower expression. (C) GO functional enrichment analysis of the DeLRGs. The bar graph categorizes the enriched GO terms into biological processes (orange), cellular components (green), and molecular functions (blue). The length of each bar represents the number of genes associated with each GO term. DEGs, differentially expressed genes; DeLRGs, differentially expressed liquid-liquid phase separation-related genes; GC, gastric cancer; GO, Gene Ontology; TCGA, The Cancer Genome Atlas.

Identification of GC subtypes based on DeLRGs

Unsupervised consensus clustering based on the 148 DeLRGs stratified the TCGA cohort into two molecular subtypes, Cluster 1 and Cluster 2 (Figure 2A), supported by clustering stability analyses and further validated by dimensionality-reduction approaches (Figure 2B-2F). The heatmap illustrated the expression patterns of the 148 DeLRGs across different subtypes (Figure 2G). Eleven immune cell types showed statistically significant differences in abundance between subtypes (Figure 2H). One subtype was characterized by activation of cell cycle- and translation-associated biological processes, including cytoplasmic translation, DNA-templated DNA replication, and mitotic sister chromatid segregation, whereas suppressed processes were mainly related to muscle system processes and responses to toxic substances (Figure 2I, table available at https://cdn.amegroups.cn/static/public/tcr-2025-1-2846-2.xlsx). Consistently, KEGG pathway analysis showed enrichment of ribosome biogenesis, oxidative phosphorylation, and DNA replication in the activated pathways, while suppressed pathways were predominantly associated with fatty acid degradation and multiple metabolic and detoxification processes (Figure 2J, table available at https://cdn.amegroups.cn/static/public/tcr-2025-1-2846-3.xlsx). These findings suggest that the DeLRG-related subtypes differ in gene expression profiles, functional pathway enrichment, and immune cell composition.

Figure 2 Consensus clustering of GC patients based on DeLRGs and functional enrichment analysis. (A) Consensus matrix for k=2, showing distinct clustering into two groups with high consensus scores. (B) A consensus CDF plot depicts the cumulative distribution for different values of k ranging from 2 to 9. (C) A delta area plot shows the relative change in area under the CDF curve for different values of k, with a notable decrease indicating the stability of the clustering at k=2. (D) A cluster-consensus bar plot indicates the clustering consistency for each cluster. (E) A principal component analysis scatter plot shows the distribution of the two identified clusters. (F) A tSNE plot illustrates the clustering of the GC patients into two distinct subtypes. Each point represents a patient and is colored according to the assigned cluster. (G) A heatmap illustrates the expression patterns of 148 DeLRGs across the identified subtypes. (H) A box plot shows differences in the abundance of 11 immune cell types between the two identified clusters. ns, not significant; *, P<0.05; **, P<0.01; ***, P<0.001; ****, P<0.0001. GO (I) and KEGG (J) functional enrichment analyses in the identified clusters. The dot plot shows the top enriched GO terms or KEGG pathways for activated (left panel) and suppressed (right panel) biological processes or pathways. The size of the dots represents the gene ratio, and the color gradient indicates the adjusted P value for enrichment significance. CDF, cumulative distribution function; DeLRGs, differentially expressed liquid-liquid phase separation-related genes; GC, gastric cancer; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; tSNE, t-distributed Stochastic Neighbor Embedding.

Development of a prognostic risk model using DEGs from GC subtypes

To identify prognostic biomarkers for GC, DEGs between the DeLRG-related subtypes were analyzed. Compared to Cluster 2, Cluster 1 exhibited 5,962 upregulated DEGs and 4,950 downregulated DEGs (Figure 3A, table available at https://cdn.amegroups.cn/static/public/tcr-2025-1-2846-4.xlsx). Among these, 464 genes were filtered through univariate COX analysis (table available at https://cdn.amegroups.cn/static/public/tcr-2025-1-2846-5.xlsx). Figure 3B presents the LASSO coefficient profiles, while Figure 3C shows the optimal lambda selected by ten-fold cross-validation. LASSO COX analysis on the training set identified 3 genes (ZFYVE27, GNG11, and DOK7) with the minimum lambda. The weight coefficients (Figure 3D) were used to compute the risk score: (−0.00193) × ZFYVE27 + (0.00287) × GNG11 + (−0.00234) × DOK7. Subsequently, a multivariate COX analysis was conducted on these genes (Figure 3E), followed by the construction of a PPI network (Figure 3F). Compared to adjacent normal tissues, GC tissues showed significant differential expression of these genes (Figure 3G).

Figure 3 Identification and prognostic analysis of DEGs in GC subtypes. (A) A volcano plot illustrates the DEGs identified between GC subtypes with criteria of |log2 FC| >1 and adjusted P<0.05. (B) A trace plot for the LASSO COX regression shows the coefficient paths for each variable as a function of the regularization parameter lambda. (C) Ten-fold cross-validation curve for LASSO COX regression. Dashed lines indicate minimum lambda and optimal lambda. (D) Bar plot of the weight coefficients of the three selected genes (ZFYVE27, GNG11, and DOK7). (E) Forest plot of the multivariate COX regression analysis for the model genes. HR and 95% CI are shown. (F) Protein-protein interaction network of the model genes and other related proteins. Nodes represent proteins, and edges represent interactions. (G) Box plots comparing the expression of levels of the model genes between GC tissues and normal tissues. *, P<0.05; ***, P<0.001; ****, P<0.0001. AIC, Akaike Information Criterion; CI, confidence interval; DEGs, differentially expressed genes; FC, fold change; GC, gastric cancer; HR, hazard ratio; LASSO, least absolute shrinkage and selection operator.

Sequence-based characterization of intrinsic disorder in model proteins

To characterize LLPS-related molecular features of the model genes, IDR analysis was performed, as IDRs facilitate multivalent interactions underlying phase separation (25). DOK7 exhibited a long C-terminal IDR (residues ~210–504), accounting for a substantial proportion of the protein length (Figure S1A). GNG11 displayed consistently elevated disorder scores across most of its sequence (Figure S1B). ZFYVE27 showed a prominent central IDR (residues ~220–300) with high disorder propensity (Figure S1C). These results provide structural evidence supporting the involvement of the model genes in LLPS-related mechanisms.

Prognostic evaluation

To assess the model’s prognostic value, survival analyses were performed on the TCGA cohort. In both training and test sets, high-risk patients showed a higher mortality rate (Figure 4A,4B). Kaplan-Meier survival curves demonstrated significantly poorer OS for high-risk patients in both datasets (P<0.0001 and P=0.02; Figure 4C,4D). Moderate predictive accuracy was observed in the training set, with area under the curve (AUC) values of 0.68, 0.71, and 0.69 for 1-, 2-, and 3-year survival (Figure 4E). In the test set, AUCs were 0.56, 0.61, and 0.56 (Figure 4F). Dimensionality-reduction analysis showed separation between the two clusters (Figure S2A). Cluster 2 exhibited significantly higher RiskScores than Cluster 1 (****, P<0.0001; Figure S2B), and Sankey analysis illustrated that Cluster 2 patients were predominantly classified into the high-risk group and experienced poorer survival outcomes (Figure S2C). These results demonstrate stable prognostic stratification by the RiskScore.

Figure 4 Prognostic evaluation of the risk model in The Cancer Genome Atlas stomach adenocarcinoma cohort. (A,B) Risk score distribution and survival status of patients in the training (A) and validation (B) cohorts. The top panel shows patients stratified into high-risk (red) and low-risk (gray) groups based on their risk scores. The bottom panel illustrates the survival status of patients, with red dots representing deceased patients and gray dots representing surviving patients. (C,D) Kaplan-Meier survival curves for the training (C) and validation (D) cohorts, comparing overall survival between high-risk (red line) and low-risk (gray line) patients. (E,F) Time-dependent receiver operating characteristic curves for the training (E) and validation (F) cohorts at 1 year, 2 years, and 3 years. AUC, area under the curve.

Validation in external datasets and clinical subgroups

The prognostic model was further tested in independent datasets: GSE84437, GSE66229, and GSE28541. Compared to patients at low risk, patients at high risk showed a higher death rate (Figure S3A-S3C) and worse OS (all P<0.05; Figure S3D-S3F). ROC analysis showed moderate accuracy, with 1-, 2-, and 3-year AUCs ranging from 0.5 to 0.67 across cohorts (Figure S3G-S3I). Moreover, compared to patients at low risk, patients at high risk consistently had poorer OS regardless of sex (males: P<0.001, females: P<0.001), age (≤65 years: P=0.001, >65 years: P<0.001), tumor stage (P=0.01 for stage I–II, P<0.001 for stage III–IV), and lymph node involvement (N0: P=0.01, N1–N3: P<0.001). Exceptions were observed regarding tumor size (T3–T4: P<0.001, T1–T2: P=0.20) and metastasis (M0: P<0.001, M1: P=0.84) (Figure S4A-S4L). These results support the generalizability of the RiskScore, with reduced performance in specific clinical subgroups.

RiskScore was further evaluated in relation to established molecular classifications in the TCGA-STAD cohort. The GS subtype had the highest RiskScore, whereas the EBV and MSI subtypes had lower RiskScores, consistent with their relatively favorable clinical outcomes (Figure S5A). Patients with lower HER2 expression had higher RiskScores than those with higher expression (Figure S5B). These patterns suggest that RiskScore is associated with molecular heterogeneity in GC.

C-index values were calculated to assess model discrimination across all cohorts. The C-index was 0.653 (95% CI: 0.594–0.711) in the TCGA training set and 0.562 (95% CI: 0.465–0.659) in the internal testing set, while external cohorts ranged from 0.462 to 0.573 (Table S3). Bootstrap resampling (1,000 iterations) showed relatively stable distributions, with lower values observed in more heterogeneous cohorts (Figure S6). These findings indicate that the model maintains moderate but consistent discriminative performance across datasets.

Construction and evaluation of a nomogram

To provide individualized survival predictions, a nomogram was developed. Univariate Cox regression identified RiskScore, age, stage, and TNM stages as OS predictors (Figure 5A). Multivariate analysis confirmed RiskScore as an independent predictor for prognosis (Figure 5B). By integrating the RiskScore and clinicopathological factors, a nomogram was developed (Figure 5C). Calibration curves showed reasonable alignment between expected and actual outcomes (Figure 5D-5F). Moderate predictive accuracy was observed for 1- (AUC =0.69), 2- (AUC =0.73), and 3-year (AUC =0.66) survival (Figure 5G). These findings indicate that the RiskScore is an independent prognostic factor for OS in GC patients.

Figure 5 Nomogram construction and evaluation. (A,B) Forest plots of univariate and multivariate COX analyses for risk score and clinical factors. (C) A nomogram was developed to provide individualized survival predictions, integrating age, gender, TNM stages, and the risk score for 1-, 2-, and 3-year overall survival. (D-F) The calibration curve for 1-year OS demonstrates good agreement between predicted and observed outcomes. (G) The receiver operating characteristic curves for nomogram-predicted 1-, 2-, and 3-year survival. AUC, area under the curve; CI, confidence interval; OS, overall survival; prob., probability; TNM, tumor-node-metastasis.

Incremental clinical value of the RiskScore evaluated by DCA

To assess the clinical applicability of the RiskScore, DCA was performed in the training and validation cohorts by comparing the clinical model incorporating age, sex, and TNM stage, the RiskScore model, and a combined model integrating molecular and clinical variables. In the training cohort (Figure 6A), the combined model consistently demonstrated the highest net benefit across a broad range of threshold probabilities for 1-, 2-, and 3-year OS. As the threshold probability increased, the net benefit of the clinical model progressively declined and approached the reference line representing no-intervention strategies, whereas the combined model maintained a stable and positive net benefit. This pattern indicates that the combined model more effectively identifies patients who are likely to experience adverse outcomes and may benefit from intensified clinical management. In the validation cohort (Figure 6B), although the absolute net benefit was attenuated, the combined model remained comparable to or slightly better than the clinical model, particularly in the prediction of 3-year OS. Together, these findings indicate that integrating the LLPS-based RiskScore with clinicopathological factors improves clinical decision-making utility compared with the clinical model alone.

Figure 6 Decision curve analysis evaluating the clinical utility of the RiskScore. Decision curve analysis comparing net benefit among the clinical model (age, sex, and TNM stage), the RiskScore model, and the combined model integrating molecular and clinical variables. (A) Decision curves for 1-, 2-, and 3-year OS in the training cohort. (B) Decision curves for 1-, 2-, and 3-year OS in the validation cohort. The y-axis represents net benefit, and the x-axis indicates threshold probability. The dashed line denotes treating all patients, and the horizontal line represents treating none. Across multiple threshold probabilities, the combined model shows a higher net benefit than the clinical model alone, indicating improved clinical decision-making utility when the RiskScore is incorporated. OS, overall survival; TNM, tumor-node-metastasis.

Immune cell profiling

To further characterize biological differences associated with RiskScore-based stratification, immune cell composition was analyzed. CIBERSORT identified differences in the relative abundance of multiple immune cell subsets between high- and low-risk groups (Figure 7A). Immune infiltration was further assessed using EPIC and xCell. EPIC showed higher cancer-associated fibroblasts (CAFs), endothelial cells, macrophages, and B cells in the high-risk group (Figure S7A). xCell similarly showed increased fibroblasts, endothelial cells, macrophages, and CD4+ T cells, along with higher overall immune scores in the high-risk group (Figure S7B), consistent with the CIBERSORT results. A Chi-squared test revealed a lower therapy response rate in patients at high risk compared to those at low risk (Figure 7B). Patients at high risk had significantly increased StromalScore, ImmuneScore, and ESTIMATEScore (Figure 7C). Correlation analysis identified significant associations between model gene expression (ZFYVE27, GNG11, DOK7) and immune checkpoints (CTLA4, PDCD1LG2, PDCD1, CD274) (Figure 7D,7E). Additionally, TIDE, CD8, dysfunction, exclusion, and CAF scores were higher, whereas myeloid-derived suppressor cell levels were lower in the high-risk group (Figure 7F). These results indicate that RiskScore stratification is associated with differences in the immune microenvironment.

Figure 7 Correlation between the RiskScore and immune infiltration in the The Cancer Genome Atlas stomach adenocarcinoma cohort. (A) The comparison of immune cell infiltration between high- and low-risk groups shows significant differences in the proportions of various immune cells. (B) The Chi-squared test reveals that high-risk patients have a lower response rate to therapy compared to low-risk patients. (C) Analysis of TME scores demonstrates that high-risk patients have significantly higher StromalScore, ImmuneScore, and ESTIMATEScore. (D) Correlation analysis identifies significant associations between the expression of model genes (ZFYVE27, GNG11, DOK7) and immune checkpoints (CTLA4, PDCD1LG2, PDCD1, CD274). (E) The bubble plot shows significant correlations between model genes. (F) Comparisons of various immune-related scores between high-risk and low-risk groups show significant differences. ns, not significant; *, P<0.05; **, P<0.01; ***, P<0.001; ****, P<0.0001. TME, tumor microenvironment.

GSEA enrichment analysis and drug sensitivity in patients with different risks

To characterize biological programs associated with RiskScore-based stratification, GSEA was performed. High-risk patients showed enrichment of cell cycle, oncogenic, and inflammatory pathways, including G2M checkpoint, hypoxia, IL6/JAK/STAT3, interferon-gamma, KRAS, MYC targets V1, p53, and TNFA (Figure 8A, table available at https://cdn.amegroups.cn/static/public/tcr-2025-1-2846-6.xlsx). Drug sensitivity analysis was additionally conducted. Based on in silico prediction, high-risk patients exhibited lower predicted IC50 values for BMS-754807_2171, dasatinib, JAK inhibitors, and nutlin-3a, whereas reduced sensitivity was observed for lapatinib, sapitinib, SCH772984, and paclitaxel (Figure 8B). Correlation analysis further identified significant associations between RiskScore and drug response across multiple agents (Figure 8C). Notably, the drugs showing increased sensitivity in the high-risk group predominantly target signaling programs aligned with the pathways enriched by GSEA, suggesting a potential association between pathway activation states and pharmacologic response patterns. Detailed results are provided in table available at https://cdn.amegroups.cn/static/public/tcr-2025-1-2846-7.xlsx.

Figure 8 Pathway enrichment and drug sensitivity associated with the RiskScore in the The Cancer Genome Atlas stomach adenocarcinoma cohort. (A) GSEA showing hallmark pathways significantly enriched in high-risk compared with low-risk patients. Enrichment score was plotted against gene rank in the ordered dataset. (B) Comparison of predicted drug sensitivity scores (PreScores) between high-risk (yellow) and low-risk (blue) groups for selected therapeutic agents. (C) Correlation analysis between the RiskScore and predicted drug sensitivity. Circle size indicates correlation strength. Only drugs with P<0.001 are displayed. ****, P<0.001. GSEA, gene set enrichment analysis.

Discussion

This study developed and validated a three-gene LLPS-related prognostic model (ZFYVE27, GNG11, and DOK7) that stratified GC patients by OS and demonstrated incremental clinical utility over clinicopathological models in DCA. In addition to prognostic stratification, the RiskScore was associated with differences in tumor immune characteristics and drug sensitivity, suggesting that LLPS-related stratification corresponds to variation in tumor biology and treatment response.

The discriminative performance of the LLPS-related RiskScore, reflected by moderate time-dependent AUC values in external cohorts, is consistent with the biological and clinical heterogeneity of GC, in which survival outcomes are determined by multiple interacting molecular, pathological, and treatment-related factors that cannot be fully captured by transcriptomic features alone. Discrimination-based metrics primarily rank patients by risk but are insensitive to whether a model improves downstream clinical decisions. Notably, modest AUC values do not preclude clinical relevance when a model provides incremental risk stratification beyond conventional clinicopathological variables. DCA directly evaluated this incremental value by quantifying the net clinical consequences of model-guided decisions across clinically plausible threshold probabilities. The observed improvement in net benefit after incorporating the RiskScore indicates that the model meaningfully reduces unnecessary interventions in low-risk patients while enhancing the identification of high-risk individuals, thereby supporting its utility as a decision-support tool rather than a stand-alone discriminator.

This decision-level utility is supported by the convergence of the three model genes on pathway activity, immune regulation, and treatment-related signaling in pathways previously linked to LLPS. In addition, intrinsic disorder analysis showed that ZFYVE27, GNG11, and DOK7 contain substantial IDRs, supporting their potential involvement in biomolecular condensation. Consistent with our finding of a negative coefficient for DOK7, this gene acts as a potential tumor suppressor. DOK7 has been reported to exhibit hypermethylation in GC (26), leading to transcriptional silencing. As an activator of MuSK that has been implicated in rapsyn-mediated condensate formation (27), the loss of DOK7 may reflect impaired compartmentalization and the disruption of tumor-suppressive signaling, including the PI3K/AKT pathways (28). Conversely, GNG11 was identified as a risk factor in our model. Known as a senescence-associated prognostic gene in GC (29), GNG11 participates in GPCR-mediated signaling that frequently operates through signalosomes that may involve phase separation (30), potentially sustaining proliferative and inflammatory signaling programs. Interestingly, our analysis revealed a protective role for ZFYVE27 (protrudin) in the context of LLPS-related stratification, which contrasts with a previous study linking it to CAF signatures and poor survival (31). This discrepancy highlights the complexity of ZFYVE27’s function in the tumor microenvironment versus the tumor cell interior. ZFYVE27 regulates endoplasmic reticulum (ER) morphology and vesicular trafficking (32), which are essential for establishing intracellular nucleation sites for biomolecular condensates (33). We speculate that while ZFYVE27 may promote aggressiveness when expressed in the stroma, its intracellular maintenance of endoplasmic reticulum (ER)-mediated phase separation in tumor cells might act as a constraint on genomic instability, consistent with the negative coefficient observed in our model.

Consistent with these gene-level functions, high-risk tumors defined by the RiskScore showed enrichment of proliferative, stress-response, and inflammatory pathways, including MYC targets, G2M checkpoint, DNA repair, hypoxia, and IL6/JAK/STAT3 and TNFA signaling. Many of these pathways involve phase-separated regulatory complexes, such as MYC-driven transcriptional condensates (34), DNA damage repair foci (35), and cytokine signaling assemblies mediated by STAT proteins and NF-κB (36). Furthermore, RiskScore distribution across TCGA molecular subtypes revealed that GS tumors exhibited higher risk levels, whereas EBV and MSI subtypes were associated with lower RiskScores, suggesting that LLPS-related dysregulation may preferentially characterize more aggressive, non-immunogenic tumors. The coordinated activation of these pathways may contribute to the observed immune microenvironmental differences and the predicted sensitivity of high-risk tumors to agents such as JAK inhibitors. Together, these findings suggest that the RiskScore reflects integrated tumor states in which pathway activity, immune behavior, and therapeutic susceptibility co-vary, which may underlie its added clinical value observed in DCA.

This study has several limitations. First, although the LLPS-related three-gene signature showed prognostic value, the decline in AUC values across training and validation cohorts indicates only moderate predictive performance, and its generalizability remains to be confirmed in larger prospective cohorts. Second, as a retrospective bioinformatics analysis based on public datasets, mechanistic inferences remain indirect. Although IDR prediction supports the structural feasibility of phase separation, it does not constitute direct evidence of LLPS. Accordingly, the proposed association of ZFYVE27, GNG11, and DOK7 with LLPS regulation requires experimental validation. Potential approaches include visualizing condensate dynamics of proteins such as MYC or DNA repair factors in GC cell lines with high versus low risk scores using immunofluorescence or super-resolution microscopy, and applying CRISPR-based modulation of these genes to examine their effects on LLPS formation and pathway activity. Third, drug sensitivity predictions are computational and require experimental validation before clinical interpretation. Fourth, immunotherapy response was inferred using TIDE-related metrics without validation in treated cohorts and should be interpreted as predictive. Lastly, the imbalance between tumor and normal samples in the TCGA (375 vs. 32) may introduce bias in differential expression analysis.


Conclusions

This study establishes a three-gene LLPS-associated prognostic model (ZFYVE27, GNG11, and DOK7) for GC stratification. Although discriminative performance is moderate, DCA demonstrates a consistent net clinical benefit when the RiskScore is integrated with clinicopathological factors, supporting its relevance for clinical decision support. The findings suggest a potential role for dysregulated biomolecular compartmentalization in malignant progression and suggest potential links between LLPS-related processes and oncogenic pathways, immune features, and drug sensitivity, providing a rationale for future mechanistic and translational studies.


Acknowledgments

None.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-1-2846/rc

Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-1-2846/prf

Funding: None.

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-1-2846/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2024;74:229-63. [Crossref] [PubMed]
  2. Mamun TI, Younus S, Rahman MH. Gastric cancer-Epidemiology, modifiable and non-modifiable risk factors, challenges and opportunities: An updated review. Cancer Treat Res Commun 2024;41:100845. [Crossref] [PubMed]
  3. Shah SAR, Mumtaz M, Sharif S, et al. Helicobacter pylori and gastric cancer: current insights and nanoparticle-based interventions. RSC Adv 2025;15:5558-70. [Crossref] [PubMed]
  4. Ogata T, Narita Y, Oze I, et al. Chronological improvement of survival in patients with advanced gastric cancer over 15 years. Ther Adv Med Oncol 2024;16:17588359241229428. [Crossref] [PubMed]
  5. Ma Y, Jiang Z, Pan L, et al. Current development of molecular classifications of gastric cancer based on omics Int J Oncol 2024;65:89. (Review). [Crossref] [PubMed]
  6. Xie CC, Wang T, Liu XR, et al. Liquid-Liquid Phase Separation in Major Hallmarks of Cancer. Cell Prolif 2026;59:e70122. [Crossref] [PubMed]
  7. Thakur DK, Padole S, Sarkar T, et al. Liquid-Liquid Phase Separation: Mechanisms, Roles, and Implications in Cellular Function and Disease. FASEB Bioadv 2025;7:e70054. [Crossref] [PubMed]
  8. Jian Q, Xu Q, Xiang S, et al. Liquid-liquid phase separation: an emerging perspective on the tumorigenesis, progression, and treatment of tumors. Front Immunol 2025;16:1604015. [Crossref] [PubMed]
  9. Wu Y, Chen Y, Yan X, et al. Lopinavir enhances anoikis by remodeling autophagy in a circRNA-dependent manner. Autophagy 2024;20:1651-72. [Crossref] [PubMed]
  10. Pei Y, Liang H, Guo Y, et al. Liquid-liquid phase separation drives immune signaling transduction in cancer: a bibliometric and visualized study from 1992 to 2024. Front Oncol 2025;15:1509457. [Crossref] [PubMed]
  11. Stein CK, Qu P, Epstein J, et al. Removing batch effects from purified plasma cell gene expression microarrays with modified ComBat. BMC Bioinformatics 2015;16:63. [Crossref] [PubMed]
  12. Oh SC, Sohn BH, Cheong JH, et al. Clinical and genomic landscape of gastric cancer with a mesenchymal phenotype. Nat Commun 2018;9:1777. [Crossref] [PubMed]
  13. Yoon SJ, Park J, Shin Y, et al. Deconvolution of diffuse gastric cancer and the suppression of CD34 on the BALB/c nude mice model. BMC Cancer 2020;20:314. [Crossref] [PubMed]
  14. You K, Huang Q, Yu C, et al. PhaSepDB: a database of liquid-liquid phase separation related proteins. Nucleic Acids Res 2020;48:D354-9. [Crossref] [PubMed]
  15. Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015;43:e47. [Crossref] [PubMed]
  16. Alivand MR, Najafi S, Esmaeili S, et al. Integrative analysis of DNA methylation and gene expression profiles to identify biomarkers of glioblastoma. Cancer Genet 2021;258-259:135-50. [Crossref] [PubMed]
  17. Yu G, Wang LG, Han Y, et al. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 2012;16:284-7. [Crossref] [PubMed]
  18. Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 2010;26:1572-3. [Crossref] [PubMed]
  19. Engebretsen S, Bohlin J. Statistical predictions with glmnet. Clin Epigenetics 2019;11:123. [Crossref] [PubMed]
  20. Montojo J, Zuberi K, Rodriguez H, et al. GeneMANIA: Fast gene network construction and function prediction for Cytoscape. F1000Res 2014;3:153. [Crossref] [PubMed]
  21. Chen B, Khodadoust MS, Liu CL, et al. Profiling Tumor Infiltrating Immune Cells with CIBERSORT. Methods Mol Biol 2018;1711:243-59. [Crossref] [PubMed]
  22. Aran D, Hu Z, Butte AJ. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol 2017;18:220. [Crossref] [PubMed]
  23. Racle J, Gfeller D. EPIC: A Tool to Estimate the Proportions of Different Cell Types from Bulk Gene Expression Data. Methods Mol Biol 2020;2120:233-248. [Crossref] [PubMed]
  24. The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of gastric adenocarcinoma. Nature 2014;513:202-9.
  25. Xiao L, Xia K. Functions of Intrinsically Disordered Regions. Biology (Basel) 2025;14:810. [Crossref] [PubMed]
  26. Moradi A, Aleyasin SA, Mohammadian K, et al. Epigenteic Alteration of DOK7 Gene CpG Island in Blood Leukocyte of Patients with Gastric Cancer and Intestinal Methaplasia. Iran J Biotechnol 2022;20:e3050. [Crossref] [PubMed]
  27. Keritam O, Vincent A, Zimprich F, et al. A clinical perspective on muscle specific kinase antibody positive myasthenia gravis. Front Immunol 2024;15:1502480. [Crossref] [PubMed]
  28. Yue C, Bai Y, Piao Y, et al. DOK7 Inhibits Cell Proliferation, Migration, and Invasion of Breast Cancer via the PI3K/PTEN/AKT Pathway. J Oncol 2021;2021:4035257. [Crossref] [PubMed]
  29. Dai L, Wang X, Bai T, et al. Cellular Senescence-Related Genes: Predicting Prognosis in Gastric Cancer. Front Genet 2022;13:909546. [Crossref] [PubMed]
  30. Yadav R, Zaccolo M. GPCR signaling via cAMP nanodomains. Biochem J 2025;482:519-33. [Crossref] [PubMed]
  31. Zhou Z, Guo S, Lai S, et al. Integrated single-cell and bulk RNA sequencing analysis identifies a cancer-associated fibroblast-related gene signature for predicting survival and therapy in gastric cancer. BMC Cancer 2023;23:108. [Crossref] [PubMed]
  32. Kleniuk J, Nadadhur AG, Rodger C, et al. Protrudin acts at ER-endosome contacts to promote KIF5-mediated endosomal tubule fission. Neurobiol Dis 2026;218:107231. [Crossref] [PubMed]
  33. Ma S, Yang Z, Du C, et al. Lipid-Mediated Assembly of Biomolecular Condensates: Mechanisms, Regulation, and Therapeutic Implications. Biology (Basel) 2025;14:1232. [Crossref] [PubMed]
  34. Hu L, Huang Z, Liu Z, et al. Biomolecular phase separation in tumorigenesis: from aberrant condensates to therapeutic vulnerabilities. Mol Cancer 2025;24:220. [Crossref] [PubMed]
  35. Deng J, Du Z, Li L, et al. Phase separation in DNA repair: orchestrating the cellular response to genomic stability. PeerJ 2025;13:e19402. [Crossref] [PubMed]
  36. Huang Z, Liu Z, Chen L, et al. Liquid-liquid phase separation in cell physiology and cancer biology: recent advances and therapeutic implications. Front Oncol 2025;15:1540427. [Crossref] [PubMed]
Cite this article as: Yan B, Lao Q, Lin L. Liquid-liquid phase separation-related gene signature characterizes prognostic subtypes and therapeutic sensitivities in gastric cancer. Transl Cancer Res 2026;15(5):413. doi: 10.21037/tcr-2025-1-2846

Download Citation