Tree-based model for diffuse-type gastric cancer prognostication: a population study based on the Surveillance, Epidemiology, and End Results database
Original Article

Tree-based model for diffuse-type gastric cancer prognostication: a population study based on the Surveillance, Epidemiology, and End Results database

Lei Cai1, Yeqi Sun2, Beibei Lv1, Wenjing Su1, Jia Li1

1Department of Pathology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China; 2Department of Pathology, Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China

Contributions: (I) Conception and design: L Cai; (II) Administrative support: L Cai, Y Sun; (III) Provision of study materials or patients: Y Sun, B Lv, W Su; (IV) Collection and assembly of data: B Lv, J Li; (V) Data analysis and interpretation: L Cai, W Su; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Lei Cai, PhD. Department of Pathology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, No. 324, Jingwu Road, Jinan 250021, China. Email: cailei123666@163.com.

Background: Diffuse-type gastric cancer (DGC), a subtype of the Lauren classification, is characterized by poor differentiation and a poor prognosis, and effective therapeutic options remain limited. This study aimed to explore the prognostic risk stratification of DGC and to evaluate the effects of various clinical factors on prognosis.

Methods: The data of 1,310 DGC patients were collected from the Surveillance, Epidemiology, and End Results (SEER) database from 2004 to 2015. Cox proportional hazards regression, survival trees, and random survival forests were used for multivariable analyses. The relative importance of different clinical variables was determined by the random survival forest analysis for overall survival (OS) and cancer-specific survival (CSS) in DGC.

Results: The 1,310 DGC patients had a median survival time of 23 months. Most of the patients were White and presented with poorly differentiated tumors. Three distinct subgroups and four subgroups were identified by survival tree analyses based on OS and CSS, respectively. For OS, the 5-year rates of the three groups were 79.4%, 37.2%, and 11%, while the 10-year rates were 71.4%, 24.3%, and 7%, respectively. For CSS, the 5-year rates of the four groups was 78.7%, 39.3%, 21.4%, and 7.8%, and the 10-year rates were 70.7%, 30.1%, 16.3%, and 6%, respectively. The random survival forest results highlighted the importance of stage, surgery, age, tumor size, and receipt of chemotherapy in determining both OS and CSS.

Conclusions: This study identified different prognostic groups and established a risk stratification framework for DGC. Tumor stage and surgery significantly influence risk stratification. The study findings may inform long-term surveillance strategies and guide treatment decision-making for patients with DGC.

Keywords: Diffuse-type gastric cancer (DGC); prognostic risk stratification; overall survival (OS); cancer-specific survival (CSS); tree-based model


Submitted Nov 08, 2025. Accepted for publication Mar 23, 2026. Published online Apr 28, 2026.

doi: 10.21037/tcr-2025-aw-2465


Highlight box

Key findings

• The survival tree-based model provides a novel prognostic stratification framework for patients with diffuse-type gastric cancer (DGC) based on the selected clinical variables.

What is known and what is new?

• Few studies have sought to predict the survival of patients with DGC. The tumor-node-metastasis gastric cancer stage classification alone cannot fully or accurately predict prognosis or guide treatment decision-making for these patients.

• By integrating Cox proportional hazards regression, survival trees, and random survival forests, this study established a reliable prognostic risk stratification framework for DGC.

What is the implication, and what should change now?

• The prognostic risk stratification framework may guide treatment decision-making and the management of patients with DGC.


Introduction

Gastric cancer (GC) is the fifth most common cancer and the fourth leading cause of cancer related-death worldwide (1). Proposed in 1965, the Lauren classification categorizes GC into intestinal-type GC (IGC) and diffuse-type GC (DGC). It has become the most widely used system, as it reflects the distinct histological and clinical features of these subtypes (2). DGC exhibits tumor heterogeneity due to its diverse etiologies and morphological phenotypes, which roughly correspond to poorly cohesive and signet-ring cell carcinomas under the World Health Organization classification (3-5). It is characterized histologically by poor differentiation, non-cohesive scattering, and stromal cell enrichment, which together enable cancer cell dissemination, and contribute to high malignancy, poor prognosis, and chemoresistance (6,7). Despite a decline in the overall incidence of GC linked to increased H. pylori eradication, the incidence of DGC and signet-ring cell carcinoma continues to increase (8).

Numerous retrospective studies have summarized the differential clinical characteristics between IGC and DGC. DGC is more frequently observed in women and younger patients, often presenting with proximal location and advanced disease, and is associated with a poorer prognosis (9-11). Despite its clinical significance, few studies have sought to predict the survival of patients with DGC. Patients with DGC exhibit poorer responses to surgery, chemotherapy, targeted therapy, and immunotherapy compared to those with IGC, which can be partially attributed to the unique molecular profile of DGC. The current tumor-node-metastasis (TNM) staging system may be insufficient to accurately predict prognosis and guide the selection of optimal treatment strategies for DGC. Thus, a risk stratification model needs to be developed to individualize long-term surveillance and improve treatment decision-making.

To explore prognostic risk stratification based on clinical characteristics, we conducted a comprehensive analysis of 1,310 patients with DGC using Cox proportional hazards regression, survival trees, and random survival forests. The survival tree model and random survival forest approach—two types of machine learning algorithms—offer advantages in naturally selecting key variables for survival prediction, evaluating their interactions, and defining prognostic groups. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2465/rc).


Methods

Data source and study population

Patient data and clinical characteristics were obtained from the Surveillance, Epidemiology, and End Results (SEER) database and downloaded using SEER*Stat software (version 8.4.4, Research Data, 17 Registries, Nov 2023 Sub [2000–2021] database). According to the International Classification of Disease of Oncology (ICD-O), the data for cases with pathology code 8145/3—adenocarcinoma, diffuse-type (C16)—were specifically extracted. To ensure the inclusion of several essential variables such as tumor size, only patients diagnosed between 2004 and 2015 were selected. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Clinicopathological variables

In this study, the clinical variables included age, sex, race, tumor site, surgery, radiation, chemotherapy, marital status, county-level median household income, first malignant primary indicator, SEER historic stage A, tumor size, total number of in situ lesions, and patients’ county type. To ensure robust results, patients with missing data for any of these variables were excluded from the study. Age was divided into two groups: ≤65 and >65 years. Race was classified as White, Asian or Pacific Islander, American Indian/Alaska Native, and Black. The primary tumor site was categorized as follows: cardia (C16.0—cardia), fundus (C16.1—fundus of stomach), body (C16.2—body of stomach, C16.5—lesser curvature of stomach, and C16.6—greater curvature of stomach), antrum/pylorus (C16.3—gastric antrum and C16.4—pylorus), overlapping lesion (C16.8—overlapping lesion of stomach), and stomach, not otherwise specified (NOS) (C16.9—stomach, NOS). The treatment variables included surgery (performed vs. no/unknown), radiation (received vs. no/unknown), and chemotherapy (received vs. no/unknown). Marital status was categorized as married (including common-law) and single (including divorced, separated, single, unmarried, domestic partner, and widowed). County-level median household income was classified into four groups: ≤$64,999, $65,000–$79,999, $80,000–$94,999, and ≥$95,000. Tumor stage was categorized as localized, regional, or distant metastasis. Tumor size was coded according to the SEER guidelines. Codes 001–988 represent the exact tumor size in millimeters, which were converted to centimeters for analysis. Code 990 indicates microscopic focus only, while code 998 denotes the diffuse or linitis plastica type of GC without a measurable tumor size. Codes 999 or blank entries were treated as missing data. After excluding cases with missing data, tumor size was categorized as ≤2, 3–5, and >5 cm. Tumor grade was grouped into two categories: well- to moderately differentiated, and poorly to undifferentiated.

Overall survival (OS) and cancer-specific survival (CSS) were considered the primary endpoint and secondary endpoint of the study, respectively. OS was defined as the time from diagnosis to death from any cause. CSS was defined as the time from diagnosis to death attributed to DGC.

Statistical analysis

Cox proportional hazards regression was performed on all clinicopathological variables to assess OS and CSS in DGC. Hazard ratios (HRs) and corresponding confidence intervals (CIs) were calculated. A P<0.05 was considered statistically significant.

A tree-based method was then used to generate an ensemble of trees, and construct diagnostic stratification and prognostication models (12). This approach provides a flexible and stable predictive tool through binary recursive partitioning. Predictive variables and outcomes were illustrated using a tree structure consisting of one root node and its splitting terminal nodes. During this process, all clinicopathological variables were considered candidate variables for splitting the samples. Through recursive partitioning, all samples were assigned to terminal nodes, resulting in subgroups with high internal homogeneity. Following the classification and regression tree paradigm, the tree was grown using a splitting criterion based on the exponential log-likelihood loss (method = “exp”) to maximize the survival difference between the nodes. The final tree was subsequently pruned to optimize the model and avoid overfitting by using the complexity parameter value that minimized the cross-validated error.

In addition, we performed a random survival forest analysis on these DGC cases. This approach generated numerous decision trees through bootstrap resampling. The final predictions were determined by the average or mode of the individual predictions across all decision trees. For the survival data, we selected the number of trees corresponding to the minimum error rate and calculated variable importance (VIMP). We constructed 140 trees for OS and 150 trees for CSS. In both models, the node size was set to 15, and the mtry parameter was set to 5. In addition, VIMP was evaluated using two complementary metrics: minimal depth (MD), with lower values indicating greater influence, and permutation VIMP, with 95% CIs obtained via subsampling (100 replicates). The MD rankings for OS and CSS are shown in Table S1; the VIMP results with the CIs for OS and CSS are provided in Table S2 and Table S3, respectively.

Finally, the prognostic performance of this stratification model was evaluated in the DGC cohort using the concordance index (C-index), calibration curves, and time-dependent receiver operating characteristic (ROC) curves. The C-index was used to evaluate the concordance between the predicted and actual results. The calibration curves were used to assess the agreement between the predicted and actual results. The ROC curve analysis was used to evaluate the discriminative ability of the model in predicting the probability of benefit. All these three validation approaches were performed separately for OS and CSS.

All analyses were performed using R software (version 4.4.1). The “survival”, “tableone”, “dplyr”, “rpart”, “rpart.plot”, “randomForest”, and “randomForestSRC” R packages were used for Cox proportional hazards regression, survival trees, and random survival forest analyses, respectively (13-17). The C-index, calibration curves, and ROC curve analyses were conducted using the “rms” and “timeROC” R packages.


Results

Patient characteristics

The final cohort comprised 1,310 DGC patients, with a median survival time of 23 months. The clinical characteristics of the patients and the results of Cox proportional hazard regression are presented in Table 1. Most patients were White (58.2%), had poorly to undifferentiated tumors (97.6%), presented with regional-stage disease (54.2%), and had primary tumors located in the body (30.6%) or antrum/pylorus (35.7%) of the stomach. A large proportion of the patients resided in metropolitan areas (92.7%), had a median household income of 65,000–80,000 $, and were identified as having a first malignant primary tumor (85.5%). Of the patients, 48.5% were female and 51.5% were male, reflecting a nearly equal sex distribution. Most of the patients underwent surgery (90.5%), radiotherapy (65.6%), and chemotherapy (56.0%) (Table 1). Among all clinicopathological variables, not having surgery or unknown surgery status was associated with worse OS (HR 2.770, 95% CI: 2.215–3.463) and CSS (HR 2.859, 95% CI: 2.264–3.611). Patients with regional- or distant-stage disease experienced significantly worse OS and CSS than those with localized-stage disease. An increased risk was also associated with a primary tumor size >5 cm and an age >65 years. Conversely, Asian or Pacific Islander and American Indian/Alaska Native ethnicity, tumors located in specific body sites, and receipt of chemotherapy were associated with improved OS and CSS, serving as protective factors (Table 1).

Table 1

Cox proportional hazard regression for association of different clinical traits with OS and CSS results based on SEER data

Patient characteristics Value (n=1,310) OS, HR (95% CI) CSS, HR (95% CI)
Age (years)
   ≤65 703 (53.7) Reference Reference
   >65 607 (46.3) 1.445 (1.262–1.654) 1.208 (1.041–1.401)
Sex
   Female 635 (48.5) Reference Reference
   Male 675 (51.5) 0.991 (0.869–1.131) 0.997 (0.863–1.151)
Race
   White 762 (58.2) Reference Reference
   Asian or Pacific Islander 388 (29.6) 0.696 (0.594–0.816) 0.675 (0.566–0.805)
   American Indian/Alaska Native 13 (1.0) 1.261 (0.669–2.379) 1.293 (0.661–2.527)
   Black 147 (11.2) 1.075 (0.876–1.320) 1.017 (0.811–1.276)
Primary site
   Cardia 108 (8.2) Reference Reference
   Fundus 62 (4.7) 0.867 (0.606–1.241) 0.864 (0.586–1.276)
   Body 401 (30.6) 0.775 (0.609–0.988) 0.736 (0.567–0.956)
   Antrum/pylorus 468 (35.7) 0.838 (0.658–1.068) 0.810 (0.624–1.052)
   Overlapping lesion 163 (12.6) 0.902 (0.684–1.190) 0.882 (0.656–1.186)
   Stomach, NOS 108 (8.2) 0.978 (0.726–1.322) 0.948 (0686–1.308)
Surgery
   No/unknown 124 (9.5) 2.770 (2.215–3.463) 2.859 (2.264–3.611)
   Performed 1,186 (90.5) Reference Reference
Radiation
   No/unknown 450 (34.4) Reference Reference
   Yes 860 (65.6) 1.110 (0.951–1.295) 1.156 (0.978–1.367)
Chemotherapy
   No/unknown 577 (44.0) Reference Reference
   Yes 733 (56.0) 0.620 (0.531–0.723) 0.609 (0.515–0.720)
Marital status
   Married 807 (61.6) Reference Reference
   Single 503 (38.4) 0.950 (0.828–1.091) 0.879 (0.755–1.023)
Income
   ≤$64,999 213 (16.3) Reference Reference
   $65,000–$79,999 565 (43.1) 0.932 (0.764–1.137) 0.965 (0.774–1.023)
   $80,000–$94,999 306 (23.4) 1.114 (0.885–1.401) 1.253 (0.976–1.608)
   ≥$95,000 226 (17.2) 1.103 (0.871–1.397) 1.094 (0.841–1.423)
First malignant primary indicator
   Yes 190 (14.5) Reference Reference
   No 1,120 (85.5) 0.662 (0.497–0.881) 0.653 (0.464–0.918)
Grade
   Well- to moderately differentiated 31 (2.4) Reference Reference
   Poorly to undifferentiated 1,279 (97.6) 1.515 (0.955–2.404) 1.759 (0.987–3.133)
Stage
   Localized 324 (24.7) Reference Reference
   Regional 710 (54.2) 3.005 (2.454–3.681) 3.944 (3.073–5.061)
   Distant 276 (21.1) 5.211 (4.104–6.615) 7.194 (5.428–9.536)
Tumor number
   1 1,028 (78.5) Reference Reference
   >1 282 (21.5) 0.696 (0.542–0.893) 0.589 (0.438–0.792)
Size
   ≤2 cm 203 (15.5) Reference Reference
   3–5 cm 555 (42.4) 1.139 (0.911–1.423) 1.197 (0.919–1.559)
   >5 cm 552 (42.1) 1.481 (1.177–1.864) 1.608 (1.230–2.103)
County type
   Metropolitan 1,214 (92.7) Reference Reference
   Non-metropolitan 96 (7.3) 1.270 (0.984–1.640) 1.213 (0.913–1.611)

Italicized values indicate that the variable differed significantly. CI, confidence interval; CSS, cancer-specific survival; HR, hazard ratio; NOS, not otherwise specified; OS, overall survival; SEER, Surveillance, Epidemiology, and End Results.

OS survival tree

The OS survival tree was constructed using 1,310 DGC patients, and subsequently split into different levels. Each terminal node displayed the number of patients, as well as the 5- and 10-year OS rates. The tree was initially split by the localized stage, and then further partitioned by age, chemotherapy, and surgery, resulting in five terminal nodes. Based on 5- and 10-year survival rates, the terminal nodes were then combined into three prognostic groups. Group I comprised patients with localized-stage disease and an age <71 years. Group II comprised patients with localized-stage disease, an age ≥71 years, or those with regional-stage disease who received chemotherapy. Group III comprised patients with regional-stage disease, no/unknown chemotherapy status, or distant-stage disease (Figure 1A). Comparable survival rates were observed between patients with regional-stage disease who received chemotherapy and older patients with localized-stage disease. Among the patients with regional-stage disease, those who received chemotherapy had significantly longer survival than those who did not. Meanwhile, the remaining patients with distant-stage disease were grouped by surgery status; however, both groups exhibited similarly poor survival outcomes.

Figure 1 Survival tree for OS and Kaplan-Meier analyses for different prognostic groups in DGC based on the SEER cohort. (A) Survival tree for OS in DGC. (B) Kaplan-Meier curves for three prognostic groups based on OS. 5yr, 5-year OS rate; 10yr, 10-year OS rate; DGC, diffuse-type gastric cancer; N, total number of patients in the terminal group; OS, overall survival; SEER, Surveillance, Epidemiology, and End Results.

The 5-year OS rates of Groups I, II, and III were 79.4% (95% CI: 73.7–85.7%), 37.2% (95% CI: 33.5–41.3%), and 11.0% (95% CI: 8.5–14.0%), respectively, while the corresponding 10-year OS rates were 71.4% (95% CI: 64.7–78.6%), 24.3% (95% CI: 20.9–28.4%), and 7% (95% CI: 5.1–9.7%) (Table 2). The patients in Group I had best survival, while those in Group III had the worst prognosis. Consistently, the Kaplan-Meier curves analysis also drew the same conclusion across the three prognostic groups (P<0.0001) (Figure 1B).

Table 2

Prognostic groups for OS

Group Description 5-year OS (95% CI) 10-year OS (95% CI)
1 Localized stage, age <71 years 0.794 (0.737–0.857) 0.714 (0.647–0.786)
2 Localized stage, age ≥71 years or regional stage, chemotherapy 0.372 (0.335–0.413) 0.243 (0.209–0.284)
3 Regional stage, no/unknown chemotherapy, or distant stage 0.110 (0.085–0.140) 0.070 (0.051–0.097)

CI, confidence interval; OS, overall survival.

CSS survival tree

Similarly, the CSS survival tree categorized the cases into six terminal nodes. The primary split in the tree was based on tumor stage (localized- vs. regional- or distant-stage), followed by subsequent splits based on treatment, including surgery or chemotherapy. Finally, the six terminal nodes were combined into four prognostic groups based on similar survival outcomes (Figure 2A). Group I comprised patients with localized disease, who underwent surgery. Group II comprised patients with regional disease, who received chemotherapy. Group III comprised patients with regional disease who did not receive or had unknown chemotherapy status. Group IV comprised patients with localized disease who did not undergo surgery or patients with distant disease.

Figure 2 Survival tree for CSS and Kaplan-Meier analyses for different prognostic groups in DGC based on the SEER cohort. (A) Survival tree for CSS in DGC. (B) Kaplan-Meier curves for four prognostic groups based on CSS. 5yr, 5-year CSS rate; 10yr, 10-year CSS rate; CSS, cancer-specific survival; DGC, diffuse-type gastric cancer; N, total number of patients in the terminal group; OS, overall survival; SEER, Surveillance, Epidemiology, and End Results.

For CSS, the 5-year rates of the four prognostic groups were 78.7% (95% CI: 74.1–83.5%), 39.3% (95% CI: 35.0–44.2%), 21.4% (95% CI: 16.5–27.8%), and 7.8% (95% CI: 5.1–11.9%), while the corresponding 10-year rates were 70.7% (95% CI: 65.3–76.6%), 30.1% (95% CI: 25.7–35.2%), 16.3% (95% CI: 11.8–22.5%), and 6% (95% CI: 3.6–9.8%) (Table 3). The prognostic survival of the four groups differed significantly, as illustrated by the Kaplan-Meier curve analysis (Figure 2B).

Table 3

Prognostic groups for CSS

Group Description 5-year CSS (95% CI) 10-year CSS (95% CI)
1 Localized stage, surgery performed 0.787 (0.741–0.835) 0.707 (0.653–0.766)
2 Regional stage, chemotherapy 0.393 (0.35–0.442) 0.301 (0.257–0.352)
3 Regional stage, no/unknown chemotherapy 0.214 (0.165–0.278) 0.163 (0.118–0.225)
4 Localized stage, surgery not performed, or distant stage 0.078 (0.051–0.119) 0.060 (0.036–0.098)

CI, confidence interval; CSS, cancer-specific survival.

Based on the random survival forest analysis, tumor stage was the most important clinicopathological variable for OS, followed by surgery, age, tumor size, race, and receipt of chemotherapy and radiotherapy, each with a relative importance greater than 0.003 (Figure 3A). Similarly, tumor stage was the strongest predictor of CSS, followed by surgery, tumor size, age, race, tumor site, and receipt of chemotherapy and radiotherapy (Figure 3B).

Figure 3 The relative importance of distinct clinicopathological variables based on the random survival forest analysis. (A) Relative importance of variables for OS. (B) Relative importance of variables for CSS. CSS, cancer-specific survival; OS, overall survival.

Evaluation and validation of the stratification

To assess the robustness of the risk stratification, C-index, ROC curve, and calibration curve analyses were performed. The C-index measures the concordance between predicted and actual results. For OS, the C-index at 1, 3, and 5 years was 0.704, 0.685, and 0.683, respectively, while for CSS, it was 0.725, 0.707, and 0.708. The calibration curves demonstrated relatively good agreement between the predicted and actual results (Figure 4A,4B). The ROC curve analysis showed that the area under the curve at 1, 3, and 5 years was 0.730, 0.742 and 0.755, respectively, while for CSS, it was 0.761, 0.780 and 0.796 (Figure 4C,4D).

Figure 4 Calibration and ROC curves of 1-, 3- and 5-year OS and CSS. (A) Calibration plot of the stratification for predicting 1-, 3- and 5-year OS rates. (B) Calibration plot of the stratification for predicting 1-, 3- and 5-year CSS rates. (C) ROC curves for predicting OS rates. (D) ROC curves for predicting CSS rates. CSS, cancer-specific survival; OS, overall survival; ROC, receiver operating characteristic.

Discussion

Consistent with previous studies, most of the patients in this cohort were relatively young and had poorly differentiated or undifferentiated tumors. The median OS in this study (23 months) was comparable to that reported in previous retrospective research (17 months) (18). DGC is usually diagnosed at an advanced stage due to a lack of obvious symptoms in the early stages (19). In this study, most of the patients were diagnosed at the regional stage. A Cox proportional hazards model was used to identify the clinicopathological variables significantly associated with prognosis that exert a consistent, global effect on OS and CSS. As a traditional methodology, Cox regression has limited ability to capture complex interactions among different types of variables. Stage, age, surgery, tumor size, and initial treatment characteristics were identified as the most important variables for predicting survival across these three methods. Similarly, a recent study developed a prognostic nomogram model for patients with DGC and highlighted the importance of age, surgical status, and chemotherapy status (20). The results of the Cox proportional hazard regression model showed that these variables exerted a global effect on prognosis. Compared with the White patients, the Asian or Pacific Islander patients exhibited better survival, despite several studies reporting a higher proportion of DGC in Asian, Hispanic White, and Native American populations. Although the gastric antrum is the most common site of involvement (21,22), the patients with tumors in the body of the stomach had a more favorable prognosis than those with tumors at other sites.

Previous systematic studies have reported that the mean or median age of patients with DGC ranges from 45 to 60 years (23). The median age of this cohort was 64 years (range: 19 to over 90 years). For OS, patients with localized-stage disease were split into two groups using an optimized cutoff age of 71 years. The 5- and 10-year survival rates decreased sharply from 79.4% and 71.4% to 48.3% and 27.6%, respectively, in patients aged over 71 years. Evidence suggests that older patients with DGC have a worse prognosis. However, some studies have reported that in advanced GC, young adult patients also have a poor prognosis, independent of histology (24). Conversely, age appears to have little effect on CSS in patients with localized-stage disease. A previous study found no significant association between DGC and CSS in early-stage GC patients aged 75 years or older, suggesting that early-stage DGC patients may benefit from endoscopic resection regardless of age (25). This difference between OS and CSS may be partially due to factors such as organ dysfunction, hospitalization duration, and malabsorption in older patients (26,27).

Surgery played an important role in stratifying the risk groups for both OS and CSS. For patients with localized-stage disease, surgical intervention is a critical determinant of prognosis, although the SEER dataset does not specify the surgical techniques used. Consistent with previous findings, surgery remains the most effective therapy for GC. For early GC without metastases, curative endoscopic resection is recommended by several authoritative guidelines (28-31). Radical gastrectomy is essential for patients with poorly differentiated tumors, incomplete resection, or lymphovascular invasion, although a subset of cases may still experience recurrence. In our analysis, the 5-year survival rate of patients with localized-stage disease decreased sharply if surgery was not performed. Nevertheless, given the disappointing survival outcomes following hepatectomy for liver metastases, the role of surgery in metastatic or stage IV GC remains unclear (32-35). Similarly, we found that surgery had limited efficacy in altering the malignant progression of advanced DGC at the distant stage.

The analysis of prognostic risk stratifications for both OS and CSS highlighted the essential role of chemotherapy in patients with regional-stage disease. Patients who received chemotherapy had nearly twofold higher survival compared with those who did not. Although chemotherapy remains the standard of care for advanced-stage DGC, its effectiveness is significantly compromised by chemoresistance, driven by the enrichment of cancer-associated fibroblasts and epithelial-mesenchymal transition. The European Society for Medical Oncology guidelines for GC recommend a double combination of platinum derivative and a fluoropyrimidine as the standard first-line chemotherapy regimen (31). In practice, the effects of chemotherapy in locally advanced DGC remain controversial. Several studies have reported that patients with locally advanced DGC could benefit from neoadjuvant and perioperative chemotherapy (36,37). However, other studies have reported that patients receiving preoperative chemotherapy followed by surgery showed no significant survival benefit or detriment compared with those undergoing primary surgery (38-40). This may be due to the biological complexity of DGC, which leads to inherent chemoresistance, particularly in cases of signet-ring cell GC (41,42). Advancements in molecular targeted therapy have led to several clinical trials in DGC, yielding encouraging results against targets, including claudin 18.2 (CLDN18.2), fibroblast growth factor receptor 2 (FGFR2), focal adhesion kinase (FAK), and the Hippo signaling pathway (43-45). These clinical advancements may enable more precisely tailored treatments for patients with this aggressive subtype at various disease stages.

This study highlighted the importance of chemotherapy for patients with regional-stage disease in the context of risk stratification. The 5- and 10-year survival rates of Group II were nearly twice as high as those of Group III. Patients with distant-stage disease, corresponding to stage IV in the TNM staging system, had a worse prognosis even after undergoing surgery. Their 5- and 10-year survival rates were still under 10%. Thus, surgery and chemotherapy exerted differential effects on survival in DGC across distinct SEER stages. In recent years, with the development of genetic sequencing technologies, target therapy and immunotherapy have been widely applied in GC, guided by molecular profiling.

Previous studies have largely focused on the biological mechanisms of DGC in tumor initiation and progression. Conversely, our risk stratification, derived from a relatively large SEER DGC cohort, developed a novel prognostic model for predicting outcomes and guiding individualized treatment decisions. To some extent, it also aided in long-term surveillance and reduced patient anxiety. However, several limitations remain that need to be optimized. First, the reported median age may not be accurate because patients older than 90 years were coded as 90 years. Second, selection bias may have occurred due to the exclusion of patients with missing data for certain variables. Compared to the included patients, the excluded patients had significantly more advanced disease and lower rates of surgery and chemotherapy. These differences suggest that the missing data were not completely random, and that our final cohort may represent a healthier subset with a more favorable prognosis. Consequently, the observed survival rates could be overestimated, while the true effects of certain risk factors may be underestimated. Third, the study lacked external validation cohorts, which are needed to further enhance the reliability and accuracy of the stratification.


Conclusions

This study identified distinct prognostic groups for DGC based on clinical variables and highlighted the influence of age, chemotherapy, and surgery on OS and CSS. The proposed stratification can be used to predict prognosis and tailor therapeutic plans for DGC patients, enabling more intensive disease management of high-risk patients.


Acknowledgments

The authors would like to thank the participants included in the SEER dataset and the project leaders for making their clinical data publicly available.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2465/rc

Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2465/prf

Funding: This study was supported by the Shandong Provincial Natural Science Foundation (Nos. ZR2024QH365 and ZR2021MH382), the Incubation Foundation of Shandong Provincial Hospital (No. 2023FY001), and the National Natural Science Foundation of China (No. 82103490).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-aw-2465/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Thrift AP, Wenker TN, El-Serag HB. Global burden of gastric cancer: epidemiological trends, risk factors, screening and prevention. Nat Rev Clin Oncol 2023;20:338-49. [Crossref] [PubMed]
  2. Lauren P. The two histological main types of gastric carcinoma: diffuse and so-called intestinal-type carcinoma. An attempt at a histo-clinical classification. Acta Pathol Microbiol Scand 1965;64:31-49.
  3. Mariette C, Carneiro F, Grabsch HI, et al. Consensus on the pathological definition and classification of poorly cohesive gastric carcinoma. Gastric Cancer 2019;22:1-9. [Crossref] [PubMed]
  4. Garcia-Pelaez J, Barbosa-Matos R, Gullo I, et al. Histological and mutational profile of diffuse gastric cancer: current knowledge and future challenges. Mol Oncol 2021;15:2841-67. [Crossref] [PubMed]
  5. Tanaka H, Yoshii M, Imai T, et al. Clinical significance of coexisting histological diffuse type in stage II/III gastric cancer. Mol Clin Oncol 2021;15:234. [Crossref] [PubMed]
  6. Monster JL, Kemp LJS, Gloerich M, et al. Diffuse gastric cancer: Emerging mechanisms of tumor initiation and progression. Biochim Biophys Acta Rev Cancer 2022;1877:188719. [Crossref] [PubMed]
  7. Pattison S, Mitchell C, Lade S, et al. Early relapses after adjuvant chemotherapy suggests primary chemoresistance in diffuse gastric cancer. PLoS One 2017;12:e0183891. [Crossref] [PubMed]
  8. Iyer P, Moslim M, Farma JM, et al. Diffuse gastric cancer: histologic, molecular, and genetic basis of disease. Transl Gastroenterol Hepatol 2020;5:52. [Crossref] [PubMed]
  9. Kim KH, Chi CH, Lee SK, et al. Histologic types of gastric carcinoma among Koreans. Cancer 1972;29:1261-3. [Crossref] [PubMed]
  10. Muñoz N, Correa P, Cuello C, et al. Histologic types of gastric carcinoma in high- and low-risk areas. Int J Cancer 1968;3:809-18. [Crossref] [PubMed]
  11. Qiu MZ, Cai MY, Zhang DS, et al. Clinicopathological characteristics and prognostic analysis of Lauren classification in gastric adenocarcinoma in China. J Transl Med 2013;11:58. [Crossref] [PubMed]
  12. Banerjee M, Reynolds E, Andersson HB, et al. Tree-Based Analysis. Circ Cardiovasc Qual Outcomes 2019;12:e004879. [Crossref] [PubMed]
  13. Ishwaran H, Kogalur UB, Blackstone EH, et al. Random survival forests. The Ann Appl Stat 2008;2:841-60.
  14. Ishwaran H, Kogalur UB. Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC). 3.4.4 ed, 2025. Available online: https://CRAN.R-project.org/package=randomForestSRC
  15. Therneau TM, Atkinson EJ. An Introduction to Recursive Partitioning Using the RPART Routines. 2023. Available online: http://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf
  16. Therneau T, Atkinson B, Ripley B. rpart: Recursive Partitioning and Regression Trees. 4.1-11 ed: R package [computer program]; 2017. Available online: https://CRAN.R-project.org/package=rpart
  17. Therneau T. A Package for Survival Analysis in R. 3.5-0 ed, 2023. Available online: https://cran.r-project.org/web/packages/survival/vignettes/survival.pdf
  18. Stiekema J, Cats A, Kuijpers A, et al. Surgical treatment results of intestinal and diffuse type gastric cancer. Implications for a differentiated therapeutic approach? Eur J Surg Oncol 2013;39:686-93.
  19. Kumar S, Long JM, Ginsberg GG, et al. The role of endoscopy in the management of hereditary diffuse gastric cancer syndrome. World J Gastroenterol 2019;25:2878-86. [Crossref] [PubMed]
  20. Huang T, Chan C, Zhou H, et al. Construction and validation of the prognostic nomogram model for patients with diffuse-type gastric cancer based on the SEER database. Discov Oncol 2024;15:305. [Crossref] [PubMed]
  21. Assumpção PP, Barra WF, Ishak G, et al. The diffuse-type gastric cancer epidemiology enigma. BMC Gastroenterol 2020;20:223. [Crossref] [PubMed]
  22. Ooki A, Yamaguchi K. The dawn of precision medicine in diffuse-type gastric cancer. Ther Adv Med Oncol 2022;14:17588359221083049. [Crossref] [PubMed]
  23. Wang JE, Kim SE, Lee BE, et al. The risk of diffuse-type gastric cancer following diagnosis with gastric precancerous lesions: a systematic review and meta-analysis. Cancer Causes Control 2022;33:183-91. [Crossref] [PubMed]
  24. Yamamoto S, Kanzaki H, Sakaguchi C, et al. Current prognostic factors of advanced gastric cancer patients treated with chemotherapy: real world data from a Japanese 12 institutions. Jpn J Clin Oncol 2023;53:928-35. [Crossref] [PubMed]
  25. Yin P, Cai R, Zhou X, et al. Comparable prognosis of early gastric cancer between intestinal type and diffuse type in patients of age 75 and older: a SEER-based cohort study. Transl Cancer Res 2024;13:888-99. [Crossref] [PubMed]
  26. Bakir B, Şahin H, Kaner G, et al. Nutritional status and frailty in elderly patients undergoing major abdominal surgery for upper gastrointestinal tumors: a single-center prospective observational study. Ir J Med Sci 2025;194:1773-86. [Crossref] [PubMed]
  27. Guo Y, Xu Q, Xu T, et al. Sarcopenia to frailty and malnutrition and their predictive role for postoperative outcomes in older gastric cancer. Innov Aging 2025;9:igaf122.2379.
  28. Bollschweiler E, Berlth F, Baltin C, et al. Treatment of early gastric cancer in the Western World. World J Gastroenterol 2014;20:5672-8. [Crossref] [PubMed]
  29. Japanese gastric cancer treatment guidelines 2018 (5th edition). Gastric Cancer 2021;24:1-21.
  30. Ajani JA, D'Amico TA, Bentrem DJ, et al. Gastric Cancer, Version 2.2022, NCCN Clinical Practice Guidelines in Oncology. J Natl Compr Canc Netw 2022;20:167-92. [Crossref] [PubMed]
  31. Lordick F, Carneiro F, Cascinu S, et al. Gastric cancer: ESMO Clinical Practice Guideline for diagnosis, treatment and follow-up. Ann Oncol 2022;33:1005-20. [Crossref] [PubMed]
  32. Shin HJ, Song JH, Kim SE, et al. Clinical impact of gastrectomy in surgically proven stage IV gastric cancers: retrospective analysis from Korean multicenter dataset (PASS-META). Gastric Cancer 2026;29:205-19. [Crossref] [PubMed]
  33. Grotz TE. Editorial: Bridging Biology and Surgery: The Next Frontier in Managing Cytology-Positive Gastric Cancer. Ann Surg Oncol 2026;33:2827-8. [Crossref] [PubMed]
  34. Orsini C, Aulicino M, D'Annibale G, et al. Evaluating the Long-Term Impact of Cytoreductive Surgery for Gastric Cancer with Peritoneal Metastasis: Are We on the Right Path? J Pers Med 2025;15:300. [Crossref] [PubMed]
  35. Takahashi K, Terashima M, Notsu A, et al. Surgical treatment for liver metastasis from gastric cancer: A systematic review and meta-analysis of long-term outcomes and prognostic factors. Eur J Surg Oncol 2024;50:108582. [Crossref] [PubMed]
  36. Gertsen EC, van der Veen A, Brenkman HJF, et al. Multimodal Therapy Versus Primary Surgery for Gastric and Gastroesophageal Junction Diffuse Type Carcinoma, with a Focus on Signet Ring Cell Carcinoma: A Nationwide Study. Ann Surg Oncol 2024;31:1760-72. [Crossref] [PubMed]
  37. Li ZF, Li Z, Zhang XJ, et al. Perioperative chemotherapy improves survival of patients with locally advanced diffuse gastric cancer. World J Gastrointest Surg 2024;16:2878-92. [Crossref] [PubMed]
  38. Elangovan A, Penumadu P, Dubashi B, et al. Perioperative versus adjuvant chemotherapy in carcinoma stomach-A retrospective propensity-matched analysis. Indian J Gastroenterol 2025; Epub ahead of print. [Crossref]
  39. Liu B, Shen C, Yin X, et al. Perioperative chemotherapy for gastric cancer patients with microsatellite instability or deficient mismatch repair: A systematic review and meta-analysis. Cancer 2025;131:e35831. [Crossref] [PubMed]
  40. Giampieri R, Baleani MG, Bittoni A, et al. Impact of Signet-Ring Cell Histology in the Management of Patients with Non-Metastatic Gastric Cancer: Results from a Retrospective Multicenter Analysis Comparing FLOT Perioperative Chemotherapy vs. Surgery Followed by Adjuvant Chemotherapy. Cancers (Basel) 2023;15:3342.
  41. Raju D, Prabhu S, Maria A, et al. Gastric Signet Ring Cell Carcinoma: Tumor Microenvironment Reprogramming and Novel Therapeutic Targets With Emphasis on GRIN2D. Clin Transl Sci 2025;18:e70424. [Crossref] [PubMed]
  42. Kemp LJS, Monster JL, Wood CS, et al. Tumour-intrinsic alterations and stromal matrix remodelling promote Wnt-niche independence during diffuse-type gastric cancer progression. Gut 2025;74:1219-29. [Crossref] [PubMed]
  43. Qi C, Gong J, Li J, et al. Claudin18.2-specific CAR T cells in gastrointestinal cancers: phase 1 trial interim results. Nat Med 2022;28:1189-98. [Crossref] [PubMed]
  44. Nakayama I, Qi C, Chen Y, et al. Claudin 18.2 as a novel therapeutic target. Nat Rev Clin Oncol 2024;21:354-69. [Crossref] [PubMed]
  45. Wu LW, Jang SJ, Shapiro C, et al. Diffuse Gastric Cancer: A Comprehensive Review of Molecular Features and Emerging Therapeutics. Target Oncol 2024;19:845-65. [Crossref] [PubMed]

(English Language Editor: L. Huleatt)

Cite this article as: Cai L, Sun Y, Lv B, Su W, Li J. Tree-based model for diffuse-type gastric cancer prognostication: a population study based on the Surveillance, Epidemiology, and End Results database. Transl Cancer Res 2026;15(4):328. doi: 10.21037/tcr-2025-aw-2465

Download Citation