Investigation of immunohistochemical marker expression in breast cancer of varying severity and construction of a predictive model
Highlight box
Key findings
• Single factors: age, body mass index (BMI), node, human epidermal growth factor receptor 2, Ki-67 antigen (Ki-67), epidermal growth factor receptor (EGFR) are associated with severity; estrogen receptor (ER) positivity is associated with a lower risk.
• Main model: age is significantly independent; BMI/EGFR margin ER is not independent.
• Sensitivity model: node is the strongest, and EGFR is independently significant.
• After internal validation, it becomes moderate discrimination.
What is known and what is new?
• Established risk factors for advanced breast cancer at diagnosis include nodal involvement, aggressive molecular subtypes, and high Ki-67.
• This study shows that when nodal status is excluded from the primary model to avoid structural overlap with stage, age remains independently associated with severe disease, whereas EGFR shows a positive association that becomes significant in sensitivity analysis, including nodal status.
What is the implication, and what should change now?
• These findings suggest that older patients and those with EGFR-positive tumors may be more likely to present with severe disease, while ER positivity is associated with lower crude odds of severity. Because the model was internally validated only and showed moderate corrected discrimination, further external validation is required before clinical implementation.
Introduction
Breast cancer is the most prevalent cancer globally and the leading malignancy in women (1,2). In 2022, the World Health Organization (WHO) data reported 2,296,840 new breast cancer cases and 666,103 deaths worldwide. Notably, China reported the highest incidence at 350,000 new cases (3,4). Despite advancements in diagnosis and treatment, breast cancer’s heterogeneity and complexity continue to pose challenges to clinical management (5-7). Immunohistochemical markers, including estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), Ki-67 antigen (Ki-67), and epidermal growth factor receptor (EGFR), are vital for diagnosis, prognostic evaluation, and guiding breast cancer therapy (8-11). These markers are essential for optimizing clinical decisions in breast cancer. Further research is needed to elucidate the variations in expression and clinical relevance of these markers at different breast cancer stages (12,13).
Recent studies increasingly highlight the role of immunohistochemical markers in breast cancer (14,15). For example, ER and PR expression correlate with breast cancer prognosis (11,16,17). HER2 overexpression is linked to aggressive disease and poor survival (8,18). Ki-67, a marker indicating cell proliferation, is associated with high recurrence and poor survival in breast cancer (10). EGFR overexpression, noted in multiple cancers, is also a focus in breast cancer research (9,19,20).
This study examines differences in the expression of immunohistochemical markers among breast cancer patients of varying severities and develops a predictive model based on these markers to inform personalized breast cancer treatment. Analyzing data from breast cancer patients at Qinghai University Affiliated Hospital between 2019 and 2024, we aim to identify the distribution and clinical relevance of these markers across different levels of breast cancer severity. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0116/rc).
Methods
Study population
The study, designed retrospectively, was carried out at a Qinghai University-affiliated tertiary medical center (Qinghai University Affiliated Hospital) and included 358 breast cancer patients diagnosed between January 1, 2019, and May 1, 2024. Inclusion criteria were pathologically confirmed breast cancer, complete clinicopathological data (including age, tumor size, lymph node status, and distant metastasis), and available immunohistochemical marker results. Patients with a HER2 score of 2+ without fluorescence in situ hybridization (FISH) testing were excluded. Among the 356 eligible breast cancer patients, two groups were defined according to the 2024 edition of the Chinese Anti-Cancer Association Guidelines for the Diagnosis and Treatment of Breast Cancer (21): stages I–II were classified as the mild group (n=280, 78.7%) and stages III–IV as the severe group (n=76, 21.3%). The study design is detailed in Figure 1. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Qinghai University Affiliated Hospital (No. QHDXYY-2023-IRB-056). The requirement for informed consent was waived by the Ethics Committee due to the retrospective design and anonymized nature of the data.
Data collection
Demographic, clinicopathological, and immunohistochemical data, including ER, PR, HER2, Ki-67, EGFR, tumor protein 63 (p63), and cytokeratin 5/6 (CK5/6), were collected from the electronic medical records of the Qinghai University Affiliated Hospital. All patients were pathologically diagnosed with breast cancer and underwent standardized immunohistochemical testing. Tumor specimens resected during surgery were fixed in 10% neutral buffered formalin within 1 hour after surgery for a duration of 6 to 72 hours. The fixed tissues were then processed through routine dehydration, paraffin embedding, and sectioned into 4-µm-thick slices. Sections were used for hematoxylin and eosin staining and immunohistochemical analysis.
Interpretive standards
The Ki-67 cut-off value of 20%, supported by evidence of its prognostic relevance, was adopted (22-24). High Ki-67 expression, defined as >20%, suggests aggressive tumor biology and may signal a poorer prognosis. ER and PR are considered positive with ≥1% nuclear-stained tumor cells (21). HER2 is scored positive at 3+ by immunohistochemistry (IHC) or 2+ with FISH confirmation (25). Ki-67 was considered highly expressed when >20% of tumor cell nuclei were stained. Tumors were considered EGFR-positive if at least 10% of cells exhibited membrane staining (26,27). p63 was considered positive when nuclear staining was observed in at least 10% of tumor cells (28,29). Tumors were classified as CK5/6-positive when ≥5% of cells showed staining in the membrane or cytoplasm (30).
Tumor phenotype classification
Based on immunohistochemical results for ER, PR, HER2, and Ki-67, as well as FISH results for HER2, breast cancer phenotypes were classified as follows: luminal A: ER and/or PR positive, HER2 negative, and low Ki-67 expression; luminal B (HER2-negative): ER and/or PR positive, HER2 negative, and high Ki-67 expression; luminal B (HER2-positive): ER and/or PR positive with HER2 overexpression and/or amplification; HER2-enriched: ER and PR negative with HER2 overexpression and/or amplification; triple-negative breast cancer (TNBC): ER, PR, and HER2 all negative.
Statistical analysis
Continuous variables are reported as mean ± standard deviation (SD) or median [interquartile range (IQR)]. Categorical variables are presented as n (%). For continuous variables, t-tests or Mann-Whitney U tests were applied based on data normality. Chi-squared tests or Fisher’s tests were used to compare variables. Associations were quantified using odds ratios (ORs) and 95% confidence intervals (CIs). Variables for the primary multivariable model were prespecified based on clinical relevance and biological plausibility (age, BMI, ER, PR, HER2, Ki-67, EGFR, and CK5/6), rather than selected solely according to univariate P values. Because axillary lymph node status is part of the staging framework used to define the outcome, it was excluded from the primary model to avoid structural circularity and was evaluated only in a sensitivity analysis. Additionally, a nomogram was developed using identified risk factors to predict the probability of severe breast cancer at presentation. Model performance was assessed by discrimination [apparent area under the curve (AUC) and bootstrap optimism-corrected AUC], calibration (bootstrap-corrected calibration slope and calibration curve), and goodness-of-fit. Internal validation was performed using 1,000 bootstrap resamples. P values <0.05 were considered statistically significant. Data analysis was performed with R version 4.1.3 (31).
Results
A comparison of baseline characteristics among 356 patients with breast cancer
This study enrolled 358 patients, excluding 2 with a HER2 score of 2+ without FISH testing. Data collection spanned January 1, 2019, to May 1, 2024. Table 1 details the baseline characteristics. The mean age was 52.42±10.37 years; 1 patient (0.28%) was male. The cohort included 280 patients in the mild group and 76 in the severe group. No differences were noted in height, weight, BMI, hypertension, diabetes, coronary heart disease, p63, CK5/6, or other variables. In the mild group, there was a higher proportion of cases without lymph node metastasis, ER-positive, PR-positive, HER2-negative, EGFR-negative, and low Ki-67 expression (P<0.05). In the severe group, there was a higher proportion of cases with lymph node metastasis, ER-negative, HER2-negative, and high Ki-67 expression (P<0.05). Results indicated that lymph node metastasis, ER, PR, and HER2-negative, and EGFR-positive, high Ki-67 expression, and older age were associated with the severe group.
Table 1
| Variables | Total (n=356) | Mild group (n=280) | Severe group (n=76) | P value |
|---|---|---|---|---|
| Age (years) | 52.42±10.37 | 51.48±9.74 | 55.91±11.86 | 0.003 |
| Height (cm) | 160.83±5.29 | 160.98±5.05 | 160.29±6.12 | 0.37 |
| Weight (kg) | 61.60±8.81 | 61.23±8.51 | 62.97±9.77 | 0.16 |
| BMI (kg/m2) | 23.82±3.25 | 23.62±3.09 | 24.53±3.70 | 0.05 |
| Marital status | ||||
| Single | 8 (2.25) | 7 (2.50) | 1 (1.32) | >0.99 |
| Married | 348 (97.75) | 273 (97.50) | 75 (98.68) | |
| Hypertension | 0.09 | |||
| No | 313 (87.92) | 251 (89.64) | 62 (81.58) | |
| Yes | 43 (12.08) | 29 (10.36) | 14 (18.42) | |
| Diabetes | 0.55 | |||
| No | 338 (94.94) | 267 (95.36) | 71 (93.42) | |
| Yes | 18 (5.06) | 13 (4.64) | 5 (6.58) | |
| Coronary heart disease | 0.38 | |||
| No | 348 (97.75) | 275 (98.21) | 73 (96.05) | |
| Yes | 8 (2.25) | 5 (1.79) | 3 (3.95) | |
| Position | 0.50 | |||
| Right | 155 (43.54) | 125 (44.64) | 30 (39.47) | |
| Left | 201 (56.46) | 155 (55.36) | 46 (60.53) | |
| Lymph node metastasis | <0.001 | |||
| No | 200 (56.18) | 189 (67.50) | 11 (14.47) | |
| Yes | 156 (43.82) | 91 (32.50) | 65 (85.53) | |
| Tumor type | <0.001 | |||
| Luminal A | 136 (38.20) | 118 (42.14) | 18 (23.68) | |
| Luminal B (HER2+) | 54 (15.17) | 43 (15.36) | 11 (14.47) | |
| Luminal B (HER2−) | 82 (23.03) | 70 (25.00) | 12 (15.79) | |
| HER2+ | 34 (9.55) | 18 (6.43) | 16 (21.05) | |
| TNBC | 50 (14.04) | 31 (11.07) | 19 (25.00) | |
| ER | <0.001 | |||
| Positive | 269 (75.56) | 229 (81.79) | 40 (52.63) | |
| Negative | 87 (24.44) | 51 (18.21) | 36 (47.37) | |
| PR | <0.001 | |||
| Positive | 229 (64.33) | 198 (70.71) | 31 (40.79) | |
| Negative | 127 (35.67) | 82 (29.29) | 45 (59.21) | |
| HER2 | 0.02 | |||
| Positive | 88 (24.72) | 61 (21.79) | 27 (35.53) | |
| Negative | 268 (75.28) | 219 (78.21) | 49 (64.47) | |
| Ki-67 | 0.01 | |||
| Low expression | 198 (55.62) | 166 (59.29) | 32 (42.11) | |
| High expression | 158 (44.38) | 114 (40.71) | 44 (57.89) | |
| p63 | 0.32 | |||
| Positive | 103 (28.93) | 77 (27.50) | 26 (34.21) | |
| Negative | 253 (71.07) | 203 (72.50) | 50 (65.79) | |
| EGFR | <0.001 | |||
| Positive | 99 (27.81) | 63 (22.50) | 36 (47.37) | |
| Negative | 257 (72.19) | 217 (77.50) | 40 (52.63) | |
| CK5/6 | 0.06 | |||
| Positive | 116 (32.58) | 84 (30.00) | 32 (42.11) | |
| Negative | 240 (67.42) | 196 (70.00) | 44 (57.89) |
BMI, body mass index; CK5/6, cytokeratin 5/6; EGFR, epidermal growth factor receptor; ER, estrogen receptor; HER2, human epidermal growth factor receptor 2; Ki-67, Ki-67 antigen; p63, tumor protein 63; PR, progesterone receptor; TNBC, triple-negative breast cancer.
Among the molecular subtypes, luminal A was the most common (38.20%), followed by luminal B (HER2-negative) at 23.03%, luminal B (HER2-positive) at 15.17%, TNBC at 14.04%, and HER2-enriched at 9.55%. The distribution of molecular subtypes differed significantly between the mild and severe groups (P<0.001). The proportions of HER2-enriched and TNBC subtypes were markedly higher in the severe group, suggesting that these subtypes may be more likely to be associated with advanced-stage breast cancer.
Univariate logistic regression analysis for severe breast cancer
Table 2 shows that univariate analysis linked age, BMI, lymph node metastasis, ER, PR, HER2, Ki-67, and EGFR status with breast cancer severity. The data confirmed significant associations of age, BMI, and axillary lymph node metastasis with breast cancer severity. ER and PR positivity were associated with lower odds of severe disease in the univariable analysis. Elevated HER2, Ki-67, and EGFR expression were associated with greater tumor aggressiveness and poorer prognosis.
Table 2
| Variables | OR (95% CI) | P value |
|---|---|---|
| Age | 1.04 (1.02, 1.07) | 0.001 |
| BMI | 1.09 (1.01, 1.18) | 0.03 |
| ER_bin | 0.25 (0.14, 0.43) | <0.001 |
| PR_bin | 0.29 (0.17, 0.48) | <0.001 |
| HER2_bin | 1.98 (1.13, 3.41) | 0.01 |
| Ki-67_bin | 2.00 (1.20, 3.37) | 0.008 |
| EGFR_bin | 3.10 (1.82, 5.28) | <0.001 |
| CK5/6_bin | 1.70 (1.00, 2.86) | 0.047 |
| Node | 12.27 (6.41, 25.60) | <0.001 |
Bin, binarized status (positive/negative) of the immunohistochemical marker; BMI, body mass index; CI, confidence interval; CK5/6, cytokeratin 5/6; EGFR, epidermal growth factor receptor; ER, estrogen receptor; HER2, human epidermal growth factor receptor 2; Ki-67, Ki-67 antigen; OR, odds ratio; PR, progesterone receptor.
Univariate analysis of molecular subtypes showed that, compared with the luminal A subtype (reference group), both the HER2-enriched and TNBC subtypes were strongly associated with increased disease severity. In contrast, luminal B (HER2-positive) and luminal B (HER2-negative) subtypes were not significantly associated with severity (P>0.05).
Multivariate logistic regression analysis for breast cancer of the severe group
To refine the analysis, we utilized stepwise regression. Table 3 displays the results of the multivariable logistic regression. In the revised primary multivariable model (Table 3), excluding nodal status, age remained independently associated with severe disease (OR =1.03; 95% CI: 1.01–1.06; P=0.01). BMI showed a borderline positive association (OR =1.08; 95% CI: 0.99–1.18; P=0.08). ER was not independently significant after adjustment (OR =0.58; 95% CI: 0.24–1.33; P=0.21), and EGFR showed a positive but borderline association (OR =1.78; 95% CI: 0.92–3.39; P=0.08).
Table 3
| Variables | OR (95% CI) | P value |
|---|---|---|
| Age | 1.03 (1.01, 1.06) | 0.01 |
| BMI | 1.08 (0.99, 1.18) | 0.08 |
| ER_bin | 0.58 (0.24, 1.33) | 0.21 |
| PR_bin | 0.74 (0.33, 1.71) | 0.47 |
| HER2_bin | 1.43 (0.76, 2.67) | 0.26 |
| Ki-67_bin | 1.49 (0.84, 2.65) | 0.17 |
| EGFR_bin | 1.78 (0.92, 3.39) | 0.08 |
| CK5/6_bin | 1.30 (0.71, 2.36) | 0.39 |
Bin, binarized status (positive/negative) of the immunohistochemical marker; BMI, body mass index; CI, confidence interval; CK5/6, cytokeratin 5/6; EGFR, epidermal growth factor receptor; ER, estrogen receptor; HER2, human epidermal growth factor receptor 2; Ki-67, Ki-67 antigen; OR, odds ratio; PR, progesterone receptor.
In the sensitivity model (Table 4) including axillary lymph node metastasis, nodal status remained the strongest predictor (OR =16.75; 95% CI: 7.91–39.15; P<0.001), and EGFR positivity remained independently associated with severe disease (OR =2.93; 95% CI: 1.33–6.59; P=0.008).
Table 4
| Term coef_shrunk | Coef_shrunk | OR_per5y | pOR_per5bmi |
|---|---|---|---|
| Intercept | −0.15317393 | 0.8579805, 1.000000 | 1.000000 |
| Age (years) | 0.03240979 | 1.0329407, 1.175918 | 1.000000 |
| BMI (kg/m2) | 0.05706644 | 1.0587262, 1.000000 | 1.330204 |
| Hypertension | 0.22912295 | 1.2574966, 1.000000 | 1.000000 |
| EGFR | −0.90485616 | 0.4046001, 1.000000 | 1.000000 |
| CK5/6 | −0.32796359 | 0.7203892, 1.000000 | 1.000000 |
BMI, body mass index; CK5/6, cytokeratin 5/6; coef, coefficient; EFGR, epidermal growth factor receptor; OR, odds ratio; per5bmi, per 5-kg/m2 increase in body mass index; per5y, per 5-year increase; pOR, prevalence odds ratio.
Figure 2 presents the multivariate results after the manual inclusion of age and BMI. The model indicated that neither age nor BMI significantly influenced the severity of breast cancer in patients; however, an increase in age and BMI was associated with worse disease outcomes. Therefore, these variables were included due to their clinical significance (32-36).
The nomogram for predicting breast cancer severity
Using risk factors identified by multivariable logistic regression, we developed a nomogram predicting breast cancer severity (Figure 3). To ensure the accuracy of the results, we used a calibration curve. The bootstrap method (1,000 resamples) was used to plot the curve (Figure 4), demonstrating good model-observation agreement. The revised primary model showed an apparent AUC of 0.713 (95% CI: 0.644–0.782), indicating moderate discrimination. After bootstrap internal validation with 1,000 resamples, the mean optimism was 0.034, yielding an optimism-corrected AUC of 0.679 (Figure 5). For the nomogram, the AUC indicated good discrimination (Figure S1). Furthermore, decision curve analysis (DCA) confirmed the nomogram’s clinical efficacy (Figure S2). The bootstrap-corrected calibration slope was 0.841, indicating mild overfitting. The bootstrap-corrected calibration curve showed acceptable overall agreement, with Emax =0.072, a mean absolute error of 0.034, and a mean squared error of 0.00167.
Each 1 year/1 kg/m2 increase corresponded to ORs of 1.033/1.059; EGFR =2 and CK5/6 =2 corresponded to ORs of 0.405 and 0.720, respectively (relative to the reference level =1). The shrunken final coefficients are presented in Table 4 and can be used for individualized risk calculation.
Discussion
Breast cancer, the leading malignant tumor in women globally, exhibits rising incidence and mortality trends (1,2). Research advancements in breast cancer have been significant in recent years. Notably, the application of immunohistochemical markers has improved clinicians’ ability to predict prognosis and personalize treatment by analyzing protein expression in the tumor microenvironment. These studies lay a solid foundation for deeper insights into breast cancer biology and its progression (37,38).
This study investigates immunohistochemical marker expression and its correlation with breast cancer severity. Analysis of 356 patients revealed that 78.7% were stage I/II (mild group) and 21.3% were stage III/IV (severe group). Significant differences in clinicopathological features distinguished the mild and severe groups, with the latter showing a higher rate of lymph node metastasis, EGFR and HER2 positivity, elevated Ki-67 expression, and greater average age (39). These findings are consistent with the literature linking such aggressive traits to poorer outcomes (40). The elevated Ki-67 expression in the severe group supports the association between higher proliferation rates and aggressive tumor behavior (41-43). Moreover, elevated ER and PR expression correlate with a better prognosis (44).
Univariate logistic regression analysis associated breast cancer severity with age, BMI, lymph node metastasis, molecular-based subtype classification, and biomarkers ER, PR, HER2, Ki-67, and EGFR. Importantly, lymph node metastasis, age, and BMI were significantly linked to severe disease. Age serves as a critical predictor of prognosis in breast cancer. Patients at age extremes often have poorer outcomes (45). Younger patients are more likely to present with hormone receptor-negative tumors and more aggressive molecular subtypes, while older patients may experience reduced survival due to comorbidities or less intensive treatment. BMI has been widely recognized as a negative prognostic factor in breast cancer (45). In particular, obese patients exhibit significantly higher rates of recurrence and mortality compared with those of normal weight (46). Given its clinical relevance and potential confounding effect, both age and BMI were included as covariates in the multivariable analysis (Figure 3). This adjustment enhances the model’s alignment with real-world clinical settings and improves its interpretability and practical value. High ER and PR expression are generally associated with a more favorable prognosis. Recent studies have confirmed that tumors in ER- and PR-positive patients tend to exhibit lower aggressiveness and better responses to endocrine therapy, resulting in improved clinical outcomes and a reduced risk of recurrence (44). HER2 overexpression is a well-established adverse prognostic factor in breast cancer. Elevated HER2 levels are closely associated with increased tumor proliferation, invasiveness, and metastatic potential, leading to accelerated disease progression and poorer clinical outcomes (47). Elevated Ki-67 expression indicates a more aggressive disease. As a marker of cellular proliferation, Ki-67 reflects the proliferative and invasive potential of tumor cells. High Ki-67 levels have been widely associated with poor prognosis, higher histological grade, and reduced disease-free and overall survival rates (41-43). Clinical guidelines also recommend Ki-67 as an important reference marker for guiding breast cancer treatment decisions (48). EGFR positivity is strongly associated with increased tumor aggressiveness and poorer prognosis. EGFR is a key receptor tyrosine kinase involved in tumor cell growth, survival, and metastasis, primarily acting through downstream signaling pathways such as PI3K/AKT and MAPK (49). Numerous studies have shown that EGFR overexpression is particularly common in TNBC and basal-like subtypes, both of which are characterized by high aggressiveness and limited treatment options (49,50). Therefore, patients with EGFR-positive tumors may require more aggressive treatment strategies and closer clinical monitoring. In univariate logistic regression analysis, using the luminal A subtype as the reference, both the HER2-enriched and TNBC subtypes showed a significant positive association with disease severity. The HER2-enriched subtype had an OR of 5.74 (95% CI: 2.48–13.5; P<0.001), indicating that patients with this subtype were nearly 5.7 times more likely to present with advanced disease compared with those with the luminal A subtype. The TNBC subtype had an OR of 3.98 (95% CI: 1.86–8.60; P<0.001), indicating approximately a fourfold increased risk compared with the luminal A subtype. Both ORs demonstrated high statistical significance (P<0.001), suggesting that, relative to the biologically less aggressive luminal A subtype, patients with HER2-enriched and TNBC subtypes have a significantly higher risk of developing severe disease. These findings are consistent with established clinical knowledge: HER2-enriched and TNBC subtypes are generally more aggressive and associated with poorer prognosis compared with the luminal A subtype, with patients more likely to present with higher tumor grade or advanced stage at diagnosis (51). In HER2-positive breast cancer, overexpression of the HER2 receptor is a key driver of tumor progression. HER2 is a tyrosine kinase receptor that, upon dimerization with other members of the HER family, activates downstream signaling cascades such as the RAS-RAF-MEK-ERK pathway, which promotes cell proliferation, and the PI3K-AKT-mTOR pathway, which enhances cell survival (52). The hyperactivation of these signaling pathways accelerates tumor cell proliferation and inhibits apoptosis, contributing to the high proliferative capacity and aggressiveness of HER2-positive breast cancers. Consequently, these tumors often exhibit higher histological grades and more rapid growth. HER2-driven signaling facilitates invasion into surrounding tissues and promotes distant metastasis, leading to a poorer prognosis. Although anti-HER2 targeted therapies, such as trastuzumab, have significantly improved outcomes in this patient population, HER2-positive tumors remain biologically more aggressive than luminal A subtypes (53). TNBC lacks expression of hormone receptors and HER2, rendering it unresponsive to endocrine therapy or HER2-targeted treatments. As a result, patients must rely on non-specific systemic therapies such as chemotherapy, with limited treatment options available. This subtype often corresponds to the ‘basal-like’ molecular phenotype. TNBC tumors frequently exhibit high expression of basal cytokeratin, such as CK5/6 and EGFR, while lacking expression of genes associated with the estrogen signaling pathway. The basal-like phenotype is characterized by high proliferative activity, poor differentiation, and increased genomic instability (51). Studies have shown that TNBC is characterized by a high frequency of tumor suppressor gene mutations, particularly p53, and frequent dysregulation of pathways such as the cell cycle and PI3K/AKT. These molecular alterations collectively contribute to the aggressive growth of TNBC. This subtype is also strongly linked to breast cancer 1, early onset (BRCA1) gene mutations and is more prevalent among breast cancer patients carrying germline BRCA1 mutations (53). Loss of BRCA1 function impairs DNA repair, which may further drive the aggressive behavior and early recurrence of TNBC. Due to the lack of effective targeted therapies, TNBC carries a high risk of early recurrence and metastasis, and is considered one of the most aggressive subtypes of breast cancer. The tumor immune microenvironment of TNBC also differs from other subtypes. TNBC tumors often exhibit a high abundance of tumor-infiltrating lymphocytes (TILs), indicating a higher degree of immunogenicity in this subtype (54). A high density of TILs is generally associated with better responses to chemotherapy and immunotherapy. Numerous clinical studies have demonstrated that immune checkpoint inhibitors, such as programmed cell death protein 1 (PD-1)/programmed death-ligand 1 (PD-L1) inhibitors, can provide survival benefits for patients with TNBC (55). However, some TNBC tumors can evade immune surveillance through multiple mechanisms. Combined with the lack of targetable driver mutations, this often leads to disease progression despite intensified chemotherapy. In summary, the heightened proliferative and metastatic potential driven by aberrant HER2 signaling in HER2-positive subtypes and the basal-like biological features of TNBC, along with the lack of specific targeted therapies and the complex immune microenvironment in TNBC, contribute to the greater aggressiveness and disease severity of these subtypes compared with the luminal A subtype. In the revised primary model excluding nodal status, age remained independently associated with severe disease, whereas ER was not independently significant after adjustment, and EGFR showed a positive but borderline association. In the sensitivity model including nodal status, axillary lymph node metastasis remained the strongest predictor, and EGFR positivity remained independently associated with severity. Lymph node metastasis, a strong prognostic indicator, typically signifies more extensive disease and poorer survival. Patients with lymph node metastases had a significantly worse prognosis compared with those without (41-43). Consistent with prior research, ER negativity is associated with aggressive breast cancer and poor prognosis (56-58). Notably, our study identified EGFR positivity as a significant predictor of poor prognosis. This underscores the necessity of evaluating multiple factors and biomarkers for accurate assessment of breast cancer severity and prognosis.
Like prior studies, this study found that ER status and lymph node metastasis were closely linked to breast cancer severity. This analysis also confirmed the poor prognosis linked to EGFR positivity, consistent with previous findings. The findings suggest that future research should explore the combination of immunohistochemical markers to predict breast cancer prognosis. Moreover, EGFR’s role as a potential target for prevention and therapy should be further investigated. Unlike previous studies that focused on single biomarkers, the novelty of this study lies in its use of breast cancer severity (stages I–II vs. III–IV) as the grouping criterion and the integration of multiple immunohistochemical markers to construct a comprehensive predictive model. This approach provides a new tool for risk stratification. The revised primary nomogram showed moderate discrimination (apparent AUC =0.713; optimism-corrected AUC =0.679) with acceptable calibration after bootstrap internal validation. By incorporating multiple biomarkers, the model improved its ability to identify high-risk patients—particularly those with lymph node metastasis, EGFR positivity, and ER negativity—who may benefit from more intensive treatment strategies. Despite these valuable conclusions, there are several limitations to this study. First, its single-center, retrospective design with limited sampling may introduce selection bias. Second, it did not include other potential factors, such as genetic and lifestyle factors, which could bias the findings. Prospective studies are needed to address these limitations. Third, although key immunohistochemical markers were included, factors such as tumor grade and lymphovascular invasion were not considered and may serve as potential directions for future model refinement.
Limitations
In addition, the events-per-variable (EPV) ratio was limited (EPV =9.5), which may affect model stability. The use of a binary stage-defined endpoint (I–II vs. III–IV) may also have reduced clinical granularity. Although nodal status was strongly associated with severity, it was excluded from the primary model because of its structural overlap with stage and was retained only in the sensitivity analysis. Moreover, the study was retrospective, single-center, and internally validated only, and therefore, external validation is required.
Conclusions
In the revised primary model excluding nodal status, age remained independently associated with severe breast cancer at presentation, while BMI and EGFR showed borderline positive associations, and ER was not independently significant after adjustment. In the sensitivity analysis, axillary lymph node metastasis remained the strongest predictor, and EGFR positivity remained independently associated with severity. The nomogram showed moderate discrimination after internal validation and acceptable calibration; however, external validation is required before clinical application. This study highlights key insights into breast cancer progression, especially for high-risk patients with lymph node metastasis, EGFR positivity, and ER negativity, who may benefit from more aggressive treatment strategies. While the retrospective design and single-center sample limit the study, its findings offer valuable evidence to enhance prognostic prediction models.
Acknowledgments
We would like to express our gratitude to Qinghai University Affiliated Hospital for providing the immunohistochemical data.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0116/rc
Data Sharing Statement: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0116/dss
Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0116/prf
Funding: This study was supported by
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2026-1-0116/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Qinghai University Affiliated Hospital (No. QHDXYY-2023-IRB-056). The requirement for informed consent was waived by the Ethics Committee due to the retrospective design and anonymized nature of the data.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. [Crossref] [PubMed]
- Siegel RL, Miller KD, Fuchs HE, et al. Cancer statistics, 2022. CA Cancer J Clin 2022;72:7-33. [Crossref] [PubMed]
- Han B, Zheng R, Zeng H, et al. Cancer incidence and mortality in China, 2022. J Natl Cancer Cent 2024;4:47-53. [Crossref] [PubMed]
- Xu Y, Gong M, Wang Y, et al. Global trends and forecasts of breast cancer incidence and deaths. Sci Data 2023;10:334. [Crossref] [PubMed]
- Fumagalli C, Barberis M. Breast Cancer Heterogeneity. Diagnostics (Basel) 2021;11:1555. [Crossref] [PubMed]
- Guo L, Kong D, Liu J, et al. Breast cancer heterogeneity and its implication in personalized precision therapy. Exp Hematol Oncol 2023;12:3. [Crossref] [PubMed]
- Carlino F, Solinas C, Orditura M, et al. Editorial: Heterogeneity in breast cancer: clinical and therapeutic implications. Front Oncol 2024;14:1321654. [Crossref] [PubMed]
- Abubakar M, Figueroa J, Ali HR, et al. Combined quantitative measures of ER, PR, HER2, and KI67 provide more prognostic information than categorical combinations in luminal breast cancer. Mod Pathol 2019;32:1244-56. [Crossref] [PubMed]
- Schulmeyer CE, Fasching PA, Häberle L, et al. Expression of the Immunohistochemical Markers CK5, CD117, and EGFR in Molecular Subtypes of Breast Cancer Correlated with Prognosis. Diagnostics (Basel) 2023;13:372. [Crossref] [PubMed]
- Lashen AG, Toss MS, Ghannam SF, et al. Expression, assessment and significance of Ki67 expression in breast cancer: an update. J Clin Pathol 2023;76:357-64. [Crossref] [PubMed]
- Zhao X, Yang X, Fu L, et al. Associations of Estrogen Receptor, Progesterone Receptor, Human Epidemic Growth Factor Receptor-2 and Ki-67 with Ultrasound Signs and Prognosis of Breast Cancer Patients. Cancer Manag Res 2021;13:4579-86. [Crossref] [PubMed]
- Lopez-Gonzalez L, Sanchez Cendra A, Sanchez Cendra C, et al. Exploring Biomarkers in Breast Cancer: Hallmarks of Diagnosis, Treatment, and Follow-Up in Clinical Practice. Medicina (Kaunas) 2024;60:168. [Crossref] [PubMed]
- Neves Rebello Alves L, Dummer Meira D, Poppe Merigueti L, et al. Biomarkers in Breast Cancer: An Old Story with a New End. Genes (Basel) 2023;14:1364. [Crossref] [PubMed]
- Sun H, Ding Q, Sahin AA. Immunohistochemistry in the Diagnosis and Classification of Breast Tumors. Arch Pathol Lab Med 2023;147:1119-32. [Crossref] [PubMed]
- Chand P, Garg A, Singla V, et al. Evaluation of Immunohistochemical Profile of Breast Cancer for Prognostics and Therapeutic Use. Niger J Surg 2018;24:100-6. [Crossref] [PubMed]
- Kwak Y, Jang SY, Choi JY, et al. Progesterone Receptor Expression Level Predicts Prognosis of Estrogen Receptor-Positive/HER2-Negative Young Breast Cancer: A Single-Center Prospective Cohort Study. Cancers (Basel) 2023;15:3435. [Crossref] [PubMed]
- Luo Y, Li Q, Fang J, et al. ER+/PR- phenotype exhibits more aggressive biological features and worse outcome compared with ER+/PR+ phenotype in HER2-negative inflammatory breast cancer. Sci Rep 2024;14:197. [Crossref] [PubMed]
- Abdullah N, Al-Mansouri L, Ali N, et al. Molecular and serological biomarkers to predict trastuzumab responsiveness in HER-2 positive breast cancer. J Med Life 2023;16:1633-8. [Crossref] [PubMed]
- Thomas R, Weihua Z. Rethink of EGFR in Cancer With Its Kinase Independent Function on Board. Front Oncol 2019;9:800. [Crossref] [PubMed]
- Hashmi AA, Hashmi SK, Irfan M, et al. Prognostic utility of epidermal growth factor receptor (EGFR) expression in prostatic acinar adenocarcinoma. Appl Cancer Res 2019;39:2.
- Breast Cancer China Anti-Cancer Association The Society of, of the Oncology Branch of the Chinese Medical Association Breast Oncology Group. Guidelines for breast cancer diagnosis and treatment by China Anti-cancer Association (2024 edition). China Oncology 2023;33:1092-187. Available online: https://www.china-oncology.com/thesisDetails#10.19401/j.cnki.1007-3639.2023.12.004&lang=zh
- Đokić S, Gazić B, Grčar Kuzmanov B, et al. Clinical and Analytical Validation of Two Methods for Ki-67 Scoring in Formalin Fixed and Paraffin Embedded Tissue Sections of Early Breast Cancer. Cancers (Basel) 2024;16:1405. [Crossref] [PubMed]
- Bustreo S, Osella-Abate S, Cassoni P, et al. Optimal Ki67 cut-off for luminal breast cancer prognostic evaluation: a large case series study with a long-term follow-up. Breast Cancer Res Treat 2016;157:363-71. [Crossref] [PubMed]
- Davey MG, Hynes SO, Kerin MJ, et al. Ki-67 as a Prognostic Biomarker in Invasive Breast Cancer. Cancers (Basel) 2021;13:4455. [Crossref] [PubMed]
- Recommended by Breast Cancer Expert Panel. Guideline for HER2 detection in breast cancer, the 2019 version. Zhonghua Bing Li Xue Za Zhi 2019;48:169-75. [Crossref] [PubMed]
- Bhargava R, Gerald WL, Li AR, et al. EGFR gene amplification in breast cancer: correlation with epidermal growth factor receptor mRNA and protein expression and HER-2 status and absence of EGFR-activating mutations. Mod Pathol 2005;18:1027-33. [Crossref] [PubMed]
- Shawarby MA, Al-Tamimi DM, Ahmed A. Very low prevalence of epidermal growth factor receptor (EGFR) protein expression and gene amplification in Saudi breast cancer patients. Diagn Pathol 2011;6:57. [Crossref] [PubMed]
- Jiang YH, Cheng B, Ge MH, et al. The prognostic significance of p63 and Ki-67 expression in myoepithelial carcinoma. Head Neck Oncol 2012;4:9. [Crossref] [PubMed]
- Kim SK, Jung WH, Koo JS. p40 (ΔNp63) expression in breast disease and its correlation with p63 immunohistochemistry. Int J Clin Exp Pathol 2014;7:1032-41.
- Chu PG, Weiss LM. Expression of cytokeratin 5/6 in epithelial neoplasms: an immunohistochemical study of 509 cases. Mod Pathol 2002;15:6-10. [Crossref] [PubMed]
- Sun K, Peng F, Xu K, et al. A novel multivariate logistic model for predicting risk factors of failed treatment with carbapenem-resistant Acinetobacter baumannii ventilator-associated pneumonia. Front Public Health 2024;12:1385118. [Crossref] [PubMed]
- Li X, Li J, Hu Q, et al. Association of physical weight statuses defined by body mass index (BMI) with molecular subtypes of premenopausal breast cancer: a systematic review and meta-analysis. Breast Cancer Res Treat 2024;203:429-47. [Crossref] [PubMed]
- Lipsyc-Sharf M, Ballman KV, Campbell JD, et al. Age, Body Mass Index, Tumor Subtype, and Racial and Ethnic Disparities in Breast Cancer Survival. JAMA Netw Open 2023;6:e2339584. [Crossref] [PubMed]
- Kong YH, Huang JY, Ding Y, et al. The effect of BMI on survival outcome of breast cancer patients: a systematic review and meta-analysis. Clin Transl Oncol 2025;27:403-16. [Crossref] [PubMed]
- Pang Y, Wei Y, Kartsonaki C. Associations of adiposity and weight change with recurrence and survival in breast cancer patients: a systematic review and meta-analysis. Breast Cancer 2022;29:575-88. [Crossref] [PubMed]
- Brandt J, Garne JP, Tengrup I, et al. Age at diagnosis in relation to survival following breast cancer: a cohort study. World J Surg Oncol 2015;13:33. [Crossref] [PubMed]
- Ahn SK, Jung SY. Current Biomarkers for Precision Medicine in Breast Cancer. Adv Exp Med Biol 2021;1187:363-79. [Crossref] [PubMed]
- Wang YY, Wang T, Yu H, et al. Intraobserver reproducibility of Ki-67 assessment of breast cancers based on digital slide. Zhonghua Bing Li Xue Za Zhi 2020;49:1163-8. [Crossref] [PubMed]
- Lian CL, Zhang HY, Wang J, et al. Staging for Breast Cancer With Internal Mammary Lymph Nodes Metastasis: Utility of Incorporating Biologic Factors. Front Oncol 2020;10:584009. [Crossref] [PubMed]
- Dieci MV, Guarneri V. Should triple-positive breast cancer be recognized as a distinct subtype? Expert Rev Anticancer Ther 2020;20:1011-4. [Crossref] [PubMed]
- Wang B, Ding W, Sun K, et al. Impact of the 2018 ASCO/CAP guidelines on HER2 fluorescence in situ hybridization interpretation in invasive breast cancers with immunohistochemically equivocal results. Sci Rep 2019;9:16726. [Crossref] [PubMed]
- Ács B, Zámbó V, Vízkeleti L, et al. Ki-67 as a controversial predictive and prognostic marker in breast cancer patients treated with neoadjuvant chemotherapy. Diagn Pathol 2017;12:20. [Crossref] [PubMed]
- Tan S, Fu X, Xu S, et al. Quantification of Ki67 Change as a Valid Prognostic Indicator of Luminal B Type Breast Cancer After Neoadjuvant Therapy. Pathol Oncol Res 2021;27:1609972. [Crossref] [PubMed]
- Waks AG, Winer EP. Breast Cancer Treatment: A Review. JAMA 2019;321:288-300. [Crossref] [PubMed]
- Erić I, Petek Erić A, Koprivčić I, et al. Independent factors FOR poor prognosis in young patients with stage I-III breast cancer. Acta Clin Croat 2020;59:242-51. [Crossref] [PubMed]
- Li Z, Shen G, Shi M, et al. Association between high body mass index and prognosis of patients with early-stage breast cancer: A systematic review and meta-analysis. Cancer Pathog Ther 2023;1:205-15. [Crossref] [PubMed]
- Loibl S, Gianni L. HER2-positive breast cancer. Lancet 2017;389:2415-29. [Crossref] [PubMed]
- Dowsett M, Nielsen TO, A'Hern R, et al. Assessment of Ki67 in breast cancer: recommendations from the International Ki67 in Breast Cancer working group. J Natl Cancer Inst 2011;103:1656-64. [Crossref] [PubMed]
- Garrido-Castro AC, Lin NU, Polyak K. Insights into Molecular Classifications of Triple-Negative Breast Cancer: Improving Patient Selection for Treatment. Cancer Discov 2019;9:176-98. [Crossref] [PubMed]
- Costa RLB, Han HS, Gradishar WJ. Targeting the PI3K/AKT/mTOR pathway in triple-negative breast cancer: a review. Breast Cancer Res Treat 2018;169:397-406. [Crossref] [PubMed]
- Zagami P, Carey LA. Triple negative breast cancer: Pitfalls and progress. NPJ Breast Cancer 2022;8:95. [Crossref] [PubMed]
- Britten CD. PI3K and MEK inhibitor combinations: examining the evidence in selected tumor types. Cancer Chemother Pharmacol 2013;71:1395-409. [Crossref] [PubMed]
- Miricescu D, Totan A, Stanescu-Spinu II, et al. PI3K/AKT/mTOR Signaling Pathway in Breast Cancer: From Molecular Landscape to Clinical Aspects. Int J Mol Sci 2020;22:173. [Crossref] [PubMed]
- Shin SJ, Park I, Go H, et al. Immune environment of high-TIL breast cancer: triple negative and hormone receptor positive HER2 negative. NPJ Breast Cancer 2024;10:102. [Crossref] [PubMed]
- Cortes J, Rugo HS, Cescon DW, et al. Pembrolizumab plus Chemotherapy in Advanced Triple-Negative Breast Cancer. N Engl J Med 2022;387:217-26. [Crossref] [PubMed]
- Hu X, Chen W, Li F, et al. Expression changes of ER, PR, HER2, and Ki-67 in primary and metastatic breast cancer and its clinical significance. Front Oncol 2023;13:1053125. [Crossref] [PubMed]
- Xie P, An R, Yu S, et al. A novel immune subtype classification of ER-positive, PR-negative and HER2-negative breast cancer based on the genomic and transcriptomic landscape. J Transl Med 2021;19:398. [Crossref] [PubMed]
- Fu Y, Jiang J, Chen S, et al. Establishment of risk prediction nomogram for ipsilateral axillary lymph node metastasis in T1 breast cancer. Zhejiang Da Xue Xue Bao Yi Xue Ban 2021;50:81-9. [Crossref] [PubMed]


