The value of different machine learning radiomics based on DCE-MRI in predicting axillary lymph node status of breast cancer
Highlight box
Key findings
• A combined model integrating dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) radiomics and clinicopathologic data effectively predicted axillary lymph node (ALN) stages (N0, N1, N2–3) in breast cancer patients.
• The combined model achieved areas under the curve (AUCs) of 0.890 (training) and 0.854 (testing) for distinguishing N0 (no metastasis) vs. N+ (≥1 metastatic node).
• The combined model demonstrated high accuracy in differentiating N1 from N2–3 (AUC: 0.973 training, 0.835 testing) and multiclass prediction (micro-AUC: 0.861; macro-AUC: 0.812).
• Decision curve analysis confirmed superior clinical utility over radiomics-only or clinical-only models.
What is known and what is new?
• ALN status is a critical prognostic factor, but invasive biopsies [(sentinel lymph node biopsy (SLNB)/axillary lymph node dissection (ALND)] carry complications. Existing radiomics studies focus on qualitative ALN metastasis or node counts, often overlooking tumor-node-metastasis (TNM) staging.
• This is the first study to predict American Joint Committee on Cancer (AJCC)/Union for International Cancer Control (UICC) TNM N stages (N0/N1/N2–3) using DCE-MRI radiomics of primary tumors (not lymph nodes) integrated with clinicopathologic data. The results were validated on a public The Cancer Genome Atlas-The Cancer Imaging Archive (TCGA-TCIA) cohort.
What is the implication, and what should change now?
• This noninvasive model could reduce unnecessary invasive biopsies (SLNB/ALND) by improving preoperative ALN staging accuracy, guiding tailored treatments (e.g., identifying N2–3 patients needing systemic therapy).
Introduction
Breast cancer (BC) is the leading cause of cancer-related mortality among women globally (1). Importantly, the prognosis of the disease vary significantly across different lymph node stages (N stages), necessitating different surgical and adjuvant treatment approaches (2,3). Initially, patients with N-stage disease are evaluated through either sentinel lymph node biopsy (SLNB) or axillary lymph node dissection (ALND) on the basis of their individual circumstances (4,5). The National Comprehensive Cancer Network (NCCN) strongly recommends the application of preoperative systemic therapy in patients with N2–3 disease. Postmastectomy radiotherapy is also recommended for patients with N2–3 disease according to both the European Society for Medical Oncology and the American Society of Clinical Oncology (6). Nevertheless, both SLNB and ALND are considered invasive procedures and carry potential complications, such as numbness, seromas, lymphedema, and infections (7). Moreover, SLNB has been criticized for its high false-negative rate (8). Therefore, it is crucial to explore and develop noninvasive and accurate diagnostic methods for the preoperative assessment of the patient’s N stage. Such methods could reduce unnecessary lymphadenectomies, alleviate the psychological and physical burdens of the patients, and minimize activity limitations and the risk of surgical complications. Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is widely used in clinical settings for identifying high-risk individuals, determining tumor stage, and evaluating the efficacy of neoadjuvant chemotherapy (NACT) (9). Despite its broad application (10,11), DCE-MRI generally requires manually annotation of a limited number of qualitative descriptors of the tumor, potentially imposing observer bias in the results (12,13).
Machine learning-based radiomics methods have received considerable attention, despite demonstrating various limitations in clinical trials (14,15). Radiomics involves the automated extraction of numerous quantitative image features that are often imperceptible to humans (16). High-dimensional data, including texture features, intensity, and shape, can be extracted via specialized software and analyzed with specific algorithms (17,18).
However, existing models focus primarily on the qualitative analysis of axillary lymph node metastasis (ALNM) (19), with only a few conducting quantitative assessments (20-22). Furthermore, these quantitative analyses often overlook the N stage as defined by the American Joint Committee on Cancer (AJCC)/Union for International Cancer Control (UICC) tumor-node-metastasis (TNM) staging system (8th edition) (23), a gap that is particularly critical to address for patients with supraclavicular or intramammary lymph node metastasis.
To address these issues, we developed a machine learning framework that integrates DCE-MRI radiomics of primary tumors with clinicopathologic data from The Cancer Genome Atlas-The Cancer Imaging Archive (TCGA-TCIA) repository, aiming to noninvasively discriminate between N0, N1, and N2–3 ALNM. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-1418/rc).
Methods
Study population
The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The dataset used in this study was obtained from the TCIA, particularly from the Duke-Breast-Cancer-MRI section (24). The retrieved data comprise preoperative DCE-MRI scans of 922 patients with biopsy-validated BC, including details of the tumor characteristics. All MRI examinations were performed at initial diagnosis, prior to any systemic or neoadjuvant therapy. The use of this public database adhered to the citation requirements and data use policies outlined on the TCGA-TCIA website. This study was exempt from institutional review board oversight, as patient identifiers are not accessible to database users.
Among the 922 patients, 622 had unilateral BC. The axillary lymph node (ALN) status for each patient was determined through postoperative pathological evaluation and subsequently classified according to the N classification of the AJCC TNM staging system, 8th edition. Following the exclusion of 18 patients lacking lymph node pathology data, 604 patients were included in the study.
Imaging data
Preoperative DCE-MRI scan parameters were obtained from the TCIA. Both 1.5T and 3T MRI scanners were used, and the patients were scanned primarily in the prone position. The images from the following MR sequences were obtained in DICOM format: a fat-saturated gradient echo T1-weighted precontrast sequence, a fat-free saturated T1-weighted sequence, and four postcontrast T1-weighted sequences (with the use of a weight-based protocol, 0.2 mL/kg), the latter obtained after the administration of intravenous contrast material. The details of the scanner and MRI acquisition parameters have been documented in detail in previous studies (25,26).
Clinicopathologic and radiological analysis
Clinicopathologic and radiological data were acquired from the TCIA. Clinicopathologic parameters, such as age, menopausal status, tumor location, histologic type, Nottingham grade, and T stage (tumor size), were retrospectively retrieved and analyzed. In addition, we analyzed the following imaging parameters: multicentricity, lymphadenopathy, skin or nipple involvement, and chest involvement.
Radiomics feature extraction and selection
A compilation of 529 computer-extracted imaging features was procured from TCIA (27). This dataset included commonly published features from the literature alongside uniquely extracted features. The patients were divided into training and testing cohorts at a ratio of 8:2. In the data of the training cohort, dimensionality reduction and feature selection entailed the following steps. First, all feature values were normalized via z score normalization, in which features are rescaled to a mean of zero and a standard deviation of one, ensuring comparability across features with varying scales. Spearman’s correlation analysis was then used to assess correlations between features, and one of two features whose correlation coefficient exceeded 0.9 was retained to minimize multicollinearity. The minimum redundancy maximum relevance (mRMR) algorithm was employed to identify the most pertinent features for tumor classification. Finally, a least absolute shrinkage and selection operator (LASSO) regression model with 10-fold cross-validation was developed, retaining features with nonzero coefficients. All procedures were validated in the data of the test cohort.
Machine learning model construction
Radiomics models were constructed using the final set of selected features. Machine learning techniques were applied to create precise, objective, and reliable models to aid in clinical decision-making (28). Four widely used algorithms were evaluated: support vector machine (SVM), random forest (RF), logistic regression (LR), and extreme gradient boosting (XGBoost). The diagnostic models were compared via metrics such as the area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).
Clinical and combined model construction
Univariable LR analysis was conducted on the clinicopathologic and radiological characteristics. Multivariable LR analysis was subsequently conducted for those features that were deemed significantly different between the groups in the univariable analysis to determine the final predictor variables for model development.
The optimal radiomics model was integrated with the clinical predictors to create a combined model. The performance of the combined model was assessed via receiver operating characteristic (ROC) curves, whereas the clinical efficacy of the model in tumor classification was evaluated via decision curve analysis (DCA), which quantifies the net benefit across various threshold probabilities.
Statistical analysis
Statistical analysis was performed via SPSS Version 25.0 software, Python version 3.5.6, and R version 3.5.3. Independent t-tests were used to compare continuous variables, and the chi-square tests or Fisher’s exact tests were used to compare categorical variables. Statistical significance was determined by a two-tailed P value less than 0.05.
Results
Prediction of ALN status between N0 and N+ (≥1 metastatic ALN)
A total of 605 patients were included in the study (Table 1). Univariable and multivariable LRs analyses revealed that multifocal tumor status, T stage, lymphadenopathy or suspicious nodes on MRI, and metastasis outside the lymph nodes were independent predictors of ALNM (Table 2). LASSO regression analysis was employed for dimensionality reduction (λ=0.0295).
Table 1
| Patient characteristics and MRI features | N0 and N ≥1 | N1 and N2–3 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| All patients (n=605) | Testing cohort (n=101) | Training cohort (n=504) | P value | All patients (n=246) | Testing cohort (n=49) | Training cohort (n=197) | P value | ||
| Age (years) | 52.09±11.63 | 52.37±11.20 | 52.03±11.72 | 0.71 | 50.10±11.43 | 51.69±11.02 | 49.70±11.52 | 0.27 | |
| Menopause | 0.55 | 0.68 | |||||||
| Pre-menopause | 281 (46.45) | 42 (41.58) | 239 (47.42) | 113 (45.93) | 25 (51.02) | 88 (44.67) | |||
| Post-menopause | 314 (51.90) | 57 (56.44) | 257 (50.99) | 4 (1.63) | 1 (2.04) | 3 (1.52) | |||
| Not available | 10 (1.65) | 2 (1.98) | 8 (1.59) | 129 (52.44) | 23 (46.94) | 106 (53.81) | |||
| Histologic type | 0.84 | 0.26 | |||||||
| No special type | 542 (89.59) | 92 (91.09) | 450 (89.29) | 219 (89.02) | 42 (85.71) | 177 (89.85) | |||
| Lobular | 53 (8.76) | 9 (8.91) | 44 (8.73) | 23 (9.35) | 7 (14.29) | 16 (8.12) | |||
| Metaplastic | 1 (0.17) | 0 | 1 (0.20) | 0 | 0 | 0 | |||
| Tubular | 2 (0.33) | 0 | 2 (0.40) | 0 | 0 | 0 | |||
| Mucinous | 3 (0.50) | 0 | 3 (0.60) | 0 | 0 | 0 | |||
| Not available | 4 (0.66) | 0 | 4 (0.79) | 4 (1.63) | 0 | 4 (2.03) | |||
| T staging (tumor size) | 0.52 | 0.59 | |||||||
| T1 | 277 (45.79) | 52 (51.49) | 225 (44.64) | 70 (28.46) | 11 (22.45) | 59 (29.95) | |||
| T2 | 254 (41.98) | 41 (40.59) | 213 (42.26) | 110 (44.72) | 24 (48.98) | 86 (43.65) | |||
| T3 | 52 (8.60) | 6 (5.94) | 46 (9.13) | 46 (18.70) | 10 (20.41) | 36 (18.27) | |||
| T4 | 17 (2.81) | 2 (1.98) | 15 (2.98) | 15 (6.10) | 4 (8.16) | 11 (5.58) | |||
| Not available | 5 (0.83) | 0 | 5 (0.99) | 5 (2.03) | 0 | 5 (2.54) | |||
| Nottingham grade | 0.41 | 0.46 | |||||||
| Low | 100 (16.53) | 19 (18.81) | 81 (16.07) | 25 (10.16) | 7 (14.29) | 18 (9.14) | |||
| Intermediate | 294 (48.60) | 54 (53.47) | 240 (47.62) | 127 (51.63) | 21 (42.86) | 106 (53.81) | |||
| High | 199 (32.89) | 26 (25.74) | 173 (34.33) | 87 (35.37) | 20 (40.82) | 67 (34.01) | |||
| Not available | 12 (1.98) | 2 (1.98) | 10 (1.98) | 7 (2.85) | 1 (2.04) | 6 (3.05) | |||
| PR | 0.46 | >0.99 | |||||||
| Negative | 220 (36.36) | 33 (32.67) | 187 (37.10) | 101 (41.06) | 20 (40.82) | 81 (41.12) | |||
| Positive | 385 (63.64) | 68 (67.33) | 317 (62.90) | 145 (58.94) | 29 (59.18) | 116 (58.88) | |||
| ER | 0.26 | >0.99 | |||||||
| Negative | 149 (24.63) | 20 (19.80) | 129 (25.60) | 68 (27.64) | 14 (28.57) | 54 (27.41) | |||
| Positive | 456 (75.37) | 81 (80.20) | 375 (74.40) | 178 (72.36) | 35 (71.43) | 143 (72.59) | |||
| HER2 | 0.26 | 0.32 | |||||||
| Negative | 494 (81.65) | 78 (77.23) | 416 (82.54) | 185 (75.20) | 40 (81.63) | 145 (73.60) | |||
| Positive | 111 (18.35) | 23 (22.77) | 88 (17.46) | 61 (24.80) | 9 (18.37) | 52 (26.40) | |||
| Triple negative | 0.23 | 0.20 | |||||||
| No | 493 (81.49) | 87 (86.14) | 406 (80.56) | 199 (80.89) | 36 (73.47) | 163 (82.74) | |||
| Yes | 112 (18.51) | 14 (13.86) | 98 (19.44) | 47 (19.11) | 13 (26.53) | 34 (17.26) | |||
| Metastasis (outside of lymph nodes) | >0.99 | 0.31 | |||||||
| No | 590 (97.52) | 98 (97.03) | 492 (97.62) | 231 (93.90) | 44 (89.80) | 187 (94.92) | |||
| Yes | 15 (2.48) | 3 (2.97) | 12 (2.38) | 15 (6.10) | 5 (10.20) | 10 (5.08) | |||
| MRI | |||||||||
| Multicentric | 0.20 | 0.07 | |||||||
| No | 355 (58.68) | 53 (52.48) | 302 (59.92) | 115 (46.75) | 29 (59.18) | 86 (43.65) | |||
| Yes | 250 (41.32) | 48 (47.52) | 202 (40.08) | 131 (53.25) | 20 (40.82) | 111 (56.35) | |||
| Lymphadenopathy | 0.71 | 0.94 | |||||||
| No | 402 (66.45) | 65 (64.36) | 337 (66.87) | 99 (40.24) | 19 (38.78) | 80 (40.61) | |||
| Yes | 203 (33.55) | 36 (35.64) | 167 (33.13) | 147 (59.76) | 30 (61.22) | 117 (59.39) | |||
| Skin or tipple invovlement | 0.02 | 0.12 | |||||||
| No | 539 (89.09) | 83 (82.18) | 456 (90.48) | 202 (82.11) | 36 (73.47) | 166 (84.26) | |||
| Yes | 66 (10.91) | 18 (17.82) | 48 (9.52) | 44 (17.89) | 13 (26.53) | 31 (15.74) | |||
| Chest involvement | >0.99 | 0.37 | |||||||
| No | 594 (98.18) | 99 (98.02) | 495 (98.21) | 242 (98.37) | 47 (95.92) | 195 (98.98) | |||
| Yes | 11 (1.82) | 2 (1.98) | 9 (1.79) | 4 (1.63) | 2 (4.08) | 2 (1.02) | |||
Data are shown as mean ± standard deciation or n (%). ER, estrogen receptor; HER2, human epidermal growth factor receptor 2; MRI, magnetic resonance imaging; N, lymph node staging; PR, progesterone receptor; T, tumor size staging.
Table 2
| Variable | Univariate analysis | Multivariate analysis | |||
|---|---|---|---|---|---|
| OR (95% CI) | P value | OR (95% CI) | P value | ||
| Age | 0.994 (0.991–0.997) | <0.001 | 0.997 (0.994–1.000) | 0.10 | |
| Menopause | 0.916 (0.862–0.975) | 0.02 | 1.041 (0.969–1.116) | 0.35 | |
| Histologic type | 0.948 (0.902–0.995) | 0.07 | |||
| Nottingham grade | 1.053 (1.008–1.101) | 0.053 | |||
| PR | 0.921 (0.860–0.986) | 0.047 | 1.033 (0.973–1.096) | 0.37 | |
| ER | 0.936 (0.868–1.010) | 0.15 | |||
| HER2 | 1.191 (1.095–1.296) | 0.001 | 1.080 (1.004–1.162) | 0.08 | |
| Triple negative | 1.016 (0.933–1.106) | 0.75 | |||
| T staging (tumor size) | 1.248 (1.198–1.301) | <0.001 | 1.149 (1.103–1.196) | <0.001 | |
| Metastasis (outside of lymph nodes) | 1.838 (1.493–2.261) | <0.001 | 1.374 (1.145–1.649) | 0.004 | |
| MRI | |||||
| Multifocal | 1.222 (1.143–1.305) | <0.001 | 1.145 (1.081–1.212) | <0.001 | |
| Skin or nipple involvement | 1.339 (1.207–1.486) | <0.001 | 1.034 (0.939–1.138) | 0.57 | |
| Lymphadenopathy or suspicious nodes | 1.613 (1.516–1.716) | <0.001 | 1.447 (1.358–1.543) | <0.001 | |
| Chest involvement | 0.957 (0.748–1.225) | 0.77 | |||
CI, confidence interval; ER, estrogen receptor; HER2, human epidermal growth factor receptor-2; MRI, magnetic resonance imaging; N, lymph node staging; OR, odds ratio; PR, progesterone receptor.
The results of ROC curve analysis are displayed in Table 3. The XGBoost model achieved an AUC of 0.999 in the training cohort, indicating almost perfect performance. However, its AUC decreased significantly to 0.776 in the test cohort, reflecting relatively poor generalizability to unseen data. This discrepancy suggests overfitting, a condition wherein the model captures noise and specific patterns in the training data that do not generalize well to independent datasets. In contrast, the SVM model achieved a more consistent performance, with an AUC of 0.813 in the test cohort (but only an AUC of 0.869 in the training cohort), making it the optimal choice of model for this study.
Table 3
| Model | AUC (95% CI) | Accuracy | Sensitivity | Specificity | PPV | NPV |
|---|---|---|---|---|---|---|
| LR | ||||||
| Training | 0.776 (0.735–0.817) | 0.724 | 0.753 | 0.683 | 0.776 | 0.654 |
| Test | 0.801 (0.715–0.887) | 0.762 | 0.867 | 0.61 | 0.765 | 0.758 |
| SVM* | ||||||
| Training | 0.869 (0.836–0.902) | 0.794 | 0.786 | 0.805 | 0.855 | 0.721 |
| Test | 0.813 (0.729–0.896) | 0.733 | 0.633 | 0.878 | 0.884 | 0.621 |
| RF | ||||||
| Training | 0.999 (0.997–0.999) | 0.980 | 0.983 | 0.976 | 0.983 | 0.976 |
| Test | 0.733 (0.637–0.828) | 0.683 | 0.700 | 0.659 | 0.750 | 0.600 |
| XGBoost | ||||||
| Training | 0.999 (0.997–0.999) | 0.986 | 1.000 | 0.966 | 0.977 | 1.000 |
| Test | 0.776 (0.684–0.868) | 0.743 | 0.783 | 0.683 | 0.783 | 0.683 |
| Clinical* | ||||||
| Training | 0.750 (0.701–0.797) | 0.760 | 0.823 | 0.741 | 0.783 | 0.721 |
| Test | 0.807 (0.711–0.901) | 0.802 | 0.850 | 0.789 | 0.823 | 0.769 |
| Combined* | ||||||
| Training | 0.890 (0.860–0.919) | 0.825 | 0.856 | 0.780 | 0.850 | 0.788 |
| Test | 0.854 (0.779–0.928) | 0.792 | 0.850 | 0.707 | 0.810 | 0.763 |
*, models were constructed using SVM. AUC, area under the curve; CI, confidence interval; LR, logistic regression; N, lymph node staging; NPV, negative prediction value; PPV, positive prediction value; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.
By incorporating radiomics features with clinicopathologic and radiological predictors, a combined model was constructed, which demonstrated superior discriminatory capacity over the individual clinical and radiomics feature models in both the training and test cohorts, as shown in its higher AUC values (0.890 in the training cohort, 0.854 in the testing cohort) (Figure 1). Figure 2 displays the DCA plots for the three models, demonstrating that the combined model provided the greatest net benefit in classifying ALN involvement in both the training and test cohorts.
Comparison between N1 and N2–3 patients
For this analysis, N1 BC was adopted as the negative reference standard (Table 1). In the training cohort, the AUC of the radiomics model was 0.972, whereas that of the clinicopathological model was only 0.505. However, when combined with the radiomics model, the AUC of the clinical model improved to 0.973. In the test cohort, the AUC slightly decreased but remained at a reasonable value of 0.835. The details of the statistical results are summarized in Table 4. Figure 3 shows the corresponding ROC curves for the comparisons, whereas Figure 4 presents the DCA plots for the three models.
Table 4
| Model | AUC (95% CI) | Accuracy | Sensitivity | Specificity | PPV | NPV |
|---|---|---|---|---|---|---|
| LR | ||||||
| Training | 0.815 (0.752–0.877) | 0.665 | 0.575 | 0.922 | 0.955 | 0.431 |
| Test | 0.846 (0.709–0.983) | 0.878 | 0.944 | 0.692 | 0.895 | 0.818 |
| SVM* | ||||||
| Training | 0.972 (0.942–1.000) | 0.934 | 0.925 | 0.961 | 0.985 | 0.817 |
| Test | 0.848 (0.726–0.969) | 0.714 | 0.639 | 0.923 | 0.958 | 0.48 |
| RF | ||||||
| Training | 0.999 (0.997–1.000) | 0.995 | 0.993 | 1.000 | 1.000 | 0.981 |
| Test | 0.743 (0.580–0.904) | 0.735 | 0.778 | 0.615 | 0.848 | 0.5 |
| XGBoost | ||||||
| Training | 1.000 (1.000–1.000) | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| Test | 0.712 (0.532–0.891) | 0.714 | 0.722 | 0.750 | 0.867 | 0.474 |
| Clinical* | ||||||
| Training | 0.505 (0.406–0.604) | 0.685 | 0.801 | 0.367 | 0.78 | 0.383 |
| Test | 0.717 (0.548–0.886) | 0.776 | 0.889 | 0.462 | 0.821 | 0.600 |
| Combined* | ||||||
| Training | 0.973 (0.944–1.000) | 0.934 | 0.932 | 0.941 | 0.978 | 0.828 |
| Test | 0.835 (0.708–0.963) | 0.694 | 0.611 | 0.923 | 0.957 | 0.462 |
*, models were constructed using SVM. AUC, area under the curve; CI, confidence interval; LR, logistic regression; N, lymph node staging; NPV, negative prediction value; PPV, positive prediction value; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.
Prediction of the N stage for N0, N1 and N2–3 patients
The present study expanded upon the existing model to accommodate three distinct task groups for predicting ALN status. The clinical endpoints were divided into three groups, the N0, N1 and N2–3 groups, with 359, 182 and 64 lesions, respectively. In multiclass classification, the micro-AUC evaluates overall predictive capability by aggregating predictions across all classes into a single ROC curve, giving equal weight to each instance. The macro-AUC, on the other hand, averages the AUCs of the individual classes, treating all classes equally in an attempt to highlight balanced performance across them. In this study, the combined model achieved a micro-AUC of 0.861, indicating strong overall predictive performance, and a macro-AUC of 0.812, demonstrating some level of effectiveness in maintaining balanced prediction across the N0, N1, and N2–3 groups. These metrics collectively highlight the robustness and generalizability of the combined model. The confusion matrix is depicted in Figure 5, while the ROC curves for the combined model are illustrated in Figure 6. Notably, the combined model demonstrated favorable performance in distinguishing between the N0 and N2–3 groups but yielded less satisfactory outcomes in identifying the N1 group.
Discussion
Clinical significance and current challenges in predicting ALNM
ALNM is a well-established prognostic indicator in BC and significantly influences both the clinical course and treatment decisions for affected patients (21,22). Several studies have employed imaging modalities such as ultrasound (US), mammography (MMG), and MRI to evaluate the utility of breast radiomics in diagnosing BC to identify prognostic factors and to predict therapeutic responses (29-32).
However, research specifically focusing on the application of breast radiomics for predicting ALNM remains limited (33). Most existing studies have focused primarily on the qualitative analysis of ALNM (29,34), with only a few addressing the quantitative assessment of this burden in BC. These studies typically aimed to distinguish between patients with low-load ALNM, defined by one to two positive nodes (N1–2), and those with heavy-load metastasis, defined by three or more positive nodes (N ≥3) (33,35), or, relatedly, to differentiate between (N1–3) and (N ≥4) groups (36). However, these quantitative analyses largely rely on the number of ALNs as the sole classification criterion, neglecting the classification based on the N stage itself in the conventional TNM classification system. This oversight is particularly important when considering metastasis to supraclavicular or internal mammary lymph nodes.
Previous studies have shown that radiomics features obtained from ALNs may aid in predicting ALNM in BC patients (37,38). However, this approach is limited by some challenges, including discrepancies between imaging and pathological findings and the limited scanning scope of standard MRI. Additionally, biases may be introduced by excluding patients with small ALNs to mitigate issues associated with manual delineation.
Notably, models built from radiomics features derived from breast tumor images captured during the peak enhancement phase of DCE-MRI outperform models relying on features from the initial imaging phase in predicting ALNM (39,40). This greater predictive power likely results from the improved visibility of tumor heterogeneity and aggressive characteristics during peak contrast enhancement (41).
Novelty and key findings of the present study
This study is the first to demonstrate the potential of integrating radiomics features extracted from DCE-MRI with clinicopathologic characteristics to predict the N stage of ALN in BC patients. A unique aspect of this study is the classification of lymph nodes according to the conventional TNM staging system rather than a reliance solely on the number of metastatic nodes. This approach specifically distinguishes among patients with stages N0, N1, and N2–3. This stratification is clinically significant, as different N stages are associated with different patient outcomes and therapeutic strategies. The results highlighted the superior performance of the combined model, with AUC values of 0.890 and 0.854 in the training and test cohorts, respectively, in differentiating between patients with no ALNM (N0) and those with at least one metastatic lymph node (N+). Furthermore, ROC curve analysis yielded AUC values of 0.973 and 0.835 in the training and test cohorts, respectively, demonstrating the model’s high accuracy in differentiating between the N1 and N2–3 stages. Importantly, the model also performs well in distinguishing among the three categories (N0, N1, and N2–3), and thus may reduce the need for invasive procedures such as SLNB and ALND.
The integrated model demonstrated increased diagnostic capability because of the synergistic interaction between radiomics and clinicopathologic features. Radiomics texture metrics, such as entropy and uniformity, reflect tumor heterogeneity (42), whereas shape parameters, such as sphericity and compactness, capture invasive growth patterns (43). Meanwhile, clinicopathologic variables, such as tumor size (T stage), multifocality, and lymphadenopathy, provide an essential clinical context.
In contrast to single-center studies, our research utilized publicly available datasets from the TCGA-TCIA repository (44). Developed by leading institutions, these resources ensure reliability through rigorous quality control and standardized protocols. The comprehensive nature of these datasets enhances their statistical power and generalizability while minimizing the time and resources required for data acquisition by researchers.
Limitations and future directions
While the findings are promising, there are notable limitations in this study. First, differences in imaging acquisition parameters may have introduced heterogeneities in the dataset. While such differences can improve generalizability, it may also affect the reproducibility of the radiomics features. Future work should prioritize standardized imaging acquisition parameters to address this issue (45). Second, the exclusion of patients with bilateral BC limits the model’s applicability to unilateral cases. The development of models that can predict ALNM in patients with bilateral BC remains crucial for future research. Future studies should also explore the integration of multimodal data, such as genomic and metabolic data, to refine the model’s predictive accuracy (46). Third, it is important to acknowledge that the pathological N staging employed as our reference standard did not distinguish between macrometastases, micrometastases, and isolated tumor cells, due to the inconsistent availability of such detailed information within the dataset. Furthermore, deep learning frameworks, such as convolutional neural networks (CNNs) (47), represent a critical frontier for exploration (48). Unlike traditional radiomics, which relies on manually curated features, deep learning can automatically extract complex patterns from imaging data, offering the potential to identify novel biomarkers for ALNM (49).
Conclusions
This study demonstrates that integrating DCE-MRI radiomics of the primary tumor with clinicopathologic data provides a powerful, non-invasive tool for accurately predicting AJCC/UICC TNM N stage in BC patients, offering significant potential to refine preoperative planning and reduce reliance on invasive axillary procedures.
Acknowledgments
This work utilized the breast cancer dataset (Duke-Breast-Cancer-MRI cohort) obtained from The Cancer Imaging Archive (TCIA). We would like to thank TCIA, the National Cancer Institute, and Duke University for creating and maintaining this public resource. The data of 922 breast cancer patients used in this study are available at https://doi.org/10.7937/TCIA.e3sv-re93. All participants in these studies, as well as the OnekeyAI platform and its developers, are to be thanked for their assistance.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-1418/rc
Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-1418/prf
Funding: None.
Conflicts of Interest: Both authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-1418/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. [Crossref] [PubMed]
- Tirada N, Aujero M, Khorjekar G, et al. Breast Cancer Tissue Markers, Genomic Profiling, and Other Prognostic Factors: A Primer for Radiologists. Radiographics 2018;38:1902-20. [Crossref] [PubMed]
- Gradishar WJ, Moran MS, Abraham J, et al. Breast Cancer, Version 3.2022, NCCN Clinical Practice Guidelines in Oncology. J Natl Compr Canc Netw 2022;20:691-722. [Crossref] [PubMed]
- Giuliano AE, Ballman K, McCall L, et al. Locoregional Recurrence After Sentinel Lymph Node Dissection With or Without Axillary Dissection in Patients With Sentinel Lymph Node Metastases: Long-term Follow-up From the American College of Surgeons Oncology Group (Alliance) ACOSOG Z0011 Randomized Trial. Ann Surg 2016;264:413-20. [Crossref] [PubMed]
- Giuliano AE, Ballman KV, McCall L, et al. Effect of Axillary Dissection vs No Axillary Dissection on 10-Year Overall Survival Among Women With Invasive Breast Cancer and Sentinel Node Metastasis: The ACOSOG Z0011 (Alliance) Randomized Clinical Trial. JAMA 2017;318:918-26. [Crossref] [PubMed]
- Aebi S, Davidson T, Gruber G, et al. Primary breast cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol 2010;21:v9-14. [Crossref] [PubMed]
- Abass MO, Gismalla MDA, Alsheikh AA, et al. Axillary Lymph Node Dissection for Breast Cancer: Efficacy and Complication in Developing Countries. J Glob Oncol 2018;4:1-8. [Crossref] [PubMed]
- Qiu SQ, Zhang GJ, Jansen L, et al. Evolution in sentinel lymph node biopsy in breast cancer. Crit Rev Oncol Hematol 2018;123:83-94. [Crossref] [PubMed]
- Eom HJ, Choi WJ, Sun YJ, et al. Preoperative breast MRI in HER2-positive/hormone receptor-negative breast cancer: surgical outcomes using propensity score matching. Eur Radiol 2025;35:5648-57. [Crossref] [PubMed]
- Choi EJ, Youk JH, Choi H, et al. Dynamic contrast-enhanced and diffusion-weighted MRI of invasive breast cancer for the prediction of sentinel lymph node status. J Magn Reson Imaging 2020;51:615-26. [Crossref] [PubMed]
- Zhao M, Wu Q, Guo L, et al. Magnetic resonance imaging features for predicting axillary lymph node metastasis in patients with breast cancer. Eur J Radiol 2020;129:109093. [Crossref] [PubMed]
- Chang JM, Leung JWT, Moy L, et al. Axillary Nodal Evaluation in Breast Cancer: State of the Art. Radiology 2020;295:500-15. [Crossref] [PubMed]
- Samiei S, Smidt ML, Vanwetswinkel S, et al. Diagnostic performance of standard breast MRI compared to dedicated axillary MRI for assessment of node-negative and node-positive breast cancer. Eur Radiol 2020;30:4212-22. [Crossref] [PubMed]
- Qi TH, Hian OH, Kumaran AM, et al. Multi-center evaluation of artificial intelligent imaging and clinical models for predicting neoadjuvant chemotherapy response in breast cancer. Breast Cancer Res Treat 2022;193:121-38. [Crossref] [PubMed]
- Vithayathil M, Koku D, Campani C, et al. Machine learning based radiomic models outperform clinical biomarkers in predicting outcomes after immunotherapy for hepatocellular carcinoma. J Hepatol 2025;83:959-70. [Crossref] [PubMed]
- Wang K, Lu X, Zhou H, et al. Deep learning Radiomics of shear wave elastography significantly improved diagnostic performance for assessing liver fibrosis in chronic hepatitis B: a prospective multicentre study. Gut 2019;68:729-41. [Crossref] [PubMed]
- Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology 2016;278:563-77. [Crossref] [PubMed]
- Sun M, Wang J, Xu P, et al. Development and validation of MRI-based radiomics model for clinical symptom stratification of extrinsic adenomyosis. Ann Med 2025;57:2534521. [Crossref] [PubMed]
- Qu L, Mei X, Yi Z, et al. An unsupervised learning model based on CT radiomics features accurately predicts axillary lymph node metastasis in breast cancer patients: diagnostic study. Int J Surg 2024;110:5363-73. [Crossref] [PubMed]
- Hong M, Fan S, Xu Z, et al. MRI radiomics and biological correlations for predicting axillary lymph node burden in early-stage breast cancer. J Transl Med 2024;22:826. [Crossref] [PubMed]
- Bhushan A, Gonsalves A, Menon JU. Current State of Breast Cancer Diagnosis, Treatment, and Theranostics. Pharmaceutics 2021;13:723. [Crossref] [PubMed]
- Allison KH. Prognostic and predictive parameters in breast pathology: a pathologist's primer. Mod Pathol 2021;34:94-106. [Crossref] [PubMed]
- Sawaki M, Shien T, Iwata H. TNM classification of malignant tumors (Breast Cancer Study Group). Jpn J Clin Oncol 2019;49:228-31. [Crossref] [PubMed]
- Saha A, Harowicz MR, Grimm LJ, et al. A machine learning approach to radiogenomics of breast cancer: a study of 922 subjects and 529 DCE-MRI features. Br J Cancer 2018;119:508-16. [Crossref] [PubMed]
- Grimm LJ, Zhang J, Mazurowski MA. Computational approach to radiogenomics of breast cancer: Luminal A and luminal B molecular subtypes are associated with imaging features on routine breast MRI extracted using computer vision algorithms. J Magn Reson Imaging 2015;42:902-7. [Crossref] [PubMed]
- Mazurowski MA, Grimm LJ, Zhang J, et al. Recurrence-free survival in breast cancer is associated with MRI tumor enhancement dynamics quantified using computer algorithms. Eur J Radiol 2015;84:2117-22. [Crossref] [PubMed]
- Saha A, Yu X, Sahoo D, et al. Effects of MRI scanner parameters on breast cancer radiomics. Expert Syst Appl 2017;87:384-91. [Crossref] [PubMed]
- Parmar C, Grossmann P, Bussink J, et al. Machine Learning methods for Quantitative Radiomic Biomarkers. Sci Rep 2015;5:13087. [Crossref] [PubMed]
- Chen Y, Wang L, Dong X, et al. Deep Learning Radiomics of Preoperative Breast MRI for Prediction of Axillary Lymph Node Metastasis in Breast Cancer. J Digit Imaging 2023;36:1323-31. [Crossref] [PubMed]
- Bae MS, Shin SU, Ryu HS, et al. Pretreatment MR Imaging Features of Triple-Negative Breast Cancer: Association with Response to Neoadjuvant Chemotherapy and Recurrence-Free Survival. Radiology 2016;281:392-400. [Crossref] [PubMed]
- Zhou J, Zhang Y, Chang KT, et al. Diagnosis of Benign and Malignant Breast Lesions on DCE-MRI by Using Radiomics and Deep Learning With Consideration of Peritumor Tissue. J Magn Reson Imaging 2020;51:798-809. [Crossref] [PubMed]
- Wang LJ, Yu JC, Hong ZJ. The predict value of lymph node status pre-operation by ultrasound, mammography and MRI in early breast cancer. J Formos Med Assoc 2025; Epub ahead of print. [Crossref]
- Zheng X, Yao Z, Huang Y, et al. Deep learning radiomics can predict axillary lymph node status in early-stage breast cancer. Nat Commun 2020;11:1236. [Crossref] [PubMed]
- Liu H, Zou L, Xu N, et al. Deep learning radiomics based prediction of axillary lymph node metastasis in breast cancer. NPJ Breast Cancer 2024;10:22. [Crossref] [PubMed]
- Wei W, Ma Q, Feng H, et al. Deep learning radiomics for prediction of axillary lymph node metastasis in patients with clinical stage T1-2 breast cancer. Quant Imaging Med Surg 2023;13:4995-5011. [Crossref] [PubMed]
- Li L, Yu T, Sun J, et al. Prediction of the number of metastatic axillary lymph nodes in breast cancer by radiomic signature based on dynamic contrast-enhanced MRI. Acta Radiol 2022;63:1014-22. [Crossref] [PubMed]
- Tang YL, Wang B, Ou-Yang T, et al. Ultrasound radiomics based on axillary lymph nodes images for predicting lymph node metastasis in breast cancer. Front Oncol 2023;13:1217309. [Crossref] [PubMed]
- Tang Y, Che X, Wang W, et al. Radiomics model based on features of axillary lymphatic nodes to predict axillary lymphatic node metastasis in breast cancer. Med Phys 2022;49:7555-66. [Crossref] [PubMed]
- Liu C, Ding J, Spuhler K, et al. Preoperative prediction of sentinel lymph node metastasis in breast cancer by radiomic signatures from dynamic contrast-enhanced MRI. J Magn Reson Imaging 2019;49:131-40. [Crossref] [PubMed]
- Liu J, Sun D, Chen L, et al. Radiomics Analysis of Dynamic Contrast-Enhanced Magnetic Resonance Imaging for the Prediction of Sentinel Lymph Node Metastasis in Breast Cancer. Front Oncol 2019;9:980. [Crossref] [PubMed]
- Mao N, Dai Y, Lin F, et al. Radiomics Nomogram of DCE-MRI for the Prediction of Axillary Lymph Node Metastasis in Breast Cancer. Front Oncol 2020;10:541849. [Crossref] [PubMed]
- Ho LM, Lam SK, Zhang J, et al. Association of Multi-Phasic MR-Based Radiomic and Dosimetric Features with Treatment Response in Unresectable Hepatocellular Carcinoma Patients following Novel Sequential TACE-SBRT-Immunotherapy. Cancers (Basel) 2023;15:1105. [Crossref] [PubMed]
- Rathore S, Akbari H, Rozycki M, et al. Radiomic MRI signature reveals three distinct subtypes of glioblastoma with different clinical and molecular characteristics, offering prognostic value beyond IDH1. Sci Rep 2018;8:5087. [Crossref] [PubMed]
- Kirby J, Prior F, Petrick N, et al. Introduction to special issue on datasets hosted in The Cancer Imaging Archive (TCIA). Med Phys 2020;47:6026-8. [Crossref] [PubMed]
- Lisson CS, Lisson CG, Mezger MF, et al. Deep Neural Networks and Machine Learning Radiomics Modelling for Prediction of Relapse in Mantle Cell Lymphoma. Cancers (Basel) 2022;14:2008. [Crossref] [PubMed]
- Zhang L, Wang Y, Peng Z, et al. The progress of multimodal imaging combination and subregion based radiomics research of cancers. Int J Biol Sci 2022;18:3458-69. [Crossref] [PubMed]
- Xu X, Xi L, Zhu J, et al. Intelligent Diagnosis of Cervical Lymph Node Metastasis Using a CNN Model. J Dent Res 2025;104:955-63. [Crossref] [PubMed]
- Mayfield JD, Ataya D, Abdalah M, et al. Presurgical Upgrade Prediction of DCIS to Invasive Ductal Carcinoma Using Time-dependent Deep Learning Models with DCE MRI. Radiol Artif Intell 2024;6:e230348. [Crossref] [PubMed]
- Lee G, Park H, Lee HY, et al. Tumor Margin Contains Prognostic Information: Radiomic Margin Characteristics Analysis in Lung Adenocarcinoma Patients. Cancers (Basel) 2021;13:1676. [Crossref] [PubMed]

