The value of an integrated multi-omics model in the diagnosis of benign and malignant pulmonary nodules

Wucui Huang; Xiaoli Zhu

doi:10.21037/tcr-2025-664

Original Article

The value of an integrated multi-omics model in the diagnosis of benign and malignant pulmonary nodules

Wucui Huang^1,2 , Xiaoli Zhu^1,2

¹Department of Respiratory and Critical Care Medicine, Zhongda Hospital, Southeast University, Nanjing, China; ²School of Medicine, Southeast University, Nanjing, China

Contributions: (I) Conception and design: Both authors; (II) Administrative support: X Zhu; (III) Provision of study materials or patients: X Zhu; (IV) Collection and assembly of data: W Huang; (V) Data analysis and interpretation: W Huang; (VI) Manuscript writing: Both authors; (VII) Final approval of manuscript: Both authors.

Correspondence to: Xiaoli Zhu, PhD. Department of Respiratory and Critical Care Medicine, Zhongda Hospital, Southeast University, Nanjing, China; School of Medicine, Southeast University, No. 87 Dingjiaqiao Road, Hunanlu Street, Nanjing 210009, China. Email: zhuxiaoli62@163.com.

Background: In recent years, multi-omics models based on a variety of biomarkers have been continuously developed and increasingly applied in the field of oncology, especially in the early diagnosis of lung cancer. This study aimed to integrate computed tomography (CT) radiomics with seven lung cancer-associated autoantibodies (AABs) to develop multi-omics predictive models for pulmonary nodule (PN) characterization.

Methods: This retrospective study enrolled 179 patients with PNs measuring from 5 to 30 mm in diameter who underwent thoracic surgery at Zhongda Hospital, Southeast University between January 2020 and December 2024. The patients were pathologically categorized into lung cancer (n=87) and non-lung cancer (n=92) groups, and then randomly allocated into training and test sets at a ratio of 7 to 3. Least absolute shrinkage and selection operator (LASSO) regression was used for feature screening to construct a clinical model based on five clinical characteristics. A radiomics prediction model was constructed based on the radiomics features identified after delineating the regions of interest and extracting the radiomics features; the rad-score for each patient was calculated to develop a multi-analytic comprehensive model by combining different markers. The diagnostic performances of the models were compared using the area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value.

Results: The multi-omics model demonstrated superior diagnostic accuracy with an AUC of 0.902 [95% confidence interval (CI): 0.817–0.986], accuracy of 82.4%, sensitivity of 88.5%, and specificity of 80.0%, outperforming the clinical (AUC =0.848; 95% CI: 0.777–0.919) and radiomics (AUC =0.854; 95% CI: 0.786–0.922) models. Notably, the radiomics model exhibited high sensitivity (96.6%) but poor specificity (63.6%), while the multi-omics model resolved this trade-off via the synergistic integration of clinical-radiomic-biomarker features, achieving significant improvements in the PPV (81.5% vs. 72.7%) compared to the clinical model.

Conclusions: Integrating CT radiomics with seven lung cancer-AABs established a robust multi-omics framework for PN diagnosis. Compared to the standalone clinical or radiomics models, this comprehensive model demonstrated superior diagnostic performance.

Keywords: Pulmonary nodules (PNs); radiomics; autoantibody (AAB); tumor-associated antigen (TAA); multi-omics; early diagnosis

Submitted Mar 26, 2025. Accepted for publication Dec 02, 2025. Published online Feb 25, 2026.

doi: 10.21037/tcr-2025-664

Highlight box

Key findings

• A multi-omics prediction model integrating clinical parameters, radiographic imaging features, serum biomarkers, and computed tomography (CT)-based radiomics signatures was constructed, which exhibited superior performance in distinguishing between benign and malignant pulmonary nodules (PNs) compared with previously reported models.

What is known, and what is new?

• With the advancement of CT imaging and the discovery of hematological biomarkers, an increasing number of prediction models have been developed and applied for the differential diagnosis of benign and malignant PNs.

• Our integrated model showed better predictive efficacy than single-modal models, providing a more accurate tool for PN assessment.

What is the implication, and what should change now?

• Our multi-omics model holds great clinical value for the early diagnosis of lung cancer and the differential identification of PNs. Given the clinical demand for more accurate and convenient predictive tools, the adoption of this integrated model should be promoted to identify early-stage lung cancer, thereby improving the survival rate of patients.

Introduction

As one of the most prevalent and lethal malignant neoplasms worldwide, lung cancer represents a significant public health challenge. Lung cancer not only severely compromises patient survival and quality of life, but also places substantial economic burdens on families and healthcare systems; thus, effective early detection strategies urgently need to be established (1,2). The 5-year survival rate for advanced-stage lung cancer (III–IV) is ≤20%, while that for early-stage lung cancer (stage I–II) exceeds 60–70% (3,4). Thus, early diagnosis and intervention are critical for improving prognosis. However, the detection of small pulmonary nodules (PNs) ≤1 cm poses significant challenges.

Imaging alone often fails to distinguish between benign and malignant PNs, particularly for histopathological subtypes such as adenocarcinoma in situ (AIS) and stage IA non-small-cell lung cancer (NSCLC). Further, expert consensus guidelines emphasize that the definitive pathological diagnosis of AIS necessitates surgical resection, and have reclassified AIS as a glandular precursor lesion rather than an invasive lung cancer (5). The latest evidence indicates that pre-invasive and invasive lung adenocarcinomas have distinct molecular mechanisms and clinical prognoses (6,7). Thus, the development of non-invasive approaches or models for detecting pre-invasive lesions is of particular importance.

In recent decades, low-dose computed tomography (LDCT) has emerged as a cornerstone of lung cancer screening and has been reported to reduce mortality by 20–25% in high-risk populations (8). However, the widespread use of LDCT has significantly increased the workload of radiologists, leading to challenges in maintaining diagnostic accuracy and inadvertently contributing to higher rates of missed diagnoses and misclassification (9,10). Additionally, the manual interpretation of computed tomography (CT) images remains highly subjective, with inter-reader variability affecting clinical reliability. To address these limitations, artificial intelligence-driven radiomics and machine learning technologies have gained traction in automating image analysis and extracting quantitative features predictive of malignancy. Radiomics, which involves the extraction and analysis of high-dimensional imaging data, has been successfully applied across various modalities, including CT, ultrasound, and positron emission tomography-CT to aid in tumor characterization and diagnosis (11,12). Its integration with PN evaluation holds particular promise for enhancing diagnostic precision.

Concomitantly, advancements in liquid biopsy technology have revealed that serum tumor-associated autoantibodies (T-AABs), such as anti-tumor protein p53, are consistently detectable in lung cancer patients and exhibit stable expression profiles (13,14). These biomarkers offer a non-invasive alternative to tissue sampling, enabling the dynamic monitoring of disease progression and the therapeutic response. However, existing PN prediction models, including the Mayo Clinic model (15), Brock model (16), and Beijing University People’s Hospital model (17), rely solely on clinical or radiographic features, omitting serum biomarkers or radiomics data. Recent studies have demonstrated that hybrid models integrating clinical characteristics, biomarkers, and radiomics exhibit superior diagnostic performance compared to conventional approaches, highlighting the potential of multi-omics strategies for PN stratification (18,19).

This study retrospectively analyzed a cohort of PN patients who underwent thoracic surgery at Southeast University Hospital [2020–2024]. Our objectives were: (I) to develop a comprehensive diagnostic model incorporating clinical demographics, 7-associated autoantibodies (7-AABs) in serum (a panel of seven T-AABs), and CT radiomics features; (II) to evaluate the incremental discriminatory power of this multi-omics approach compared to existing models; and (III) to present a clinically actionable nomogram for visualizing individualized malignant risk and guiding treatment decisions. By synergizing complementary data sources, this study aimed to address the current diagnostic gaps in PN management and advance precision oncology for early lung cancer. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-664/rc).

Methods

Study design and participants

Patients’ information was collected retrospectively. Patients with PNs (5–30 mm) who underwent pathological evaluation and chest CT scanning at the Zhongda Hospital, Southeast University from January 2020 to December 2024 were screened. The inclusion criteria were as follows: (I) age over 18 years; (II) definitive pathological diagnosis confirmed by thoracic surgery; (III) preoperative results of tumor-associated antigen (TAA) and serum 7-AABs; and (IV) complete CT imaging with a thickness of 1.25 mm performed within 3 months before surgery. The exclusion criteria were as follows: (I) a history of lung cancer; (II) a history of other tumors within 5 years; (III) radiotherapy, chemotherapy, immune, or targeted drug use before enrollment and evaluation; and/or (IV) hilar or mediastinal lymph node metastasis and pleural effusion suggested by CT. The sample size was determined by the minimum number required to ensure stable performance metrics, following the rule of thumb that at least 10 events per predictor variable are needed for logistic regression.

Blood samples of the surgical patients were collected before the operation and then tested at the Laboratory Department of the Zhongda Hospital, Southeast University. TAA was detected using the chemiluminescence method, while the panel of 7-AABs was detected by enzyme-linked immunosorbent assay. Based on clinical reference ranges, values below the critical threshold were defined as negative for TAA and the 7-AABs, while values above the threshold were defined as positive. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Zhongda Hospital, Southeast University (No. 2019040012), and the requirement for individual consent for this retrospective analysis was waived.

CT image acquisition and segmentation

All patients underwent chest CT examinations using continuous scanning with a thickness of 1.25 mm. The CT images were reviewed by two radiologists with more than 5 years of working experience each. If disagreements arose, consensus was reached by consultation. The PN location, number, type, diameter, and other features were recorded in the process.

Radiomics feature extraction

CT images were obtained from the data system of Zhongda Hospital, Southeast University in the form of Digital Imaging and Communications in Medicine (DICOM), and subsequently exported and uploaded to 3D Slicer software (http://www.Slicer.org/). In the Editor Segmentation module, the two radiologists independently performed image segmentation and delineated the regions of interest. The tumor boundaries were outlined layer by layer on thin-layer CT images, avoiding adjacent vascular, bronchial, chest wall, or mediastinal structures. The window settings were: width: 1,400 and bit: −600 (20).

Radiomics feature extraction was processed in Python software (Pyradiomics version 3.7, Python Software Foundation, Beaverton, OR, USA). A total of five sets of radiomics features were extracted, including first-order statistical features, two-/three-dimensional-based shape features, texture features, including the gray-level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), gray-level size zone matrix (GLSZM), and gray-level dependence matrix (GLDM), Laplacian-of-Gaussian (LOG) transformation features, and wavelet transform features. The intraclass correlation coefficients (ICCs) were calculated to evaluate the reliability and reproducibility of the feature extraction. This study complied with the Image Biomarker Standardization Initiative (21).

Statistical analysis

The SPSS 22.0 software (IBM Corp., Armonk, NY, USA) was used for data analysis. Univariate analysis was used for the continuous data with a normal distribution and equal variance. The Mann-Whitney U test was used for the non-normally distributed data; the t-test was used for group comparisons. The count data are presented as the rate (%). R statistical software (version 4.1.2, https://www.r-project.org) was used to conduct the ICC, least absolute shrinkage and selection operator (LASSO), and receiver operating characteristic (ROC) curve analyses, and generate the calibration maps and decision curves. A two-sided P value <0.05 was considered statistically significant.

Results

Patient characteristics

As shown in Table 1, 179 patients with PNs were ultimately included in the study. Based on the latest 2021 World Health Organization lung cancer classification guidelines, the patients were divided into two groups: the lung cancer group, comprising 87 (48.6%) patients, and the non-lung cancer group, comprising 92 (51.4%) patients. The lung cancer group comprised 18.4% patients with minimally invasive adenocarcinoma and 81.6% with invasive adenocarcinoma. The non-lung cancer group comprised 44 (47.8%) patients with benign lesions, 7 (7.6%) with atypical adenomatous hyperplasia, and 41 (44.6%) with AIS. The benign lesion subgroup comprised 12 cases of inflammatory nodules, eight cases of pulmonary hamartoma, seven cases of organizing pneumonia, three cases each of chronic inflammation and pulmonary tuberculosis, two cases each of granuloma, inflammatory pseudotumor, and cryptococcosis, and one case each of sarcoidosis, pulmonary sclerosing hemangioma, pulmonary aspergillosis, pulmonary hemangioma, and pulmonary fibrotic nodule.

Table 1

Characteristics of patients

Characteristics	Non-lung cancer (n=92)	Lung cancer (n=87)	Z/χ^2†	P
Age (years)	55 [49, 64]	63 [56, 70]	–4.296	<0.001
Gender			0.858	0.35
Male	56 (60.870)	47 (54.023)
Female	36 (39.130)	40 (45.977)
Smoking			2.333	0.12
No	79 (85.870)	67 (77.011)
Yes	13 (14.130)	20 (22.989)
Passive smoking			4.547	0.03
No	92 (100.000)	83 (95.402)
Yes	0 (0.000)	4 (4.598)
Personal cancer history			0.047	0.83
No	86 (93.478)	82 (94.253)
Yes	6 (6.522)	5 (5.747)
Family cancer history			0.007	0.94
No	88 (95.652)	83 (95.402)
Yes	4 (4.348)	4 (4.598)
Chronic obstructive pulmonary disease			1.141	0.29
No	91 (98.913)	84 (96.552)
Yes	1 (1.087)	3 (3.448)
Tuberculosis			0.003	0.96
No	90 (97.826)	85 (97.701)
Yes	2 (2.174)	2 (2.299)
Nodule type			12.566	0.002
Solid	35 (38.043)	49 (56.322)
Mixed ground glass	19 (20.652)	23 (26.437)
Pure ground glass	38 (41.304)	15 (17.241)
Location			4.519	0.34
LLL	25 (27.174)	36 (41.379)
LUL	7 (7.609)	6 (6.897)
RLL	23 (25.000)	20 (22.989)
RML	24 (26.087)	17 (19.540)
RUL	21 (11.732)	13 (14.130)
Single			0	0.99
No	57 (61.957)	54 (62.069)
Yes	35 (38.043)	33 (37.931)
Lobulation			9.316	0.002
No	75 (81.522)	53 (60.920)
Yes	17 (18.478)	34 (39.080)
Spiculation			13.656	<0.001
No	70 (76.087)	43 (49.425)
Yes	22 (23.913)	44 (50.575)
Pleural indentation			23.147	<0.001
No	82 (89.130)	50 (57.471)
Yes	10 (10.870)	37 (42.529)
Cavitation			0.018	0.89
No	83 (90.217)	79 (90.805)
Yes	9 (9.783)	8 (9.195)
Vascular convergence			9.074	0.003
No	82 (89.130)	62 (71.264)
Yes	10 (10.870)	25 (28.736)
Calcification			6.319	0.01
No	83 (90.217)	86 (98.851)
Yes	9 (9.783)	1 (1.149)
Pathology			–	–
Benign	44 (47.826)	0 (0.000)
Atypical adenomatous hyperplasia	7 (7.609)	0 (0.000)
AIS	41 (44.565)	0 (0.000)
Minimally invasive adenocarcinoma	0 (0.000)	16 (18.391)
Invasive adenocarcinoma	0 (0.000)	71 (81.609)
T-AABs			11.078	<0.001
Negative	74 (80.435)	50 (57.471)
Positive	18 (19.565)	37 (42.529)
7-AABs for lung cancer			9.284	0.002
Negative	64 (69.565)	41 (47.126)
Positive	28 (30.435)	46 (52.874)
Progastrin-releasing peptide			3.229	0.07
Negative	90 (97.826)	80 (91.954)
Positive	2 (2.174)	7 (8.046)
Carcinoembryonic antigen			2.614	0.11
Negative	86 (93.478)	75 (86.207)
Positive	6 (6.522)	12 (13.793)
Cytokeratin 19 fragment antigen			6.178	0.01
Negative	77 (83.696)	59 (67.816)
Positive	15 (16.304)	28 (32.184)
Neuron-specific-enolase			0.285	0.59
Negative	90 (97.826)	86 (98.851)
Positive	2 (2.174)	1 (1.149)
Cancer-testis antigen 10			1.05	0.31
Negative	82 (89.130)	73 (83.908)
Positive	10 (10.870)	14 (16.092)
Germ cell-expressed 7			1.358	0.24
Negative	86 (93.478)	77 (88.506)
Positive	6 (6.522)	10 (11.494)
Germ cell-expressed 4–5			1.253	0.26
Negative	85 (92.391)	76 (87.356)
Positive	7 (7.609)	11 (12.644)
Melanoma-associated antigen 1			0.025	0.88
Negative	84 (91.304)	80 (91.954)
Positive	8 (8.696)	7 (8.046)
Tumor protein p53			0.141	0.71
Negative	84 (91.304)	78 (89.655)
Positive	8 (8.696)	9 (10.345)
Protein gene product 9.5			2.138	0.14
Negative	87 (94.565)	77 (88.506)
Positive	5 (5.435)	10 (11.494)
SRY-related HMG-box gene 2			0.785	0.38
Negative	85 (92.391)	77 (88.506)
Positive	7 (7.609)	10 (11.494)

Categorical data are expressed as the n (%); continuous data are presented as the median [IQR]. ^†, only age was reported as Z value. Exact P values are reported without symbolic annotation, and statistical significance is defined as P<0.05. 7-AABs, 7-associated autoantibodies; AIS, adenocarcinoma in situ; IQR, interquartile range; LLL, left lower lobe; LUL, left upper lobe; RLL, right lower lobe; RML, right middle lobe; RUL, right upper lobe; T-AABs, tumor-associated autoantibodies.

The median age of the patients in the non-lung cancer group was 55 years, and 60.8% were male, while 36.1% were female. The median age of the lung cancer group was 63 years, which was significantly higher than that of the non-lung cancer group (P<0.05). A history of passive smoking was more common in the lung cancer group than in the non-lung cancer group. The CT imaging characteristics, including the diameter, nodule type, lobulation, spiculation, pleural indentation, vascular convergence, and calcification, differed significantly between the two groups (P<0.05). The median diameter of the lung cancer group was 17 mm, while that of the non-lung cancer group was 10 mm. The lung cancer group usually presented with solid nodules or mixed ground glass nodules, which accounted for 82.7% of the nodules. The most common nodule type in the non-lung cancer group was pure ground glass nodules, which accounted for 41.3% of the nodules. In terms of the blood biomarkers, significant differences were observed in the TAAs and 7-AABs between the two groups (P<0.05).

Radiomics features

A total of 1,218 radiomics features across a total of eight feature categories were extracted from 179 lung nodule lesions. The number of radiomics features is shown in Table 2. Among them, 1,027 features showed good consistency, with ICC values over 0.75. The extracted radiomics features were standardized using the Z-score method. A total of 474 features were included in LASSO regression for feature screening by T-test dimensionality reduction (Figure 1), with a λ of the minimum mean squared error of 0.036, corresponding to 15 radiomics features. Among them, one shape feature, one texture feature, four LOG operator transformation features, and nine wavelet transform features were included in the construction of the radiomics model.

Table 2

Radiomics features

Categories	Number
First-order features	18
Two-/three-dimensional shape features	14
GLCM-based features	22
GLRLM-based features	16
GLSZM-based features	16
GLDM-based features	14
LOG-based features	430
Wavelet-based features	688

GLCM, gray-level co-occurrence matrix; GLRLM, gray-level run-length matrix; GLSZM, gray-level size zone matrix; LOG, Laplacian-of-Gaussian.

Figure 1 Feature selection and LASSO regression. (A) Different coefficients in LASSO regression. (B) The correlation between lambda with binomial deviance. LASSO, least absolute shrinkage and selection operator.

Radiomics model construction

The radiomics model was constructed using the 15 radiomics features with the data randomly split 7:3 into a training set and test set. As shown in Figure 2, the AUC value of the model was 90.9% [95% confidence interval (CI): 82.8–98.8%] in the training set and 85.4% (95% CI: 78.6–92.2%) in the test set. While the sensitivity and positive predictive value (PPV) of the model were high, reaching 96.6% and 77.4%, respectively, but its specificity was poor, only 63.6%, and the negative predictive value was 75.0% (Table 3).

Figure 2 ROC curves for the radiomics model. (A) Training set. (B) Test set. AUC, area under the curve; CI, confidence interval; ROC, receiver operating characteristic.

Table 3

Evaluation indicators for radiomics model

Data set	AUC (95% CI)	Accuracy	Sensitivity	Specificity	Positive predict value	Negative predict value
Training set	0.909 (0.828–0.988)	0.839	0.960	0.751	0.803	0.904
Test set	0.854 (0.786–0.922)	0.760	0.966	0.636	0.774	0.750

AUC, area under the curve; CI, confidence interval.

Building a multi-omics integrated model

The computed rad-score of the radiomics model was incorporated as a predictive feature into the integrated multi-omics model by LASSO logistic analysis (Table 4). The integrated model included the following covariates: age, nodule diameter, pleural indentation on CT imaging, TAA, 7-AABs, and rad-score. Logistic regression analysis was employed for model development. As shown in Table 5 and Figure 3, the multi-omics model demonstrated superior discriminatory performance on the test cohort, achieving an AUC of 90.2% (95% CI: 81.7–98.6%), with a corresponding accuracy of 82.4%, robust sensitivity (88.5%), and a favorable PPV (81.5%). Notably, its specificity showed significant improvement compared to the standalone radiomics model, reaching 80.0%. Decision curves and calibration plots were generated from this predictive model (Figures 4,5). The calibration plot had a Brier score of 0.143, indicating excellent model calibration. These findings collectively demonstrate that the multi-omics predictive model provided enhanced clinical decision-making utility and improved clinical consistency compared to single-modality approaches.

Table 4

LASSO logistic analysis for rad-score

Features	β
Original_shape_Maximum2DDiameterSlice	0.008
Original_glszm_LowGrayLevelZoneEmphasis	−2.790
Log-sigma-1-0-mm-3D_glszm_SmallAreaLowGrayLevelEmphasis	−1.000
Log-sigma-3-0-mm-3D_glszm_ZonePercentage	−0.089
Log-sigma-5-0-mm-3D_firstorder_InterquartileRange	0.001
Log-sigma-5-0-mm-3D_glszm_SizeZoneNonUniformityNormalized	−0.254
Wavelet-LHL_firstorder_RootMeanSquared	0.005
Wavelet-LHH_glszm_SizeZoneNonUniformityNormalized	−0.381
Wavelet-HLL_glcm_Correlation	0.057
Wavelet-HLL_glcm_Imc1	0.468
Wavelet-HLL_glcm_Imc2	−0.468
Wavelet-HHL_glcm_Correlation	1.195
Wavelet-HHH_glcm_Imc2	−0.228
Wavelet-HHH_glszm_SizeZoneNonUniformityNormalized	−0.071
Wavelet-LLL_glcm_JointEntropy	0.003

LASSO, least absolute shrinkage and selection operator.

Table 5

Evaluation indicators for multi-omics integrated models

Data set	AUC (95% CI)	Accuracy	Sensitivity	Specificity	Positive predict value	Negative predict value
Training set	0.913 (0.861–0.967)	0.844	0.847	0.857	0.840	0.849
Test set	0.902 (0.817–0.986)	0.824	0.885	0.800	0.815	0.833

AUC, area under the curve; CI, confidence interval.

Figure 3 ROC curves for the integrated multi-omics models. (A) Training set. (B) Test set. AUC, area under the curve; CI, confidence interval; ROC, receiver operating characteristic.

Figure 4 Calibration curve of integrated multi-omics models.

Figure 5 DCA curve of the integrated multi-omics models. DCA, decision curve analysis.

Application of multi-omics integrated model

To enhance its clinical applicability, we developed a clinically interpretable logistic regression model for risk stratification (Figure 6). The rad-score emerged as the most significant predictor in this comprehensive model, demonstrating a substantial increase in malignant PN risk per unit increment. The multivariate analysis identified several independent risk factors: (I) nodular diameter enlargement; (II) advancing age; (III) pleural indentation on CT imaging; (IV) elevated thoracic effusion (TAA); and (V) positive serum 7-AAB status.

Figure 6 Nomogram of the integrated multi-omics models. 0, negative; 1, positive. AABs, autoantibodies; PI, pleural indentation; TAA, tumor-associated antigen.

Discussion

The widespread implementation of LDCT has significantly increased the detection rate of PNs (22). Despite this progress, pathological biopsy remains the definitive diagnostic gold standard for PNs, particularly those with diameters <2 cm, where non-invasive biopsy techniques frequently yield inconclusive pathological evidence (3). Current studies predominantly employ comprehensive analyses of clinical demographics and radiological characteristics to quantitatively assess malignant risk, thereby guiding personalized management strategies (23-25). The integration of multi-omics data (genomic, proteomic, and imaging-derived biomarkers) for PN risk stratification has emerged as a pivotal research frontier in thoracic oncology (26-28).

Several critical clinical and radiological parameters exert substantial influence on early lung cancer diagnosis. Age serves as a fundamental biomarker for identifying high-risk populations in lung cancer screening protocols. However, international guidelines exhibit notable discrepancies in defining age thresholds for eligibility criteria. For instance, 55–74 years has been established as the target age range for screening; however, other organizations like the United States Preventive Services Task Force (USPSTF) have proposed alternative age criteria (29,30). In this study, the lung cancer group had a median age of 56–70 years, which was significantly higher than that of the non-lung cancer group (49–64 years). Consistent with classical Mayo and Brock models (15,17), both age and nodule diameter emerged as independent risk factors for malignant PNs. Quantitatively, the lung cancer group had larger nodules (median diameter: 17 mm; range, 13–20 mm) than the non-cancer group (median diameter: 10 mm; range, 8–13.9 mm). The statistical analysis revealed a significant positive correlation between nodule diameter and malignancy risk. The Burr sign and upper lobe location were identified as independent factors. Additional independent predictors identified in this study included clinical-radiological features. Notably, pleural depression showed substantial discriminatory power in our clinical prediction model.

Serum autoantibodies (AABs), which are low-cost and highly accessible, serve as a valuable supplement to current lung cancer early-screening protocols. Recent advancements have positioned 7-AABs as one of the most promising biomarkers for early lung cancer detection (31-33). Empirical evidence demonstrates that 7-AABs exhibit superior detection rates in early-stage NSCLC patients compared to conventional tumor markers. Specifically, Ling et al. reported a sensitivity of 67.5% for 7-AABs in stage I–II lung adenocarcinoma patients (34). The panel-based detection of lung AABs has emerged as a more clinically meaningful approach for differentiating between benign and malignant PNs, with significant implications for the management of incidentally detected nodules (35,36).

TAA, another clinically prevalent and accessible serum biomarker, plays a critical role in tumor diagnosis and prognosis across multiple malignancies (37,38). Ouyang et al. developed a predictive model integrating 7-AABs with carcinoembryonic antigen and cytokeratin 19 fragment 21-1 (CYFRA21-1) for distinguishing between lung cancer patients and healthy controls. Although the model demonstrated moderate sensitivity (44.02%), its remarkable specificity (83%) underscores the synergistic diagnostic value of combining 7-AABs with TAA (39). Our current study corroborates these findings, showing higher positivity rates of 7-AABs in the overall cohort compared to traditional biomarkers, and elevated TAA/7-AAB co-expression in the lung cancer group compared to the non-lung cancer group. To enhance the predictive accuracy, we incorporated both TAA and 7-AABs as covariates in our clinical prediction model. Emerging biomarkers such as circulating tumor cells, cell-free DNA methylation, circulating microRNAs, metabolomic profiling, DNA methylation, and genome-wide association studies have shown preliminary promise in lung cancer diagnostics (40,41). However, no single blood-based biomarker has achieved routine clinical adoption. This has prompted investigations into multi-parametric models that integrate biomarkers with clinical characteristics and radiological features to improve the accuracy of diagnosing of malignant PNs (42,43).

CT has emerged as a critical tool for early lung cancer screening. The Dutch-Belgian Lung Cancer Screening Trial (NELSON), which utilized chest CT-based volumetric analysis, substantiated the diagnostic value of CT in early lung cancer detection. Compared with the National Lung Screening Trial, the NELSON study demonstrated improved sensitivity (93.5% vs. 92.5%) and remarkably enhanced specificity (98.3% vs. 73.4%) (44). In our cohort, surgical pathology revealed 59 cases of stage IA, five cases of stage IIb, and seven cases of stage IIIa lung cancer. Notably, a minority of patients (8.3%) exhibited no preoperative CT evidence of hilar, bronchial, mediastinal, or subcarinal lymphadenopathy, yet pathological confirmation ultimately revealed lymph node metastasis or pleural invasion. These findings underscore the inherent limitations of CT-based screening: despite continuous technological advancements in sensitivity and specificity, CT remains incapable of replacing invasive biopsy procedures. Pathological diagnosis persists as the definitive gold standard for PN characterization.

In recent years, radiomics has been extensively applied in early lung cancer diagnosis. Huang et al. developed a hybrid radiomics approach by extracting features from both nodule regions and peri-nodular tissues, demonstrating that peri-nodular imaging characteristics significantly predict pathological invasiveness in solitary PN (45). Wu et al. investigated the correlation between CT-based radiomics features and histopathological aggressiveness in lung adenocarcinoma, confirming radiomics signatures as predictive biomarkers for tumor aggression. Their model demonstrated impressive diagnostic performance with a sensitivity of 84.8%, a specificity of 79.2%, and an AUC of 0.878 (46). The C-Lung-RADS stepwise risk stratification model, validated in a Chinese population cohort, achieved a sensitivity of 96% for early lung cancer detection and an accuracy of 92.46% for distinguishing between benign and malignant lesions. While the PD-L1 Expression Score (PD-L1ES) radiomics model exhibited an AUC of 0.946 (47,48).

The present study specifically focused on thin-section chest CT-derived radiomics features, selecting 15 discriminative features for model development. Our findings align with previous studies: the CT-based predictive models show strong sensitivity but limited specificity. Kammer et al. (49) developed a radiomics model across four multicenter cohorts of PN patients, which demonstrated superior diagnostic performance compared to the Mayo Clinic risk model. Subsequently, they integrated radiomics scores, Mayo Clinic risk estimates, and highly sensitive CYFRA21-1 levels to construct a multi-biomarker predictive framework. This hybrid model achieved a 12.4% improvement in AUC (95% CI: 9.1–15.6%) compared to the clinical Mayo model (49). In our study, the multi-omics predictive model demonstrated optimal diagnostic accuracy (82.4%) with an AUC of 0.902 (95% CI: 0.817–0.986). Notably, the integration of radiomics features with clinical parameters significantly enhanced the sensitivity of the model for distinguishing between malignant and benign nodules (78.0% vs. 88.5%), and improved the PPV (81.5% vs. 72.7%), albeit with a moderate reduction in the specificity (83.3% vs. 80.0%). The standalone radiomics model exhibited the highest sensitivity (96.6%) but the poorest specificity (63.6%), highlighting the classic sensitivity-specificity trade-off in biomarker development.

This study integrated clinical parameters, radiographic imaging features, serum biomarkers, and CT-based radiomics signatures in PN patients to establish a comprehensive multi-omics predictive framework. Although this study had some primary limitations that warrant critical consideration, such as a single-center retrospective design and CT protocol heterogeneity, our integrated model exhibited superior diagnostic performance compared to conventional clinical prediction models and outperformed standalone radiomics signature-based approaches. This multi-modal fusion strategy not only resolves the inherent sensitivity-specificity trade-off in single-biomarker studies but also establishes a new paradigm for early lung cancer diagnosis.

Conclusions

Our multi-omics model, incorporating clinical phenotypes, radiological features, serological biomarkers, and radiomics signatures, exhibited superior performance compared to its uni-modal counterparts. This strategy holds substantial promise for optimizing early lung cancer diagnosis and advancing precision oncology via individualized patient risk stratification.

Acknowledgments

None.

Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-664/rc

Data Sharing Statement: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-664/dss

Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-664/prf

Funding: The work was supported by the Postgraduate Research and Practice Innovation Program of Jiangsu Province (No. SJCX20_0054) and the Fundamental Research Funds for the Central Universities (No. 322402109D).

Conflicts of Interest: Both authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-664/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the ethics committee of Zhongda Hospital, Southeast University (No. 2019040012) and individual consent for this retrospective analysis was waived.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. [Crossref] [PubMed]
Xia C, Dong X, Li H, et al. Cancer statistics in China and United States, 2022: profiles, trends, and determinants. Chin Med J (Engl) 2022;135:584-90. [Crossref] [PubMed]
Mazzone PJ, Silvestri GA, Souter LH, et al. Screening for Lung Cancer: CHEST Guideline and Expert Panel Report. Chest 2021;160:e427-94. [Crossref] [PubMed]
Baum P, Schlamp K, Klotz LV, et al. Incidental Pulmonary Nodules: Differential Diagnosis and Clinical Management. Dtsch Arztebl Int 2024;121:853-60. [Crossref] [PubMed]
Chinese Medical Association guideline for clinical diagnosis and treatment of lung cancer (2025 edition). Zhonghua Yi Xue Za Zhi 2025;105:2918-59. [Crossref] [PubMed]
Pan Z, Hu G, Zhu Z, et al. Predicting Invasiveness of Lung Adenocarcinoma at Chest CT with Deep Learning Ternary Classification Models. Radiology 2024;311:e232057. [Crossref] [PubMed]
Liang P, Peng M, Tao J, et al. Development of a genome atlas for discriminating benign, preinvasive, and invasive lung nodules. MedComm (2020) 2024;5:e644.
Zhong D, Sidorenkov G, Jacobs C, et al. Lung Nodule Management in Low-Dose CT Screening for Lung Cancer: Lessons from the NELSON Trial. Radiology 2024;313:e240535. [Crossref] [PubMed]
Yoo J, Cheon M, Park YJ, et al. Machine learning-based diagnostic method of pre-therapeutic (18)F-FDG PET/CT for evaluating mediastinal lymph nodes in non-small cell lung cancer. Eur Radiol 2021;31:4184-94. [Crossref] [PubMed]
Tian L, Zhang D, Bao S, et al. Radiomics-based machine-learning method for prediction of distant metastasis from soft-tissue sarcomas. Clin Radiol 2021;76:158.e19-25. [Crossref] [PubMed]
Li C, Cheng B, Li J, et al. Non-Risk-Based Lung Cancer Screening With Low-Dose Computed Tomography. JAMA 2025;333:2108-10. [Crossref] [PubMed]
Zhao R, Liu J, Liang W, et al. PSMA PET/CT for Improved Staging Accuracy and Imaging of Neovascularization-associated Features in Primary Lung Cancer. Clin Nucl Med 2025;50:e638-45. [Crossref] [PubMed]
Ma H, Wu T, Zhang Q, et al. The role of seven tumor-associated autoantibodies in the diagnosis, staging and treatment guidance of lung cancer. BMC Pulm Med 2024;24:250. [Crossref] [PubMed]
Ren F, Chen F, Xu X, et al. Clinical Value of Seven Autoantibodies Against Tumor-Associated Antigens and Tumor Markers in Lung Cancer Patients: A Retrospective Analysis from a Single Institution. Technol Cancer Res Treat 2024;23:15330338241293490. [Crossref] [PubMed]
Swensen SJ, Silverstein MD, Ilstrup DM, et al. The probability of malignancy in solitary pulmonary nodules. Application to small radiologically indeterminate nodules. Arch Intern Med 1997;157:849-55.
Li Y, Chen KZ, Wang J. Development and validation of a clinical prediction model to estimate the probability of malignancy in solitary pulmonary nodules in Chinese people. Clin Lung Cancer 2011;12:313-9. [Crossref] [PubMed]
McWilliams A, Tammemagi MC, Mayo JR, et al. Probability of cancer in pulmonary nodules detected on first screening CT. N Engl J Med 2013;369:910-9. [Crossref] [PubMed]
Zhao M, Xue G, He B, et al. Integrated multiomics signatures to optimize the accurate diagnosis of lung cancer. Nat Commun 2025;16:84. [Crossref] [PubMed]
Yang M, Yu H, Feng H, et al. Enhancing the differential diagnosis of small pulmonary nodules: a comprehensive model integrating plasma methylation, protein biomarkers, and LDCT imaging features. J Transl Med 2024;22:984. [Crossref] [PubMed]
Velazquez ER, Parmar C, Jermoumi M, et al. Volumetric CT-based segmentation of NSCLC using 3D-Slicer. Sci Rep 2013;3:3529. [Crossref] [PubMed]
Hu Z, Tang J, Wang Z, et al. Deep learning for image-based cancer detection and diagnosis−A survey. Pattern Recognition 2018;83:134-49.
Barta JA, Farjah F, Thomson CC, et al. The American Cancer Society National Lung Cancer Roundtable strategic plan: Optimizing strategies for lung nodule evaluation and management. Cancer 2024;130:4177-87. [Crossref] [PubMed]
Zuo Z, Zhang G, Chen J, et al. CT Radiomic Nomogram Using Optimal Volume of Interest for Preoperatively Predicting Invasive Mucinous Adenocarcinomas in Patients with Incidental Pulmonary Nodules: A Multicenter, Large-Scale Study. Technol Cancer Res Treat 2024;23:15330338241308307. [Crossref] [PubMed]
Zhang C, Zhou H, Li M, et al. The diagnostic value of CT-based radiomics nomogram for solitary indeterminate smoothly marginated solid pulmonary nodules. Front Oncol 2024;14:1427404. [Crossref] [PubMed]
Zhan Y, Song F, Zhang W, et al. Prediction of benign and malignant pulmonary nodules using preoperative CT features: using PNI-GARS as a predictor. Front Immunol 2024;15:1446511. [Crossref] [PubMed]
Sun L, Zhang M, Lu Y, et al. Nodule-CLIP: Lung nodule classification based on multi-modal contrastive learning. Comput Biol Med 2024;175:108505. [Crossref] [PubMed]
Li Y, Xie F, Zheng Q, et al. Non-invasive diagnosis of pulmonary nodules by circulating tumor DNA methylation: A prospective multicenter study. Lung Cancer 2024;195:107930. [Crossref] [PubMed]
Baeza S, Gil D, Sanchez C, et al. Radiomics and Clinical Data for the Diagnosis of Incidental Pulmonary Nodules and Lung Cancer Screening: Radiolung Integrative Predictive Model. Arch Bronconeumol 2024;60:S22-30. [Crossref] [PubMed]
US Preventive Services Task Force. Screening for Lung Cancer: US Preventive Services Task Force Recommendation Statement. JAMA 2021;325:962-70. [Crossref] [PubMed]
Buck B, Yates A, Bui J, et al. A retrospective cohort study investigating factors affecting recommendation for continued low-dose computed tomography lung cancer screening in the national lung cancer screening trial. J Med Screen 2025;32:215-23. [Crossref] [PubMed]
Guo H, Zhao W, Li C, et al. The diagnostic efficacy of seven autoantibodies in early detection of ground-glass nodular lung adenocarcinoma. Front Oncol 2024;14:1499140. [Crossref] [PubMed]
Ren Z, Ding HM, Qian X, et al. Clinical value of tumor-associated autoantibodies in diagnosis of early non-small cell lung cancer. Zhonghua Yu Fang Yi Xue Za Zhi 2021;55:1426-34. [Crossref] [PubMed]
Wang Y, Jiao Y, Ding CM, et al. The role of autoantibody detection in the diagnosis and staging of lung cancer. Ann Transl Med 2021;9:1673. [Crossref] [PubMed]
Ling Z, Chen J, Wen Z, et al. The Value of a Seven-Autoantibody Panel Combined with the Mayo Model in the Differential Diagnosis of Pulmonary Nodules. Dis Markers 2021;2021:6677823. [Crossref] [PubMed]
Tong L, Sun J, Zhang X, et al. Development of an autoantibody panel for early detection of lung cancer in the Chinese population. Front Med (Lausanne) 2023;10:1209747. [Crossref] [PubMed]
Hao Y, Wu LN, Lyu YT, et al. Evaluation of the application value of seven tumor-associated autoantibodies in non-small cell lung cancer based on machine learning algorithms. Zhonghua Yu Fang Yi Xue Za Zhi 2023;57:1827-38. [Crossref] [PubMed]
Tong H, Dan B, Dai H, et al. Clinical application of serum tumor abnormal protein combined with tumor markers in lung cancer patients. Future Oncol 2022;18:1357-69. [Crossref] [PubMed]
Mu Y, Li J, Xie F, et al. Efficacy of autoantibodies combined with tumor markers in the detection of lung cancer. J Clin Lab Anal 2022;36:e24504. [Crossref] [PubMed]
Ouyang R, Wu S, Zhang B, et al. Clinical value of tumor-associated antigens and autoantibody panel combination detection in the early diagnostic of lung cancer. Cancer Biomark 2021;32:401-9. [Crossref] [PubMed]
Du C, Tan L, Xiao X, et al. Detection of the DNA methylation of seven genes contribute to the early diagnosis of lung cancer. J Cancer Res Clin Oncol 2024;150:77. [Crossref] [PubMed]
Xu S, Luo J, Tang W, et al. Detecting pulmonary malignancy against benign nodules using noninvasive cell-free DNA fragmentomics assay. ESMO Open 2024;9:103595. [Crossref] [PubMed]
Su H, Chen L, Wu J, et al. Proteogenomic characterization reveals tumorigenesis and progression of lung cancer manifested as subsolid nodules. Nat Commun 2025;16:2414. [Crossref] [PubMed]
Peng F, Sinjab A, Dai Y, et al. Multimodal spatial-omics reveal co-evolution of alveolar progenitors and proinflammatory niches in progression of lung precursor lesions. Cancer Cell 2026;44:321-39. [Crossref] [PubMed]
Lindholt JS, Søgaard R. Lung-Cancer Screening and the NELSON Trial. N Engl J Med 2020;382:2164. [Crossref] [PubMed]
Huang L, Lin W, Xie D, et al. Development and validation of a preoperative CT-based radiomic nomogram to predict pathology invasiveness in patients with a solitary pulmonary nodule: a machine learning approach, multicenter, diagnostic study. Eur Radiol 2022;32:1983-96. [Crossref] [PubMed]
Wu YJ, Liu YC, Liao CY, et al. A comparative study to evaluate CT-based semantic and radiomic features in preoperative diagnosis of invasive pulmonary adenocarcinomas manifesting as subsolid nodules. Sci Rep 2021;11:66. [Crossref] [PubMed]
Wang C, Shao J, He Y, et al. Data-driven risk stratification and precision management of pulmonary nodules detected on chest computed tomography. Nat Med 2024;30:3184-95. [Crossref] [PubMed]
Wang C, Chen B, Liang S, et al. China Protocol for early screening, precise diagnosis, and individualized treatment of lung cancer. Signal Transduct Target Ther 2025;10:175. [Crossref] [PubMed]
Kammer MN, Lakhani DA, Balar AB, et al. Integrated Biomarkers for the Management of Indeterminate Pulmonary Nodules. Am J Respir Crit Care Med 2021;204:1306-16. [Crossref] [PubMed]

(English Language Editor: L. Huleatt)

Cite this article as: Huang W, Zhu X. The value of an integrated multi-omics model in the diagnosis of benign and malignant pulmonary nodules. Transl Cancer Res 2026;15(2):127. doi: 10.21037/tcr-2025-664

The value of an integrated multi-omics model in the diagnosis of benign and malignant pulmonary nodules

Highlight box

Introduction

Methods

Study design and participants

CT image acquisition and segmentation

Radiomics feature extraction

Statistical analysis

Results

Patient characteristics

Table 1

Radiomics features

Table 2

Radiomics model construction

Table 3

Building a multi-omics integrated model

Table 4

Table 5

Application of multi-omics integrated model

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share