Prognostic role of tumor microenvironment and immune- and autophagy-related genes in colorectal adenocarcinoma
Highlight box
Key findings
• This study found that the bioinformatic analysis proved that immune-a1-related genes could be used to distinguish between normal and colorectal adenocarcinoma (COADREAD) specimens and that immunity and autophagy are associated with low-risk COADREAD; therefore, these genes have the potential to improve clinical predictions of COADREAD risk.
What is known and what is new?
• Immunity and autophagy play a key role in the development and progression of COADREAD.
• The immune- and autophagy-related genes (IARGs) were explored to establish prognostic risk assessment and clinical prediction models and to understand the molecular basis of COADREAD.
What is the implication, and what should change now?
• IARG and IARDEG biomarkers could be used as prognostic factors for colorectal cancer. The IARGs and differentially expressed immune-autophagy-related genes (IARDEGs) in the model play a key role in the occurrence and progression of COADREAD and could become new targets for treatment.
Introduction
There are approximately 1.9 million new cases and 904,000 deaths annually due to colorectal cancer (COADREAD), and it is the second most common cause of cancer-associated deaths globally (1). Approximately 20% of patients at initial diagnosis exhibit distant metastases. Without treatment, the survival rate is only 6 to 12 months (2), and the prognosis is poor. Thus, the early diagnosis and establishment of prognostic risk models for COADREAD are urgently required.
Autophagy is a highly conserved degradation pathway in eukaryotes in the evolutionary process and plays a complex role in tumors, inhibiting or promoting cancer growth at different stages of cancer development. Autophagy genes, such as LC3, BECN1, and BCL2, are potential prognostic markers for COADREAD (3-5). Immunotherapies based on programmed cell death ligand 1, programmed cell death protein 1, and cytotoxic T lymphocyte-associated antigen 4 have shown advantages in treating COADREAD (6). Therefore, targeting immune and autophagy mechanisms may provide new opportunities to improve the prognosis of colorectal cancer patients.
However, there are currently no effective markers for predicting the immunotherapy efficacy. Therefore, studying the relationship between COADREAD, autophagy, and immunity to effectively predict tumor prognosis has become a new direction for COADREAD clinical research. Although there are many reports on immune- or autophagy-related genes associated with COADREAD prognosis, there have been no studies on immune- and autophagy-related genes as prognostic markers of COADREAD and their relationship with immunotherapy responses. Hence, it is important to discover such genes and to construct a gene-based risk model for the treatment and prognosis of COADREAD.
In this paper, gene expression data from two commonly used databases, the Gene Expression Omnibus (GEO) and the Cancer Genome Atlas (TCGA), were used to identify immune- and autophagy-related genes (IARGs) and immune- and autophagy-related differentially expressed genes (IARDEGs), construct molecular subtypes, and establish risk score models and clinical prediction models based on risk scores to provide a significant reference for clinical prognosis prediction. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-24-1708/rc).
Methods
Data extraction
Transcripts per kilobase of exon model per million mapped reads (TPM) expression profiles (C group: 483 cases; control group: 41 cases; R group: 167 cases; control group: 10 cases) and clinical data (Table 1) of colon cancer (COAD) and rectal cancer (READ) were obtained from TCGA (https://portal.gdc.cancer.gov/) (7) database via the TCGAbiolinks package (8). Since this study required overall survival time (OS time) and OS information of patients with COAD samples, “01A” cancer samples and samples with OS status and OS times greater than 0 were obtained. A total of 340 COAD samples (excluding adjacent cancer samples) and 41 control samples with 11A were involved in the analysis.
Table 1
Characteristics | Alive (n=401) | Dead (n=59) | Total (n=460) |
---|---|---|---|
Age (years) | |||
Mean | 64.7 | 68.4 | 65.2 |
Median | 66 | 71 | 67 |
≤67 | 217 (54%) | 17 (29%) | 234 (51%) |
>67 | 184 (46%) | 42 (71%) | 226 (49%) |
Sex | |||
Female | 182 (45%) | 25 (42%) | 207 (45%) |
Male | 219 (55%) | 34 (58%) | 253 (55%) |
T | |||
T1 | 14 (3.5%) | 2 (3.4%) | 16 (3.5%) |
T2 | 72 (18%) | 4 (6.8%) | 76 (16.5%) |
T3 | 275 (68.6%) | 40 (67.8%) | 315 (68.5%) |
T4 | 39 (9.7%) | 12 (20.3%) | 51 (11.1%) |
TX | 1 (0.2%) | 1 (1.7%) | 2 (0.4%) |
N | |||
N0 | 234 (58.4%) | 23 (39%) | 257 (55.9%) |
N1 | 99 (24.7%) | 18 (30.5%) | 117 (25.4%) |
N2 | 66 (16.4%) | 17 (28.8%) | 83 (18%) |
NX | 2 (0.5%) | 1 (1.7%) | 3 (0.7%) |
M | |||
M0 | 298 (74.3%) | 31 (52.5%) | 329 (72%) |
M1 | 45 (11.2%) | 20 (33.9%) | 65 (14%) |
MX | 58 (14.5%) | 8 (13.6%) | 66 (14%) |
Stage | |||
I | 73 (18.2%) | 3 (5%) | 76 (16.5%) |
II | 146 (36.4%) | 17 (28.8%) | 163 (35.4%) |
III | 120 (30%) | 15 (25.4%) | 135 (29.3%) |
IV | 46 (11.4%) | 20 (34%) | 66 (14.4%) |
Site | |||
Colon | 290 (72%) | 50 (85%) | 340 (74%) |
Rectum | 111 (28%) | 9 (15%) | 120 (26%) |
TCGA, The Cancer Genome Atlas.
The READ samples also required information on the OS and OS time; therefore, “01A” cancer samples and samples with OS status and OS Time greater than 0 were obtained. A total of 120 READ samples (no adjacent cancer samples) and 11 control samples with 11A were involved in the analysis. Owing to the analysis of the overall characteristics of patients with COADREAD in this study, the expression spectrum and clinical data of COAD and READ patients were combined for analysis, and 460 COADREAD samples (no adjacent cancer samples) were involved in the analysis. There were 401 cases in the survival state group, 59 cases in the death state group, and 51 cases in the control group. Only protein-coding genes were retained for the genes in the TPM expression profile, and half of the genes that were not expressed were deleted, leaving 16,891 genes.
The “Masked Somatic Mutation” data was obtained from TCGA database through the TCGAbiolinks package as somatic mutation data for patients with COADREAD. There were 596 cases. We applied the R maftools package (9) to visually analyze the somatic mutations. The “Masked Copy Number Segment” data was taken from TCGA database via the R TCGAbiolinks package as the copy number alteration data for patients with COADREAD. There were 611 cases. The R ggplot2 package was used for the visual analysis of the copy number variation.
We downloaded the COADREAD-related datasets GSE161158 (10) and GSE17536 (11) from the gene expression omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) (12) database using the R GEOquery package (13). Among them, the GSE161158 dataset was from Homo sapiens, its data platform was GPL570, and it included 250 cancer samples (no adjacent cancer samples). As the survival time needed to be greater than 0, the samples with 0 survival time were deleted, and 191 cancer samples (no adjacent cancer samples) were included in the analysis. The GSE17536 dataset was from Homo sapiens, its data platform was GPL570, and it contained 177 cancer samples (no adjacent cancer samples) for analysis.
As there were no data on immunotherapy responses for COADREAD, we used immunotherapy data from cancers not studied in this study for validation with reference to PMID: 35444656 literature (14). The immunotherapy cohort data for bladder cancer (BLCA) (15) were downloaded through the R IMvigor210CoreBiologies package, which included 298 BLCA samples (no adjacent cancer samples) for analysis. Among them, 68 cases were in the complete response (CR)/partial response (PR) group, and 230 cases were in the stable disease (SD)/progressive disease (PD) group.
Immune-related genes (IRGs) were collected through the InnateDB (http://www.innatedb.com/) database (16) and the Immport (https://www.immport.org/) database (17). By combining the results of the two database, 359 IRGs (available online: https://cdn.amegroups.cn/static/public/tcr-24-1708-1.csv) were obtained. The Human Autophagy Database (HADb, http://www.autophagy.lu/indes.html) is the first specialized autophagy database and a public repository that holds information on the human artophagy-related genes (ARGs) described so far. The ARG data were downloaded from HADb, resulting in 222 ARGs (Table S1). After merging and deduplication, we obtained 22 IARGs (Table S2).
Construction of molecular subtypes based on IARGs
Consensus clustering determines the number and members of possible clusters in a dataset (microarray gene expression) (18). To differentiate between different subtypes of COADREAD better, we performed consistency clustering of TCGA expression profile data by processing IARGs using the R ConsensusClusterPlus package (19). In this development, we set the cluster number to between 2–8 and extracted 80% of the total sample 1,000 times, with distance = “Euclidean” and clusterAlg = “pam”.
GSVA
To study the diversity in biological processes between the different groups, GSVA was performed (20) on the TCGA-COADREAD patient gene-expression profile dataset. GSVA is an unsupervised nonparametric algorithm that does not require prior grouping of samples and can compute a specific gene-set-enrichment score in each sample. It is often applied for estimating changes in samples expressing dataset pathways and bioprocess activity. We obtained the “h.all.v7.5.1.symbols” gene set from the MSigDB database (21) for the GSVA of the IARG molecular subtypes in the TCGA-COADREAD dataset. A less-than-0.05 modified P value was considered statistically significant.
Differential gene analysis related to immunity and autophagy
To evaluate the differences between different molecular subtypes, the groupings were analyzed using the R limma package (22) based on the grouping information of different molecular subtypes in TCGA data. The genes with adjPvalue <0.01 and |logFC| >1 were screened as the threshold to obtain IARDEGs. Among them, genes with logFC >1 and adjP value <0.01 were upregulated IARDEGs, and genes with logFC <−1 and adjP value <0.01 were downregulated IARDEGs.
Construction of molecular isoforms based on IARDEGs
To further determine the impact of IARDEGs on the prognosis of COADREAD, we used the R ConsensusClusterPlus package (19) to perform consistent clustering of IARDEGs from TCGA expression profile data to identify different subtypes of COADREAD more accurately. In this development, we set the cluster number to between 2–8 and extracted 80% of the total sample 1000 times, with distance = “maximum” and clusterAlg = “pam”.
Feature enrichment analysis
Gene Ontology (GO) analysis (23) is a typical tool used to study large-scale functional enrichment, such as molecular functions, cellular components, and biological processes. Kyoto Encyclopedia of Genes and Genomes (KEGG) (24) is a common database for storing information about genomes, diseases, drugs, and biological pathways. We performed KEGG pathway enrichment and GO annotation analyses of IARDEGs in the TCGA-COADREAD dataset using the R package clusterProfiler (25). The false discovery rate (FDR) cut-off value of <0.05 was considered statistically significant. The Benjamin-Hochberg method was used to correct the P value. The entry screening criteria were adjP value <0.05 and q value <0.05.
Risk model construction based on IARDEGs
LASSO regression (26) is a commonly adopted method to establish prognostic medals based on linear regression by adding the penalty term (the absolute value of lambda*slope), reducing the overfitting of the model, and improving the generalization ability of the model. For the purpose of evaluating the ability of IARDEGs to predict patient survival, the TCGA dataset was used as the test set, and univariate and multivariate Cox regression analyses and LASSO regression analysis were applied to test for prognosis-associated genes further and establish a prognostic model accordingly. First, we applied the univariate Cox regression analysis to test the link among each IARDEG expression and OS and retained genes with P values <0.05. Subsequently, we used the LASSO algorithm to eliminate multicollinearity and screened for meaningful variables from the univariate Cox regression analysis. To obtain more accurate independent prognostic factors (prognostic characteristic genes), multivariate Cox regression analysis and final screening via stepwise regression were used. Finally, the risk score formula was calculated using multivariate Cox regression coefficients, considering the expression of optimized genes:
The best risk score grouping cut-off value was calculated using the R surv cutpoint function. The TCGA-COADREAD patients were divided into high-risk and low-risk groups based on the cut-off value. A logarithmic rank test and Kaplan-Meier analysis were applied using the R survival package to investigate the OS of TCGA patients.
To analyze the OS in GSE17536 and GSE161158, we used the expression spectrum data of GSE17536 and GSE161158 as the verification set, tested the risk score separately using the above formula, grouped the data based on the risk cut-off value, and performed log-rank tests and Kaplan-Meier analysis using the R survival package.
Gene set enrichment analysis (GSEA)
To analyze the differences in biological processes among the two groups, GSEA was performed (27) based on the TCGA-COADREAD gene-expression profiling dataset. GSEA is a computational method that analyzes whether a particular set of genes is statistically different between two biological states and measures changes in pathway and bioprocess activity in expression dataset samples. The gene collection “h.all.v7.5.1.symbols” was loaded from the MSigDB database (21) for GSEA of all genes in high- and low-risk patient samples. An adjusted P value of <0.05 was considered statistically significant.
Immunoinfiltration analysis
The TME mainly includes surrounding inflammatory and immune cells, tumor tissue, tumor-associated interstitial tissues, fibroblasts, various chemokines, and cytokines, and is a comprehensive system of loading. The immune-cell-infiltration analysis in cancer tissues plays a vital role in disease research and treatment prognosis. We obtained IRGs from PMID:28052254 (28) and the gene set contained 28 cell types and 782 genes, including activated dendritic cells, natural killer T cells, activated CD8 T cells, regulatory T cells, macrophages, and other human immune cell subtypes. We processed TCGA-COADREAD expression profile data with a GSVA package in R and analyzed the degree of invasion of immune cells using the ssGSEA algorithm. Immune cell correlation maps were plotted using the R corrplot package.
Construction of clinical prediction model based on IADEG model
A nomogram (29) is a plot in a planar cartesian coordinate system that represents the functional relationship between multiple independent variables with a cluster of disjoint line segments. It relies on multivariate regression analysis and usually sets a certain scale to score and characterize each variable of the multivariate regression model, finally calculating the total score to forecast the probability of events occurring. Thus, to demonstrate that risk scores are linked with clinicopathological features and could be used for individualizing the prognosis of patients with COADREAD, we used multivariate and univariate Cox regression based on TCGA-COADREAD expression profile data to investigate the ability of the risk scores linked with clinicopathological characteristics of patients with COADREAD to predict OS. After the risk score model was generated, the clinicopathological features (P<0.05) that significantly correlated with OS were included in the model, and the clinical prediction nomogram was constructed using the R rms package. To evaluate the performance of the nomogram, a calibration curve was created, which compared the estimated values of the nomogram with the observed actual survival rate.
Statistical analysis
All data processing and analysis involved in this study were performed using R software (Version 4.2.2, https://www.r-project.org/). The statistical significance of normally distributed variables for comparison of two sets of continuous variables was evaluated using the independent Student’s t-test. The differences in non-normally distributed variables were analyzed using the Mann-Whitney U test (Wilcoxon rank sum test). The statistical significance among the two groups of categorical variables was compared and evaluated using the Chi-square test or Fisher’s exact test. We performed survival analysis using the R survival package, showed survival differences using the Kaplan-Meier survival curve, and assessed the significance of the difference in survival time between the two groups using the log-rank test. Both multivariate and univariate Cox analyses were based on the R survival package, and Lasso analysis was based on the R glmnet package (30). All statistical P values in tests were two-tailed, with P<0.05 as statistically significant. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Results
Effects of IARGs on colorectal cancer in TCGA-COADREAD dataset
The technical roadmap for this study is shown in Figure 1. To analyze the effect of IARGs on COADREAD, we performed a principal component analysis of TCGA data from patients with COADREAD and IARGs using the R prcomp function (Figure 2A). The results showed that IARGs could be used to distinguish most COADREAD samples from normal samples. We then used the R ggplot function to analyze the mRNA levels in TCGA-COADREAD normal and cancer samples (Figure 2B), which revealed that the expression of more than 85% of IARGs differed greatly between these two groups (P<0.05). Therefore, to visually identify the differences between IARGs in the normal and COADREAD groups, we identified the differentially expressed genes (adjP value <0.01, |logFC| >1) among the two groups using the limma package of R and generated a volcano plot (Figure 2C) and heat map (Figure 2D) using the R ggplot function and heatmap function.


The results showed that VEGFA (P<0.001), BIRC5 (P<0.001), BID (P<0.001), and TNFSF10 (P<0.001 were significantly differently expressed among the differentially expressed genes. The VEGFA (P<0.001), BIRC5 (P<0.001), and BID (P<0.001) genes were highly expressed in the COADREAD group, and TNFSF10 expression (P<0.001) was low in the COADREAD group. We then performed a single nucleotide mutation analysis of IARGs via the R maftools package (Figure 2E). The results showed that the IARGs were mutated in 77 samples with a mutation frequency of 17.19%, and the highest mutation frequency of the EGFR gene was 4%. By analyzing the copy number variation of IARGs in COADREAD (Figure 2F), we found that most of the samples exhibited copy number changes and amplification.
Molecular isoform construction of IARGs in TCGA-COADREAD dataset
To analyze the correlation between IARGs in the TCGA-COADREAD dataset, correlation heat maps of IARGs were plotted using the R corrplot package (Figure 3A). The expression of most IARGs was positively correlated with COADREAD. Among them, the TBK1 and EIF2AK2 genes were highly positively correlated (cor=0.772, P<0.001), the MAPK8 and TBK1 genes were positively correlated (cor=0.752, P<0.001, and the BID and EGFR genes were negatively correlated (cor=−0.229, P<0.001).

To analyze the effect of IARGs on COADREAD prognosis, we performed unsupervised clustering using a consistent clustering algorithm on the expression profiles of 22 IARGs for all samples from patients with COADREAD. First, we determined the optimal number of clusters by computation (consistent cumulative distribution function: Figure 3B; lithography: Figure 3C; cluster result: Figure 3D). The results indicated that the clustering effect was the best when the cluster number was four. Therefore, we selected k=4 as the optimal number of clusters for unsupervised clustering, and all samples were clustered into four categories.
Subsequently, we performed prognostic analysis on four types of samples. Survival significantly differed between the four subtypes (log-rank P=0.03) (Figure 3E), demonstrating the accuracy of the clustering results. We then plotted a boxplot showing the differential expression of IARGs between different groups using the ggplot2 package of R (Figure 3F). The results indicated that, except for the BIRC5 gene, there were significant differences in the expression of IARGs between different groups (P<0.05). Additionally, heat maps of differences in IARG expression between different groups were plotted (Figure 3G). Genes such as CXCR4, CCL2, and CTSB were highly expressed in the C4 class, which exhibited a better prognosis, and exhibited low expression in the C1 class, which had a poorer prognosis.
Gene set variation analysis (GSVA) of IARG model in TCGA-COADREAD dataset
We conducted an enrichment analysis of hallmark gene sets from the expression profile data of the four subtypes using the GSVA package in R (Figure 4A, Table S3) to analyze the functional differences among different isoforms in the TCGA-COADREAD dataset. The subtypes were mainly enriched in complement, inflammatory responses, KRAS signaling, STAT5/IL2 signaling, JAK/IL6/STAT3 signaling, allograft rejection, apoptosis, interferon-gamma response, and other hallmark gene sets (Figure 4A). The androgen response and protein secretion pathways were highly expressed in C4 samples. Moreover, xenobiotic metabolism and peroxisome pathways were highly expressed in C1.

To analyze the correlation between the four subtypes and different biological characteristics, the proportion of different IARG molecular subtypes in the disease was plotted using the ggplot function of R (Figure 4B). The proportion of the C4 class in the colon was higher than that of the C1 class (colon: C4 class accounted for 81.2%, and C1 class accounted for 65.4%). The proportion of IARG molecular subtypes in the M stage was then plotted using the ggplot function of R (Figure 4C). The results showed that in the M0 group, the proportion of the C4 class was higher than that of the C1 class (M0: C4 class accounted for 25.53%, and the C1 class accounted for 23.40%). C1 class samples with a better prognosis were more dominant in the M0 group than in the M1 group (C4: 25.53% in the M0 group and 7.69% in the M1 group). A Sanchi plot was plotted using the R ggalluvial function to show the relationship between tumor stage, IARG molecular subtypes, and N stage (Figure 4D). The results showed that the tumors of most of the C4 patients with a better prognosis were Stage II and Stage III. Most of the C4 patients with better prognoses were in the N0 stage, and most of the N1 patients with poor prognoses comprised the C1 class.
Molecular subtype construction and functional enrichment analysis of IARDEGs in the TCGA-COADREAD dataset
We performed an analysis of the differential gene expression of four subtypes in the TCGA-COADREAD dataset to determine the biological differences between different subtypes using the R limma package. In patients with COADREAD, a total of 1,754 IARDEGs (available online: https://cdn.amegroups.cn/static/public/tcr-24-1708-2.csv) were acquired. To analyze the effect of IARDEGs on COADREAD prognosis, we performed unsupervised clustering using a consistent clustering algorithm on the expression profiles of 1,754 IARDEGs for all patient samples. First, we calculated the optimal number of clusters (Figure 5A). The effect of clustering was best when the cluster number was three. Thus, k=3 was selected as the optimal number of clusters for unsupervised clustering, and all samples were clustered into three classes. Then, we conducted a prognosis analysis of the three types of samples. The findings showed a remarkable difference in survival among the three groups (Log-rank P=0.045) (Figure 5B), which proved the accuracy of the clustering results.

GO (biological processes: Figure 5C; cell composition: Figure 5D; molecular function: Figure 5E, Table 2) and KEGG (Figure 5F, Table 3) functional enrichment analyses were performed on 1754 IARDEGs to analyze the associated molecular functions, cellular components, biological processes, and biological pathways. Next, GO functional enrichment analysis on IARDEGs was completed. The results demonstrated that the IARDEGs were largely enriched in leukocyte migration, cell-adhesion positive regulation, cytokine production positive regulation, cell-cell adhesion regulation, leukocyte cell-cell adhesion, T cell-activation regulation, cell chemotaxis, and other biological processes (Figures 5C), as well as the external side of endoplasmic reticulum lumen, collagen-containing extracellular matrix, plasma membrane, secretory granule membrane, membrane raft, membrane microdomain, endocytic vesicle, cell-substrate junction, and other cellular components (Figure 5D). The enriched molecular functions were extracellular matrix structural constituent, integrin binding, collagen binding, cytokine binding, glycosaminoglycan binding, immune receptor activity, heparin binding, growth factor binding, and other molecular functions (Figure 5E) (Table 2). Pathway enrichment analysis of IARDEGs showed enrichment in KEGG pathways such as hematopoietic cell lineage, rheumatoid arthritis, viral protein interaction with cytokines and cytokine receptors, cell-adhesion molecules, chemokine signaling pathway, cytokine-cytokine receptor interaction, and Staphylococcus aureus infection (Figure 5F, Table 3). In addition, we visualized the Th1 and Th2 cell differentiation pathways (Figure 5G) and the Th17 cell differentiation pathway (Figure 5H).
Table 2
ONTOLOGY | ID | Description | P.adjust |
---|---|---|---|
BP | GO:0050900 | Leukocyte migration | <0.001 |
GO:0045785 | Positive regulation of cell adhesion | <0.001 | |
GO:0007159 | Leukocyte cell-cell adhesion | <0.001 | |
GO:0001819 | Positive regulation of cytokine production | <0.001 | |
GO:0022407 | Regulation of cell-cell adhesion | <0.001 | |
GO:0050863 | Regulation of T cell activation | <0.001 | |
GO:0060326 | Cell chemotaxis | <0.001 | |
GO:1903037 | Regulation of leukocyte cell-cell adhesion | <0.001 | |
GO:0022409 | Positive regulation of cell-cell adhesion | <0.001 | |
GO:0002683 | Negative regulation of immune system process | <0.001 | |
CC | GO:0062023 | Collagen-containing extracellular matrix | <0.001 |
GO:0009897 | External side of plasma membrane | <0.001 | |
GO:0005788 | Endoplasmic reticulum lumen | <0.001 | |
GO:0030667 | Secretory granule membrane | <0.001 | |
GO:0045121 | Membrane raft | <0.001 | |
GO:0098857 | Membrane microdomain | <0.001 | |
GO:0030139 | Endocytic vesicle | <0.001 | |
GO:0030055 | Cell-substrate junction | <0.001 | |
GO:0005925 | Focal adhesion | <0.001 | |
GO:0005581 | Collagen trimer | <0.001 | |
MF | GO:0005201 | Extracellular matrix structural constituent | <0.001 |
GO:0005178 | Integrin binding | <0.001 | |
GO:0005539 | Glycosaminoglycan binding | <0.001 | |
GO:0019955 | Cytokine binding | <0.001 | |
GO:0140375 | Immune receptor activity | <0.001 | |
GO:0005518 | Collagen binding | <0.001 | |
GO:0008201 | Heparin binding | <0.001 | |
GO:0019838 | Growth factor binding | <0.001 | |
GO:0005126 | Cytokine receptor binding | <0.001 | |
GO:0005125 | Cytokine activity | <0.001 |
BP, biological process; CC, cellular component; COADREAD, colorectal adenocarcinoma; GO, Gene Ontology; IARGs, immune and autophagy-related genes; MF, molecular function; TCGA, The Cancer Genome Atlas.
Table 3
ID | Description | P.adjust |
---|---|---|
hsa04640 | Hematopoietic cell lineage | <0.001 |
hsa04061 | Viral protein interaction with cytokine and cytokine receptor | <0.001 |
hsa05323 | Rheumatoid arthritis | <0.001 |
hsa04514 | Cell adhesion molecules | <0.001 |
hsa04060 | Cytokine-cytokine receptor interaction | <0.001 |
hsa04062 | Chemokine signaling pathway | <0.001 |
hsa05150 | Staphylococcus aureus infection | <0.001 |
hsa05144 | Malaria | <0.001 |
hsa04380 | Osteoclast differentiation | <0.001 |
hsa05140 | Leishmaniasis | <0.001 |
COADREAD, colorectal adenocarcinoma; KEGG, Kyoto Encyclopedia of Genes and Genomes; IARGs, immune and autophagy-related genes; TCGA, The Cancer Genome Atlas.
To identify the correlations between the three subtypes of IARDEGs and different biological features in the TCGA-COADREAD dataset, the proportions of different IARDEGs in the M stages (Figure 5I) and age groups (Figure 5J) were plotted using the R ggplot function. The proportion of G2 class samples with better prognosis was less than that of G1 class samples with poor prognosis in the M1 stage (M1: G2 class accounted for 35.4%, G1 class accounted for 50.8%), and the proportion of G2 class samples with a better prognosis in the M0 stage was higher than that in the M1 stage (G2: 39.8% in M0 stage and 35.4% in M1 stage). Meanwhile, the proportion of G2 samples with better survival was higher in the ≤67 years age group than in the >67 years age group (G1: 40.6% in ≤67 years, 33.6% in >67 years). A Sanchi plot was plotted using the R ggalluvial function to show the relationship between tumor stage, IARDEG molecular subtypes, and N stage (Figure 5K). The results demonstrated that most G2 patients with a good prognosis were N0 patients, and most Stage III and Stage IV patients with a poor prognosis were included in the G1 class.
Model construction of IARDEGs and GSEA of TCGA-COADREAD
To quantify the influence of IARDEGs on COADREAD prognosis, we synthesized the expression of 1,754 IARDEGs to institute a risk score model. Initially, 1,754 IARDEGs in the TCGA-COADREAD dataset were screened according to univariate Cox regression analysis. We selected 26 genes with scores less than 0.05 for subsequent analysis (Figure 6A, available online: https://cdn.amegroups.cn/static/public/tcr-24-1708-3.csv). Next, we screened the 26 IARDEGs associated with the prognosis of COADREAD patients as per LASSO regression analysis to eliminate collinearity. Then, the optimal lambda value was determined, and 15 genes (RPS10, HLA-DQB2, ADAM9, CITED2, BATF, OXCT1, TIMP1, CCL22, CLEC10A, CST2, AFF1, EGFL7, CTLA4, DUOXA2, and CCDC85B) were obtained for further analysis (Figure 6B). Overall, 11 genes were (RPS10, HLA-DQB2, ADAM9, BATF, TIMP1, CCL22, CST2, AFF1, EGFL7, CTLA4, and DUOXA2) obtained via stepwise regression screening for optimal combination (Figure 6C). The model scoring formula was:

To investigate the model’s effectiveness, the risk score cut-off value (12.10278) was used to divide the TCGA-COADREAD patient training dataset into low-risk and high-risk groups. The survival analysis results demonstrated that there were remarkable differences between the two groups of the TCGA dataset, and the survival of the low-risk group was significantly better than that of the high-risk group (P< 0.001) (Figure 6D). Subsequently, the risk score cut-off values (GSE17536: 16.20523; GSE161158: 17.29614) were used to divide patients in the GSE17536 and GSE161158 validation sets into low-risk and high-risk groups, respectively. Survival analysis demonstrated that the survival of the low-risk group in the GSE17536 dataset was significantly better than that of the high-risk group (P=0.02) (Figure 6E), and the survival of the low-risk group in the GSE161158 dataset was also significantly better than that of the high-risk group (P<0.001) (Figure 6F).
To determine the effect of the gene expression levels in the two groups of COADREAD, the biological processes associated with TCGA data gene expression were analyzed via GSEA. The results showed that allograft rejection (Figure 6G, NES =−2.36,P<0.001), interferon-gamma response (Figure 6H, NES =−2.35, P<0.001), inflammatory response (Figure 6I, NES =−2.25, P<0.001), interferon alpha response (Figure 6J, NES =−2.08, P<0.001), JAK IL6 STAT3 signaling (Figure 6K, NES =−1.59, P<0.001), and IL2 STAT3 signaling (Figure 6L, NES =−1.17, P<0.001) were greatly enriched in low-risk patients (Table 4).
Table 4
ID | NES | P.adjust |
---|---|---|
HALLMARK_ALLOGRAFT_REJECTION | −2.35526 | <0.001 |
HALLMARK_INTERFERON_GAMMA_RESPONSE | −2.3504 | <0.001 |
HALLMARK_INFLAMMATORY_RESPONSE | −2.25353 | <0.001 |
HALLMARK_INTERFERON_ALPHA_RESPONSE | −2.08445 | <0.001 |
HALLMARK_IL6_JAK_STAT3_SIGNALING | −2.08219 | <0.001 |
HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION | −2.00582 | <0.001 |
HALLMARK_TNFA_SIGNALING_VIA_NFKB | −1.91784 | <0.001 |
HALLMARK_COMPLEMENT | −1.7767 | <0.001 |
HALLMARK_UV_RESPONSE_DN | −1.8522 | <0.001 |
HALLMARK_IL2_STAT5_SIGNALING | −1.71186 | <0.001 |
COADREAD, colorectal adenocarcinoma; GSEA, gene-set enrichment analysis; NES, normalized enrichment score; TCGA, The Cancer Genome Atlas.
COADREAD dataset
Immune cell invasion analysis of TCGA patients with COADREAD
To assess the impact of the IARDEG model risk score on the different immune cell infiltration levels in patients with COADREAD, the ssGSEA algorithm was used to detect the relative abundance of each infiltrating immune cell and identify the type of each infiltrating immune cell. Heat plots were plotted using the R pheatmap function to show the immune cell infiltration scores in TCGA data to examine the immune cell expression (Figure 7A, available online: https://cdn.amegroups.cn/static/public/tcr-24-1708-4.csv). The results demonstrated that immune cells such as type 17 T helper cells and CD56dim natural killer cells were highly expressed in the high-risk group, and cells such as immature dendritic, central memory CD8 T, and gamma delta T cells were highly expressed in the low-risk group (Figure 7A).

A heat map of the correlation among the 11 IARDEGs in the immune cell and risk models was plotted using the R ggplot function (Figure 7B). The results demonstrated that genes such as BATF were highly expressed in most immune cells; for example, BATF was highly expressed in central memory CD4 T cells (cor=0.119, P=0.01). Some genes, such as ADAM9, were expressed at low levels in most immune cells; ADAM9 was expressed in memory B cells (cor=−0.119, P=0.01) (Figure 7B). We then plotted a box plot using the R ggplot function to investigate whether there was a difference in the immune cell infiltration degree among the two groups. We found that over 70% of the immune cells were significantly different between the two groups (P<0.05), and most immune cells exhibited significantly higher infiltration in the low-risk group than in the high-risk group (Figure 7C, P<0.05).
Then, the R corrplot function was used to plot the correlation between the two groups in the TCGA-COADREAD data and differentiated immune cells. The results revealed a positive correlation among most immune cells in the high-risk group (Figure 7D). Among them, central memory CD4 T cells were significantly positively correlated with natural killer cells (cor=0.861, P<0.001), and type 2 T helper cells were remarkably negatively associated with CD56dim natural killer cells (cor=−0.260, P=0.01). Most low-risk group immune cells (Figure 7E) were also positively correlated. Among them, effector memory CD8 T cells were significantly positively correlated with MDSC immune cells (cor=0.915, P<0.001), and effector memory CD4 T cells were significantly negatively correlated with CD56dim natural killer cells (cor=−0.255, P<0.001).
Prediction of immunotherapy efficacy based on risk scores of TCGA-COADREAD dataset
For the purpose of analyzing the prediction of immunotherapy efficacy in different datasets using the risk score model, the risk score cut-off value (10.97891) was used to divide the patients in the immunotherapy dataset IMvigor210 into low-risk and high-risk groups. The result of the survival analysis showed that there was a remarkable difference between the two groups. The survival of the low-risk group was better than that of the high-risk group (Figure 8A, P=0.03), which indicated that the risk score could predict immunotherapy efficacy and patient survival.

Then, the proportion of immunotherapy responses in the IMvigor210 dataset in the two groups was plotted using the ggplot function of R (Figure 8B). The immunotherapy response group accounted for more of the low-risk group than the high-risk group (CR/PR: low group: 27%; high group: 14%), and the immunotherapy non-response group accounted for more of the high-risk group than the low-risk group (SD/PD: low group: 73%; high group: 86%). Finally, we plotted characteristic gene-expression patterns, survival status, and the risk score distribution of the IMvigor210 dataset (Figure 8C) and showed that the high-risk patients were more likely to die and had roughly the same gene expression pattern.
Immune checkpoints are intrinsic regulatory mechanisms in the immune system that prevent collateral damage during physiological immune responses and maintain self-tolerance. To investigate the connection between risk scores and immune checkpoints, we plotted a difference boxplot of immune checkpoint genes in the two groups using the R ggplot function (CD274, Figure 8D; CD160, Figure 8E; CD86, Figure 8F; CD47, Figure 8G; CD226, Figure 8H; TNFRSF9, Figure 8I). The results showed that the expression levels of the majority of the immune checkpoint genes were remarkably different, and the low-risk group expression levels were significantly higher than those of the high-risk group, such as CD274 (P<0.001) and CD160 (P<0.001) expression levels.
Construction of clinical prediction model based on risk scores in TCGA-COADREAD dataset
To judge whether multiple clinicopathological features and risk scores were independent prognostic factors, multivariate and univariate Cox regressions were performed. The results of univariate Cox regression revealed that risk score (P<0.001), M stage (P<0.001), stages (P<0.001), age group (P=0.004), and N stage (P=0.01) were related to the overall survival (OS) of the patients (Figure 9A, Table 5). Multivariate Cox regression of the significant factors in univariate Cox regression demonstrated that risk score (P<0.001), age group (P=0.001), and M-stage (P<0.001) were greatly correlated with the OS of the patient (Figure 9B, Table 6). Therefore, the risk score was combined with the M stage and age group to establish a predictive nomogram to assess the OS status of patients with COADREAD (Figure 9C). The nomogram was calibrated and compared with the actual observed OS of the patient at one, three, and five years. Good consistency between the two was found (Figure 9D), indicating that the model accuracy was good.

Table 5
Variable | HR | Lower 95% CI | Upper 95% CI | P value |
---|---|---|---|---|
Risk score | 2.72 | 1.95 | 3.79 | 3.43E−09 |
TNM_M | 4.14 | 2.32 | 7.37 | 1.44E−06 |
Stage | 1.39 | 0.398 | 4.85 | 1.16E−05 |
Age | 2.32 | 1.32 | 4.07 | 0.00355 |
TNM_N | 1.95 | 1.15 | 3.33 | 0.0136 |
Gender | 1.37 | 0.815 | 2.3 | 0.235 |
TNM_T | 1.45 | 0.617 | 3.41 | 0.393 |
Data are presented as HR with 95% CI, derived from univariate Cox proportional hazards regression analysis. HR represents the relative risk of an event occurring, and the corresponding 95% CI describes the uncertainty range of the HR estimate. The P values were calculated using the Wald test, with P<0.05 considered statistically significant. CI, confidence interval; HR, hazard ratio; TNM_M, Tumor Node Metastasis_Metastasis; TNM_N, Tumor Node Metastasis_Node; TNM_T, Tumor Node Metastasis_Tumor.
Table 6
Variable | HR | Lower 95% CI | Upper 95% CI | P value |
---|---|---|---|---|
Risk score | 2.76 | 1.92 | 3.97 | <0.001 |
Stage II | 3.11 | 0.4 | 24.14 | 0.278 |
Stage III | 3.64 | 0.47 | 28.16 | 0.216 |
Stage IV | 17.1 | 2.28 | 128.16 | 0.006 |
Age >67 years | 2.97 | 1.52 | 5.8 | 0.001 |
Data are presented as HR with 95% CI, derived from multivariate Cox proportional hazards regression analysis. The multivariate model was adjusted for tumor stage and patient age. HR represents the relative risk of an event occurring after adjusting for confounding factors. The P values were derived from the Wald test, with statistical significance set at P<0.05. CI, confidence interval; HR, hazard ratio.
Discussion
The initiation and progression of COADREAD is a multi-factor, multi-gene, and multi-stage process with many recognized risk factors. Its prognosis is closely related to early diagnosis and early intervention. At present, the early diagnosis rate of colorectal lesions has been significantly raised by advances in electronic colonoscopy, multi-row spiral CT, magnetic resonance, immunological fecal occult blood tests, etc. In terms of treatment, surgery has gradually become standardized and precise, radiotherapy and chemotherapy regimens have been continuously updated and optimized, and research into immunotherapy and targeted drugs has made huge progress. However, over 50% of patients are still diagnosed at Stage III–IV (31,32), the five-year survival rate for distant metastasis is 12% (33), and the prognosis is poor. This is in part because the molecular mechanisms of colorectal cancer are not fully understood.
Autophagy lysosomes degrade damaged, defective, or unwanted intracellular components to protect cells from damage, and this process is strictly regulated and highly conserved. Abnormalities in autophagy-related genes lead to dysregulation that causes cells to become cancerous. The initiation and progression of COADREAD are closely associated with autophagy (34). The TME is the complex microecological environment in which tumorigenesis and development occur, including hypoxic, acidic, immune, and inflammatory microenvironments (35). The immune microenvironment includes T cells, B cells, macrophages, and their related signaling molecules, and its main role is immunosuppression, which plays a vital part in cancer cell development (36).
In this study, 22 immune and autophagy-related genes were acquired from 359 genes related to immunity and 222 genes related to autophagy; more than 85% of the genes related to immunity and autophagy differed between the normal and tumor samples. Among them, VEGFA, BIRC5, and BID were highly expressed, and TNFSF10 was lowly expressed in the colorectal cancer group. There were 17.19% gene mutations in immune- and autophagy-related genes, and the EGFR mutation rate reached 4%, mainly owing to copy number amplification. VEGFA is a vital angiogenic growth factor that induces and stimulates the growth of lymphatic and blood vessels in malignant tumors (37,38). VEGFA is also able to participate in tumor immune responses. VEGFA binds to VEGFR2 on dendritic cells (DCs) to inhibit DC maturation, resulting in reduced tumor antigen presentation and potential immune evasion from tumors (39). VEGFA leads to T cell depletion by binding to R on the CD8+ T cell surface (40).
BIRC5 is a key gene for inhibiting apoptosis, and BIRC5 is highly expressed in tumor tissues, causing tumor metastasis and recurrence by inhibiting apoptosis (41). The expression of BIRC5 is elevated in patients with COADREAD, and its overexpression is related to poor prognosis (42). TNFSF10, a member of the TNF-α family, is a pro-apoptotic cytokine that induces apoptosis in many tumor cell lines (43,44). Its expression in those tumors is consistent with the expression in tumors in this study. Immunity and autophagy promote the occurrence and development of colorectal cancer by promoting apoptosis and angiogenesis and inhibiting cell death.
According to the correlation heat map, most colorectal cancer immunity and autophagy genes were positively correlated. After consistent clustering, molecular subtypes were constructed, and significant differences in the prognosis of colorectal cancer were found between different immune and autophagy-related genes, which were mainly enriched in hallmark genes such as KRAS and IL6/JAK/STAT3 signaling genes. KRAS participates in regulating cell growth, differentiation, proliferation, and survival signaling pathways (45). Patients with COADREAD have the highest rates of KRAS gene mutations (46). IL-6 is an inflammatory factor, and IL-6 expression is strongly related to tumor growth in patients with COADREAD (47).
To analyze the biological differences among different subtypes, a total of 1754 IARDEGs were obtained. The GO and KEGG functional enrichment analyses demonstrated that in terms of biological processes, the IARDEGs were mostly enriched in inflammatory processes, T cell activation regulation, and cell chemotaxis. Among the cellular components, the IARDEGs were mainly enriched in autophagosomes. Cytokines are mainly expressed in the TME of tumor cells and immune cells and are information carriers between cancer and immune cells, affecting changes in cancer immunity (48). IARDEGs act as cytokines and cytokine receptors in COADREAD.
Consistent clustering was performed to identify the prognosis risk, which was significantly different between the IARGs in the prognosis groups of patients with COADREAD. Then, a prognostic risk model was established using LASSO and univariate Cox regression analyses, and 11 genes (RPS10, HLA-DQB2, ADAM9, BATF, TIMP1, CCL22, CST2, AFF1, EGFL7, CTLA4, and DUOXA2) were screened to establish the model. EGFL7 is a vascular endothelial-specific expression gene, and the protein encoded by it plays an important role in embryonic vascular development, vascular lumen formation, and inducing endothelial cell migration (49). CTLA4 is closely related to tumor immunity, inhibits T cell activation, and reduces T cell proliferation and cytokine secretion (50).
DUOXA2 is a vital member of the NADPH oxidase family and participates in synthesizing thyroxine. DUOXA2 is highly expressed in COADREAD and connected to lymph node metastasis (51). These genes could be prognostic factors for COADREAD. Additionally, this study showed a higher degree of T cell and NK cell infiltration in COADREAD. Immune checkpoint genes, such as CD86 and TNFRSF9, were expressed at high levels in low-risk groups. Therefore, a nomogram was established that included age, risk score, and tumor stage, and had good survival-prediction abilities of patients with COADREAD.
The study had the following limitations. This study was conducted based on public databases and did not experimentally validate the findings. Despite these limitations, these findings could serve as a theoretical basis for further research on biomarkers correlated with colorectal cancer prognosis. We will focus on elucidating the roles of these genes in the future.
Conclusions
IARG and IARDEG biomarkers could be used as prognostic factors for colorectal cancer. The established risk model is highly associated with immunity and autophagy, which provides a valuable target for the early diagnosis of patients with COADREAD and an indicator for evaluating prognosis. The IARGs and IARDEGs in the model play a key role in the occurrence and progression of COADREAD and could become new targets for treatment.
Acknowledgments
None.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist Available at https://tcr.amegroups.com/article/view/10.21037/tcr-24-1708/rc
Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-24-1708/prf
Funding: This work was financially supported by
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-24-1708/coif). M.Z. reports that this work was financially supported by the Science Research Foundation of Inner Mongolia People’s Hospital (No. 2022YN11). W.J. reports that this work was financially supported by the Natural Science Foundation of Inner Mongolia (No. 2022QN06007). The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2024;74:229-63. [Crossref] [PubMed]
- Chakedis J, Schmidt CR. Surgical Treatment of Metastatic Colorectal Cancer. Surg Oncol Clin N Am 2018;27:377-99.
- Wu S, Sun C, Tian D, et al. Expression and clinical significances of Beclin1, LC3 and mTOR in colorectal cancer. Int J Clin Exp Pathol 2015;8:3882-91.
- Selvakumaran M, Amaravadi RK, Vasilevskaya IA, et al. Autophagy inhibition sensitizes colon cancer cells to antiangiogenic and cytotoxic therapy. Clin Cancer Res 2013;19:2995-3007. [Crossref] [PubMed]
- Koehler BC, Scherr AL, Lorenz S, et al. Beyond cell death - antiapoptotic Bcl-2 proteins regulate migration and invasion of colorectal cancer cells in vitro. PLoS One 2013;8:e76446. [Crossref] [PubMed]
- Weng J, Li S, Zhu Z, et al. Exploring immunotherapy in colorectal cancer. J Hematol Oncol 2022;15:95. [Crossref] [PubMed]
- Tomczak K, Czerwińska P, Wiznerowicz M. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol (Pozn) 2015;19:A68-77. [Crossref] [PubMed]
- Silva TC, Colaprico A, Olsen C, et al. TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages. F1000Res 2016;5:1542. [Crossref] [PubMed]
- Mayakonda A, Lin DC, Assenov Y, et al. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res 2018;28:1747-56. [Crossref] [PubMed]
- Szeglin BC, Wu C, Marco MR, et al. A SMAD4-modulated gene profile predicts disease-free survival in stage II and III colorectal cancer. Cancer Rep (Hoboken) 2022;5:e1423. [Crossref] [PubMed]
- Smith JJ, Deane NG, Wu F, et al. Experimentally derived metastasis gene expression profile predicts recurrence and death in patients with colon cancer. Gastroenterology 2010;138:958-68. [Crossref] [PubMed]
- Barrett T, Wilhite SE, Ledoux P, et al. NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res 2013;41:D991-5. [Crossref] [PubMed]
- Davis S, Meltzer PS. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics 2007;23:1846-7. [Crossref] [PubMed]
- Zhang HC, Deng SH, Pi YN, et al. Identification and Validation in a Novel Quantification System of Ferroptosis Patterns for the Prediction of Prognosis and Immunotherapy Response in Left- and Right-Sided Colon Cancer. Front Immunol 2022;13:855849. [Crossref] [PubMed]
- Mariathasan S, Turley SJ, Nickles D, et al. TGFβ attenuates tumour response to PD-L1 blockade by contributing to exclusion of T cells. Nature 2018;554:544-8. [Crossref] [PubMed]
- Breuer K, Foroushani AK, Laird MR, et al. InnateDB: systems biology of innate immunity and beyond--recent updates and continuing curation. Nucleic Acids Res 2013;41:D1228-33. [Crossref] [PubMed]
- Bhattacharya S, Dunn P, Thomas CG, et al. ImmPort, toward repurposing of open access immunological assay data for translational and clinical research. Sci Data 2018;5:180015. [Crossref] [PubMed]
- Monti S, Tamayo P, Mesirov JP, Golub TR. Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Machine Learning 2003;52:91-118.
- Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 2010;26:1572-3. [Crossref] [PubMed]
- Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics 2013;14:7. [Crossref] [PubMed]
- Liberzon A, Birger C, Thorvaldsdóttir H, et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst 2015;1:417-25. [Crossref] [PubMed]
- Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015;43:e47. [Crossref] [PubMed]
- Gene Ontology Consortium. going forward. Nucleic Acids Res 2015;43:D1049-56. [Crossref] [PubMed]
- Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000;28:27-30. [Crossref] [PubMed]
- Wu T, Hu E, Xu S, et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation (Camb) 2021;2:100141. [Crossref] [PubMed]
- Cai W, van der Laan M. Nonparametric bootstrap inference for the targeted highly adaptive least absolute shrinkage and selection operator (LASSO) estimator. Int J Biostat 2020;/j/ijb.ahead-of-print/ijb-2017-0070/ijb-2017-0070.xml.
- Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 2005;102:15545-50. [Crossref] [PubMed]
- Charoentong P, Finotello F, Angelova M, et al. Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade. Cell Rep 2017;18:248-62.
- Park SY. Nomogram: An analogue tool to deliver digital knowledge. J Thorac Cardiovasc Surg 2018;155:1793. [Crossref] [PubMed]
- Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw 2010;33:1-22.
- Li X, Zhou Y, Luo Z, et al. The impact of screening on the survival of colorectal cancer in Shanghai, China: a population based study. BMC Public Health 2019;19:1016. [Crossref] [PubMed]
- Lee YH, Kung PT, Wang YH, et al. Effect of length of time from diagnosis to treatment on colorectal cancer survival: A population-based study. PLoS One 2019;14:e0210465. [Crossref] [PubMed]
- Miller KD, Nogueira L, Mariotto AB, et al. Cancer treatment and survivorship statistics, 2019. CA Cancer J Clin 2019;69:363-85. [Crossref] [PubMed]
- Mokarram P, Albokashy M, Zarghooni M, et al. New frontiers in the treatment of colorectal cancer: Autophagy and the unfolded protein response as promising targets. Autophagy 2017;13:781-819. [Crossref] [PubMed]
- Roma-Rodrigues C, Mendes R, Baptista PV, et al. Targeting Tumor Microenvironment for Cancer Therapy. Int J Mol Sci 2019;20:840. [Crossref] [PubMed]
- Sun Y. Tumor microenvironment and cancer therapy resistance. Cancer Lett 2016;380:205-15. [Crossref] [PubMed]
- Wei D, Xin Y, Rong Y, et al. Correlation between the Expression of VEGF and Ki67 and Lymph Node Metastasis in Non-small-Cell Lung Cancer: A Systematic Review and Meta-Analysis. Evid Based Complement Alternat Med 2022;2022:9693746. Retracted Publication. [Crossref] [PubMed]
- Tian L, Chen X, Cao L, et al. Effects of plant-based medicinal food on postoperative recurrence and lung metastasis of gastric cancer regulated by Wnt/β-catenin-EMT signaling pathway and VEGF-C/D-VEGFR-3 cascade in a mouse model. BMC Complement Med Ther 2022;22:233. [Crossref] [PubMed]
- Rahma OE, Hodi FS. The Intersection between Tumor Angiogenesis and Immune Suppression. Clin Cancer Res 2019;25:5449-57. [Crossref] [PubMed]
- Bosisio D, Ronca R, Salvi V, et al. Dendritic cells in inflammatory angiogenesis and lymphangiogenesis. Curr Opin Immunol 2018;53:180-6. [Crossref] [PubMed]
- Li F, Aljahdali I, Ling X. Cancer therapeutics using survivin BIRC5 as a target: what can we do after over two decades of study? J Exp Clin Cancer Res 2019;38:368. [Crossref] [PubMed]
- Wang R, Kang Y, Löhr CV, et al. Reciprocal regulation of BMF and BIRC5 (Survivin) linked to Eomes overexpression in colorectal cancer. Cancer Lett 2016;381:341-8. [Crossref] [PubMed]
- Zhou K, Yan Y, Zhao S. Esophageal cancer-selective expression of TRAIL mediated by MREs of miR-143 and miR-122. Tumour Biol 2014;35:5787-95. [Crossref] [PubMed]
- Cantarella G, Di Benedetto G, Puzzo D, et al. Neutralization of TNFSF10 ameliorates functional outcome in a murine model of Alzheimer's disease. Brain 2015;138:203-16. [Crossref] [PubMed]
- Herbst RS, Schlessinger J. Small molecule combats cancer-causing KRAS protein at last. Nature 2019;575:294-5. [Crossref] [PubMed]
- Scott A, Goffredo P, Ginader T, et al. The Impact of KRAS Mutation on the Presentation and Prognosis of Non-Metastatic Colon Cancer: an Analysis from the National Cancer Database. J Gastrointest Surg 2020;24:1402-10. [Crossref] [PubMed]
- Turano M, Cammarota F, Duraturo F, et al. A Potential Role of IL-6/IL-6R in the Development and Management of Colon Cancer. Membranes (Basel) 2021;11:312. [Crossref] [PubMed]
- Binnewies M, Roberts EW, Kersten K, et al. Understanding the tumor immune microenvironment (TIME) for effective therapy. Nat Med 2018;24:541-50. [Crossref] [PubMed]
- Feng S, Yin J, Xie H, et al. Correlation of serum EGFL7 and MTA1 with clinical characteristics andprognosis of patients with nasopharyngeal carcinoma. J Mol Diagn Ther 2020;12:616-20.
- Rowshanravan B, Halliday N, Sansom DM. CTLA-4: a moving target in immunotherapy. Blood 2018;131:58-67. [Crossref] [PubMed]
- Li M, Zhao LM, Li SL, et al. Differentially expressed lncRNAs and mRNAs identified by NGS analysis in colorectal cancer patients. Cancer Med 2018;7:4650-64. [Crossref] [PubMed]