Original Article
Survival-based bioinformatics analysis to identify hub genes and key pathways in non-small cell lung cancer
Abstract
Background: Lung cancer is one of the leading causes of cancer mortality worldwide. Here, we performed an integrative bioinformatics analysis to screen hub genes and critical pathways in non-small cell lung cancer (NSCLC) based on the overall survival rate of differentially expressed genes (DEGs).
Methods: Four datasets from the gene expression omnibus (GEO) were used to identify the DEGs. To obtain robust DEGs in NSCLC, only the DEGs that co-existed in the four datasets were selected for subsequent analysis. To identify the genes correlated with overall survival, the overall survival of these genes was then analyzed using the Kaplan-Meier plotter database. The genes significantly correlated with survival were used to perform gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) analysis; next, these genes were used to construct a protein-protein interaction network. MCODE and CytoHubba were used to identify the clusters and hub genes. Finally, the hub genes were validated in the Cancer Genome Atlas (TCGA) and the Human Protein Atlas (HPA).
Results: We found 522 up-regulated DEGs, and 989 down-regulated DEGs between the NSCLC and normal lung tissue, and 895 of them were correlated with a higher overall survival. GO analysis showed that the DEGs that were associated with a higher overall survival were enriched in cell division, cell cycle, DNA replication, angiogenesis, and cell migration. KEGG analysis was consistent with GO analysis and showed that p53 signaling pathway, pyrimidine metabolism, cGMP-PKG signaling pathway and renin secretion pathway were associated with overall survival in NSCLC. In the protein-protein analysis, we identified seven clusters and six hub genes which were BUB1B, CCNB1, CENPE, KIF18A, NDC10, and MAD2L1. Of these genes, CENPE and KIF18A had not been reported until now. Finally, the dysregulated expression of the six hub genes was validated by the data from the TCGA and HPA.
Conclusions: We identified the hub genes and potential mechanisms of NSCLC based on multiple-microarray analysis and overall survival; then, validated the hub genes in the TCGA and HPA database. These hub genes may serve as potential therapeutic targets.
Methods: Four datasets from the gene expression omnibus (GEO) were used to identify the DEGs. To obtain robust DEGs in NSCLC, only the DEGs that co-existed in the four datasets were selected for subsequent analysis. To identify the genes correlated with overall survival, the overall survival of these genes was then analyzed using the Kaplan-Meier plotter database. The genes significantly correlated with survival were used to perform gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) analysis; next, these genes were used to construct a protein-protein interaction network. MCODE and CytoHubba were used to identify the clusters and hub genes. Finally, the hub genes were validated in the Cancer Genome Atlas (TCGA) and the Human Protein Atlas (HPA).
Results: We found 522 up-regulated DEGs, and 989 down-regulated DEGs between the NSCLC and normal lung tissue, and 895 of them were correlated with a higher overall survival. GO analysis showed that the DEGs that were associated with a higher overall survival were enriched in cell division, cell cycle, DNA replication, angiogenesis, and cell migration. KEGG analysis was consistent with GO analysis and showed that p53 signaling pathway, pyrimidine metabolism, cGMP-PKG signaling pathway and renin secretion pathway were associated with overall survival in NSCLC. In the protein-protein analysis, we identified seven clusters and six hub genes which were BUB1B, CCNB1, CENPE, KIF18A, NDC10, and MAD2L1. Of these genes, CENPE and KIF18A had not been reported until now. Finally, the dysregulated expression of the six hub genes was validated by the data from the TCGA and HPA.
Conclusions: We identified the hub genes and potential mechanisms of NSCLC based on multiple-microarray analysis and overall survival; then, validated the hub genes in the TCGA and HPA database. These hub genes may serve as potential therapeutic targets.