Deepsgp:Deep learning for gene selection and survival group prediction in glioblastoma

Article Type

Research Article

Publication Title

Electronics (Switzerland)


Glioblastoma Multiforme (GBM) is an aggressive form of glioma, exhibiting very poor survival. Genomic input, in the form of RNA sequencing data (RNA-seq), is expected to provide vital information about the characteristics of the genes that affect the Overall Survival (OS) of patients. This could have a significant impact on treatment planning. We present a new Autoencoder (AE)based strategy for the prediction of survival (low or high) of GBM patients, using the RNA-seq data of 129 GBM samples from The Cancer Genome Atlas (TCGA). This is a novel interdisciplinary approach to integrating genomics with deep learning towards survival prediction. First, the Differentially Expressed Genes (DEGs) were selected using EdgeR. These were further reduced using correlationbased analysis. This was followed by the application of ranking with different feature subset selection and feature extraction algorithms, including the AE. In each case, fifty features were selected/extracted, for subsequent prediction with different classifiers. An exhaustive study for survival group prediction, using eight different classifiers with the accuracy and Area Under the Curve (AUC), established the superiority of the AE-based feature extraction method, called DeepSGP. It produced a very high accuracy (0.83) and AUC (0.90). Of the eight classifiers, using the extracted features by DeepSGP, the MLP was the best at Overall Survival (OS) prediction with an accuracy of 0.89 and an AUC of 0.97. The biological significance of the genes extracted by the AE were also analyzed to establish their importance. Finally, the statistical significance of the predicted output of the DeepSGP algorithm was established using the concordance index.



Publication Date



Open Access, Gold

This document is currently not available here.