Scalable Non-Linear Graph Fusion for Prioritizing Cancer-Causing Genes
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Microarray data and protein-protein interaction (PPI) networks have been extensively studied, due to their ability to depict important characteristics of disease-associated genes. Therefore, the paper presents a new gene prioritization algorithm to identify cancer-causing genes, integrating judiciously the complementary information obtained from two data sources. The proposed algorithm selects disease-causing genes by maximizing the importance of selected genes and functional similarity among them. A new quantitative index is introduced to evaluate the importance of a gene. It considers whether a gene exhibits differential expression pattern and has a strong connectivity in the PPI network. As disease-associated genes are expected to have similar expression profiles and topological structures, a scalable non-linear graph fusion technique, termed as ScaNGraF, is proposed to learn a disease-dependent functional similarity network from the co-expression and common neighbor based similarity networks. The proposed ScaNGraF, which is based on message passing algorithm, efficiently combines shared and complementary information provided by different data sources with significantly lower computational cost. A new measure, termed as DiCoIN, is introduced to evaluate the quality of learned affinity network. Performance of proposed graph fusion technique and gene selection algorithm is extensively compared with that of some existing methods, using several cancer data sets.
Shah, Ekta and Maji, Pradipta, "Scalable Non-Linear Graph Fusion for Prioritizing Cancer-Causing Genes" (2020). Journal Articles. 458.