A supervised weighted similarity measure for gene expressions using biological knowledge
Article Type
Research Article
Publication Title
Gene
Abstract
A supervised similarity measure for Saccharomyces cerevisiae gene expressions is developed which can capture the gene similarity when multiple types of experimental conditions like cell cycle, heat shock are available for all the genes. The measure is called Weighted Pearson correlation (WPC), where the weights are systematically determined for each type of experiment by maximizing the positive predictive value for gene pairs having Pearson correlation greater than 0.80. The positive predictive value is computed by using the annotation information available from yeast GO-Slim process annotations in Saccharomyces Genome Database (SGD). Genes are then clustered by k-medoid algorithm using the newly computed WPC, and functions of 135 unclassified genes are predicted with a p-value cutoff 10 −5 using Munich Information for Protein Sequences (MIPS) annotations. Out of these genes, functional categories of 55 gene are predicted with p-value cutoff greater than 10 −10 and reported in this investigation. The superiority of WPC as compared to some existing similarity measures like Pearson correlation and Euclidean distance is demonstrated using positive predictive (PPV) values of gene pairs for different Saccharomyces cerevisiae data sets. The related code is available at http://www.sampa.droppages.com/WPC.html.
First Page
150
Last Page
160
DOI
10.1016/j.gene.2016.09.033
Publication Date
12-31-2016
Recommended Citation
Ray, Shubhra Sankar and Misra, Sampa, "A supervised weighted similarity measure for gene expressions using biological knowledge" (2016). Journal Articles. 4082.
https://digitalcommons.isical.ac.in/journal-articles/4082