Journal Articles

Integrating Multiple Data Sources for Combinatorial Marker Discovery: A Study in Tumorigenesis

Sanghamitra Bandyopadhyay, Indian Statistical Institute, Kolkata
Saurav Mallik, Indian Statistical Institute, Kolkata

Article Type

Research Article

Publication Title

IEEE/ACM Transactions on Computational Biology and Bioinformatics

Abstract

Identification of combinatorial markers from multiple data sources is a challenging task in bioinformatics. Here, we propose a novel computational framework for identifying significant combinatorial markers ( $SCM$ s) using both gene expression and methylation data. The gene expression and methylation data are integrated into a single continuous data as well as a (post-discretized) boolean data based on their intrinsic (i.e., inverse) relationship. A novel combined score of methylation and expression data (viz., $CoMEx$ ) is introduced which is computed on the integrated continuous data for identifying initial non-redundant set of genes. Thereafter, (maximal) frequent closed homogeneous genesets are identified using a well-known biclustering algorithm applied on the integrated boolean data of the determined non-redundant set of genes. A novel sample-based weighted support ( $WS$ ) is then proposed that is consecutively calculated on the integrated boolean data of the determined non-redundant set of genes in order to identify the non-redundant significant genesets. The top few resulting genesets are identified as potential $SCM$ s. Since our proposed method generates a smaller number of significant non-redundant genesets than those by other popular methods, the method is much faster than the others. Application of the proposed technique on an expression and a methylation data for Uterine tumor or Prostate Carcinoma produces a set of significant combination of markers. We expect that such a combination of markers will produce lower false positives than individual markers.

First Page

673

Last Page

687

DOI

10.1109/TCBB.2016.2636207

Publication Date

3-1-2018

Comments

All Open Access, Bronze

Recommended Citation

Bandyopadhyay, Sanghamitra and Mallik, Saurav, "Integrating Multiple Data Sources for Combinatorial Marker Discovery: A Study in Tumorigenesis" (2018). Journal Articles. 1448.
https://digitalcommons.isical.ac.in/journal-articles/1448

This document is currently not available here.

COinS

Journal Articles

Integrating Multiple Data Sources for Combinatorial Marker Discovery: A Study in Tumorigenesis

Article Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Comments

Recommended Citation

Browse

Search

Author Corner

Links

Journal Articles

Integrating Multiple Data Sources for Combinatorial Marker Discovery: A Study in Tumorigenesis

Authors

Article Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Comments

Recommended Citation

Share

Browse

Search

Author Corner

Links