On Some statistical problems in single-cell transcriptome data analysis

Date of Submission

July 2021

Date of Award

7-13-2022

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Doctoral Thesis

Degree Name

Doctor of Philosophy

Subject Name

Statistics

Department

Human Genetics Unit (HGU-Kolkata)

Supervisor

Mukhopadhyay, Indranil (HGU-Kolkata; ISI)

Abstract (Summary of the Work)

Single-cell transcriptome data provide us with an enormous scope of studying biological systems at the cellular level. We aim to address different problems involving the statistical analysis of single-cell RNA-seq data. First, we develop a realistic statistical model for fitting single-cell transcriptome data based on a two-part model for gene-wise unimodal or bimodal distribution in addition to using a generalized linear model with a probit link for zero occurrences. In continuation to this work, we discuss testing methods to compare transcriptome profiles between two groups. We suggest two different likelihood ratio-based tests under unimodal and bimodal assumptions. We also propose a cell pseudotime reconstruction method avoiding dimensionality reduction, which may lead to loss of information in the data. We view the pseudotime reconstruction problem as finding the best permutation based on a cost function and invoke a genetic algorithm to find the optimum permutation. We also discuss a novel method to remove batch effects to facilitate merging two or more single-cell RNA-seq datasets. All our approaches are supported by simulation study and real data analysis

Comments

ProQuest Collection ID: https://www.proquest.com/pqdtlocal1010185/dissertations/fromDatabasesLayer?accountid=27563

Control Number

ISILib-TH526

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

DOI

http://dspace.isical.ac.in:8080/jspui/handle/10263/2146

Share

COinS