Discriminative Deep Joint Subspace Analysis for Multi-View Data:

Date of Submission

December 2023

Date of Award

12-1-2024

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Doctoral Thesis

Degree Name

Doctor of Philosophy

Subject Name

Computer Science

Department

Machine Intelligence Unit (MIU-Kolkata)

Supervisor

Maji, Pradipta (MIU-Kolkata; ISI)

Abstract (Summary of the Work)

Over the past few years, multi-view data analysis has emerged as an inevitable method for identifying sample categories. In multi-view data classification problem, it is expected that the joint subspace is learned from the given input views in such a way that the similarity in the latent space implies the similarity in the corresponding concepts. Since each view has different statistical properties, the joint subspace should be able to reflect the intrinsic properties of each of the input views. Another important aspect is the coherent knowledge of the multiple views. It is required that the learning objective of the multi-view model efficiently captures the non-linear correlated structures across different views. Cross-view dependency is also an essential attribute of multi-view learning in which the primary focus to discover the dependency shared between the pairs of input views. If one or more input views correspond to images, then then joint subspace should be learned in such a way that the topological properties of the image views are properly preserved along with the inherent chracteristics of the rest of the views. In this regard, the thesis addresses the classification problem of multi-view data, where the primary objective is to identify and analyze the inherent structures or patterns of the data, relevant to classify the given observations into different categories. In order to evaluate the relevance of a view in differentiating observations from a particular class from the observations belonging to the rest of the classes, a novel framework is developed by judiciously integrating the theory of rough sets with the Bayes decision theory. While rough set theory deals with the uncertainty due to incompleteness in class definition, the probabilistic model addresses the uncertainty due to overlapping classes by measuring the belongingness of an observation to a specific class. In multi-view learning, it is essential that a joint subspace is learned from the given input views which can efficiently encapsulate the underlying non-linear data distribution of the given observations. In this regard, the thesis develops deep predictive models based on the framework of deep Boltzmann machine for discriminability, correlation, and dependency analysis. In discriminability analysis, the class nodes are incorporated into the deep architecture where the supervised information is clamped. Through proper learning of the weights associated with the class nodes, the discriminative ability of the latent subspace is enhanced. In correlation analysis, the learning objective of the deep architecture is judiciously integrated with canonical correlation analysis such that given the input views, the joint subspace is learned from maximally correlated subspaces. In dependency analysis, the relationship between each pair of views is assumed to be unique. Hence, a view-pair specific approach is developed based on the concept of Hilbert-Schmidt independence criterion to efficiently encapsulate the cross-view dependency in terms of consensus and/or complementary knowledge from the input pairs of views. Based on the Bayes error analysis, an upper bound on the error probability of the proposed deep model is estimated in terms of the model architecture. It facilitates determining the optimal architecture of the proposed model for each database considered. Combining information from multiple views is particularly challenging when the input views involve both image and non-image information. In case of multi-view data analysis, it is essential that descriptive and comprehensive information is efficiently extracted from all the views of the given input data. If one or more input views correspond to image information, then it should be ensured that the innate topological properties of each of the input image views are appropriately reflected in the joint subspace. In this regard, a geometrically motivated deep predictive model is developed, which can process multiple image and non-image views simultaneously. In order to recognize and represent the geometric structures of the image manifolds, embedded in the high- dimensional ambient space, the theory of Laplacian eigenmap is judiciously integrated with the learning objective of the deep predictive model. An approximate common eigenbasis of the Laplacians is computed to consolidate the intrinsic geometric structures of the manifolds, corresponding to each of the input image views

Comments

ProQuest Collection ID: https://www.proquest.com/pqdtlocal1010185/dissertations/fromDatabasesLayer?accountid=27563

Control Number

ISILib-TH

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

DOI

http://dspace.isical.ac.in:8080/jspui/handle/10263/2146

Share

COinS