Transfer Learning Using MMD.

Date of Submission

December 2016

Date of Award

Winter 12-12-2017

Institute Name (Publisher)

Indian Statistical Institute

Document Type

Master's Dissertation

Degree Name

Master of Technology

Subject Name

Computer Science


Electronics and Communication Sciences Unit (ECSU-Kolkata)


Pal, Nikhil Ranjan (ECSU-Kolkata; ISI)

Abstract (Summary of the Work)

In standard machine learning tasks, the domain of the data on which a classifier learns and the domain on which it predicts, are usually the same. However, this assumption may not always hold true. In such cases, we usually transfer the knowledge learnt from one domain to design a classifier on a related domain and this task of knowledge transfer is termed as Transfer Learning. There are many facets of Transfer Learning, many authors use some labeled data from the target domain (test domain) in addition to the labeled data from the source domain. There can be several different scenarios where knowledge may be transferred. In our method we are trying to minimize simultaneously the MMD (Maximum Mean Discrepancy) which measures the distance between means of two the domains after mapping the data in a higher dimension by some non-linear transformation and classifier objective function. In this work, our purpose is to transfer knowledge between a source domain and a target domain which lie in the same feature space but have different distributions. In the literature we could not find any work which optimizes the kernel parameters with respect to MMD. Here we first investigate how effective is optimization of kernel parameters and the minimization of MMD to achieve better performance in Transfer Learning. Our investigation reveals that lower MMD does not necessarily mean better classifier performance on the target domain. It also suggests existence of multiple local minima with almost equal value of MMD. Then we moved to our main problem of building a classifier for the target domain. Various authors have used minimization of MMD using multiple kernels to find a suitable latent space. Typically, a convex combination of the kernels is used and weights of the combination are learnt to minimize the MMD. In our method, we cluster both source and target domain data into a predefined number of clusters and then we establish correspondence between source and target domain clusters using the Hungarian Algorithm. Finally, we minimize the MMD on corresponding pairs of cluster for obtaining a latent space that represents the source and target data in a better manner for solving our problem. Our method shows some improvement in performance when there are good cluster structure in the source and target domains. In our approach we do not use any label information from the target data.


ProQuest Collection ID:

Control Number


Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.


This document is currently not available here.