Sparse Fuzzy Switching Regression Model.
Date of Submission
December 2016
Date of Award
Winter 12-12-2017
Institute Name (Publisher)
Indian Statistical Institute
Document Type
Master's Dissertation
Degree Name
Master of Technology
Subject Name
Computer Science
Department
Electronics and Communication Sciences Unit (ECSU-Kolkata)
Supervisor
Pal, Nikhil Ranjan (ECSU-Kolkata; ISI)
Abstract (Summary of the Work)
Unlike multiple regression, in switching regression, data are assumed to have come from more than one regression model but the association between the sample points and the models is not known. One approach to obtain the parameters of the switching regression model, is to formulate the problem using a mixture distribution. The estimators for this kind of distribution can be obtained using an iterative maximum likelihood method. The second approach is to obtain a fuzzy partition of the data using the fuzzy c-regression model (FCRM) algorithm. Here, the prototypes of the clusters are in the form of regression models. For switching regression, although there are evidences/reasons to believe that the data are generated by more than one model, usually it is not known whether all predictors are important for all regimes. This work is based around identifying useful predictors, independent variables, and eliminating the irrelevant ones in the fuzzy switching regression setup. We employ two different regularizers in the FCRM objective function to induce sparsity in the models and thereby select useful features. In the first case, the ordinary FCRM objective function is regularized using the least absolute shrinkage and selection operator (lasso) penalty i.e., using the l1 norm of the parameters of the regression models as the regularizer. In order to deal with the l1 norm, each parameter is modelled using two non-negative variables. For a given partition matrix, it leads to a bound constraint quadratic optimization problem. In the second case, we formulate the non-negatve garrotte penalty for the fuzzy c-regression model. In this case, for each variable we associate a non-negative weight or importance. We consider two versions of the problem: (1) for every model we use a different set of weights, (2) only one common set of weights is used for all models. We test both approaches on synthetic as well as real datasets. After comparing results of both the cases on these datasets, we conclude that garrotte is more effective in inducing sparisty, maintaining the same level of root mean square error. Lastly, we discuss a method to evaluate goodness of the feature selection methods. This evaluation method affirms that features selected by the non-negative garrotte penalty are useful.
Control Number
ISI-DISS-2016-349
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
DOI
http://dspace.isical.ac.in:8080/jspui/handle/10263/6506
Recommended Citation
Majumder, Biswajit, "Sparse Fuzzy Switching Regression Model." (2017). Master’s Dissertations. 247.
https://digitalcommons.isical.ac.in/masters-dissertations/247
Comments
ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28843271