• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Concave Pairwise Fusion Approach to Clustering of Multi-Response Regression and Its Robust Extensions

Chen, Chen, 0000-0003-1175-3027 January 2022 (has links)
Solution-path convex clustering is combined with concave penalties by Ma and Huang (2017) to reduce clustering bias. Their method was introduced in the setting of single-response regression to handle heterogeneity. Such heterogeneity may come from either the regression intercepts or the regression slopes. The procedure, realized by the alternating direction method of multipliers (ADMM) algorithm, can simultaneously identify the grouping structure of observations and estimate regression coefficients. In the first part of our work, we extend this procedure to multi-response regression. We propose models to solve cases with heterogeneity in either the regression intercepts or the regression slopes. We combine the existing gadgets of the ADMM algorithm and group-wise concave penalties to find solutions for the model. Our work improves model performance in both clustering accuracy and estimation accuracy. We also demonstrate the necessity of such extension through the fact that by utilizing information in multi-dimensional space, the performance can be greatly improved. In the second part, we introduce robust solutions to our proposed work. We introduce two approaches to handle outliers or long-tail distributions. The first is to replace the squared loss with robust loss, among which are absolute loss and Huber loss. The second is to characterize and remove outliers' effects by a mean-shift vector. We demonstrate that these robust solutions outperform the squared loss based method when outliers are present, or the underlying distribution is long-tailed. / Statistics

Page generated in 0.1149 seconds