Blocked Clusterwise Regression

Jan-29-2020–arXiv.org Machine Learning

Such models have been shown to allow estimation and inference by regression clustering methods. This paper is motivated by the finding that the clustered heterogeneity models studied in this literature can be badly misspecified, even when the panel has significant discrete cross-sectional structure. To address this issue, we generalize previous approaches to discrete unobserved heterogeneity by allowing each unit to have multiple, imperfectly-correlated latent variables that describe its response-type to different covariates. We give inference results for a k-means style estimator of our model and develop information criteria to jointly select the number clusters for each latent variable. Monte Carlo simulations confirm our theoretical results and give intuition about the finite-sample performance of estimation and model selection. We also contribute to the theory of clustering with an over-specified number of clusters and derive new convergence rates for this setting. Our results suggest that over-fitting can be severe in k-means style estimators when the number of clusters is over-specified.

artificial intelligence, assumption 3, machine learning, (18 more...)

arXiv.org Machine Learning

Jan-29-2020

arXiv.org PDF

Add feedback

Country:
- South America > Venezuela (0.04)
- Asia > China (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report > New Finding (0.86)

Industry:
- Banking & Finance (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found