complete observation
K-means, SOM, k-nn or classical clustering methods?
The best-known optimization clustering algorithm is k-means clustering. Unlike hierarchical clustering methods that require processing time proportional to the square or cube of the number of observations, the time required by the k-means algorithm is proportional to the number of observations. This means that k-means clustering can be used on larger data sets. In fact, k-means clustering is inappropriate for small ( 100 observations) data sets. If the data set is small, the k-means solution becomes sensitive to the order in which the observations appear (the order effect).
Data Generation as Sequential Decision Making
Bachman, Philip, Precup, Doina
We connect a broad class of generative models through their shared reliance on sequential decision making. Motivated by this view, we develop extensions to an existing model, and then explore the idea further in the context of data imputation -- perhaps the simplest setting in which to investigate the relation between unconditional and conditional generative modelling. We formulate data imputation as an MDP and develop models capable of representing effective policies for it. We construct the models using neural networks and train them using a form of guided policy search. Our models generate predictions through an iterative process of feedback and refinement. We show that this approach can learn effective policies for imputation problems of varying difficulty and across multiple datasets.