Developed back in the 50s by Rosenblatt and colleagues, this extremely simple algorithm can be viewed as the foundation for some of the most successful classifiers today, including suport vector machines and logistic regression, solved using stochastic gradient descent. The convergence proof for the Perceptron algorithm is one of the most elegant pieces of math I've seen in ML. Most useful: Boosting, especially boosted decision trees. This intuitive approach allows you to build highly accurate ML models, by combining many simple ones. Boosting is one of the most practical methods in ML, it's widely used in industry, can handle a wide variety of data types, and can be implemented at scale.
We introduce a Deep Boltzmann Machine model suitable for modeling and extracting latent semantic representations from a large unstructured collection of documents. We overcome the apparent difficulty of training a DBM with judicious parameter tying. This parameter tying enables an efficient pretraining algorithm and a state initialization scheme that aids inference. The model can be trained just as efficiently as a standard Restricted Boltzmann Machine. Our experiments show that the model assigns better log probability to unseen data than the Replicated Softmax model. Features extracted from our model outperform LDA, Replicated Softmax, and DocNADE models on document retrieval and document classification tasks.
We present a mixture model whose components are Restricted Boltzmann Machines (RBMs). This possibility has not been considered before because computing the partition function of an RBM is intractable, which appears to make learning a mixture of RBMs intractable as well. Surprisingly, when formulated as a third-order Boltzmann machine, such a mixture model can be learned tractably using contrastive divergence. The energy function of the model captures three-way interactions among visible units, hidden units, and a single hidden multinomial unit that represents the cluster labels. The distinguishing feature of this model is that, unlike other mixture models, the mixing proportions are not explicitly parameterized. Instead, they are defined implicitly via the energy function and depend on all the parameters in the model. We present results for the MNIST and NORB datasets showing that the implicit mixture of RBMs learns clusters that reflect the class structure in the data.
This paper presents a novel approach to speaker subspace modelling based on Gaussian-Binary Restricted Boltzmann Machines (GRBM). The proposed model is based on the idea of shared factors as in the Probabilistic Linear Discriminant Analysis (PLDA). GRBM hidden layer is divided into speaker and channel factors, herein the speaker factor is shared over all vectors of the speaker. Then Maximum Likelihood Parameter Estimation (MLE) for proposed model is introduced. Various new scoring techniques for speaker verification using GRBM are proposed. The results for NIST i-vector Challenge 2014 dataset are presented.