AITopics

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Sinha, Kaushik, Belkin, Mikhail

The Value of Labeled and Unlabeled Examples when the Model is Imperfect

Semi-supervised learning, i.e. learning from both labeled and unlabeled data has received significant attention in the machine learning literature in recent years.

artificial intelligence, machine learning, unlabeled data, (15 more...)

Country:

North America > United States > Ohio (0.15)
North America > United States > Wisconsin (0.14)

Industry: Education (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.58)

Roux, Nicolas L., Manzagol, Pierre-antoine, Bengio, Yoshua

Topmoumoute Online Natural Gradient Algorithm

Guided by the goal of obtaining an optimization algorithm that is both fast and yielding good generalization, we study the descent direction maximizing the decrease in generalization error or the probability of not increasing generalization error. The surprising result is that from both the Bayesian and frequentist perspectives this can yield the natural gradient direction. Although that direction can be very expensive to compute we develop an efficient, general, online approximation to the natural gradient descent which is suited to large scale problems. We report experimental results showing much faster convergence in computation time and in number of iterations with TONGA (Topmoumoute Online natural Gradient Algorithm) than with stochastic gradient descent, even on very large datasets.

artificial intelligence, gradient, neural network, (18 more...)

Country: Oceania > Tonga (0.29)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.80)

Zhao, Bing, Xing, Eric P.

HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation

We present a novel paradigm for statistical machine translation (SMT), based on joint modeling of word alignment and the topical aspects underlying bilingual document pairs via a hidden Markov Bilingual Topic AdMixture (HM-BiTAM). In this new paradigm, parallel sentence-pairs from a parallel document-pair are coupled via a certain semantic-flow, to ensure coherence of topical context in the alignment of matching words between languages, during likelihood-based training of topic-dependent translational lexicons, as well as topic representations in each language. The resulting trained HM-BiTAM can not only display topic patterns like other methods such as LDA, but now for bilingual corpora; it also offers a principled way of inferring optimal translation in a context-dependent way. Our method integrates the conventional IBM Models based on HMM --- a key component for most of the state-of-the-art SMT systems, with the recently proposed BiTAM model, and we report an extensive empirical analysis (in many way complementary to the description-oriented of our method in three aspects: word alignment, bilingual topic representation, and translation.

artificial intelligence, hm-bitam, machine translation, (17 more...)

Country:

Asia > China (0.46)
North America > United States > Michigan (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Kearns, Michael, Tan, Jinsong, Wortman, Jennifer

Privacy-Preserving Belief Propagation and Sampling

We provide provably privacy-preserving versions of belief propagation, Gibbs sampling, and other local algorithms -- distributed multiparty protocols in which each party or vertex learns only its final local value, and absolutely nothing else.

bayesian inference, belief revision, protocol, (21 more...)

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Industry: Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Hinton, Geoffrey E., Salakhutdinov, Ruslan R.

Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes

We show how to use unlabeled data and a deep belief net (DBN) to learn a good covariance kernel for a Gaussian process. We first learn a deep generative model of the unlabeled data using the fast, greedy algorithm introduced by Hinton et.al. If the data is high-dimensional and highly-structured, a Gaussian kernel applied to the top layer of features in the DBN works much better than a similar kernel applied to the raw input. Performance at both regression and classification can then be further improved by using backpropagation through the DBN to discriminatively fine-tune the covariance kernel.

deep learning, neural network, unlabeled data, (18 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Ganchev, Kuzman, Taskar, Ben, Gama, João

Expectation Maximization and Posterior Constraints

The expectation maximization (EM) algorithm is a widely used maximum likelihood estimationprocedure for statistical models when the values of some of the variables in the model are not observed. Very often, however, our aim is primarily tofind a model that assigns values to the latent variables that have intended meaning for our data and maximizing expected likelihood only sometimes accomplishes this.Unfortunately, it is typically difficult to add even simple a-priori information about latent variables in graphical models without making the models overly complex or intractable. In this paper, we present an efficient, principled way to inject rich constraints on the posteriors of latent variables into the EM algorithm. Our method can be used to learn tractable graphical models that satisfy additional,otherwise intractable constraints. Focusing on clustering and the alignment problem for statistical machine translation, we show that simple, intuitive posteriorconstraints can greatly improve the performance over standard baselines and be competitive with more complex, intractable models.

constraint, machine translation, optimization problem, (18 more...)