AITopics | Ding, Nan

Plotting

Ding, Nan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators

Chen, Changyou, Ding, Nan, Carin, Lawrence

Neural Information Processing SystemsDec-31-2015

Recent advances in Bayesian learning with large-scale data have witnessed emergence of stochastic gradient MCMC algorithms (SG-MCMC), such as stochastic gradient Langevin dynamics (SGLD), stochastic gradient Hamiltonian MCMC (SGHMC), and the stochastic gradient thermostat. While finite-time convergence properties of the SGLD with a 1st-order Euler integrator have recently been studied, corresponding theory for general SG-MCMCs has not been explored. In this paper we consider general SG-MCMCs with high-order integrators, and develop theory to analyze finite-time convergence properties and their asymptotic invariant measures. Our theoretical results show faster convergence rates and more accurate invariant measures for SG-MCMCs with higher-order integrators. For example, with the proposed efficient 2nd-order symmetric splitting integrator, the mean square error (MSE) of the posterior average for the SGHMC achieves an optimal convergence rate of $L^{-4/5}$ at $L$ iterations, compared to $L^{-2/3}$ for the SGHMC and SGLD with 1st-order Euler integrators. Furthermore, convergence results of decreasing-step-size SG-MCMCs are also developed, with the same convergence rates as their fixed-step-size counterparts for a specific decreasing sequence. Experiments on both synthetic and real datasets verify our theory, and show advantages of the proposed method in two large-scale real applications.

artificial intelligence, integrator, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Embedding Inference for Structured Multilabel Prediction

Mirzazadeh, Farzaneh, Ravanbakhsh, Siamak, Ding, Nan, Schuurmans, Dale

Neural Information Processing SystemsDec-31-2015

A key bottleneck in structured output prediction is the need for inference during training and testing, usually requiring some form of dynamic programming. Rather than using approximate inference or tailoring a specialized inference method for a particular structure---standard responses to the scaling challenge---we propose to embed prediction constraints directly into the learned representation. By eliminating the need for explicit inference a more scalable approach to structured output prediction can be achieved, particularly at test time. We demonstrate the idea for multi-label prediction under subsumption and mutual exclusion constraints, where a relationship to maximum margin structured output prediction can be established. Experiments demonstrate that the benefits of structured output training can still be realized even after inference has been eliminated.

constraint, inductive learning, optimization problem, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Alberta (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Add feedback

Bayesian Sampling Using Stochastic Gradient Thermostats

Ding, Nan, Fang, Youhan, Babbush, Ryan, Chen, Changyou, Skeel, Robert D., Neven, Hartmut

Neural Information Processing SystemsDec-31-2014

Dynamics-based sampling methods, such as Hybrid Monte Carlo (HMC) and Langevin dynamics (LD), are commonly used to sample target distributions. Recently, such approaches have been combined with stochastic gradient techniques to increase sampling efficiency when dealing with large datasets. An outstanding problem with this approach is that the stochastic gradient introduces an unknown amount of noise which can prevent proper sampling after discretization. To remedy this problem, we show that one can leverage a small number of additional variables in order to stabilize momentum fluctuations induced by the unknown noise. Our method is inspired by the idea of a thermostat in statistical physics and is justified by a general theory.

artificial intelligence, machine learning, sgnht, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Dependent Hierarchical Normalized Random Measures for Dynamic Topic Modeling

Chen, Changyou, Ding, Nan, Buntine, Wray

arXiv.org Machine LearningJun-18-2012

We develop dependent hierarchical normalized random measures and apply them to dynamic topic modeling. The dependency arises via superposition, subsampling and point transition on the underlying Poisson processes of these measures. The measures used include normalised generalised Gamma processes that demonstrate power law properties, unlike Dirichlet processes used previously in dynamic topic modeling. Inference for the model includes adapting a recently developed slice sampler to directly manipulate the underlying Poisson process. Experiments performed on news, blogs, academic and Twitter collections demonstrate the technique gives superior perplexity over a number of previous models.

artificial intelligence, dirichlet process, social media, (17 more...)

arXiv.org Machine Learning

1206.4671

Country:

Oceania > Australia (0.28)
North America > United States > California (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Theory of Dependent Hierarchical Normalized Random Measures

Chen, Changyou, Buntine, Wray, Ding, Nan

arXiv.org Machine LearningMay-25-2012

This paper presents theory for Normalized Random Measures (NRMs), Normalized Generalized Gammas (NGGs), a particular kind of NRM, and Dependent Hierarchical NRMs which allow networks of dependent NRMs to be analysed. These have been used, for instance, for time-dependent topic modelling. In this paper, we first introduce some mathematical background of completely random measures (CRMs) and their construction from Poisson processes, and then introduce NRMs and NGGs. Slice sampling is also introduced for posterior inference. The dependency operators in Poisson processes and for the corresponding CRMs and NRMs is then introduced and Posterior inference for the NGG presented. Finally, we give dependency and composition results when applying these operators to NRMs so they can be used in a network with hierarchical and dependent relations.

artificial intelligence, machine learning, random measure, (15 more...)

arXiv.org Machine Learning

1205.4159

Country:

Oceania > Australia (0.28)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

t-divergence Based Approximate Inference

Ding, Nan, Qi, Yuan, Vishwanathan, S.v.n.

Neural Information Processing SystemsDec-31-2011

Approximate inference is an important technique for dealing with large, intractable graphical models based on the exponential family of distributions. We extend the idea of approximate inference to the t-exponential family by defining a new t-divergence. This divergence measure is obtained via convex duality between the log-partition function of the t-exponential family and a new t-entropy. We illustrate our approach on the Bayes Point Machine with a Student's t-prior.

artificial intelligence, bayesian inference, t-exponential family, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

t-logistic regression

Ding, Nan, Vishwanathan, S.v.n.

Neural Information Processing SystemsDec-31-2010

We extend logistic regression by using t-exponential families which were introduced recently in statistical physics. This gives rise to a regularized risk minimization problem with a non-convex loss function. An efficient block coordinate descent optimization scheme can be derived for estimating the parameters. Because of the nature of the loss function, our algorithm is tolerant to label noise. Furthermore, unlike other algorithms which employ non-convex loss functions, our algorithm is fairly robust to the choice of initial values. We verify both these observations empirically on a number of synthetic and real datasets.

artificial intelligence, logistic regression, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.29)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report > Experimental Study (0.77)
Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.77)

Add feedback