AITopics | Yaoliang Yu

Collaborating Authors

Yaoliang Yu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Homogeneous Mixture Models: Representation, Separation, and Approximation

Priyank Jaini, Pascal Poupart, Yaoliang Yu

Neural Information Processing SystemsMar-26-2025, 23:36:15 GMT

At their core, many unsupervised learning models provide a compact representation of homogeneous density mixtures, but their similarities and differences are not always clearly understood. In this work, we formally establish the relationships among latent tree graphical models (including special cases such as hidden Markov models and tensorial mixture models), hierarchical tensor formats and sum-product networks. Based on this connection, we then give a unified treatment of exponential separation in exact representation size between deep mixture architectures and shallow ones.

artificial intelligence, machine learning, node, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Convex Two-Layer Modeling with Latent Structure

Vignesh Ganapathiraman, Xinhua Zhang, Yaoliang Yu, Junfeng Wen

Neural Information Processing SystemsJan-20-2025, 11:25:28 GMT

Unsupervised learning of structured predictors has been a long standing pursuit in machine learning. Recently a conditional random field auto-encoder has been proposed in a two-layer setting, allowing latent structured representation to be automatically inferred. Aside from being nonconvex, it also requires the demanding inference of normalization. In this paper, we develop a convex relaxation of two-layer conditional model which captures latent structure and estimates model parameters, jointly and optimally. We further expand its applicability by resorting to a weaker form of inference--maximum a-posteriori. The flexibility of the model is demonstrated on two structures based on total unimodularity--graph matching and linear chain. Experimental results confirm the promise of the method.

artificial intelligence, constraint, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Alberta (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.94)

Add feedback

Bregman Divergence for Stochastic Variance Reduction: Saddle-Point and Adversarial Prediction

Zhan Shi, Xinhua Zhang, Yaoliang Yu

Neural Information Processing SystemsOct-8-2024, 07:28:08 GMT

Adversarial machines, where a learner competes against an adversary, have regained much recent interest in machine learning. They are naturally in the form of saddle-point optimization, often with separable structure but sometimes also with unmanageably large dimension. In this work we show that adversarial prediction under multivariate losses can be solved much faster than they used to be. We first reduce the problem size exponentially by using appropriate sufficient statistics, and then we adapt the new stochastic variance-reduced algorithm of Balamurugan & Bach (2016) to allow any Bregman divergence. We prove that the same linear rate of convergence is retained and we show that for adversarial prediction using KL-divergence we can further achieve a speedup of #example times compared with the Euclidean alternative. We verify the theoretical findings through extensive experiments on two example applications: adversarial prediction and LPboosting.

artificial intelligence, inductive learning, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology: