AITopics | Country

Country

In this work, we propose to apply trust region optimization to deep reinforcement learning using a recently proposed Kronecker-factored approximation to the curvature. We extend the framework of natural policy gradient and propose to optimize both the actor and the critic using Kronecker-factored approximate curvature (K-FAC) with trust region; hence we call our method Actor Critic using Kronecker-Factored Trust Region (ACKTR). To the best of our knowledge, this is the first scalable trust region natural gradient method for actor-critic methods. It is also the method that learns non-trivial tasks in continuous control as well as discrete control policies directly from raw pixel inputs. We tested our approach across discrete domains in Atari games as well as continuous domains in the Mu-JoCo environment. With the proposed methods, we are able to achieve higher rewards and a 2-to 3-fold improvement in sample efficiency on average, compared to previous state-of-the-art on-policy actor-critic methods.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.29)
North America > United States (0.28)

Industry: Leisure & Entertainment > Games > Computer Games (0.57)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Countering Feedback Delays in Multi-Agent Learning

Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Peter W. Glynn, Claire Tomlin

Neural Information Processing SystemsMay-27-2025, 23:58:16 GMT

We consider a model of game-theoretic learning based on online mirror descent (OMD) with asynchronous and delayed feedback information. Instead of focusing on specific games, we consider a broad class of continuous games defined by the general equilibrium stability notion, which we call λ-variational stability. Our first contribution is that, in this class of games, the actual sequence of play induced by OMD-based learning converges to Nash equilibria provided that the feedback delays faced by the players are synchronous and bounded. Subsequently, to tackle fully decentralized, asynchronous environments with (possibly) unbounded delays between actions and feedback, we propose a variant of OMD which we call delayed mirror descent (DMD), and which relies on the repeated leveraging of past information. With this modification, the algorithm converges to Nash equilibria with no feedback synchronicity assumptions and even when the delays grow superlinearly relative to the horizon of play.

artificial intelligence, assumption, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Industry: Education (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.50)

Add feedback

Parallel Streaming Wasserstein Barycenters

Matthew Staib, Sebastian Claici, Justin M. Solomon, Stefanie Jegelka

Neural Information Processing SystemsMay-27-2025, 23:43:11 GMT

Efficiently aggregating data from different sources is a challenging problem, particularly when samples from each source are distributed differently. These differences can be inherent to the inference task or present for other reasons: sensors in a sensor network may be placed far apart, affecting their individual measurements. Conversely, it is computationally advantageous to split Bayesian inference tasks across subsets of data, but data need not be identically distributed across subsets. One principled way to fuse probability distributions is via the lens of optimal transport: the Wasserstein barycenter is a single distribution that summarizes a collection of input measures while respecting their geometry. However, computing the barycenter scales poorly and requires discretization of all input distributions and the barycenter itself.

artificial intelligence, bayesian inference, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report (0.46)

Industry:

Government > Military (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)

Add feedback

Learning Linear Dynamical Systems via Spectral Filtering

Elad Hazan, Karan Singh, Cyril Zhang

Neural Information Processing SystemsMay-27-2025, 23:14:39 GMT

We present an efficient and practical algorithm for the online prediction of discrete-time linear dynamical systems with a symmetric transition matrix. We circumvent the non-convex optimization problem using improper learning: carefully overparameterize the class of LDSs by a polylogarithmic factor, in exchange for convexity of the loss functions. From this arises a polynomial-time algorithm with a near-optimal regret guarantee, with an analogous sample complexity bound for agnostic learning. Our algorithm is based on a novel filtering technique, which may be of independent interest: we convolve the time series with the eigenvectors of a certain Hankel matrix.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Kernel functions based on triplet comparisons

Matthäus Kleindessner, Ulrike von Luxburg

Neural Information Processing SystemsMay-27-2025, 22:51:39 GMT

Given only information in the form of similarity triplets "Object A is more similar to object B than to object C" about a data set, we propose two ways of defining a kernel function on the data set. While previous approaches construct a lowdimensional Euclidean embedding of the data set that reflects the given similarity triplets, we aim at defining kernel functions that correspond to high-dimensional embeddings. These kernel functions can subsequently be used to apply any kernel method to the data set.

artificial intelligence, machine learning, triplet, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.47)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (1.00)

Add feedback

Inverse Filtering for Hidden Markov Models

Robert Mattila, Cristian Rojas, Vikram Krishnamurthy, Bo Wahlberg

Neural Information Processing SystemsMay-27-2025, 22:37:18 GMT

This paper considers a number of related inverse filtering problems for hidden Markov models (HMMs). In particular, given a sequence of state posteriors and the system dynamics; i) estimate the corresponding sequence of observations, ii) estimate the observation likelihoods, and iii) jointly estimate the observation likelihoods and the observation sequence. We show how to avoid a computationally expensive mixed integer linear program (MILP) by exploiting the algebraic structure of the HMM filter using simple linear algebra operations, and provide conditions for when the quantities can be uniquely reconstructed. We also propose a solution to the more general case where the posteriors are noisily observed. Finally, the proposed inverse filtering algorithms are evaluated on real-world polysomnographic data used for automatic sleep segmentation.

artificial intelligence, inverse, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.14)
North America > United States > Massachusetts (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

WNBA investigation finds no evidence of hateful comments toward Angel Reese

FOX NewsMay-27-2025, 22:04:59 GMT

Fox News Flash top sports headlines are here. Check out what's clicking on Foxnews.com. The WNBA and the Indiana Fever announced that the allegations of "hateful comments" directed toward Angel Reese on May 17 were "not substantiated." Reese and her Chicago Sky faced the Fever and Caitlin Clark, and at one point, the two had to be separated after a flagrant foul by Clark against Reese. The association announced the next day that it would launch an investigation into the alleged comments.

angel reese, artificial intelligence, investigation, (10 more...)

FOX News

Country:

North America > United States > Indiana (0.46)
North America > United States > Illinois > Cook County > Chicago (0.36)

Industry:

Leisure & Entertainment > Sports > Basketball (0.70)
Media > News (0.46)

Technology: Information Technology > Artificial Intelligence (0.34)

Add feedback

If Ted Talks are getting shorter, what does that say about our attention spans?

The GuardianMay-27-2025, 16:39:17 GMT

Age: Ted started in 1984. And has Ted been talking ever since? I know, and they do the inspirational online talks. Correct, under the slogan "Ideas change everything". She was talking at the Hay festival, in Wales.

artificial intelligence, attention span, ted talk, (2 more...)

The Guardian

Country: Europe > United Kingdom > Wales (0.26)

Genre: Research Report (0.33)

Industry:

Education > Educational Setting > Online (0.44)
Education > Educational Setting > Continuing Education (0.44)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (0.50)

Add feedback

Jasmine Crockett shares bizarre song clip calling herself 'leader of the future'

FOX NewsMay-27-2025, 16:17:33 GMT

Texas Rep. Jasmine Crockett attacked President Donald Trump's West Point address on MSNBC and called it proof of his unfitness as commander in chief. Rep. Jasmine Crockett, D-Texas, appears to be leaning in on her rising political stardom this week, briefly sharing what appeared to be a fan-made song that referred to the Democratic firebrand as the "leader of the future." "Jasmine Crockett, she rises with the dawn. Fighting for justice, her light will never be gone," the song went. Infectious with passion, she'll never bow down.

artificial intelligence, crockett, social media, (9 more...)

FOX News

Country: North America > United States > Texas (0.51)

Industry:

Media > News (0.41)
Government > Regional Government > North America Government > United States Government (0.37)

Technology:

Information Technology > Communications > Social Media (0.61)
Information Technology > Artificial Intelligence (0.52)

Add feedback

These robot cats have glowing eyes and artificial heartbeats – and could help reduce stress in children

The GuardianMay-27-2025, 15:00:20 GMT

At Springwood library in the Blue Mountains, a librarian appears with a cat carrier in each hand. About 30 children gather around in a semicircle. Inside each carrier, a pair of beaming, sci-fi-like eyes peer out at the expectant crowd. "That is the funniest thing ever," one child says. The preschoolers have just finished reading The Truck Cat by Deborah Frenkel and Danny Snell for the annual National Simultaneous Storytime.

artificial intelligence, library, metacat, (12 more...)

The Guardian

Country: Oceania > Australia (0.51)

Industry: Health & Medicine > Therapeutic Area (0.33)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback