AITopics | Trimponias, George

Collaborating Authors

Trimponias, George

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reinforcement Learning with Exogenous States and Rewards

Trimponias, George, Dietterich, Thomas G.

arXiv.org Artificial IntelligenceMar-22-2023

Exogenous state variables and rewards can slow reinforcement learning by injecting uncontrolled variation into the reward signal. This paper formalizes exogenous state variables and rewards and shows that if the reward function decomposes additively into endogenous and exogenous components, the MDP can be decomposed into an exogenous Markov Reward Process (based on the exogenous reward) and an endogenous Markov Decision Process (optimizing the endogenous reward). Any optimal policy for the endogenous MDP is also an optimal policy for the original MDP, but because the endogenous reward typically has reduced variance, the endogenous MDP is easier to solve. We study settings where the decomposition of the state space into exogenous and endogenous state spaces is not given but must be discovered. The paper introduces and proves correctness of algorithms for discovering the exogenous and endogenous subspaces of the state space when they are mixed through linear combination. These algorithms can be applied during reinforcement learning to discover the exogenous space, remove the exogenous reward, and focus reinforcement learning on the endogenous MDP. Experiments on a variety of challenging synthetic MDPs show that these methods, applied online, discover large exogenous state spaces and produce substantial speedups in reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2303.12957

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Telecommunications (0.45)
Government > Regional Government > North America Government > United States Government (0.45)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.87)

Add feedback

Comparing EM with GD in Mixture Models of Two Components

Zhang, Guojun, Poupart, Pascal, Trimponias, George

arXiv.org Machine LearningJul-18-2019

The expectation-maximization (EM) algorithm has been widely used in minimizing the negative log likelihood (also known as cross entropy) of mixture models. However, little is understood about the goodness of the fixed points it converges to. In this paper, we study the regions where one component is missing in two-component mixture models, which we call one-cluster regions. We analyze the propensity of such regions to trap EM and gradient descent (GD) for mixtures of two Gaussians and mixtures of two Bernoullis. In the case of Gaussian mixtures, EM escapes one-cluster regions exponentially fast, while GD escapes them linearly fast. In the case of mixtures of Bernoullis, we find that there exist one-cluster regions that are stable for GD and therefore trap GD, but those regions are unstable for EM, allowing EM to escape. Those regions are local minima that appear universally in experiments and can be arbitrarily bad. This work implies that EM is less likely than GD to converge to certain bad local optima in mixture models.

artificial intelligence, converge, machine learning, (19 more...)

arXiv.org Machine Learning

1907.03783

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Discovering and Removing Exogenous State Variables and Rewards for Reinforcement Learning

Dietterich, Thomas G., Trimponias, George, Chen, Zhitang

arXiv.org Machine LearningJun-5-2018

Exogenous state variables and rewards can slow down reinforcement learning by injecting uncontrolled variation into the reward signal. We formalize exogenous state variables and rewards and identify conditions under which an MDP with exogenous state can be decomposed into an exogenous Markov Reward Process involving only the exogenous state+reward and an endogenous Markov Decision Process defined with respect to only the endogenous rewards. We also derive a variance-covariance condition under which Monte Carlo policy evaluation on the endogenous MDP is accelerated compared to using the full MDP. Similar speedups are likely to carry over to all RL algorithms. We develop two algorithms for discovering the exogenous variables and test them on several MDPs. Results show that the algorithms are practical and can significantly speed up reinforcement learning.

artificial intelligence, mdp, télécommunications, (14 more...)

arXiv.org Machine Learning

1806.01584

Country:

North America > United States > Oregon (0.14)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Telecommunications (0.68)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Dynamic Sum Product Networks for Tractable Inference on Sequence Data (Extended Version)

Melibari, Mazen, Poupart, Pascal, Doshi, Prashant, Trimponias, George

arXiv.org Artificial IntelligenceJul-15-2016

Sum-Product Networks (SPN) have recently emerged as a new class of tractable probabilistic graphical models. Unlike Bayesian networks and Markov networks where inference may be exponential in the size of the network, inference in SPNs is in time linear in the size of the network. Since SPNs represent distributions over a fixed set of variables only, we propose dynamic sum product networks (DSPNs) as a generalization of SPNs for sequence data of varying length. A DSPN consists of a template network that is repeated as many times as needed to model data sequences of any length. We present a local search technique to learn the structure of the template network. In contrast to dynamic Bayesian networks for which inference is generally exponential in the number of variables per time slice, DSPNs inherit the linear inference complexity of SPNs. We demonstrate the advantages of DSPNs over DBNs and other models on several datasets of sequence data.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

1511.04412

Country:

Asia > Japan > Honshū > Chūbu (0.14)
North America > United States > Georgia > Clarke County > Athens (0.14)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback