AITopics

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.30)

AAAI ConferencesAug-8-2011

FAQ-Learning in Matrix Games: Demonstrating Convergence Near Nash Equilibria, and Bifurcation of Attractors in the Battle of Sexes

Kaisers, Michael (Maastricht University) | Tuyls, Karl (Maastricht University)

This article studies Frequency Adjusted Q-learning (FAQ-learning), a variation of Q-learning that simulates simultaneous value function updates. The main contributions are empirical and theoretical support for the convergence of FAQ-learning to attractors near Nash equilibria in two-agent two-action matrix games.The games can be divided into three types: Matching pennies, Prisoners' Dilemma and Battle of Sexes. This article shows that the Matching pennies and Prisoners' Dilemma yield one attractor of the learning dynamics, while the Battle of Sexes exhibits a supercritical pitchfork bifurcation at a critical temperature, where one attractor splits into two attractors and one repellent fixed point. Experiments illustrate that the distance between fixed points of the FAQ-learning dynamics and Nash equilibria tends to zero as the exploration parameter of FAQ-learning approaches zero.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > Limburg > Maastricht (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Leisure & Entertainment > Games (0.49)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Comparing Action-Query Strategies in Semi-Autonomous Agents

Cohn, Robert (University of Michigan, Ann Arbor) | Durfee, Edmund (University of Michigan, Ann Arbor) | Singh, Satinder (University of Michigan)

We consider settings in which a semi-autonomous agent has uncertain knowledge about its environment, but can ask what action the human operator would prefer taking in the current or in a potential future state. Asking queries can improve behavior, but if queries come at a cost (e.g., due to limited operator attention), the value of each query should be maximized. We compare two strategies for selecting action queries: 1) based on myopically maximizing expected gain in long-term value, and 2) based on myopically minimizing uncertainty in the agent's policy representation. We show empirically that the first strategy tends to select more valuable queries, and that a hybrid method can outperform either method alone in settings with limited computation.

agent, emg-aq, query, (17 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Michigan (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Taylor, Matthew Edmund (Lafayette College)

Teaching Reinforcement Learning with Mario: An Argument and Case Study

Integrating games into the computer science curriculum has been gaining acceptance in recent years, particularly when used to improve student engagement in introductory courses. This paper argues that games can also be useful in upper level courses, such as general artificial intelligence and machine learning. We provide a case study of using a Mario game in a machine learning class to provide one successful data point where both content-specific and general learning outcomes were successfully achieved.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Second AAAI Symposium on Educational Advances in Artificial Intelligence

Country: North America > United States > Pennsylvania > Northampton County > Easton (0.04)

Genre: Instructional Material > Course Syllabus & Notes (0.66)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education > Curriculum (1.00)
Education > Educational Setting (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Tziortziotis, Nikolaos (University of Ioannina) | Blekas, Konstantinos (University of Ioannina)

A Bayesian Reinforcement Learning framework Using Relevant Vector Machines

In this work we present an advanced Bayesian formulation to the task of control learning that employs the Relevance Vector Machines (RVM) generative model for value function evaluation. The key aspect of the proposed method is the design of the discount return as a generalized linear model that constitutes a well-known probabilistic approach. This allows to augment the model with advantageous sparse priors provided by the RVM's regression framework. We have also taken into account the significant issue of selecting the proper parameters of the kernel design matrix. Experiments have shown that our method produces improved performance in both simulated and real test environments.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Europe > Greece > Epirus > Ioannina (0.05)
North America > United States (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Zhang, Chongjie (University of Massachusetts Amherst) | Lesser, Victor (University of Massachusetts Amherst)

Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs

In many multi-agent applications such as distributed sensor nets, a network of agents act collaboratively under uncertainty and local interactions. Networked Distributed POMDP (ND-POMDP) provides a framework to model such cooperative multi-agent decision making. Existing work on ND-POMDPs has focused on offline techniques that require accurate models, which are usually costly to obtain in practice. This paper presents a model-free, scalable learning approach that synthesizes multi-agent reinforcement learning (MARL) and distributed constraint optimization (DCOP). By exploiting structured interaction in ND-POMDPs, our approach distributes the learning of the joint policy and employs DCOP techniques to coordinate distributed learning to ensure the global learning performance. Our approach can learn a globally optimal policy for ND-POMDPs with a property called groupwise observability. Experimental results show that, with communication during learning and execution, our approach significantly outperforms the nearly-optimal non-communication policies computed offline.

machine learning, nd-pomdp, reinforcement learning, (17 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Melo, Francisco S. (Instituto Superior Técnico/INESC-ID)

Differential Eligibility Vectors for Advantage Updating and Gradient Methods

In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of each action in the TD-error at each state. Specifically, we use DEV in TD-Q(lambda) to more accurately learn the relative value of the actions, rather than their absolute value. We identify conditions that ensure convergence w.p.1 of TD-Q(lambda) with DEV and show that this algorithm can also be used to directly approximate the advantage function associated with a given policy, without the need to compute an auxiliary function - something that, to the extent of our knowledge, was not known possible. Finally, we discuss the integration of DEV in LSTDQ and actor-critic algorithms.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country: Europe > Portugal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Scaling Up Reinforcement Learning through Targeted Exploration

Mann, Timothy Arthur (Texas A&M University) | Choe, Yoonsuck (Texas A&M University)

Recent Reinforcement Learning (RL) algorithms, such as R-MAX, make (with high probability) only a small number of poor decisions. In practice, these algorithms do not scale well as the number of states grows because the algorithms spend too much effort exploring. We introduce an RL algorithm State TArgeted R-MAX (STAR-MAX) that explores a subset of the state space, called the exploration envelope ξ. When ξ equals the total state space, STAR-MAX behaves identically to R-MAX. When ξ is a subset of the state space, to keep exploration within ξ, a recovery rule β is needed. We compared existing algorithms with our algorithm employing various exploration envelopes. With an appropriate choice of ξ, STAR-MAX scales far better than existing RL algorithms as the number of states increases. A possible drawback of our algorithm is its dependence on a good choice of ξ and β. However, we show that an effective recovery rule β can be learned on-line and ξ can be learned from demonstrations. We also find that even randomly sampled exploration envelopes can improve cumulative rewards compared to R-MAX. We expect these results to lead to more efficient methods for RL in large-scale problems.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas > Brazos County > College Station (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Konidaris, George (Massachusetts Institute of Technology) | Osentoski, Sarah (Brown University) | Thomas, Philip (University of Massachusetts Amherst)

Value Function Approximation in Reinforcement Learning Using the Fourier Basis

We describe the Fourier basis, a linear value function approximation scheme based on the Fourier series. We empirically demonstrate that it performs well compared to radial basis functions and the polynomial basis, the two most popular fixed bases for linear value function approximation, and is competitive with learned proto-value functions.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Singapore (0.05)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.83)

Basis Function Discovery Using Spectral Clustering and Bisimulation Metrics

Comanici, Gheorghe (McGill University) | Precup, Doina (McGill University)

We study the problem of automatically generating features for function approximation in reinforcement learning. We build on the work of Mahadevan and his colleagues, who pioneered the use of spectral clustering methods for basis function construction. Their methods work on top of a graph that captures state adjacency. Instead, we use bisimulation metrics in order to provide state distances for spectral clustering. The advantage of these metrics is that they incorporate reward information in a natural way, in addition to the state transition information. We provide theoretical bounds on the quality of the obtained approximation, which justify the importance of incorporating reward information. We also demonstrate empirically that the approximation quality improves when bisimulation metrics are used instead of the state adjacency graph in the basis function construction process.

approximation, machine learning, reinforcement learning, (16 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)