AITopics

doi: 10.1007/s11227-018-2591-3

2006.01022

Country:

Asia > China > Heilongjiang Province > Harbin (0.05)
Africa > Middle East > Algeria > Khenchela Province > Khenchela (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(6 more...)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

arXiv.org Machine LearningJun-27-2020

Thermodynamic Machine Learning through Maximum Work Production

Boyd, A. B., Crutchfield, J. P., Gu, M.

Adaptive thermodynamic systems -- such as a biological organism attempting to gain survival advantage, an autonomous robot performing a functional task, or a motor protein transporting intracellular nutrients -- can improve their performance by effectively modeling the regularities and stochasticity in their environments. Analogously, but in a purely computational realm, machine learning algorithms seek to estimate models that capture predictable structure and identify irrelevant noise in training data by optimizing performance measures, such as a model's log-likelihood of having generated the data. Is there a sense in which these computational models are physically preferred? For adaptive physical systems we introduce the organizing principle that thermodynamic work is the most relevant performance measure of advantageously modeling an environment. Specifically, a physical agent's model determines how much useful work it can harvest from an environment. We show that when such agents maximize work production they also maximize their environmental model's log-likelihood, establishing an equivalence between thermodynamics and learning. In this way, work maximization appears as an organizing principle that underlies learning in adaptive thermodynamic systems.

artificial intelligence, machine learning, work production, (15 more...)

2006.15416

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Yolo County > Davis (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Khetarpal, Khimya, Ahmed, Zafarali, Comanici, Gheorghe, Abel, David, Precup, Doina

What can I do here? A Theory of Affordances in Reinforcement Learning

arXiv.org Artificial IntelligenceJun-26-2020

Reinforcement learning algorithms usually assume that all actions are always available to an agent. However, both people and animals understand the general link between the features of their environment and the actions that are feasible. Gibson (1977) coined the term "affordances" to describe the fact that certain states enable an agent to do certain actions, in the context of embodied agents. In this paper, we develop a theory of affordances for agents who learn and plan in Markov Decision Processes. Affordances play a dual role in this case. On one hand, they allow faster planning, by reducing the number of actions available in any given situation. On the other hand, they facilitate more efficient and precise learning of transition models from data, especially when such models require function approximation. We establish these properties through theoretical results as well as illustrative examples. We also propose an approach to learn affordances and use it to estimate transition models that are simpler and generalize better.

affordance, machine learning, reinforcement learning, (15 more...)

2006.15085

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

arXiv.org Artificial IntelligenceJun-26-2020

Algorithm for Computing Approximate Nash equilibrium in Continuous Games with Application to Continuous Blotto

Ganzfried, Sam

Successful algorithms have been developed for computing Nash equilibrium in a variety of finite game classes. However, solving continuous games---in which the pure strategy space is (potentially uncountably) infinite---is far more challenging. Nonetheless, many real-world domains have continuous action spaces, e.g., where actions refer to an amount of time, money, or other resource that is naturally modeled as being real-valued as opposed to integral. We present a new algorithm for computing Nash equilibrium strategies in continuous games. In addition to two-player zero-sum games, our algorithm also applies to multiplayer games and games of imperfect information. We experiment with our algorithm on a continuous imperfect-information Blotto game, in which two players distribute resources over multiple battlefields. Blotto games have frequently been used to model national security scenarios and have also been applied to electoral competition and auction theory. Experiments show that our algorithm is able to quickly compute close approximations of Nash equilibrium strategies for this game.

algorithm, artificial intelligence, optimization problem, (18 more...)

2006.07443

Country:

North America > United States > Texas (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Games (1.00)
Government > Military (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Kovalev, Maxim S., Utkin, Lev V.

Counterfactual explanation of machine learning survival models

arXiv.org Machine LearningJun-26-2020

A method for counterfactual explanation of machine learning survival models is proposed. One of the difficulties of solving the counterfactual explanation problem is that the classes of examples are implicitly defined through outcomes of a machine learning survival model in the form of survival functions. A condition that establishes the difference between survival functions of the original example and the counterfactual is introduced. This condition is based on using a distance between mean times to event. It is shown that the counterfactual explanation problem can be reduced to a standard convex optimization problem with linear constraints when the explained black-box model is the Cox model. For other black-box models, it is proposed to apply the well-known Particle Swarm Optimization algorithm. A lot of numerical experiments with real and synthetic data demonstrate the proposed method.

evolutionary algorithm, explanation, machine learning, (19 more...)

2006.16793

Country:

Asia > Russia (0.14)
North America > United States > New Jersey (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
(3 more...)

Sankararaman, Abishek, Basu, Soumya, Sankararaman, Karthik Abinav

Dominate or Delete: Decentralized Competing Bandits with Uniform Valuation

arXiv.org Machine LearningJun-26-2020

We study regret minimization problems in a two-sided matching market where uniformly valued demand side agents (a.k.a. agents) continuously compete for getting matched with supply side agents (a.k.a. arms) with unknown and heterogeneous valuations. Such markets abstract online matching platforms (for e.g. UpWork, TaskRabbit) and falls within the purview of matching bandit models introduced in Liu et al. \cite{matching_bandits}. The uniform valuation in the demand side admits a unique stable matching equilibrium in the system. We design the first decentralized algorithm - \fullname\; (\name), for matching bandits under uniform valuation that does not require any knowledge of reward gaps or time horizon, and thus partially resolves an open question in \cite{matching_bandits}. \name\; works in phases of exponentially increasing length. In each phase $i$, an agent first deletes dominated arms -- the arms preferred by agents ranked higher than itself. Deletion follows dynamic explore-exploit using UCB algorithm on the remaining arms for $2^i$ rounds. {Finally, the preferred arm is broadcast in a decentralized fashion to other agents through {\em pure exploitation} in $(N-1)K$ rounds with $N$ agents and $K$ arms.} Comparing the obtained reward with respect to the unique stable matching, we show that \name\; achieves $O(\log(T)/\Delta^2)$ regret in $T$ rounds, where $\Delta$ is the minimum gap across all agents and arms. We provide a (orderwise) matching regret lower-bound.

agent, artificial intelligence, data mining, (18 more...)

2006.15166

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (0.74)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Data Science > Data Mining (0.93)

González-Duque, Miguel, Palm, Rasmus Berg, Ha, David, Risi, Sebastian

Finding Game Levels with the Right Difficulty in a Few Trials through Intelligent Trial-and-Error

arXiv.org Artificial IntelligenceJun-25-2020

Methods for dynamic difficulty adjustment allow games to be tailored to particular players to maximize their engagement. However, current methods often only modify a limited set of game features such as the difficulty of the opponents, or the availability of resources. Other approaches, such as experience-driven Procedural Content Generation (PCG), can generate complete levels with desired properties such as levels that are neither too hard nor too easy, but require many iterations. This paper presents a method that can generate and search for complete levels with a specific target difficulty in only a few trials. This advance is enabled by through an Intelligent Trial-and-Error algorithm, originally developed to allow robots to adapt quickly. Our algorithm first creates a large variety of different levels that vary across predefined dimensions such as leniency or map coverage. The performance of an AI playing agent on these maps gives a proxy for how difficult the level would be for another AI agent (e.g. one that employs Monte Carlo Tree Search instead of Greedy Tree Search); using this information, a Bayesian Optimization procedure is deployed, updating the difficulty of the prior map to reflect the ability of the agent. The approach can reliably find levels with a specific target difficulty for a variety of planning agents in only a few trials, while maintaining an understanding of their skill landscape.

agent, artificial intelligence, machine learning, (15 more...)

2005.07677

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.89)

Brooks, Nathan A., Powers, Simon T., Borg, James M.

A mechanism to promote social behaviour in household load balancing

arXiv.org Artificial IntelligenceJun-25-2020

Reducing the peak energy consumption of households is essential for the effective use of renewable energy sources, in order to ensure that as much household demand as possible can be met by renewable sources. This entails spreading out the use of high-powered appliances such as dishwashers and washing machines throughout the day. Traditional approaches to this problem have relied on differential pricing set by a centralised utility company. But this mechanism has not been effective in promoting widespread shifting of appliance usage. Here we consider an alternative decentralised mechanism, where agents receive an initial allocation of time-slots to use their appliances and can then exchange these with other agents. If agents are willing to be more flexible in the exchanges they accept, then overall satisfaction, in terms of the percentage of agents time-slot preferences that are satisfied, will increase. This requires a mechanism that can incentivise agents to be more flexible. Building on previous work, we show that a mechanism incorporating social capital - the tracking of favours given and received - can incentivise agents to act flexibly and give favours by accepting exchanges that do not immediately benefit them. We demonstrate that a mechanism that tracks favours increases the overall satisfaction of agents, and crucially allows social agents that give favours to outcompete selfish agents that do not under payoff-biased social learning. Thus, even completely self-interested agents are expected to learn to produce socially beneficial outcomes.

agent, social agent, social capital, (15 more...)

doi: 10.1162/isal_a_00290

2006.14526

Country:

Europe > United Kingdom > England > Staffordshire (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Indiana > Marion County > Indianapolis (0.04)

Genre: Research Report > New Finding (0.47)

Industry:

Energy > Renewable (1.00)
Energy > Power Industry > Utilities (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Ghods, Ramina, Banerjee, Arundhati, Schneider, Jeff

Asynchronous Multi Agent Active Search

arXiv.org Machine LearningJun-25-2020

Active search refers to the problem of efficiently locating targets in an unknown environment by actively making data-collection decisions, and has many applications including detecting gas leaks, radiation sources or human survivors of disasters using aerial and/or ground robots (agents). Existing active search methods are in general only amenable to a single agent, or if they extend to multi agent they require a central control system to coordinate the actions of all agents. However, such control systems are often impractical in robotics applications. In this paper, we propose two distinct active search algorithms called SPATS (Sparse Parallel Asynchronous Thompson Sampling) and LATSI (LAplace Thompson Sampling with Information gain) that allow for multiple agents to independently make data-collection decisions without a central coordinator. Throughout we consider that targets are sparsely located around the environment in keeping with compressive sensing assumptions and its applicability in real world scenarios. Additionally, while most common search algorithms assume that agents can sense the entire environment (e.g. compressive sensing) or sense point-wise (e.g. Bayesian Optimization) at all times, we make a realistic assumption that each agent can only sense a contiguous region of space at a time. We provide simulation results as well as theoretical analysis to demonstrate the efficacy of our proposed algorithms.

artificial intelligence, bayesian inference, machine learning, (16 more...)