AITopics

1206.5264

Country: North America > Canada (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Xie, Ning, Hachiya, Hirotaka, Sugiyama, Masashi

Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting

arXiv.org Machine LearningJun-18-2012

Oriental ink painting, called Sumi-e, is one of the most appealing painting styles that has attracted artists around the world. Major challenges in computer-based Sumi-e simulation are to abstract complex scene information and draw smooth and natural brush strokes. To automatically find such strokes, we propose to model the brush as a reinforcement learning agent, and learn desired brush-trajectories by maximizing the sum of rewards in the policy search framework. We also provide elaborate design of actions, states, and rewards tailored for a Sumi-e agent. The effectiveness of our proposed approach is demonstrated through simulated Sumi-e experiments.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

doi: 10.1587/transinf.E96.D.1134

1206.4634

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Levine, Sergey, Koltun, Vladlen

Continuous Inverse Optimal Control with Locally Optimal Examples

arXiv.org Artificial IntelligenceJun-18-2012

Inverse optimal control, also known as inverse reinforcement learning, is the problem of recovering an unknown reward function in a Markov decision process from expert demonstrations of the optimal policy. We introduce a probabilistic inverse optimal control algorithm that scales gracefully with task dimensionality, and is suitable for large, continuous domains where even computing a full policy is impractical. By using a local approximation of the reward function, our method can also drop the assumption that the demonstrations are globally optimal, requiring only local optimality. This allows it to learn from examples that are unsuitable for prior methods.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1206.4617

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Brunskill, Emma, Leffler, Bethany, Li, Lihong, Littman, Michael L., Roy, Nicholas

CORL: A Continuous-state Offset-dynamics Reinforcement Learner

arXiv.org Machine LearningJun-13-2012

Continuous state spaces and stochastic, switching dynamics characterize a number of rich, realworld domains, such as robot navigation across varying terrain. We describe a reinforcementlearning algorithm for learning in these domains and prove for certain environments the algorithm is probably approximately correct with a sample complexity that scales polynomially with the state-space dimension. Unfortunately, no optimal planning techniques exist in general for such problems; instead we use fitted value iteration to solve the learned MDP, and include the error due to approximate planning in our bounds. Finally, we report an experiment using a robotic car driving over varying terrain to demonstrate that these dynamics representations adequately capture real-world dynamics and that our algorithm can be used to efficiently solve such problems.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1206.3231

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

AAAI ConferencesJun-8-2012

Bandit-Based Planning and Learning in Continuous-Action Markov Decision Processes

Weinstein, Ari (Rutgers University) | Littman, Michael L. (Rutgers University)

Recent research leverages results from the continuous-armed bandit literature to create a reinforcement-learning algorithm for continuous state and action spaces. Initially proposed in a theoretical setting, we provide the first examination of the empirical properties of the algorithm. Through experimentation, we demonstrate the effectiveness of this planning method when coupled with exploration and model learning and show that, in addition to its formal guarantees, the approach is very competitive with other continuous-action reinforcement learners.

algorithm, holop, sequence, (17 more...)

Twenty-Second International Conference on Automated Planning and Scheduling

Country: North America > United States > New Jersey > Middlesex County > Piscataway (0.04)

Genre:

Workflow (0.68)
Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

arXiv.org Artificial IntelligenceJun-8-2012

Preconditioned Temporal Difference Learning

HengShuai, Yao

This paper has been withdrawn by the author. This draft is withdrawn for its poor quality in english, unfortunately produced by the author when he was just starting his science route. Look at the ICML version instead: http://icml2008.cs.helsinki.fi/papers/111.pdf

machine learning, preconditioned temporal difference learning, reinforcement learning, (1 more...)

arXiv.org Artificial Intelligence

0704.1409

Country: Europe > Finland > Uusimaa > Helsinki (0.24)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

N., Pradyot Korupolu V. (Indian Institute of Technology Madras) | Sivamurugan, Manimaran S. (Indian Institute of Technology Madras) | Ravindran, Balaraman (IIT Madras)

Instructing a Reinforcement Learner

AAAI ConferencesMay-20-2012

In reinforcement learning (RL), rewards have been considered the most important feedback in understanding the environment. However, recently there have been interesting forays into other modes such as using sporadic supervisory inputs. This brings into the learning process richer information about the world of interest. In this paper, we model these supervisory inputs as specific types of instructions that provide information in the form of an expert's control decision and certain structural regularities in the state space. We further provide a mathematical formulation for the same and propose a framework to incorporate them into the learning process.

instructing, reinforcement learner, supervisory input, (1 more...)

Twenty-Fifth International FLAIRS Conference

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Oladell, Marcus Carlos (University of Texas at Arlington) | Huber, Manfred (University of Texas at Arlington)

Symbol Generation and Grounding for Reinforcement Learning Agents Using Affordances and Dictionary Compression

AAAI ConferencesMay-20-2012

One of the challenges for artificial agents is managing the complexity of their environment as they learn tasks especially if they are grounded in the physical world. A scalable solution to address the state explosion problem is thus a prerequisite of physically grounded, agentbased systems. This paper presents a framework for developing grounded, symbolic representations aimed at scaling subsequent learning as well as forming a basis for symbolic reasoning. These symbols partition the environment so the agent need only consider an abstract view of the original space when learning new tasks and allows it to apply acquired symbols to novel situations.

abstract feature, affordance, representation, (11 more...)

Twenty-Fifth International FLAIRS Conference

Country: North America > United States > Texas > Tarrant County > Arlington (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

N., Pradyot Korupolu V. (Indian Institute of Technology Madras) | Sivamurugan, Manimaran S. (Indian Institute of Technology Madras) | Ravindran, Balaraman (IIT Madras)

Instructing a Reinforcement Learner

AAAI ConferencesMay-20-2012

agent, instruction, retriever, (16 more...)

Twenty-Fifth International FLAIRS Conference

Country: Asia > India (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Ortega, Pedro A., Braun, Daniel A.

Free Energy and the Generalized Optimality Equations for Sequential Decision Making

arXiv.org Machine LearningMay-17-2012

The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments. We derive generalized sequential optimality equations that not only include the Bellman optimality equations as a limit case, but also lead to well-known decision-rules such as Expectimax, Minimax and Expectiminimax. We show how these decision-rules can be derived from a single free energy principle that assigns a resource parameter to each node in the decision tree. These resource parameters express a concrete computational cost that can be measured as the amount of samples that are needed from the distribution that belongs to each node. The free energy principle therefore provides the normative basis for generalized optimality equations that account for both adversarial and stochastic environments.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

1205.3997

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)