AITopics

arXiv.org Artificial IntelligenceNov-6-2023

Shape from Shading for Robotic Manipulation

Chaudhury, Arkadeep Narayan, Keselman, Leonid, Atkeson, Christopher G.

Controlling illumination can generate high quality information about object surface normals and depth discontinuities at a low computational cost. In this work we demonstrate a robot workspace-scaled controlled illumination approach that generates high quality information for table top scale objects for robotic manipulation. With our low angle of incidence directional illumination approach, we can precisely capture surface normals and depth discontinuities of monochromatic Lambertian objects. We show that this approach to shape estimation is 1) valuable for general purpose grasping with a single point vacuum gripper, 2) can measure the deformation of known objects, and 3) can estimate pose of known objects and track unknown objects in the robot's workspace.

artificial intelligence, experiment, illumination, (17 more...)

arXiv.org Artificial Intelligence

2304.11824

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

arXiv.org Artificial IntelligenceMar-8-2023

Learning Exploration Strategies to Solve Real-World Marble Runs

Allaire, Alisa, Atkeson, Christopher G.

Tasks involving locally unstable or discontinuous dynamics (such as bifurcations and collisions) remain challenging in robotics, because small variations in the environment can have a significant impact on task outcomes. For such tasks, learning a robust deterministic policy is difficult. We focus on structuring exploration with multiple stochastic policies based on a mixture of experts (MoE) policy representation that can be efficiently adapted. The MoE policy is composed of stochastic sub-policies that allow exploration of multiple distinct regions of the action space (or strategies) and a high-level selection policy to guide exploration towards the most promising regions. We develop a robot system to evaluate our approach in a real-world physical problem solving domain. After training the MoE policy in simulation, online learning in the real world demonstrates efficient adaptation within just a few dozen attempts, with a minimal sim2real gap. Our results confirm that representing multiple strategies promotes efficient adaptation in new environments and strategies learned under different dynamics can still provide useful information about where to look for good strategies.

artificial intelligence, learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.04928

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Energy > Oil & Gas > Upstream (0.40)
Education (0.37)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach

Atkeson, Christopher G., Morimoto, Jun

A longstanding goal of reinforcement learning is to develop nonparametric representationsof policies and value functions that support rapid learning without suffering from interference or the curse of dimensionality. Wehave developed a trajectory-based approach, in which policies and value functions are represented nonparametrically along trajectories. Thesetrajectories, policies, and value functions are updated as the value function becomes more accurate or as a model of the task is updated. Wehave applied this approach to periodic tasks such as hopping and walking, which required handling discount factors and discontinuities inthe task dynamics, and using function approximation to represent value functions at discontinuities. We also describe extensions of the approach tomake the policies more robust to modeling error and sensor noise.

artificial intelligence, optimization problem, trajectory, (18 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Minimax Differential Dynamic Programming: An Application to Robust Biped Walking

Morimoto, Jun, Atkeson, Christopher G.

We developed a robust control policy design method in high-dimensional state space by using differential dynamic programming with a minimax criterion. As an example, we applied our method to a simulated five link biped robot. The results show lower joint torques from the optimal control policy compared to a hand-tuned PD servo controller. Results also show that the simulated biped robot can successfully walk with unknown disturbances that cause controllers generated by standard differential dynamic programming and the hand-tuned PD servo to fail. Learning to compensate for modeling error and previously unknown disturbances in conjunction with robust control design is also demonstrated.

artificial intelligence, controller, optimization problem, (15 more...)

Country: North America > United States (0.95)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)

Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach

Atkeson, Christopher G., Morimoto, Jun

A longstanding goal of reinforcement learning is to develop nonparametric representations of policies and value functions that support rapid learning without suffering from interference or the curse of dimensionality. We have developed a trajectory-based approach, in which policies and value functions are represented nonparametrically along trajectories. These trajectories, policies, and value functions are updated as the value function becomes more accurate or as a model of the task is updated. We have applied this approach to periodic tasks such as hopping and walking, which required handling discount factors and discontinuities in the task dynamics, and using function approximation to represent value functions at discontinuities. We also describe extensions of the approach to make the policies more robust to modeling error and sensor noise.

Minimax Differential Dynamic Programming: An Application to Robust Biped Walking

Morimoto, Jun, Atkeson, Christopher G.

We developed a robust control policy design method in high-dimensional state space by using differential dynamic programming with a minimax criterion. As an example, we applied our method to a simulated five link biped robot. The results show lower joint torques from the optimal control policycompared to a hand-tuned PD servo controller. Results also show that the simulated biped robot can successfully walk with unknown disturbances that cause controllers generated by standard differential dynamic programmingand the hand-tuned PD servo to fail. Learning to compensate for modeling error and previously unknown disturbances in conjunction with robust control design is also demonstrated.

artificial intelligence, controller, optimization problem, (14 more...)

Country: North America > United States (0.95)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)

Neural Information Processing SystemsDec-31-1998

Nonparametric Model-Based Reinforcement Learning

Atkeson, Christopher G.

This paper describes some of the interactions of model learning algorithms and planning algorithms we have found in exploring model-based reinforcement learning. The paper focuses on how local trajectoryoptimizers can be used effectively with learned nonparametric models.We find that trajectory planners that are fully consistent with the learned model often have difficulty finding reasonable plansin the early stages of learning. Trajectory planners that balance obeying the learned model with minimizing cost (or maximizing reward) often do better, even if the plan is not fully consistent with the learned model. 1 INTRODUCTION We are exploring the use of nonparametric models in robot learning (Atkeson et al., 1997b; Atkeson and Schaal, 1997). This paper describes the interaction of model learning algorithms and planning algorithms, focusing on how local trajectory optimization canbe used effectively with nonparametric models in reinforcement learning. We find that trajectory optimizers that are fully consistent with the learned model often have difficulty finding reasonable plans in the early stages of learning. The message of this paper is that a planner should not be entirely consistent with the learned model during model-based reinforcement learning.

artificial intelligence, optimization problem, trajectory, (17 more...)