AITopics | Makoviychuk, Viktor

Plotting

Makoviychuk, Viktor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DextrAH-G: Pixels-to-Action Dexterous Arm-Hand Grasping with Geometric Fabrics

Lum, Tyler Ga Wei, Matak, Martin, Makoviychuk, Viktor, Handa, Ankur, Allshire, Arthur, Hermans, Tucker, Ratliff, Nathan D., Van Wyk, Karl

arXiv.org Artificial IntelligenceJul-3-2024

A pivotal challenge in robotics is achieving fast, safe, and robust dexterous grasping across a diverse range of objects, an important goal within industrial applications. However, existing methods often have very limited speed, dexterity, and generality, along with limited or no hardware safety guarantees. In this work, we introduce DextrAH-G, a depth-based dexterous grasping policy trained entirely in simulation that combines reinforcement learning, geometric fabrics, and teacher-student distillation. We address key challenges in joint arm-hand policy learning, such as high-dimensional observation and action spaces, the sim2real gap, collision avoidance, and hardware constraints. DextrAH-G enables a 23 motor arm-hand robot to safely and continuously grasp and transport a large variety of objects at high speed using multi-modal inputs including depth images, allowing generalization across object geometry. Videos at https://sites.google.com/view/dextrah-g.

artificial intelligence, dextrah-g, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2407.02274

Country:

North America > United States > Pennsylvania (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry:

Transportation (0.48)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.67)

Add feedback

Geometric Fabrics: a Safe Guiding Medium for Policy Learning

Van Wyk, Karl, Handa, Ankur, Makoviychuk, Viktor, Guo, Yijie, Allshire, Arthur, Ratliff, Nathan D.

arXiv.org Artificial IntelligenceMay-3-2024

Robotics policies are always subjected to complex, second order dynamics that entangle their actions with resulting states. In reinforcement learning (RL) contexts, policies have the burden of deciphering these complicated interactions over massive amounts of experience and complex reward functions to learn how to accomplish tasks. Moreover, policies typically issue actions directly to controllers like Operational Space Control (OSC) or joint PD control, which induces straightline motion towards these action targets in task or joint space. However, straightline motion in these spaces for the most part do not capture the rich, nonlinear behavior our robots need to exhibit, shifting the burden of discovering these behaviors more completely to the agent. Unlike these simpler controllers, geometric fabrics capture a much richer and desirable set of behaviors via artificial, second order dynamics grounded in nonlinear geometry. These artificial dynamics shift the uncontrolled dynamics of a robot via an appropriate control law to form behavioral dynamics. Behavioral dynamics unlock a new action space and safe, guiding behavior over which RL policies are trained. Behavioral dynamics enable bang-bang-like RL policy actions that are still safe for real robots, simplify reward engineering, and help sequence real-world, high-performance policies. We describe the framework more generally and create a specific instantiation for the problem of dexterous, in-hand reorientation of a cube by a highly actuated robot hand.

artificial intelligence, fabric, geometric fabric, (15 more...)

arXiv.org Artificial Intelligence

2405.0225

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (0.89)

Add feedback

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

Huang, Shengyi, Gallouédec, Quentin, Felten, Florian, Raffin, Antonin, Dossa, Rousslan Fernand Julien, Zhao, Yanxiao, Sullivan, Ryan, Makoviychuk, Viktor, Makoviichuk, Denys, Danesh, Mohamad H., Roumégous, Cyril, Weng, Jiayi, Chen, Chufan, Rahman, Md Masudur, Araújo, João G. M., Quan, Guorui, Tan, Daniel, Klein, Timo, Charakorn, Rujikorn, Towers, Mark, Berthelot, Yann, Mehta, Kinal, Chakraborty, Dipam, KG, Arjun, Charraut, Valentin, Ye, Chang, Liu, Zichen, Alegre, Lucas N., Nikulin, Alexander, Hu, Xiao, Liu, Tianlin, Choi, Jongwook, Yi, Brent

arXiv.org Artificial IntelligenceFeb-5-2024

In many Reinforcement Learning (RL) papers, learning curves are useful indicators to measure the effectiveness of RL algorithms. However, the complete raw data of the learning curves are rarely available. As a result, it is usually necessary to reproduce the experiments from scratch, which can be time-consuming and error-prone. We present Open RL Benchmark, a set of fully tracked RL experiments, including not only the usual data such as episodic return, but also all algorithm-specific and system metrics. Open RL Benchmark is community-driven: anyone can download, use, and contribute to the data. At the time of writing, more than 25,000 runs have been tracked, for a cumulative duration of more than 8 years. Open RL Benchmark covers a wide range of RL libraries and reference implementations. Special care is taken to ensure that each experiment is precisely reproducible by providing not only the full parameters, but also the versions of the dependencies used to generate it. In addition, Open RL Benchmark comes with a command-line interface (CLI) for easy fetching and generating figures to present the results. In this document, we include two case studies to demonstrate the usefulness of Open RL Benchmark in practice. To the best of our knowledge, Open RL Benchmark is the first RL benchmark of its kind, and the authors hope that it will improve and facilitate the work of researchers in the field.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2402.03046

Country:

North America > United States > New York (0.14)
North America > United States > Louisiana (0.14)
Europe > Austria > Vienna (0.14)
(2 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (0.67)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality

Handa, Ankur, Allshire, Arthur, Makoviychuk, Viktor, Petrenko, Aleksei, Singh, Ritvik, Liu, Jingzhou, Makoviichuk, Denys, Van Wyk, Karl, Zhurkevich, Alexander, Sundaralingam, Balakumar, Narang, Yashraj, Lafleche, Jean-Francois, Fox, Dieter, State, Gavriel

arXiv.org Artificial IntelligenceJan-2-2024

Recent work has demonstrated the ability of deep reinforcement learning (RL) algorithms to learn complex robotic behaviours in simulation, including in the domain of multi-fingered manipulation. However, such models can be challenging to transfer to the real world due to the gap between simulation and reality. In this paper, we present our techniques to train a) a policy that can perform robust dexterous manipulation on an anthropomorphic robot hand and b) a robust pose estimator suitable for providing reliable real-time information on the state of the object being manipulated. Our policies are trained to adapt to a wide range of conditions in simulation. Consequently, our vision-based policies significantly outperform the best vision policies in the literature on the same reorientation task and are competitive with policies that are given privileged state information via motion capture systems. Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups, and in our case, with the Allegro Hand and Isaac Gym GPU-based simulation. Furthermore, it opens up possibilities for researchers to achieve such results with commonly-available, affordable robot hands and cameras. Videos of the resulting policy and supplementary information, including experiments and demos, can be found at https://dextreme.org/

latexit sha1, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2210.13702

Country:

North America > United States > California (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with Population Based Training

Petrenko, Aleksei, Allshire, Arthur, State, Gavriel, Handa, Ankur, Makoviychuk, Viktor

arXiv.org Artificial IntelligenceMay-20-2023

In this work, we propose algorithms and methods that enable learning dexterous object manipulation using simulated one- or two-armed robots equipped with multi-fingered hand end-effectors. Using a parallel GPU-accelerated physics simulator (Isaac Gym), we implement challenging tasks for these robots, including regrasping, grasp-and-throw, and object reorientation. To solve these problems we introduce a decentralized Population-Based Training (PBT) algorithm that allows us to massively amplify the exploration capabilities of deep reinforcement learning. We find that this method significantly outperforms regular end-to-end learning and is able to discover robust control policies in challenging tasks. Video demonstrations of learned behaviors and the code can be found at https://sites.google.com/view/dexpbt

latexit sha1, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2305.12127

Country: North America > United States (0.28)

Genre: Research Report (0.51)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine

Weng, Jiayi, Lin, Min, Huang, Shengyi, Liu, Bo, Makoviichuk, Denys, Makoviychuk, Viktor, Liu, Zichen, Song, Yufan, Luo, Ting, Jiang, Yukun, Xu, Zhongwen, Yan, Shuicheng

arXiv.org Artificial IntelligenceOct-12-2022

There has been significant progress in developing reinforcement learning (RL) training systems. Past works such as IMPALA, Apex, Seed RL, Sample Factory, and others, aim to improve the system's overall throughput. In this paper, we aim to address a common bottleneck in the RL training system, i.e., parallel environment execution, which is often the slowest part of the whole system but receives little attention. With a curated design for paralleling RL environments, we have improved the RL environment simulation speed across different hardware setups, ranging from a laptop and a modest workstation, to a high-end machine such as NVIDIA DGX-A100. On a high-end machine, EnvPool achieves one million frames per second for the environment execution on Atari environments and three million frames per second on MuJoCo environments. When running EnvPool on a laptop, the speed is 2.8x that of the Python subprocess. Moreover, great compatibility with existing RL training libraries has been demonstrated in the open-sourced community, including CleanRL, rl_games, DeepMind Acme, etc. Finally, EnvPool allows researchers to iterate their ideas at a much faster pace and has great potential to become the de facto RL environment execution engine. Example runs show that it only takes five minutes to train agents to play Atari Pong and MuJoCo Ant on a laptop. EnvPool is open-sourced at https://github.com/sail-sg/envpool.

envpool, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2206.10558

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Leisure & Entertainment > Games > Computer Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

Wong, Josiah, Makoviychuk, Viktor, Anandkumar, Anima, Zhu, Yuke

arXiv.org Artificial IntelligenceOct-1-2021

Learning performant robot manipulation policies can be challenging due to high-dimensional continuous actions and complex physics-based dynamics. This can be alleviated through intelligent choice of action space. Operational Space Control (OSC) has been used as an effective task-space controller for manipulation. Nonetheless, its strength depends on the underlying modeling fidelity, and is prone to failure when there are modeling errors. In this work, we propose OSC for Adaptation and Robustness (OSCAR), a data-driven variant of OSC that compensates for modeling errors by inferring relevant dynamics parameters from online trajectories. OSCAR decomposes dynamics learning into task-agnostic and task-specific phases, decoupling the dynamics dependencies of the robot and the extrinsics due to its environment. This structure enables robust zero-shot performance under out-of-distribution and rapid adaptation to significant domain shifts through additional finetuning. We evaluate our method on a variety of simulated manipulation problems, and find substantial improvements over an array of controller baselines. For more results and information, please visit https://cremebrule.github.io/oscar-web/.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2110.00704

Country:

North America > United States > Texas (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

de Witt, Christian Schroeder, Gupta, Tarun, Makoviichuk, Denys, Makoviychuk, Viktor, Torr, Philip H. S., Sun, Mingfei, Whiteson, Shimon

arXiv.org Artificial IntelligenceNov-18-2020

Most recently developed approaches to cooperative multi-agent reinforcement learning in the \emph{centralized training with decentralized execution} setting involve estimating a centralized, joint value function. In this paper, we demonstrate that, despite its various theoretical shortcomings, Independent PPO (IPPO), a form of independent learning in which each agent simply estimates its local value function, can perform just as well as or better than state-of-the-art joint learning approaches on popular multi-agent benchmark suite SMAC with little hyperparameter tuning. We also compare IPPO to several variants; the results suggest that IPPO's strong performance may be due to its robustness to some forms of environment non-stationarity.

computer game, ippo, survey article, (17 more...)

arXiv.org Artificial Intelligence

2011.09533

Country:

North America > United States > Massachusetts (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Games > Computer Games (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback