Goto

Collaborating Authors

 workcell


Logic-informed reinforcement learning for cross-domain optimization of large-scale cyber-physical systems

Wan, Guangxi, Zeng, Peng, Dong, Xiaoting, Song, Chunhe, Cui, Shijie, Li, Dong, Dong, Qingwei, Liu, Yiyang, Bai, Hongfei

arXiv.org Artificial Intelligence

Cyber-physical systems (CPS) require the joint optimization of discrete cyber actions and continuous physical parameters under stringent safety logic constraints. However, existing hierarchical approaches often compromise global optimality, whereas reinforcement learning (RL) in hybrid action spaces often relies on brittle reward penalties, masking, or shielding and struggles to guarantee constraint satisfaction. We present logic-informed reinforcement learning (LIRL), which equips standard policy-gradient algorithms with projection that maps a low-dimensional latent action onto the admissible hybrid manifold defined on-the-fly by first-order logic. This guarantees feasibility of every exploratory step without penalty tuning. Experimental evaluations have been conducted across multiple scenarios, including industrial manufacturing, electric vehicle charging stations, and traffic signal control, in all of which the proposed method outperforms existing hierarchical optimization approaches. Taking a robotic reducer assembly system in industrial manufacturing as an example, LIRL achieves a 36.47\% to 44.33\% reduction at most in the combined makespan-energy objective compared to conventional industrial hierarchical scheduling methods. Meanwhile, it consistently maintains zero constraint violations and significantly surpasses state-of-the-art hybrid-action reinforcement learning baselines. Thanks to its declarative logic-based constraint formulation, the framework can be seamlessly transferred to other domains such as smart transportation and smart grid, thereby paving the way for safe and real-time optimization in large-scale CPS.


Learning to Optimize Package Picking for Large-Scale, Real-World Robot Induction

Li, Shuai, Keipour, Azarakhsh, Zhao, Sicong, Rajagopalan, Srinath, Swan, Charles, Bekris, Kostas E.

arXiv.org Artificial Intelligence

Warehouse automation plays a pivotal role in enhancing operational efficiency, minimizing costs, and improving resilience to workforce variability. While prior research has demonstrated the potential of machine learning (ML) models to increase picking success rates in large-scale robotic fleets by prioritizing high-probability picks and packages, these efforts primarily focused on predicting success probabilities for picks sampled using heuristic methods. Limited attention has been given, however, to leveraging data-driven approaches to directly optimize sampled picks for better performance at scale. In this study, we propose an ML-based framework that predicts transform adjustments as well as improving the selection of suction cups for multi-suction end effectors for sampled picks to enhance their success probabilities. The framework was integrated and evaluated in test workcells that resemble the operations of Amazon Robotics' Robot Induction (Robin) fleet, which is used for package manipulation. Evaluated on over 2 million picks, the proposed method achieves a 20\% reduction in pick failure rates compared to a heuristic-based pick sampling baseline, demonstrating its effectiveness in large-scale warehouse automation scenarios.


Stow: Robotic Packing of Items into Fabric Pods

Hudson, Nicolas, Hooks, Josh, Warrier, Rahul, Salisbury, Curt, Hartley, Ross, Kumar, Kislay, Chandrashekhar, Bhavana, Birkmeyer, Paul, Tang, Bosch, Frost, Matt, Thakar, Shantanu, Piaskowy, Tony, Nilsson, Petter, Petersen, Josh, Doshi, Neel, Slatter, Alan, Bhatia, Ankit, Meeker, Cassie, Xue, Yuechuan, Cox, Dylan, Kyriazis, Alex, Lou, Bai, Hasan, Nadeem, Rana, Asif, Chacko, Nikhil, Xu, Ruinian, Faal, Siamak, Seraj, Esi, Agrawal, Mudit, Jamieson, Kevin, Bisagni, Alessio, Samzun, Valerie, Fuller, Christine, Keklak, Alex, Frenkel, Alex, Ratliff, Lillian, Parness, Aaron

arXiv.org Artificial Intelligence

This paper presents a compliant manipulation system capable of placing items onto densely packed shelves. The wide diversity of items and strict business requirements for high producing rates and low defect generation have prohibited warehouse robotics from performing this task. Our innovations in hardware, perception, decision-making, motion planning, and control have enabled this system to perform over 500,000 stows in a large e-commerce fulfillment center. The system achieves human levels of packing density and speed while prioritizing work on overhead shelves to enhance the safety of humans working alongside the robots.


RobotGraffiti: An AR tool for semi-automated construction of workcell models to optimize robot deployment

Zieliński, Krzysztof, Penning, Ryan, Blumberg, Bruce, Schlette, Christian, Kjærgaard, Mikkel Baun

arXiv.org Artificial Intelligence

Improving robot deployment is a central step towards speeding up robot-based automation in manufacturing. A main challenge in robot deployment is how to best place the robot within the workcell. To tackle this challenge, we combine two knowledge sources: robotic knowledge of the system and workcell context awareness of the user, and intersect them with an Augmented Reality interface. RobotGraffiti is a unique tool that empowers the user in robot deployment tasks. One simply takes a 3D scan of the workcell with their mobile device, adds contextual data points that otherwise would be difficult to infer from the system, and receives a robot base position that satisfies the automation task. The proposed approach is an alternative to expensive and time-consuming digital twins, with a fast and easy-to-use tool that focuses on selected workcell features needed to run the placement optimization algorithm. The main contributions of this paper are the novel user interface for robot base placement data collection and a study comparing the traditional offline simulation with our proposed method. We showcase the method with a robot base placement solution and obtain up to 16 times reduction in time.


Good Grasps Only: A data engine for self-supervised fine-tuning of pose estimation using grasp poses for verification

Hagelskjær, Frederik

arXiv.org Artificial Intelligence

In this paper, we present a novel method for self-supervised fine-tuning of pose estimation for bin-picking. Leveraging zero-shot pose estimation, our approach enables the robot to automatically obtain training data without manual labeling. After pose estimation the object is grasped, and in-hand pose estimation is used for data validation. Our pipeline allows the system to fine-tune while the process is running, removing the need for a learning phase. The motivation behind our work lies in the need for rapid setup of pose estimation solutions. Specifically, we address the challenging task of bin picking, which plays a pivotal role in flexible robotic setups. Our method is implemented on a robotics work-cell, and tested with four different objects. For all objects, our method increases the performance and outperforms a state-of-the-art method trained on the CAD model of the objects.


Multi-view Pose Fusion for Occlusion-Aware 3D Human Pose Estimation

Bragagnolo, Laura, Terreran, Matteo, Allegro, Davide, Ghidoni, Stefano

arXiv.org Artificial Intelligence

Robust 3D human pose estimation is crucial to ensure safe and effective human-robot collaboration. Accurate human perception,however, is particularly challenging in these scenarios due to strong occlusions and limited camera viewpoints. Current 3D human pose estimation approaches are rather vulnerable in such conditions. In this work we present a novel approach for robust 3D human pose estimation in the context of human-robot collaboration. Instead of relying on noisy 2D features triangulation, we perform multi-view fusion on 3D skeletons provided by absolute monocular methods. Accurate 3D pose estimation is then obtained via reprojection error optimization, introducing limbs length symmetry constraints. We evaluate our approach on the public dataset Human3.6M and on a novel version Human3.6M-Occluded, derived adding synthetic occlusions on the camera views with the purpose of testing pose estimation algorithms under severe occlusions. We further validate our method on real human-robot collaboration workcells, in which we strongly surpass current 3D human pose estimation methods. Our approach outperforms state-of-the-art multi-view human pose estimation techniques and demonstrates superior capabilities in handling challenging scenarios with strong occlusions, representing a reliable and effective solution for real human-robot collaboration setups.


Multi-Camera Hand-Eye Calibration for Human-Robot Collaboration in Industrial Robotic Workcells

Allegro, Davide, Terreran, Matteo, Ghidoni, Stefano

arXiv.org Artificial Intelligence

In industrial scenarios, effective human-robot collaboration relies on multi-camera systems to robustly monitor human operators despite the occlusions that typically show up in a robotic workcell. In this scenario, precise localization of the person in the robot coordinate system is essential, making the hand-eye calibration of the camera network critical. This process presents significant challenges when high calibration accuracy should be achieved in short time to minimize production downtime, and when dealing with extensive camera networks used for monitoring wide areas, such as industrial robotic workcells. Our paper introduces an innovative and robust multi-camera hand-eye calibration method, designed to optimize each camera's pose relative to both the robot's base and to each other camera. This optimization integrates two types of key constraints: i) a single board-to-end-effector transformation, and ii) the relative camera-to-camera transformations. We demonstrate the superior performance of our method through comprehensive experiments employing the METRIC dataset and real-world data collected on industrial scenarios, showing notable advancements over state-of-the-art techniques even using less than 10 images. Additionally, we release an open-source version of our multi-camera hand-eye calibration algorithm at https://github.com/davidea97/Multi-Camera-Hand-Eye-Calibration.git.


Toward Automated Programming for Robotic Assembly Using ChatGPT

Macaluso, Annabella, Cote, Nicholas, Chitta, Sachin

arXiv.org Artificial Intelligence

Despite significant technological advancements, the process of programming robots for adaptive assembly remains labor-intensive, demanding expertise in multiple domains and often resulting in task-specific, inflexible code. This work explores the potential of Large Language Models (LLMs), like ChatGPT, to automate this process, leveraging their ability to understand natural language instructions, generalize examples to new tasks, and write code. In this paper, we suggest how these abilities can be harnessed and applied to real-world challenges in the manufacturing industry. We present a novel system that uses ChatGPT to automate the process of programming robots for adaptive assembly by decomposing complex tasks into simpler subtasks, generating robot control code, executing the code in a simulated workcell, and debugging syntax and control errors, such as collisions. We outline the architecture of this system and strategies for task decomposition and code generation. Finally, we demonstrate how our system can autonomously program robots for various assembly tasks in a real-world project.


ALOHA 2: An Enhanced Low-Cost Hardware for Bimanual Teleoperation

ALOHA 2 Team, null, Aldaco, Jorge, Armstrong, Travis, Baruch, Robert, Bingham, Jeff, Chan, Sanky, Draper, Kenneth, Dwibedi, Debidatta, Finn, Chelsea, Florence, Pete, Goodrich, Spencer, Gramlich, Wayne, Hage, Torr, Herzog, Alexander, Hoech, Jonathan, Nguyen, Thinh, Storz, Ian, Tabanpour, Baruch, Takayama, Leila, Tompson, Jonathan, Wahid, Ayzaan, Wahrburg, Ted, Xu, Sichun, Yaroshenko, Sergey, Zakka, Kevin, Zhao, Tony Z.

arXiv.org Artificial Intelligence

Hoku Labs, Authors listed in alphabetical order, with contributions listed in Appendix. Diverse demonstration datasets have powered significant advances in robot learning, but the dexterity and scale of such data can be limited by the hardware cost, the hardware robustness, and the ease of teleoperation. We introduce ALOHA 2, an enhanced version of ALOHA that has greater performance, ergonomics, and robustness compared to the original design. To accelerate research in large-scale bimanual manipulation, we open source all hardware designs of ALOHA 2 with a detailed tutorial, together with a MuJoCo model of ALOHA 2 with system identification. Bottom: A detailed image of an ALOHA 2 workcell with gravity compensation, the redesigned leader and follower grippers, and images from the frame-mounted cameras. ALOHA 2, like the original Zhao et al. (2023), consists of a bimanual parallel-jaw gripper workcell with two ViperX 6-DoF arms (Trossen Robotics, a) (the "follower"), along with 2 smaller WidowX arms (Trossen Robotics, b) (the "leader").


Multi-FLEX: An Automatic Task Sequence Execution Framework to Enable Reactive Motion Planning for Multi-Robot Applications

Misra, Gaurav, Suzumura, Akihiro, Campo, Andres Rodriguez, Chenna, Kautilya, Bailey, Sean, Drinkard, John

arXiv.org Artificial Intelligence

In this letter, an integrated task planning and reactive motion planning framework termed Multi-FLEX is presented that targets real-world, industrial multi-robot applications. Reactive motion planning has been attractive for the purposes of collision avoidance, particularly when there are sources of uncertainty and variation. Most industrial applications, though, typically require parts of motion to be at least partially non-reactive in order to achieve functional objectives. Multi-FLEX resolves this dissonance and enables such applications to take advantage of reactive motion planning. The Multi-FLEX framework achieves 1) coordination of motion requests to resolve task-level conflicts and overlaps, 2) incorporation of application-specific task constraints into online motion planning using the new concepts of task dependency accommodation, task decomposition, and task bundling, and 3) online generation of robot trajectories using a custom, online reactive motion planner. This planner combines fast-to-create, sparse dynamic roadmaps (to find a complete path to the goal) with fast-to-execute, short-horizon, online, optimization-based local planning (for collision avoidance and high performance). To demonstrate, we use two six-degree-of-freedom, high-speed industrial robots in a deburring application to show the ability of this approach to not just handle collision avoidance and task variations, but to also achieve industrial applications.