workspace
TheAgentCompany: Benchmarking LLMAgents on Consequential Real World Tasks
We interact with computers on an everyday basis, be it in everyday life or work, and many aspects of work can be done entirely with access to a computer and the Internet. At the same time, thanks to improvements in large language models (LLMs), there has also been a rapid development in AI agents that interact with and affect change in their surrounding environments. But how performant are AI agents at accelerating or even autonomously performing work-related tasks? The answer to this question has important implications both for industry looking to adopt AI into their workflows and for economic policy to understand the effects that adoption of AI may have on the labor market. To measure the progress of these LLM agents' performance on performing real-world professional tasks, in this paper we introduce TheAgentCompany, an extensible benchmark for evaluating AI agents that interact with the world in similar ways to those of a digital worker: by browsing the Web, writing code, running programs, and communicating with other coworkers. We build a self-contained environment with internal web sites and data that mimics a small software company environment, and create a variety of tasks that may be performed by workers in such a company. We test baseline agents powered by both closed API-based and open-weights language models (LMs), and find that the most competitive agent can complete 30% of tasks autonomously. This paints a nuanced picture on task automation with LM agents-in a setting simulating a real workplace, a good portion of simpler tasks could be solved autonomously, but more difficult long-horizon tasks are still beyond the reach of current systems. We release code, data, environment, and experiments on https://the-agent-company.com.
The MAGICAL Benchmark for Robust Imitation
The robot could learn from these demonstrations to complete the tasks autonomously. For IL algorithms to be useful, however, they must be able to learn how to perform tasks from few demonstrations. A domestic robot wouldn't be very helpful if it required thirty demonstrations before it figured out that you are deliberately washing your purple cravat
Fast Functionally Redundant Inverse Kinematics for Robotic Toolpath Optimisation in Manufacturing Tasks
Razjigaev, Andrew, Lohr, Hans, Vargas-Uscategui, Alejandro, King, Peter, Bandyopadhyay, Tirthankar
Abstract--Industrial automation with six-axis robotic arms is critical for many manufacturing tasks, including welding and additive manufacturing applications; however, many of these operations are functionally redundant due to the symmetrical tool axis, which effectively makes the operation a five-axis task. Exploiting this redundancy is crucial for achieving the desired workspace and dexterity required for the feasibility and optimisation of toolpath planning. Inverse kinematics algorithms can solve this in a fast, reactive framework, but these techniques are underutilised over the more computationally expensive offline planning methods. We propose a novel algorithm to solve functionally redundant inverse kinematics for robotic manipulation utilising a task space decomposition approach, the damped least-squares method and Halley's method to achieve fast and robust solutions with reduced joint motion. We evaluate our methodology in the case of toolpath optimisation in a cold spray coating application on a non-planar surface. The functionally redundant inverse kinematics algorithm can quickly solve motion plans that minimise joint motion, expanding the feasible operating space of the complex toolpath.
Efficient Computation of a Continuous Topological Model of the Configuration Space of Tethered Mobile Robots
Battocletti, Gianpietro, Boskos, Dimitris, De Schutter, Bart
Despite the attention that the problem of path planning for tethered robots has garnered in the past few decades, the approaches proposed to solve it typically rely on a discrete representation of the configuration space and do not exploit a model that can simultaneously capture the topological information of the tether and the continuous location of the robot. In this work, we explicitly build a topological model of the configuration space of a tethered robot starting from a polygonal representation of the workspace where the robot moves. To do so, we first establish a link between the configuration space of the tethered robot and the universal covering space of the workspace, and then we exploit this link to develop an algorithm to compute a simplicial complex model of the configuration space. We show how this approach improves the performances of existing algorithms that build other types of representations of the configuration space. The proposed model can be computed in a fraction of the time required to build traditional homotopy-augmented graphs, and is continuous, allowing to solve the path planning task for tethered robots using a broad set of path planning algorithms.
Parametric Design of a Cable-Driven Coaxial Spherical Parallel Mechanism for Ultrasound Scans
Seraj, Maryam, Kamrava, Mohammad Hossein, Tiseo, Carlo
Haptic interfaces play a critical role in medical teleoperation by enabling surgeons to interact with remote environments through realistic force and motion feedback. Achieving high fidelity in such systems requires balancing performance trade-off among workspace, dexterity, stiffness, inertia, and bandwidth, particularly in applications demanding pure rotational motion. This paper presents the design methodology and kinematic analysis of a Cable-Driven Coaxial Spherical Parallel Mechanism (CDC-SPM) developed to address these challenges. The proposed cable-driven interface design allows for reducing the mass placed at the robot arm end-effector, thereby minimizing inertial loads, enhancing stiffness, and improving dynamic responsiveness. Through parallel and coaxial actuation, the mechanism achieves decoupled rotational degrees of freedom with isotropic force and torque transmission. Simulation and analysis demonstrate that the CDC-SPM provides accurate, responsive, and safe motion characteristics suitable for high-precision haptic applications. These results highlight the mechanism's potential for medical teleoperation tasks such as ultrasound imaging, where precise and intuitive manipulation is essential.
Multimodal Control of Manipulators: Coupling Kinematics and Vision for Self-Driving Laboratory Operations
Sulaiman, Shifa, H, Amarnath, Bogh, Simon, Marturi, Naresh
Motion planning schemes are used for planning motions of a manipulator from an initial pose to a final pose during a task execution. A motion planning scheme generally comprises of a trajectory planning method and an inverse kinematic solver to determine trajectories and joints solutions respectively. In this paper, 3 motion planning schemes developed based on Jacobian methods are implemented to traverse a redundant manipulator with a coupled finger gripper through given trajectories. RRT* algorithm is used for planning trajectories and screw theory based forward kinematic equations are solved for determining joint solutions of the manipulator and gripper. Inverse solutions are computed separately using 3 Jacobian based methods such as Jacobian Transpose (JT), Pseudo Inverse (PI), and Damped Least Square (DLS) methods. Space Jacobian and manipulability measurements of the manipulator and gripper are obtained using screw theory formulations. Smoothness and RMSE error of generated trajectories and velocity continuity, acceleration profile, jerk, and snap values of joint motions are analysed for determining an efficient motion planning method for a given task. Advantages and disadvantages of the proposed motion planning schemes mentioned above are analysed using simulation studies to determine a suitable inverse solution technique for the tasks.
PerFACT: Motion Policy with LLM-Powered Dataset Synthesis and Fusion Action-Chunking Transformers
Soleymanzadeh, Davood, Liang, Xiao, Zheng, Minghui
Deep learning methods have significantly enhanced motion planning for robotic manipulators by leveraging prior experiences within planning datasets. However, state-of-the-art neural motion planners are primarily trained on small datasets collected in manually generated workspaces, limiting their generalizability to out-of-distribution scenarios. Additionally, these planners often rely on monolithic network architectures that struggle to encode critical planning information. To address these challenges, we introduce Motion Policy with Dataset Synthesis powered by large language models (LLMs) and Fusion Action-Chunking Transformers (PerFACT), which incorporates two key components. Firstly, a novel LLM-powered workspace generation method, MotionGeneralizer, enables large-scale planning data collection by producing a diverse set of semantically feasible workspaces. Secondly, we introduce Fusion Motion Policy Networks (MpiNetsFusion), a generalist neural motion planner that uses a fusion action-chunking transformer to better encode planning signals and attend to multiple feature modalities. Leveraging MotionGeneralizer, we collect 3.5M trajectories to train and evaluate MpiNetsFusion against state-of-the-art planners, which shows that the proposed MpiNetsFusion can plan several times faster on the evaluated tasks.