Goto

Collaborating Authors

 forklift


Hannah Fry: 'AI can do some superhuman things – but so can forklifts'

New Scientist

Hannah Fry: 'AI can do some superhuman things - but so can forklifts' Mathematician Hannah Fry travels to the front lines of AI in her new BBC documentary AI Confidential with Hannah Fry. The chances are that you think about artificial intelligence far more today than you did five years ago. Since ChatGPT was launched in November 2022, we have become accustomed to interacting with AIs in most spheres of life, from chatbots and smart home tech to banking and healthcare. But such rapid change brings unexpected problems - as mathematician and broadcaster Hannah Fry shows in AI Confidential With Hannah Fry, a new three-part BBC documentary in which she talks to people whose lives have been transformed by the technology. She spoke to New Scientist about how we should view AI, its role in modern mathematics - and why it will upend the global economy.


A CARLA-based Simulation of Electrically Driven Forklifts

Claus, David, Thielemann, Christiane, Stark, Hans-Georg

arXiv.org Artificial Intelligence

This paper presents the simulation of the operation of an electric forklift fleet within an intralogistics scenario. For this purpose, the open source simulation tool CARLA is used; according to our knowledge this is a novel approach in the context of logistics simulation. First, CARLA is used to generate and visualize a realistic 3D outdoor warehouse scenario, incorporating a number of randomly moving forklifts. In a next step, intralogistics transport tasks, such as pick-and-place, are simulated for the forklift fleet, including shortest-path finding. Furthermore, the capability to play back localization data, previously recorded from a ''real'' forklift fleet, is demonstrated.This play back is done in the original recreated environment, thereby enabling the visualization of the forklifts movements. Finally, the energy consumption of the forklift trucks is simulated by integrating a physical battery model that generates the state of charge (SOC) of each truck as a function of load and activity. To demonstrate the wide range of possible applications for the CARLA simulation platform, we describe two use cases. The first deals with the problem of detecting regions with critically high traffic densities, the second with optimal placement of charging stations for the forklift trucks. Both use cases are calculated for an exemplary warehouse model.


Few-Shot Neuro-Symbolic Imitation Learning for Long-Horizon Planning and Acting

Lorang, Pierrick, Lu, Hong, Huemer, Johannes, Zips, Patrik, Scheutz, Matthias

arXiv.org Artificial Intelligence

Imitation learning enables intelligent systems to acquire complex behaviors with minimal supervision. However, existing methods often focus on short-horizon skills, require large datasets, and struggle to solve long-horizon tasks or generalize across task variations and distribution shifts. We propose a novel neuro-symbolic framework that jointly learns continuous control policies and symbolic domain abstractions from a few skill demonstrations. Our method abstracts high-level task structures into a graph, discovers symbolic rules via an Answer Set Programming solver, and trains low-level controllers using diffusion policy imitation learning. A high-level oracle filters task-relevant information to focus each controller on a minimal observation and action space. Our graph-based neuro-symbolic framework enables capturing complex state transitions, including non-spatial and temporal relations, that data-driven learning or clustering techniques often fail to discover in limited demonstration datasets. We validate our approach in six domains that involve four robotic arms, Stacking, Kitchen, Assembly, and Towers of Hanoi environments, and a distinct Automated Forklift domain with two environments. The results demonstrate high data efficiency with as few as five skill demonstrations, strong zero- and few-shot generalizations, and interpretable decision making.


R-ConstraintBench: Evaluating LLMs on NP-Complete Scheduling

Jain, Raj, Wetter, Marc

arXiv.org Artificial Intelligence

However, the reliability of large language models (LLMs) when reasoning under high-constraint regimes is insufficiently characterized. To address this gap, we present R-ConstraintBench, a scalable framework that evaluates models on Resource-Constrained Project Scheduling Problems (RCPSP), an NP-Complete feasibility class, while difficulty increases via linear growth in constraints. R-ConstraintBench incrementally increases non-redundant precedence constraints in Directed Acyclic Graphs (DAGs) and then introduces downtime, temporal windows, and disjunctive constraints. As an illustrative example, we instantiate the benchmark in a data center migration setting and evaluate multiple LLMs using feasibility and error analysis, identifying degradation thresholds and constraint types most associated with failure. Empirically, strong models are near-ceiling on precedence-only DAGs, but feasibility performance collapses when downtime, temporal windows, and disjunctive constraints interact--implicating constraint interaction, not graph depth, as the principal bottleneck. Performance on clean synthetic ramps also does not guarantee transfer to domain-grounded scenarios, underscoring limited generalization.


Lang2Lift: A Framework for Language-Guided Pallet Detection and Pose Estimation Integrated in Autonomous Outdoor Forklift Operation

Nguyen, Huy Hoang, Huemer, Johannes, Murschitz, Markus, Glueck, Tobias, Vu, Minh Nhat, Kugi, Andreas

arXiv.org Artificial Intelligence

The logistics and construction industries face persistent challenges in automating pallet handling, especially in outdoor environments with variable payloads, inconsistencies in pallet quality and dimensions, and unstructured surroundings. In this paper, we tackle automation of a critical step in pallet transport: the pallet pick-up operation. Our work is motivated by labor shortages, safety concerns, and inefficiencies in manually locating and retrieving pallets under such conditions. We present Lang2Lift, a framework that leverages foundation models for natural language-guided pallet detection and 6D pose estimation, enabling operators to specify targets through intuitive commands such as "pick up the steel beam pallet near the crane." The perception pipeline integrates Florence-2 and SAM-2 for language-grounded segmentation with FoundationPose for robust pose estimation in cluttered, multi-pallet outdoor scenes under variable lighting. The resulting poses feed into a motion planning module for fully autonomous forklift operation. We validate Lang2Lift on the ADAPT autonomous forklift platform, achieving 0.76 mIoU pallet segmentation accuracy on a real-world test dataset. Timing and error analysis demonstrate the system's robustness and confirm its feasibility for deployment in operational logistics and construction environments. Video demonstrations are available at https://eric-nguyen1402.github.io/lang2lift.github.io/


Leveraging Knowledge Graphs and LLM Reasoning to Identify Operational Bottlenecks for Warehouse Planning Assistance

Parekh, Rishi, Gopalakrishnan, Saisubramaniam, Ahmad, Zishan, Deodhar, Anirudh

arXiv.org Artificial Intelligence

Analyzing large, complex output datasets from Discrete Event Simulations (DES) of warehouse operations to identify bottlenecks and inefficiencies is a critical yet challenging task, often demanding significant manual effort or specialized analytical tools. Our framework integrates Knowledge Graphs (KGs) and Large Language Model (LLM)-based agents to analyze complex Discrete Event Simulation (DES) output data from warehouse operations. It transforms raw DES data into a semantically rich KG, capturing relationships between simulation events and entities. An LLM-based agent uses iterative reasoning, generating interdependent sub-questions. For each sub-question, it creates Cypher queries for KG interaction, extracts information, and self-reflects to correct errors. This adaptive, iterative, and self-correcting process identifies operational issues mimicking human analysis. Our DES approach for warehouse bottleneck identification, tested with equipment breakdowns and process irregularities, outperforms baseline methods. For operational questions, it achieves near-perfect pass rates in pinpointing inefficiencies. For complex investigative questions, we demonstrate its superior diagnostic ability to uncover subtle, interconnected issues. This work bridges simulation modeling and AI (KG+LLM), offering a more intuitive method for actionable insights, reducing time-to-insight, and enabling automated warehouse inefficiency evaluation and diagnosis.


LiDAR Based Semantic Perception for Forklifts in Outdoor Environments

Serfling, Benjamin, Reichert, Hannes, Bayerlein, Lorenzo, Doll, Konrad, Radkhah-Lens, Kati

arXiv.org Artificial Intelligence

--In this study, we present a novel LiDAR-based semantic segmentation framework tailored for autonomous forklifts operating in complex outdoor environments. Central to our approach is the integration of a dual LiDAR system, which combines forward-facing and downward-angled LiDAR sensors to enable comprehensive scene understanding, specifically tailored for industrial material handling tasks. The dual configuration improves the detection and segmentation of dynamic and static obstacles with high spatial precision. Using high-resolution 3D point clouds captured from two sensors, our method employs a lightweight yet robust approach that segments the point clouds into safety-critical instance classes such as pedestrians, vehicles, and forklifts, as well as environmental classes such as driveable ground, lanes, and buildings. Experimental validation demonstrates that our approach achieves high segmentation accuracy while satisfying strict runtime requirements, establishing its viability for safety-aware, fully autonomous forklift navigation in dynamic warehouse and yard environments.


ADAPT: An Autonomous Forklift for Construction Site Operation

Huemer, Johannes, Murschitz, Markus, Schörghuber, Matthias, Reisinger, Lukas, Kadiofsky, Thomas, Weidinger, Christoph, Niedermeyer, Mario, Widy, Benedikt, Zeilinger, Marcel, Beleznai, Csaba, Glück, Tobias, Kugi, Andreas, Zips, Patrik

arXiv.org Artificial Intelligence

Efficient material logistics play a critical role in controlling costs and schedules in the construction industry. However, manual material handling remains prone to inefficiencies, delays, and safety risks. Autonomous forklifts offer a promising solution to streamline on-site logistics, reducing reliance on human operators and mitigating labor shortages. This paper presents the development and evaluation of the Autonomous Dynamic All-terrain Pallet Transporter (ADAPT), a fully autonomous off-road forklift designed for construction environments. Unlike structured warehouse settings, construction sites pose significant challenges, including dynamic obstacles, unstructured terrain, and varying weather conditions. To address these challenges, our system integrates AI-driven perception techniques with traditional approaches for decision making, planning, and control, enabling reliable operation in complex environments. We validate the system through extensive real-world testing, comparing its long-term performance against an experienced human operator across various weather conditions. We also provide a comprehensive analysis of challenges and key lessons learned, contributing to the advancement of autonomous heavy machinery. Our findings demonstrate that autonomous outdoor forklifts can operate near human-level performance, offering a viable path toward safer and more efficient construction logistics.


Evaluating Efficiency and Engagement in Scripted and LLM-Enhanced Human-Robot Interactions

Schreiter, Tim, Rüppel, Jens V., Hazra, Rishi, Rudenko, Andrey, Magnusson, Martin, Lilienthal, Achim J.

arXiv.org Artificial Intelligence

To achieve natural and intuitive interaction with people, HRI frameworks combine a wide array of methods for human perception, intention communication, human-aware navigation and collaborative action. In practice, when encountering unpredictable behavior of people or unexpected states of the environment, these frameworks may lack the ability to dynamically recognize such states, adapt and recover to resume the interaction. Large Language Models (LLMs), owing to their advanced reasoning capabilities and context retention, present a promising solution for enhancing robot adaptability. This potential, however, may not directly translate to improved interaction metrics. This paper considers a representative interaction with an industrial robot involving approach, instruction, and object manipulation, implemented in two conditions: (1) fully scripted and (2) including LLM-enhanced responses. We use gaze tracking and questionnaires to measure the participants' task efficiency, engagement, and robot perception. The results indicate higher subjective ratings for the LLM condition, but objective metrics show that the scripted condition performs comparably, particularly in efficiency and focus during simple tasks. We also note that the scripted condition may have an edge over LLM-enhanced responses in terms of response latency and energy consumption, especially for trivial and repetitive interactions.


Aim My Robot: Precision Local Navigation to Any Object

Meng, Xiangyun, Yang, Xuning, Jung, Sanghun, Ramos, Fabio, Jujjavarapu, Srid Sadhan, Paul, Sanjoy, Fox, Dieter

arXiv.org Artificial Intelligence

Abstract-- Existing navigation systems mostly consider "success" when the robot reaches within 1m radius to a goal. To this end, we design and implement Aim-My-Robot (AMR), a local navigation system that enables a robot to reach any object in its vicinity at the desired relative pose, with centimeterlevel precision. AMR shows strong sim2real transfer and can adapt to different robot kinematics and unseen objects with little to no fine-tuning. But this usually requires specific the goal reached when the robot is within 1m radius to the object information such as 3D models [13], and the object goal [8], [11], [12]. This lax definition of success hinders being initially visible. This limits its applicability when the their applicability to the growing need for mobile robots to object 3D model is not available or the object is initially out navigate to objects with precisely.