quaternion
RLCNet: An end-to-end deep learning framework for simultaneous online calibration of LiDAR, RADAR, and Camera
Cholakkal, Hafeez Husain, Arrigoni, Stefano, Braghin, Francesco
UTONOMOUS vehicles are poised to revolutionize transportation by improving road safety, reducing traffic congestion, and increasing mobility convenience [1]. To perceive and interact with their environment accurately, these vehicles rely on a combination of complementary sensors, including LiDAR, RADAR, and cameras. Each sensor offers unique advantages: cameras capture rich visual detail, LiDAR provides precise 3D spatial measurements, and RADAR performs robustly under adverse weather conditions [2]. Sensor fusion leverages the strengths of these modalities to ensure redundancy and resilience, allowing the vehicle to maintain accurate perception in diverse and dynamic environments [3]. A critical component of sensor fusion is extrinsic calibration, which involves the determination of the relative positions and orientations of sensors in a common coordinate frame. However, maintaining precise calibration over time is a persistent challenge. Factors such as mechanical vibrations, temperature changes, and minor collisions can lead to sensor drift, where even small misalignments in sensor orientation or position can result in substantial perception errors, potentially compromising vehicle safety.
GeoPE:A Unified Geometric Positional Embedding for Structured Tensors
Standard Vision Transformers flatten 2D images into 1D sequences, disrupting the natural spatial topology. While Rotary Positional Embedding (RoPE) excels in 1D, it inherits this limitation, often treating spatially distant patches (e.g., at row edges) as sequence neighbors. Existing 2D approaches typically treat spatial axes independently, failing to decouple this false sequential proximity from true spatial distance. To restore the 2D spatial manifold, we introduce Geometric Positional Embedding (GeoPE), a framework that extends rotations to 3D Euclidean space using quaternions. To overcome non-commutativity and ensure symmetry, GeoPE constructs a unified rotational operator by computing the geometric mean in the Lie algebra. This creates a geometrically coupled encoding that effectively separates spatial dimensions. Extensive experiments on image classification, object detection, and 3D semantic segmentation demonstrate that GeoPE consistently outperforms existing 2D RoPE variants and significantly enhances shape bias, confirming its ability to capture true geometric structure.
Clifford Algebraic Rotor Embeddings : Maybe embeddings should start to CARE
Sriram, Sameeksha, Paliwal, Ayush, Ecker, Alexander S., van de Geijn, Chase
Rotary Positional Embeddings (RoPE) have demonstrated exceptional performance as a positional encoding method, consistently outperforming their baselines. While recent work has sought to extend RoPE to higher-dimensional inputs, many such extensions are non-commutative, thereby forfeiting RoPE's shift-equivariance property. Spherical RoPE is one such non-commutative variant, motivated by the idea of rotating embedding vectors on spheres rather than circles. However, spherical rotations are inherently non-commutative, making the choice of rotation sequence ambiguous. In this work, we explore a quaternion-based approach -- Quaternion Rotary Embeddings (QuatRo) -- in place of Euler angles, leveraging quaternions' ability to represent 3D rotations to parameterize the axes of rotation. We show Mixed RoPE and Spherical RoPE to be special cases of QuatRo. Further, we propose a generalization of QuatRo to Clifford Algebraic Rotary Embeddings (CARE) using geometric algebra. Viewing quaternions as the even subalgebra of Cl(3,0,0), we extend the notion of rotary embeddings from quaternions to Clifford rotors acting on multivectors. This formulation enables two key generalizations: (1) extending rotary embeddings to arbitrary dimensions, and (2) encoding positional information in multivectors of multiple grades, not just vectors. We present preliminary experiments comparing spherical, quaternion, and Clifford-based rotary embeddings.
MirrorLimb: Implementing hand pose acquisition and robot teleoperation based on RealMirror
Tai, Cong, Wu, Hansheng, Long, Haixu, Long, Zhengbin, Zheng, Zhaoyu, Xiang, Haodong, Shen, Tao
In this work, we present a PICO-based robot remote operating framework that enables low-cost, real-time acquisition of hand motion and pose data, outperforming mainstream visual tracking and motion capture solutions in terms of cost-effectiveness. The framework is natively compatible with the RealMirror ecosystem, offering ready-to-use functionality for stable and precise robotic trajectory recording within the Isaac simulation environment, thereby facilitating the construction of Vision-Language-Action (VLA) datasets. Additionally, the system supports real-time teleoperation of a variety of end-effector-equipped robots, including dexterous hands and robotic grippers. This work aims to lower the technical barriers in the study of upper-limb robotic manipulation, thereby accelerating advancements in VLA-related research.
Adaptive Multirobot Virtual Structure Control using Dual Quaternions
Giribet, Juan I., Ghersin, Alejandro S., Mas, Ignacio, Marciano, Harrison Neves, Villa, Daniel Khede Dourado, Sarcinelli-Filho, Mario
Unmanned Aerial Vehicles (UAVs), particularly multi-rotor platforms, have rapidly advanced in research and applications due to their unique capabilities, including vertical takeoff and landing (VTOL), hovering, and high maneuverability. These features make them ideal for complex environments and have driven their adoption in fields such as environmental monitoring, precision agriculture, infrastructure inspection, and emergency response, among others. A key area of recent interest is the control and coordination of multiple UAVs in formation. Formation control enables groups of UAVs to maintain specific geometric arrangements while performing tasks, offering advantages such as enhanced coverage, efficiency, and redundancy [24]. These benefits are critical for applications ranging from search and rescue to cooperative tasks like cargo transport and aerial cinematography.
Adaptive Inverse Kinematics Framework for Learning Variable-Length Tool Manipulation in Robotics
Kothavale, Prathamesh, Boddepalli, Sravani
Abstract--Conventional robots possess a limited understanding of their kinematics and are confined to preprogrammed tasks, hindering their ability to leverage tools efficiently. Driven by the essential components of tool usage--grasping the desired outcome, selecting the most suitable tool, determining optimal tool orientation, and executing precise manipulations--we introduce a pioneering framework. Our novel approach expands the capabilities of the robot's inverse kinematics solver, empowering it to acquire a sequential repertoire of actions using tools of varying lengths. By integrating a simulation-learned action trajectory with the tool, we showcase the practicality of transferring acquired skills from simulation to real-world scenarios through comprehensive experimentation. Remarkably, our extended inverse kinematics solver demonstrates an impressive error rate of less than 1cm. Furthermore, our trained policy achieves a mean error of 8cm in simulation. Noteworthy, our model achieves virtually indistinguishable performance when employing two distinct tools of different lengths. This research provides an indication of potential advances in the exploration of all four fundamental aspects of tool usage, enabling robots to master the intricate art of tool manipulation across diverse tasks. Tool use is the employment of a device or object held in a robotic gripper or hand to fulfill a task goal. Humans and animals like the New Caledonian crow have learned to use tools to accomplish tasks that they were not previously able to do when using only their own bodies or appendages.