inertia
DZ-TDPO: Non-Destructive Temporal Alignment for Mutable State Tracking in Long-Context Dialogue
Long-context dialogue systems suffer from State Inertia, where static constraints prevent models from resolving conflicts between evolving user intents and established historical context. To address this, we propose DZ-TDPO, a non-destructive alignment framework that synergizes conflict-aware dynamic KL constraints with a calibrated temporal attention bias. Experiments on the Multi-Session Chat (MSC) dataset demonstrate that DZ-TDPO achieves state-of-the-art win rates (55.4% on Phi-3.5) while maintaining robust zero-shot generalization. Our scaling analysis reveals a "Capacity-Stability Trade-off": while smaller models incur an "alignment tax" (perplexity surge) to overcome historical inertia, the larger Qwen2.5-7B model achieves 50.8% win rate with negligible perplexity overhead. This confirms that TAI can be alleviated via precise attention regulation rather than destructive weight updates, preserving general capabilities (MMLU) across model scales. Code and data are available: https://github.com/lyj20071013/DZ-TDPO
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)
Experimental Comparison of Whole-Body Control Formulations for Humanoid Robots in Task Acceleration and Task Force Spaces
Sovukluk, Sait, Zambella, Grazia, Egle, Tobias, Ott, Christian
This paper studies the experimental comparison of two different whole-body control formulations for humanoid robots: inverse dynamics whole-body control (ID-WBC) and passivity-based whole-body control (PB-WBC). The two controllers fundamentally differ from each other as the first is formulated in task acceleration space and the latter is in task force space with passivity considerations. Even though both control methods predict stability under ideal conditions in closed-loop dynamics, their robustness against joint friction, sensor noise, unmodeled external disturbances, and non-perfect contact conditions is not evident. Therefore, we analyze and experimentally compare the two controllers on a humanoid robot platform through swing foot position and orientation control, squatting with and without unmodeled additional weights, and jumping. We also relate the observed performance and characteristic differences with the controller formulations and highlight each controller's advantages and disadvantages.
- Europe > Austria > Vienna (0.14)
- North America > United States > Ohio (0.04)
- Europe > Germany (0.04)
AutoTool: Efficient Tool Selection for Large Language Model Agents
Large Language Model (LLM) agents have emerged as powerful tools for automating complex tasks by leveraging the reasoning and decision-making abilities of LLMs. However, a major bottleneck in current agent frameworks lies in the high inference cost of tool selection, especially in approaches like ReAct that repeatedly invoke the LLM to determine which tool to use at each step. In this work, we propose AutoTool, a novel graph-based framework that bypasses repeated LLM inference by exploiting a key empirical observation: tool usage inertia - the tendency of tool invocations to follow predictable sequential patterns. AutoTool constructs a directed graph from historical agent trajectories, where nodes represent tools and edges capture transition probabilities, effectively modeling the inertia in tool selection. It further integrates parameter-level information to refine tool input generation. By traversing this structured representation, AutoTool efficiently selects tools and their parameters with minimal reliance on LLM inference. Extensive experiments across diverse agent tasks demonstrate that AutoTool reduces inference costs by up to 30% while maintaining competitive task completion rates, offering a practical and scalable enhancement for inference-heavy frameworks. Our work highlights the promise of integrating statistical structure into LLM agent design for greater efficiency without sacrificing performance.
- Workflow (1.00)
- Research Report (1.00)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Austria > Vienna (0.14)
- (18 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.67)
- North America > Canada (0.04)
- Asia > China > Hong Kong (0.04)
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion
Liao, Qiayuan, Truong, Takara E., Huang, Xiaoyu, Gao, Yuman, Tevet, Guy, Sreenath, Koushil, Liu, C. Karen
The human-like form of humanoid robots positions them uniquely to achieve the agility and versatility in motor skills that humans possess. Learning from human demonstrations offers a scalable approach to acquiring these capabilities. However, prior works either produce unnatural motions or rely on motion-specific tuning to achieve satisfactory naturalness. Furthermore, these methods are often motion- or goal-specific, lacking the versatility to compose diverse skills, especially when solving unseen tasks. We present BeyondMimic, a framework that scales to diverse motions and carries the versatility to compose them seamlessly in tackling unseen downstream tasks. At heart, a compact motion-tracking formulation enables mastering a wide range of radically agile behaviors, including aerial cartwheels, spin-kicks, flip-kicks, and sprinting, with a single setup and shared hyperparameters, all while achieving state-of-the-art human-like performance. Moving beyond the mere imitation of existing motions, we propose a unified latent diffusion model that empowers versatile goal specification, seamless task switching, and dynamic composition of these agile behaviors. Leveraging classifier guidance, a diffusion-specific technique for test-time optimization toward novel objectives, our model extends its capability to solve downstream tasks never encountered during training, including motion inpainting, joystick teleoperation, and obstacle avoidance, and transfers these skills zero-shot to real hardware. This work opens new frontiers for humanoid robots by pushing the limits of scalable human-like motor skill acquisition from human motion and advancing seamless motion synthesis that achieves generalization and versatility beyond training setups.
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Education (0.87)
- Leisure & Entertainment > Sports (0.46)
- Information Technology > Artificial Intelligence > Robots > Locomotion (0.68)
- Information Technology > Artificial Intelligence > Vision > Image Understanding (0.62)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Fooling Algorithms in Non-Stationary Bandits using Belief Inertia
We study the problem of worst case regret in piecewise stationary multi armed bandits. While the minimax theory for stationary bandits is well established, understanding analogous limits in time-varying settings is challenging. Existing lower bounds rely on what we refer to as infrequent sampling arguments, where long intervals without exploration allow adversarial reward changes that induce large regret. In this paper, we introduce a fundamentally different approach based on a belief inertia argument. Our analysis captures how an algorithm's empirical beliefs, encoded through historical reward averages, create momentum that resists new evidence after a change. We show how this inertia can be exploited to construct adversarial instances that mislead classical algorithms such as Explore Then Commit, epsilon greedy, and UCB, causing them to suffer regret that grows linearly with T and with a substantial constant factor, regardless of how their parameters are tuned, even with a single change point. We extend the analysis to algorithms that periodically restart to handle non stationarity and prove that, even then, the worst case regret remains linear in T. Our results indicate that utilizing belief inertia can be a powerful method for deriving sharp lower bounds in non stationary bandits.
- North America > United States > North Carolina (0.04)
- Asia > Middle East > Israel (0.04)
Towards Active Excitation-Based Dynamic Inertia Identification in Satellites
El-Hariry, Matteo, Franzese, Vittorio, Olivares-Mendez, Miguel
Abstract-- This paper presents a comprehensive analysis of how excitation design influences the identification of the inertia properties of rigid nano-and micro-satellites. We simulate nonlinear attitude dynamics with reaction-wheel coupling, actuator limits, and external disturbances, and excite the system using eight torque profiles of varying spectral richness. Two estimators are compared, a batch Least Squares method and an Extended Kalman Filter, across three satellite configurations and time-varying inertia scenarios. Results show that excitation frequency content and estimator assumptions jointly determine estimation accuracy and robustness, offering practical guidance for in-orbit adaptive inertia identification by outlining the conditions under which each method performs best. The ability to accurately characterize the inertia properties of a spacecraft while in mission flight is key for attitude and trajectory control.
Floating-Base Deep Lagrangian Networks
Schulze, Lucas, Negri, Juliano Decico, Barasuol, Victor, Medeiros, Vivian Suzano, Becker, Marcelo, Peters, Jan, Arenz, Oleg
Grey-box methods for system identification combine deep learning with physics-informed constraints, capturing complex dependencies while improving out-of-distribution generalization. Yet, despite the growing importance of floating-base systems such as humanoids and quadrupeds, current grey-box models ignore their specific physical constraints. For instance, the inertia matrix is not only positive definite but also exhibits branch-induced sparsity and input independence. Moreover, the 6x6 composite spatial inertia of the floating base inherits properties of single-rigid-body inertia matrices. As we show, this includes the triangle inequality on the eigenvalues of the composite rotational inertia. To address the lack of physical consistency in deep learning models of floating-base systems, we introduce a parameterization of inertia matrices that satisfies all these constraints. Inspired by Deep Lagrangian Networks (DeLaN), we train neural networks to predict physically plausible inertia matrices that minimize inverse dynamics error under Lagrangian mechanics. For evaluation, we collected and released a dataset on multiple quadrupeds and humanoids. In these experiments, our Floating-Base Deep Lagrangian Networks (FeLaN) achieve highly competitive performance on both simulated and real robots, while providing greater physical interpretability.
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- South America > Brazil (0.04)
- North America > Canada (0.04)
- Asia > China > Hong Kong (0.04)