Goto

Collaborating Authors

 coil


On Efficient Online Imitation Learning via Classification

Neural Information Processing Systems

Imitation learning (IL) is a general learning paradigm for sequential decision-making problems. Interactive imitation learning, where learners can interactively query for expert annotations, has been shown to achieve provably superior sample efficiency guarantees compared with its offline counterpart or reinforcement learning. In this work, we study classification-based online imitation learning (abbrev. COIL) and the fundamental feasibility to design oracle-efficient regret-minimization algorithms in this setting, with a focus on the general non-realizable case. We make the following contributions: (1) we show that in the COIL problem, any proper online learning algorithm cannot guarantee a sublinear regret in general; (2) we propose Logger, an improper online learning algorithmic framework, that reduces COIL to online linear optimization, by utilizing a new definition of mixed policy class; (3) we design two oracle-efficient algorithms within the Logger framework that enjoy different sample and interaction round complexity tradeoffs, and show their improvements over behavior cloning; (4) we show that under standard complexity-theoretic assumptions, efficient dynamic regret minimization is infeasible in the Logger framework.


Curriculum Offline Imitating Learning

Neural Information Processing Systems

Offline reinforcement learning (RL) tasks require the agent to learn from a pre-collected dataset with no further interactions with the environment. Despite the potential to surpass the behavioral policies, RL-based methods are generally impractical due to the training instability and bootstrapping the extrapolation errors, which always require careful hyperparameter tuning via online evaluation. In contrast, offline imitation learning (IL) has no such issues since it learns the policy directly without estimating the value function by bootstrapping. However, IL is usually limited in the capability of the behavioral policy and tends to learn a mediocre behavior from the dataset collected by the mixture of policies. In this paper, we aim to take advantage of IL but mitigate such a drawback. Observing that behavior cloning is able to imitate neighboring policies with less data, we propose \textit{Curriculum Offline Imitation Learning (COIL)}, which utilizes an experience picking strategy to make the agent imitate from adaptive neighboring policies with a higher return, and improves the current policy along curriculum stages. On continuous control benchmarks, we compare COIL against both imitation-based methods and RL-based methods, showing that COIL not only avoids just learning a mediocre behavior on mixed datasets but is also even competitive with state-of-the-art offline RL methods.


Correspondence-Oriented Imitation Learning: Flexible Visuomotor Control with 3D Conditioning

Cao, Yunhao, Bhaumik, Zubin, Jia, Jessie, He, Xingyi, Fang, Kuan

arXiv.org Artificial Intelligence

We introduce Correspondence-Oriented Imitation Learning (COIL), a conditional policy learning framework for visuomotor control with a flexible task representation in 3D. At the core of our approach, each task is defined by the intended motion of keypoints selected on objects in the scene. Instead of assuming a fixed number of keypoints or uniformly spaced time intervals, COIL supports task specifications with variable spatial and temporal granularity, adapting to different user intents and task requirements. To robustly ground this correspondence-oriented task representation into actions, we design a conditional policy with a spatio-temporal attention mechanism that effectively fuses information across multiple input modalities. The policy is trained via a scalable self-supervised pipeline using demonstrations collected in simulation, with correspondence labels automatically generated in hindsight. COIL generalizes across tasks, objects, and motion patterns, achieving superior performance compared to prior methods on real-world manipulation tasks under both sparse and dense specifications.


Are induction stoves better? These chefs think so.

Popular Science

Induction stoves use electromagnetism to heat food more efficiently than any other kind of stovetop. Breakthroughs, discoveries, and DIY tips sent every weekday. Ask someone in the United States about "electric cooking" and they'll probably describe one of those awful coil stoves, the ones that take forever to heat up and then burn your dinner to a crisp the moment you take your eyes off it. This unfortunate association is perhaps one reason why induction cooking hasn't quite taken off in the U.S. the way it has elsewhere in the world--in Europe, for example, where induction stoves are commonplace. How do induction stoves differ from electric stoves?


Certified Coil Geometry Learning for Short-Range Magnetic Actuation and Spacecraft Docking Application

Takahashi, Yuta, Tajima, Hayate, Sakai, Shin-ichiro

arXiv.org Artificial Intelligence

This paper presents a learning-based framework for approximating an exact magnetic-field interaction model, supported by both numerical and experimental validation. High-fidelity magnetic-field interaction modeling is essential for achieving exceptional accuracy and responsiveness across a wide range of fields, including transportation, energy systems, medicine, biomedical robotics, and aerospace robotics. In aerospace engineering, magnetic actuation has been investigated as a fuel-free solution for multi-satellite attitude and formation control. Although the exact magnetic field can be computed from the Biot-Savart law, the associated computational cost is prohibitive, and prior studies have therefore relied on dipole approximations to improve efficiency. However, these approximations lose accuracy during proximity operations, leading to unstable behavior and even collisions. To address this limitation, we develop a learning-based approximation framework that faithfully reproduces the exact field while dramatically reducing computational cost. The proposed method additionally provides a certified error bound, derived from the number of training samples, ensuring reliable prediction accuracy. The learned model can also accommodate interactions between coils of different sizes through appropriate geometric transformations, without retraining. To verify the effectiveness of the proposed framework under challenging conditions, a spacecraft docking scenario is examined through both numerical simulations and experimental validation.





NODA-MMH: Certified Learning-Aided Nonlinear Control for Magnetically-Actuated Swarm Experiment Toward On-Orbit Proof

Takahashi, Yuta, Ochi, Atsuki, Tomioka, Yoichi, Sakai, Shin-Ichiro

arXiv.org Artificial Intelligence

This study experimentally validates the principle of large-scale satellite swarm control through learning-aided magnetic field interactions generated by satellite-mounted magnetorquers. This actuation presents a promising solution for the long-term formation maintenance of multiple satellites and has primarily been demonstrated in ground-based testbeds for two-satellite position control. However, as the number of satellites increases beyond three, fundamental challenges coupled with the high nonlinearity arise: 1) nonholonomic constraints, 2) underactuation, 3) scalability, and 4) computational cost. Previous studies have shown that time-integrated current control theoretically solves these problems, where the average actuator outputs align with the desired command, and a learning-based technique further enhances their performance. Through multiple experiments, we validate critical aspects of learning-aided time-integrated current control: (1) enhanced controllability of the averaged system dynamics, with a theoretically guaranteed error bound, and (2) decentralized current management. We design two-axis coils and a ground-based experimental setup utilizing an air-bearing platform, enabling a mathematical replication of orbital dynamics. Based on the effectiveness of the learned interaction model, we introduce NODA-MMH (Neural power-Optimal Dipole Allocation for certified learned Model-based Magnetically swarm control Harness) for model-based power-optimal swarm control. This study complements our tutorial paper on magnetically actuated swarms for the long-term formation maintenance problem.


MicroRoboScope: A Portable and Integrated Mechatronic Platform for Magnetic and Acoustic Microrobotic Experimentation

Sokolich, Max, Yang, Yanda, Cherukumilli, Subrahmanyam, Kirmizitas, Fatma Ceren, Das, Sambeeta

arXiv.org Artificial Intelligence

Microscale robots have a variety of potential applications in medicine, environmental monitoring, and tissue engineering, due to their small size and capabilities of sensing and manipulation at the small scale [1]. Recent research has demonstrated their potential in applications ranging from ocular drug delivery and in vitro fertilization to root canal prevention and tumor treatment [2, 3]. The most common actuation methods for microscale robots are acoustic and electromagnetic actuation [4]. Acoustic microrobots, for instance, can be manipulated using sound waves to achieve precise movements, while electromagnetic microrobots rely on magnetic fields for their actuation and control. Traditional open-loop control systems for acoustic and magnetic microrobots often fail to provide the necessary accuracy and reliability required for the above applications [5].