Goto

Collaborating Authors

 wbc



HoMeR: Learning In-the-Wild Mobile Manipulation via Hybrid Imitation and Whole-Body Control

Sundaresan, Priya, Malhotra, Rhea, Miao, Phillip, Yang, Jingyun, Wu, Jimmy, Hu, Hengyuan, Antonova, Rika, Engelmann, Francis, Sadigh, Dorsa, Bohg, Jeannette

arXiv.org Artificial Intelligence

We introduce HoMeR, an imitation learning framework for mobile manipulation that combines whole-body control with hybrid action modes that handle both long-range and fine-grained motion, enabling effective performance on realistic in-the-wild tasks. At its core is a fast, kinematics-based whole-body controller that maps desired end-effector poses to coordinated motion across the mobile base and arm. Within this reduced end-effector action space, HoMeR learns to switch between absolute pose predictions for long-range movement and relative pose predictions for fine-grained manipulation, offloading low-level coordination to the controller and focusing learning on task-level decisions. We deploy HoMeR on a holonomic mobile manipulator with a 7-DoF arm in a real home. We compare HoMeR to baselines without hybrid actions or whole-body control across 3 simulated and 3 real household tasks such as opening cabinets, sweeping trash, and rearranging pillows. Across tasks, HoMeR achieves an overall success rate of 79.17% using just 20 demonstrations per task, outperforming the next best baseline by 29.17 on average. HoMeR is also compatible with vision-language models and can leverage their internet-scale priors to better generalize to novel object appearances, layouts, and cluttered scenes. In summary, HoMeR moves beyond tabletop settings and demonstrates a scalable path toward sample-efficient, generalizable manipulation in everyday indoor spaces. Code, videos, and supplementary material are available at: http://homer-manip.github.io


WBCAtt: A White Blood Cell Dataset Annotated with Detailed Morphological Attributes

Neural Information Processing Systems

We then annotated ten thousand WBC images with these attributes, resulting in 113k labels (11 attributes x 10.3k images). Annotating at this level of detail and scale is unprecedented, offering unique value to AI in pathology. Moreover, we conduct experiments to predict these attributes from cell images, and also demonstrate specific applications that can benefit from our detailed annotations.


The Role of Embodiment in Intuitive Whole-Body Teleoperation for Mobile Manipulation

Moyen, Sophia Bianchi, Krohn, Rickmer, Lueth, Sophie, Pompetzki, Kay, Peters, Jan, Prasad, Vignesh, Chalvatzaki, Georgia

arXiv.org Artificial Intelligence

-- Intuitive T eleoperation interfaces are essential for mobile manipulation robots to ensure high quality data collection while reducing operator workload. A strong sense of embodiment combined with minimal physical and cognitive demands not only enhances the user experience during large-scale data collection, but also helps maintain data quality over extended periods. This becomes especially crucial for challenging long-horizon mobile manipulation tasks that require whole-body coordination. We compare two distinct robot control paradigms: a coupled embodiment integrating arm manipulation and base navigation functions, and a decoupled embodiment treating these systems as separate control entities. Additionally, we evaluate two visual feedback mechanisms: immersive virtual reality and conventional screen-based visualization of the robot's field of view. These configurations were systematically assessed across a complex, multi-stage task sequence requiring integrated planning and execution. Our results show that the use of VR as a feedback modality increases task completion time, cognitive workload, and perceived effort of the teleoperator . Coupling manipulation and navigation leads to a comparable workload on the user as decoupling the embodiments, while preliminary experiments suggest that data acquired by coupled teleoperation leads to better imitation learning performance. Our holistic view on intuitive teleoperation interfaces provides valuable insight into collecting high-quality, high-dimensional mobile manipulation data at scale with the human operator in mind.


Assessing the Impact of Image Super Resolution on White Blood Cell Classification Accuracy

Nagarhalli, Tatwadarshi P., Pawar, Shruti S., Dahanukar, Soham A., Aswalekar, Uday, Save, Ashwini M., Patil, Sanket D.

arXiv.org Artificial Intelligence

Accurately classifying white blood cells from microscopic images is essential to identify several illnesses and conditions in medical diagnostics. Many deep learning technologies are being employed to quickly and automatically classify images. However, most of the time, the resolution of these microscopic pictures is quite low, which might make it difficult to classify them correctly. Some picture improvement techniques, such as image super-resolution, are being utilized to improve the resolution of the photos to get around this issue. The suggested study uses large image dimension upscaling to investigate how picture-enhancing approaches affect classification performance. The study specifically looks at how deep learning models may be able to understand more complex visual information by capturing subtler morphological changes when image resolution is increased using cutting-edge techniques. The model may learn from standard and augmented data since the improved images are incorporated into the training process. This dual method seeks to comprehend the impact of image resolution on model performance and enhance classification accuracy. A well-known model for picture categorization is used to conduct extensive testing and thoroughly evaluate the effectiveness of this approach. This research intends to create more efficient image identification algorithms customized to a particular dataset of white blood cells by understanding the trade-offs between ordinary and enhanced images.


Bridging the Sim-to-Real Gap for Athletic Loco-Manipulation

Fey, Nolan, Margolis, Gabriel B., Peticco, Martin, Agrawal, Pulkit

arXiv.org Artificial Intelligence

Achieving athletic loco-manipulation on robots requires moving beyond traditional tracking rewards - which simply guide the robot along a reference trajectory - to task rewards that drive truly dynamic, goal-oriented behaviors. Commands such as "throw the ball as far as you can" or "lift the weight as quickly as possible" compel the robot to exhibit the agility and power inherent in athletic performance. However, training solely with task rewards introduces two major challenges: these rewards are prone to exploitation (reward hacking), and the exploration process can lack sufficient direction. To address these issues, we propose a two-stage training pipeline. First, we introduce the Unsupervised Actuator Net (UAN), which leverages real-world data to bridge the sim-to-real gap for complex actuation mechanisms without requiring access to torque sensing. UAN mitigates reward hacking by ensuring that the learned behaviors remain robust and transferable. Second, we use a pre-training and fine-tuning strategy that leverages reference trajectories as initial hints to guide exploration. With these innovations, our robot athlete learns to lift, throw, and drag with remarkable fidelity from simulation to reality.


Benchmarking Different QP Formulations and Solvers for Dynamic Quadrupedal Walking

Stark, Franek, Middelberg, Jakob, Mronga, Dennis, Vyas, Shubham, Kirchner, Frank

arXiv.org Artificial Intelligence

Quadratic Programs (QPs) are widely used in the control of walking robots, especially in Model Predictive Control (MPC) and Whole-Body Control (WBC). In both cases, the controller design requires the formulation of a QP and the selection of a suitable QP solver, both requiring considerable time and expertise. While computational performance benchmarks exist for QP solvers, studies comparing optimal combinations of computational hardware (HW), QP formulation, and solver performance are lacking. In this work, we compare dense and sparse QP formulations, and multiple solving methods on different HW architectures, focusing on their computational efficiency in dynamic walking of four legged robots using MPC. We introduce the Solve Frequency per Watt (SFPW) as a performance measure to enable a cross hardware comparison of the efficiency of QP solvers. We also benchmark different QP solvers for WBC that we use for trajectory stabilization in quadrupedal walking. As a result, this paper provides recommendations for the selection of QP formulations and solvers for different HW architectures in walking robots and indicates which problems should be devoted the greater technical effort in this domain in future.

  Country: Europe > Germany > Bremen (0.28)
  Genre: Research Report (0.50)
  Industry: Energy > Oil & Gas > Upstream (0.49)

Automated Quantification of White Blood Cells in Light Microscopic Images of Injured Skeletal Muscle

Jiao, Yang, Derakhshan, Hananeh, Schneider, Barbara St. Pierre, Regentova, Emma, Yang, Mei

arXiv.org Artificial Intelligence

White blood cells (WBCs) are the most diverse cell types observed in the healing process of injured skeletal muscles. In the course of healing, WBCs exhibit dynamic cellular response and undergo multiple protein expression changes. The progress of healing can be analyzed by quantifying the number of WBCs or the amount of specific proteins in light microscopic images obtained at different time points after injury. In this paper, we propose an automated quantifying and analysis framework to analyze WBCs using light microscopic images of uninjured and injured muscles. The proposed framework is based on the Localized Iterative Otsu's threshold method with muscle edge detection and region of interest extraction. Compared with the threshold methods used in ImageJ, the LI Otsu's threshold method has high resistance to background area and achieves better accuracy. The CD68-positive cell results are presented for demonstrating the effectiveness of the proposed work.


Deriving Hematological Disease Classes Using Fuzzy Logic and Expert Knowledge: A Comprehensive Machine Learning Approach with CBC Parameters

Ameen, Salem, Balachandran, Ravivarman, Theodoridis, Theodoros

arXiv.org Artificial Intelligence

In the intricate field of medical diagnostics, capturing the subtle manifestations of diseases remains a challenge. Traditional methods, often binary in nature, may not encapsulate the nuanced variances that exist in real-world clinical scenarios. This paper introduces a novel approach by leveraging Fuzzy Logic Rules to derive disease classes based on expert domain knowledge from a medical practitioner. By recognizing that diseases do not always fit into neat categories, and that expert knowledge can guide the fuzzification of these boundaries, our methodology offers a more sophisticated and nuanced diagnostic tool. Using a dataset procured from a prominent hospital, containing detailed patient blood count records, we harness Fuzzy Logic Rules, a computational technique celebrated for its ability to handle ambiguity. This approach, moving through stages of fuzzification, rule application, inference, and ultimately defuzzification, produces refined diagnostic predictions. When combined with the Random Forest classifier, the system adeptly predicts hematological conditions using Complete Blood Count (CBC) parameters. Preliminary results showcase high accuracy levels, underscoring the advantages of integrating fuzzy logic into the diagnostic process. When juxtaposed with traditional diagnostic techniques, it becomes evident that Fuzzy Logic, especially when guided by medical expertise, offers significant advancements in the realm of hematological diagnostics. This paper not only paves the path for enhanced patient care but also beckons a deeper dive into the potentialities of fuzzy logic in various medical diagnostic applications.


Task-Space Riccati Feedback based Whole Body Control for Underactuated Legged Locomotion

Yang, Shunpeng, Hong, Zejun, Li, Sen, Wensing, Patrick, Zhang, Wei, Chen, Hua

arXiv.org Artificial Intelligence

This manuscript primarily aims to enhance the performance of whole-body controllers(WBC) for underactuated legged locomotion. We introduce a systematic parameter design mechanism for the floating-base feedback control within the WBC. The proposed approach involves utilizing the linearized model of unactuated dynamics to formulate a Linear Quadratic Regulator(LQR) and solving a Riccati gain while accounting for potential physical constraints through a second-order approximation of the log-barrier function. And then the user-tuned feedback gain for the floating base task is replaced by a new one constructed from the solved Riccati gain. Extensive simulations conducted in MuJoCo with a point bipedal robot, as well as real-world experiments performed on a quadruped robot, demonstrate the effectiveness of the proposed method. In the different bipedal locomotion tasks, compared with the user-tuned method, the proposed approach is at least 12% better and up to 50% better at linear velocity tracking, and at least 7% better and up to 47% better at angular velocity tracking. In the quadruped experiment, linear velocity tracking is improved by at least 3% and angular velocity tracking is improved by at least 23% using the proposed method.