Goto

Collaborating Authors

 Li, Keyu


Deep learning automates Cobb angle measurement compared with multi-expert observers

arXiv.org Artificial Intelligence

Scoliosis, a prevalent condition characterized by abnormal spinal curvature leading to deformity, requires precise assessment methods for effective diagnosis and management. The Cobb angle is a widely used scoliosis quantification method that measures the degree of curvature between the tilted vertebrae. Yet, manual measuring of Cobb angles is time-consuming and labor-intensive, fraught with significant interobserver and intraobserver variability. To address these challenges and the lack of interpretability found in certain existing automated methods, we have created fully automated software that not only precisely measures the Cobb angle but also provides clear visualizations of these measurements. This software integrates deep neural network-based spine region detection and segmentation, spine centerline identification, pinpointing the most significantly tilted vertebrae, and direct visualization of Cobb angles on the original images. Upon comparison with the assessments of 7 expert readers, our algorithm exhibited a mean deviation in Cobb angle measurements of 4.17 degrees, notably surpassing the manual approach's average intra-reader discrepancy of 5.16 degrees. The algorithm also achieved intra-class correlation coefficients (ICC) exceeding 0.96 and Pearson correlation coefficients above 0.944, reflecting robust agreement with expert assessments and superior measurement reliability. Through the comprehensive reader study and statistical analysis, we believe this algorithm not only ensures a higher consensus with expert readers but also enhances interpretability and reproducibility during assessments. It holds significant promise for clinical application, potentially aiding physicians in more accurate scoliosis assessment and diagnosis, thereby improving patient care.


Causal Mediation Analysis with Multi-dimensional and Indirectly Observed Mediators

arXiv.org Artificial Intelligence

Causal mediation analysis (CMA) is a powerful method to dissect the total effect of a treatment into direct and mediated effects within the potential outcome framework. This is important in many scientific applications to identify the underlying mechanisms of a treatment effect. However, in many scientific applications the mediator is unobserved, but there may exist related measurements. For example, we may want to identify how changes in brain activity or structure mediate an antidepressant's effect on behavior, but we may only have access to electrophysiological or imaging brain measurements. To date, most CMA methods assume that the mediator is one-dimensional and observable, which oversimplifies such real-world scenarios. To overcome this limitation, we introduce a CMA framework that can handle complex and indirectly observed mediators based on the identifiable variational autoencoder (iVAE) architecture. We prove that the true joint distribution over observed and latent variables is identifiable with the proposed method. Additionally, our framework captures a disentangled representation of the indirectly observed mediator and yields accurate estimation of the direct and mediated effects in synthetic and semi-synthetic experiments, providing evidence of its potential utility in real-world applications.


Closed-Loop Magnetic Manipulation for Robotic Transesophageal Echocardiography

arXiv.org Artificial Intelligence

This paper presents a closed-loop magnetic manipulation framework for robotic transesophageal echocardiography (TEE) acquisitions. Different from previous work on intracorporeal robotic ultrasound acquisitions that focus on continuum robot control, we first investigate the use of magnetic control methods for more direct, intuitive, and accurate manipulation of the distal tip of the probe. We modify a standard TEE probe by attaching a permanent magnet and an inertial measurement unit sensor to the probe tip and replacing the flexible gastroscope with a soft tether containing only wires for transmitting ultrasound signals, and show that 6-DOF localization and 5-DOF closed-loop control of the probe can be achieved with an external permanent magnet based on the fusion of internal inertial measurement and external magnetic field sensing data. The proposed method does not require complex structures or motions of the actuator and the probe compared with existing magnetic manipulation methods. We have conducted extensive experiments to validate the effectiveness of the framework in terms of localization accuracy, update rate, workspace size, and tracking accuracy. In addition, our results obtained on a realistic cardiac tissue-mimicking phantom show that the proposed framework is applicable in real conditions and can generally meet the requirements for tele-operated TEE acquisitions.


Estimating Causal Effects using a Multi-task Deep Ensemble

arXiv.org Artificial Intelligence

A number of methods have been proposed for causal effect estimation, yet few have demonstrated efficacy in handling data with complex structures, such as images. To fill this gap, we propose Causal Multi-task Deep Ensemble (CMDE), a novel framework that learns both shared and group-specific information from the study population. We provide proofs demonstrating equivalency of CDME to a multi-task Gaussian process (GP) with a coregionalization kernel a priori. Compared to multi-task GP, CMDE efficiently handles high-dimensional and multi-modal covariates and provides pointwise uncertainty estimates of causal effects. We evaluate our method across various types of datasets and tasks and find that CMDE outperforms state-of-the-art methods on a majority of these tasks.


Style Transfer Enabled Sim2Real Framework for Efficient Learning of Robotic Ultrasound Image Analysis Using Simulated Data

arXiv.org Artificial Intelligence

Robotic ultrasound (US) systems have shown great potential to make US examinations easier and more accurate. Recently, various machine learning techniques have been proposed to realize automatic US image interpretation for robotic US acquisition tasks. However, obtaining large amounts of real US imaging data for training is usually expensive or even unfeasible in some clinical applications. An alternative is to build a simulator to generate synthetic US data for training, but the differences between simulated and real US images may result in poor model performance. This work presents a Sim2Real framework to efficiently learn robotic US image analysis tasks based only on simulated data for real-world deployment. A style transfer module is proposed based on unsupervised contrastive learning and used as a preprocessing step to convert the real US images into the simulation style. Thereafter, a task-relevant model is designed to combine CNNs with vision transformers to generate the task-dependent prediction with improved generalization ability. We demonstrate the effectiveness of our method in an image regression task to predict the probe position based on US images in robotic transesophageal echocardiography (TEE). Our results show that using only simulated US data and a small amount of unlabelled real data for training, our method can achieve comparable performance to semi-supervised and fully supervised learning methods. Moreover, the effectiveness of our previously proposed CT-based US image simulation method is also indirectly confirmed.


Planning-oriented Autonomous Driving

arXiv.org Artificial Intelligence

Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction, and planning. In order to perform a wide diversity of tasks and achieve advanced-level intelligence, contemporary approaches either deploy standalone models for individual tasks, or design a multi-task paradigm with separate heads. However, they might suffer from accumulative errors or deficient task coordination. Instead, we argue that a favorable framework should be devised and optimized in pursuit of the ultimate goal, i.e., planning of the self-driving car. Oriented at this, we revisit the key components within perception and prediction, and prioritize the tasks such that all these tasks contribute to planning. We introduce Unified Autonomous Driving (UniAD), a comprehensive framework up-to-date that incorporates full-stack driving tasks in one network. It is exquisitely devised to leverage advantages of each module, and provide complementary feature abstractions for agent interaction from a global perspective. Tasks are communicated with unified query interfaces to facilitate each other toward planning. We instantiate UniAD on the challenging nuScenes benchmark. With extensive ablations, the effectiveness of using such a philosophy is proven by substantially outperforming previous state-of-the-arts in all aspects. Code and models are public.


Domain Adaptation via Rebalanced Sub-domain Alignment

arXiv.org Artificial Intelligence

Unsupervised domain adaptation (UDA) is a technique used to transfer knowledge from a labeled source domain to a different but related unlabeled target domain. While many UDA methods have shown success in the past, they often assume that the source and target domains must have identical class label distributions, which can limit their effectiveness in real-world scenarios. To address this limitation, we propose a novel generalization bound that reweights source classification error by aligning source and target sub-domains. We prove that our proposed generalization bound is at least as strong as existing bounds under realistic assumptions, and we empirically show that it is much stronger on real-world data. We then propose an algorithm to minimize this novel generalization bound. We demonstrate by numerical experiments that this approach improves performance in shifted class distribution scenarios compared to state-of-the-art methods.


Image-Guided Navigation of a Robotic Ultrasound Probe for Autonomous Spinal Sonography Using a Shadow-aware Dual-Agent Framework

arXiv.org Artificial Intelligence

Ultrasound (US) imaging is commonly used to assist in the diagnosis and interventions of spine diseases, while the standardized US acquisitions performed by manually operating the probe require substantial experience and training of sonographers. In this work, we propose a novel dual-agent framework that integrates a reinforcement learning (RL) agent and a deep learning (DL) agent to jointly determine the movement of the US probe based on the real-time US images, in order to mimic the decision-making process of an expert sonographer to achieve autonomous standard view acquisitions in spinal sonography. Moreover, inspired by the nature of US propagation and the characteristics of the spinal anatomy, we introduce a view-specific acoustic shadow reward to utilize the shadow information to implicitly guide the navigation of the probe toward different standard views of the spine. Our method is validated in both quantitative and qualitative experiments in a simulation environment built with US data acquired from 17 volunteers. The average navigation accuracy toward different standard views achieves 5.18mm/5.25deg and 12.87mm/17.49deg in the intra- and inter-subject settings, respectively. The results demonstrate that our method can effectively interpret the US images and navigate the probe to acquire multiple standard views of the spine.


Human-Aware Robot Navigation via Reinforcement Learning with Hindsight Experience Replay and Curriculum Learning

arXiv.org Artificial Intelligence

In recent years, the growing demand for more intelligent service robots is pushing the development of mobile robot navigation algorithms to allow safe and efficient operation in a dense crowd. Reinforcement learning (RL) approaches have shown superior ability in solving sequential decision making problems, and recent work has explored its potential to learn navigation polices in a socially compliant manner. However, the expert demonstration data used in existing methods is usually expensive and difficult to obtain. In this work, we consider the task of training an RL agent without employing the demonstration data, to achieve efficient and collision-free navigation in a crowded environment. To address the sparse reward navigation problem, we propose to incorporate the hindsight experience replay (HER) and curriculum learning (CL) techniques with RL to efficiently learn the optimal navigation policy in the dense crowd. The effectiveness of our method is validated in a simulated crowd-robot coexisting environment. The results demonstrate that our method can effectively learn human-aware navigation without requiring additional demonstration data.


Automatic Recognition of Abdominal Organs in Ultrasound Images based on Deep Neural Networks and K-Nearest-Neighbor Classification

arXiv.org Artificial Intelligence

Abdominal ultrasound imaging has been widely used to assist in the diagnosis and treatment of various abdominal organs. In order to shorten the examination time and reduce the cognitive burden on the sonographers, we present a classification method that combines the deep learning techniques and k-Nearest-Neighbor (k-NN) classification to automatically recognize various abdominal organs in the ultrasound images in real time. Fine-tuned deep neural networks are used in combination with PCA dimension reduction to extract high-level features from raw ultrasound images, and a k-NN classifier is employed to predict the abdominal organ in the image. We demonstrate the effectiveness of our method in the task of ultrasound image classification to automatically recognize six abdominal organs. A comprehensive comparison of different configurations is conducted to study the influence of different feature extractors and classifiers on the classification accuracy. Both quantitative and qualitative results show that with minimal training effort, our method can "lazily" recognize the abdominal organs in the ultrasound images in real time with an accuracy of 96.67%. Our implementation code is publicly available at: https://github.com/LeeKeyu/abdominal_ultrasound_classification.