Goto

Collaborating Authors

 sonographer


UltraDP: Generalizable Carotid Ultrasound Scanning with Force-Aware Diffusion Policy

Chen, Ruoqu, Yan, Xiangjie, Lv, Kangchen, Huang, Gao, Li, Zheng, Li, Xiang

arXiv.org Artificial Intelligence

Ultrasound scanning is a critical imaging technique for real-time, non-invasive diagnostics. However, variations in patient anatomy and complex human-in-the-loop interactions pose significant challenges for autonomous robotic scanning. Existing ultrasound scanning robots are commonly limited to relatively low generalization and inefficient data utilization. To overcome these limitations, we present UltraDP, a Diffusion-Policy-based method that receives multi-sensory inputs (ultrasound images, wrist camera images, contact wrench, and probe pose) and generates actions that are fit for multi-modal action distributions in autonomous ultrasound scanning of carotid artery. We propose a specialized guidance module to enable the policy to output actions that center the artery in ultrasound images. To ensure stable contact and safe interaction between the robot and the human subject, a hybrid force-impedance controller is utilized to drive the robot to track such trajectories. Also, we have built a large-scale training dataset for carotid scanning comprising 210 scans with 460k sample pairs from 21 volunteers of both genders. By exploring our guidance module and DP's strong generalization ability, UltraDP achieves a 95% success rate in transverse scanning on previously unseen subjects, demonstrating its effectiveness.


Measurement and Potential Field-Based Patient Modeling for Model-Mediated Tele-ultrasound

Yeung, Ryan S., Black, David G., Salcudean, Septimiu E.

arXiv.org Artificial Intelligence

Teleoperated ultrasound can improve diagnostic medical imaging access for remote communities. Having accurate force feedback is important for enabling sonographers to apply the appropriate probe contact force to optimize ultrasound image quality. However, large time delays in communication make direct force feedback impractical. Prior work investigated using point cloud-based model-mediated teleoperation and internal potential field models to estimate contact forces and torques. We expand on this by introducing a method to update the internal potential field model of the patient with measured positions and forces for more transparent model-mediated tele-ultrasound. We first generate a point cloud model of the patient's surface and transmit this to the sonographer in a compact data structure. This is converted to a static voxelized volume where each voxel contains a potential field value. These values determine the forces and torques, which are rendered based on overlap between the voxelized volume and a point shell model of the ultrasound transducer. We solve for the potential field using a convex quadratic that combines the spatial Laplace operator with measured forces. This was evaluated on volunteer patients ($n=3$) by computing the accuracy of rendered forces. Results showed the addition of measured forces to the model reduced the force magnitude error by an average of 7.23 N and force vector angle error by an average of 9.37$^{\circ}$ compared to using only Laplace's equation.


Optimal patient allocation for echocardiographic assessments

Sun, Bozhi, Tierney, Seda, Feinstein, Jeffrey A., Damen, Frederick, Marsden, Alison L., Schiavazzi, Daniele E.

arXiv.org Artificial Intelligence

Scheduling echocardiographic exams in a hospital presents significant challenges due to non-deterministic factors (e.g., patient no-shows, patient arrival times, diverse exam durations, etc.) and asymmetric resource constraints between fetal and non-fetal patient streams. To address these challenges, we first conducted extensive pre-processing on one week of operational data from the Echo Laboratory at Stanford University's Lucile Packard Children's Hospital, to estimate patient no-show probabilities and derive empirical distributions of arrival times and exam durations. Based on these inputs, we developed a discrete-event stochastic simulation model using SimPy, and integrate it with the open source Gymnasium Python library. As a baseline for policy optimization, we developed a comparative framework to evaluate on-the-fly versus reservation-based allocation strategies, in which different proportions of resources are reserved in advance. Considering a hospital configuration with a 1:6 ratio of fetal to non-fetal rooms and a 4:2 ratio of fetal to non-fetal sonographers, we show that on-the-fly allocation generally yields better performance, more effectively adapting to patient variability and resource constraints. Building on this foundation, we apply reinforcement learning (RL) to derive an approximated optimal dynamic allocation policy. This RL-based policy is benchmarked against the best-performing rule-based strategies, allowing us to quantify their differences and provide actionable insights for improving echo lab efficiency through intelligent, data-driven resource management.


Mixed Reality Tele-ultrasound over 750 km: a Clinical Study

Yeung, Ryan, Black, David, Chen, Patrick B., Lessoway, Victoria, Reid, Janice, Rangel-Suarez, Sergio, Chang, Silvia D., Salcudean, Septimiu E.

arXiv.org Artificial Intelligence

Ultrasound is a hand-held, low-cost, non-invasive medical imaging modality which plays a vital role in diagnosing various diseases. Despite this, many rural and remote communities do not have access to ultrasound scans due to the lack of local experts trained to perform them. To address this challenge, we built a mixed reality and haptics-based tele-ultrasound system to enable an expert to precisely guide a novice remotely in carrying out an ultrasound exam. The precision and flexibility of our solution makes it more practical than existing tele-ultrasound solutions. We tested the system in Skidegate on the islands of Haida Gwaii, BC, Canada, with the experts positioned 754 km away at the University of British Columbia, Vancouver, Canada. We performed 11 scans with 10 novices and 2 experts. The experts were tasked with acquiring 5 target images and measurements in the epigastric region. The novices of various backgrounds and ages were all inexperienced in mixed reality and were not required to have prior ultrasound experience. The captured images were evaluated by two radiologists who were not present for the tests. These results are discussed along with new insights into the human computer interaction in such a system. We show that human teleoperation is feasible and can achieve high performance for completing remote ultrasound procedures, even at a large distance and with completely novice followers.


Coaching a Robotic Sonographer: Learning Robotic Ultrasound with Sparse Expert's Feedback

Raina, Deepak, Balakuntala, Mythra V., Kim, Byung Wook, Wachs, Juan, Voyles, Richard

arXiv.org Artificial Intelligence

Ultrasound is widely employed for clinical intervention and diagnosis, due to its advantages of offering non-invasive, radiation-free, and real-time imaging. However, the accessibility of this dexterous procedure is limited due to the substantial training and expertise required of operators. The robotic ultrasound (RUS) offers a viable solution to address this limitation; nonetheless, achieving human-level proficiency remains challenging. Learning from demonstrations (LfD) methods have been explored in RUS, which learns the policy prior from a dataset of offline demonstrations to encode the mental model of the expert sonographer. However, active engagement of experts, i.e. Coaching, during the training of RUS has not been explored thus far. Coaching is known for enhancing efficiency and performance in human training. This paper proposes a coaching framework for RUS to amplify its performance. The framework combines DRL (self-supervised practice) with sparse expert's feedback through coaching. The DRL employs an off-policy Soft Actor-Critic (SAC) network, with a reward based on image quality rating. The coaching by experts is modeled as a Partially Observable Markov Decision Process (POMDP), which updates the policy parameters based on the correction by the expert. The validation study on phantoms showed that coaching increases the learning rate by $25\%$ and the number of high-quality image acquisition by $74.5\%$.


LCE: A Framework for Explainability of DNNs for Ultrasound Image Based on Concept Discovery

Kong, Weiji, Gong, Xun, Wang, Juan

arXiv.org Artificial Intelligence

Explaining the decisions of Deep Neural Networks (DNNs) for medical images has become increasingly important. Existing attribution methods have difficulty explaining the meaning of pixels while existing concept-based methods are limited by additional annotations or specific model structures that are difficult to apply to ultrasound images. In this paper, we propose the Lesion Concept Explainer (LCE) framework, which combines attribution methods with concept-based methods. We introduce the Segment Anything Model (SAM), fine-tuned on a large number of medical images, for concept discovery to enable a meaningful explanation of ultrasound image DNNs. The proposed framework is evaluated in terms of both faithfulness and understandability. We point out deficiencies in the popular faithfulness evaluation metrics and propose a new evaluation metric. Our evaluation of public and private breast ultrasound datasets (BUSI and FG-US-B) shows that LCE performs well compared to commonly-used explainability methods. Finally, we also validate that LCE can consistently provide reliable explanations for more meaningful fine-grained diagnostic tasks in breast ultrasound.


Imitation Learning for Robotic Assisted Ultrasound Examination of Deep Venous Thrombosis using Kernelized Movement Primitives

Dall'Alba, Diego, Busellato, Lorenzo, Savarimuthu, Thiusius Rajeeth, Cheng, Zhuoqi, Iturrate, Iñigo

arXiv.org Artificial Intelligence

Deep Vein Thrombosis (DVT) is a common yet potentially fatal condition, often leading to critical complications like pulmonary embolism. DVT is commonly diagnosed using Ultrasound (US) imaging, which can be inconsistent due to its high dependence on the operator's skill. Robotic US Systems (RUSs) aim to improve diagnostic test consistency but face challenges with the complex scanning pattern needed for DVT assessment, where precise control over US probe pressure is crucial for indirectly detecting occlusions. This work introduces an imitation learning method, based on Kernelized Movement Primitives (KMP), to standardize DVT US exams by training an autonomous robotic controller using sonographer demonstrations. A new recording device design enhances demonstration ergonomics, integrating with US probes and enabling seamless force and position data recording. KMPs are used to capture scanning skills, linking scan trajectory and force, enabling generalization beyond the demonstrations. Our approach, evaluated on synthetic models and volunteers, shows that the KMP-based RUS can replicate an expert's force control and image quality in DVT US examination. It outperforms previous methods using manually defined force profiles, improving exam standardization and reducing reliance on specialized sonographers.


Cardiac Copilot: Automatic Probe Guidance for Echocardiography with World Model

Jiang, Haojun, Sun, Zhenguo, Jia, Ning, Li, Meng, Sun, Yu, Luo, Shaqi, Song, Shiji, Huang, Gao

arXiv.org Artificial Intelligence

Echocardiography is the only technique capable of real-time imaging of the heart and is vital for diagnosing the majority of cardiac diseases. However, there is a severe shortage of experienced cardiac sonographers, due to the heart's complex structure and significant operational challenges. To mitigate this situation, we present a Cardiac Copilot system capable of providing real-time probe movement guidance to assist less experienced sonographers in conducting freehand echocardiography. This system can enable non-experts, especially in primary departments and medically underserved areas, to perform cardiac ultrasound examinations, potentially improving global healthcare delivery. The core innovation lies in proposing a data-driven world model, named Cardiac Dreamer, for representing cardiac spatial structures. This world model can provide structure features of any cardiac planes around the current probe position in the latent space, serving as an precise navigation map for autonomous plane localization. We train our model with real-world ultrasound data and corresponding probe motion from 110 routine clinical scans with 151K sample pairs by three certified sonographers. Evaluations on three standard planes with 37K sample pairs demonstrate that the world model can reduce navigation errors by up to 33\% and exhibit more stable performance.


Machine Learning in Robotic Ultrasound Imaging: Challenges and Perspectives

Bi, Yuan, Jiang, Zhongliang, Duelmer, Felix, Huang, Dianye, Navab, Nassir

arXiv.org Artificial Intelligence

This article reviews the recent advances in intelligent robotic ultrasound (US) imaging systems. We commence by presenting the commonly employed robotic mechanisms and control techniques in robotic US imaging, along with their clinical applications. Subsequently, we focus on the deployment of machine learning techniques in the development of robotic sonographers, emphasizing crucial developments aimed at enhancing the intelligence of these systems. The methods for achieving autonomous action reasoning are categorized into two sets of approaches: those relying on implicit environmental data interpretation and those using explicit interpretation. Throughout this exploration, we also discuss practical challenges, including those related to the scarcity of medical data, the need for a deeper understanding of the physical aspects involved, and effective data representation approaches. Moreover, we conclude by highlighting the open problems in the field and analyzing different possible perspectives on how the community could move forward in this research area.


Whole-examination AI estimation of fetal biometrics from 20-week ultrasound scans

Venturini, Lorenzo, Budd, Samuel, Farruggia, Alfonso, Wright, Robert, Matthew, Jacqueline, Day, Thomas G., Kainz, Bernhard, Razavi, Reza, Hajnal, Jo V.

arXiv.org Artificial Intelligence

The current approach to fetal anomaly screening is based on biometric measurements derived from individually selected ultrasound images. In this paper, we introduce a paradigm shift that attains human-level performance in biometric measurement by aggregating automatically extracted biometrics from every frame across an entire scan, with no need for operator intervention. We use a convolutional neural network to classify each frame of an ultrasound video recording. We then measure fetal biometrics in every frame where appropriate anatomy is visible. We use a Bayesian method to estimate the true value of each biometric from a large number of measurements and probabilistically reject outliers. We performed a retrospective experiment on 1457 recordings (comprising 48 million frames) of 20-week ultrasound scans, estimated fetal biometrics in those scans and compared our estimates to the measurements sonographers took during the scan. Our method achieves human-level performance in estimating fetal biometrics and estimates well-calibrated credible intervals in which the true biometric value is expected to lie.