Goto

Collaborating Authors

 vertebra


Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

Neural Information Processing Systems

Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge. However, deployment of the LLMs in real-world applications can be challenging due to their high computational requirements and concerns on data privacy.


Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CT

Atad, Matan, Marka, Alexander W., Steinhelfer, Lisa, Curto-Vilalta, Anna, Leonhardt, Yannik, Foreman, Sarah C., Dietrich, Anna-Sophia Walburga, Graf, Robert, Gersing, Alexandra S., Menze, Bjoern, Rueckert, Daniel, Kirschke, Jan S., Möller, Hendrik

arXiv.org Artificial Intelligence

Accurate segmentation of vertebral metastasis in CT is clinically important yet difficult to scale, as voxel-level annotations are scarce and both lytic and blastic lesions often resemble benign degenerative changes. We introduce a weakly supervised method trained solely on vertebra-level healthy/malignant labels, without any lesion masks. The method combines a Diffusion Autoencoder (DAE) that produces a classifier-guided healthy edit of each vertebra with pixel-wise difference maps that propose candidate lesion regions. To determine which regions truly reflect malignancy, we introduce Hide-and-Seek Attribution: each candidate is revealed in turn while all others are hidden, the edited image is projected back to the data manifold by the DAE, and a latent-space classifier quantifies the isolated malignant contribution of that component. High-scoring regions form the final lytic or blastic segmentation. On held-out radiologist annotations, we achieve strong blastic/lytic performance despite no mask supervision (F1: 0.91/0.85; Dice: 0.87/0.78), exceeding baselines (F1: 0.79/0.67; Dice: 0.74/0.55). These results show that vertebra-level labels can be transformed into reliable lesion masks, demonstrating that generative editing combined with selective occlusion supports accurate weakly supervised segmentation in CT.


3D Path Planning for Robot-assisted Vertebroplasty from Arbitrary Bi-plane X-ray via Differentiable Rendering

Inigo, Blanca, Killeen, Benjamin D., Choi, Rebecca, Song, Michelle, Uneri, Ali, Khan, Majid, Bailey, Christopher, Krieger, Axel, Unberath, Mathias

arXiv.org Artificial Intelligence

Robotic systems are transforming image-guided interventions by enhancing accuracy and minimizing radiation exposure. A significant challenge in robotic assistance lies in surgical path planning, which often relies on the registration of intraoperative 2D images with preoperative 3D CT scans. This requirement can be burdensome and costly, particularly in procedures like vertebroplasty, where preoperative CT scans are not routinely performed. To address this issue, we introduce a differentiable rendering-based framework for 3D transpedicular path planning utilizing bi-planar 2D X-rays. Our method integrates differentiable rendering with a vertebral atlas generated through a Statistical Shape Model (SSM) and employs a learned similarity loss to refine the SSM shape and pose dynamically, independent of fixed imaging geometries. We evaluated our framework in two stages: first, through vertebral reconstruction from orthogonal X-rays for benchmarking, and second, via clinician-in-the-loop path planning using arbitrary-view X-rays. Our results indicate that our method outperformed a normalized cross-correlation baseline in reconstruction metrics (DICE: 0.75 vs. 0.65) and achieved comparable performance to the state-of-the-art model ReVerteR (DICE: 0.77), while maintaining generalization to arbitrary views. Success rates for bipedicular planning reached 82% with synthetic data and 75% with cadaver data, exceeding the 66% and 31% rates of a 2D-to-3D baseline, respectively. In conclusion, our framework facilitates versatile, CT-free 3D path planning for robot-assisted vertebroplasty, effectively accommodating real-world imaging diversity without the need for preoperative CT scans.


US-X Complete: A Multi-Modal Approach to Anatomical 3D Shape Recovery

Gafencu, Miruna-Alexandra, Velikova, Yordanka, Navab, Nassir, Azampour, Mohammad Farid

arXiv.org Artificial Intelligence

Ultrasound offers a radiation-free, cost-effective solution for real-time visualization of spinal landmarks, paraspinal soft tissues and neurovascular structures, making it valuable for intraoperative guidance during spinal procedures. However, ultrasound suffers from inherent limitations in visualizing complete vertebral anatomy, in particular vertebral bodies, due to acoustic shadowing effects caused by bone. In this work, we present a novel multi-modal deep learning method for completing occluded anatomical structures in 3D ultrasound by leveraging complementary information from a single X-ray image. To enable training, we generate paired training data consisting of: (1) 2D lateral vertebral views that simulate X-ray scans, and (2) 3D partial vertebrae representations that mimic the limited visibility and occlusions encountered during ultrasound spine imaging. Our method integrates morphological information from both imaging modalities and demonstrates significant improvements in vertebral reconstruction (p < 0.001) compared to state of art in 3D ultrasound vertebral completion. We perform phantom studies as an initial step to future clinical translation, and achieve a more accurate, complete volumetric lumbar spine visualization overlayed on the ultrasound scan without the need for registration with preoperative modalities such as computed tomography. This demonstrates that integrating a single X-ray projection mitigates ultrasound's key limitation while preserving its strengths as the primary imaging modality.


VoxTell: Free-Text Promptable Universal 3D Medical Image Segmentation

Rokuss, Maximilian, Langenberg, Moritz, Kirchhoff, Yannick, Isensee, Fabian, Hamm, Benjamin, Ulrich, Constantin, Regnery, Sebastian, Bauer, Lukas, Katsigiannopulos, Efthimios, Norajitra, Tobias, Maier-Hein, Klaus

arXiv.org Artificial Intelligence

We introduce VoxTell, a vision-language model for text-prompted volumetric medical image segmentation. It maps free-form descriptions, from single words to full clinical sentences, to 3D masks. Trained on 62K+ CT, MRI, and PET volumes spanning over 1K anatomical and pathological classes, VoxTell uses multi-stage vision-language fusion across decoder layers to align textual and visual features at multiple scales. It achieves state-of-the-art zero-shot performance across modalities on unseen datasets, excelling on familiar concepts while generalizing to related unseen classes. Extensive experiments further demonstrate strong cross-modality transfer, robustness to linguistic variations and clinical language, as well as accurate instance-specific segmentation from real-world text. Code is available at: https://www.github.com/MIC-DKFZ/VoxTell





A Biomimetic Vertebraic Soft Robotic Tail for High-Speed, High-Force Dynamic Maneuvering

Liu, Sicong, Liu, Jianhui, Chen, Fang, Yang, Wenjian, Yi, Juan, Zheng, Yu, Wang, Zheng, Chi, Wanchao, Song, Chaoyang

arXiv.org Artificial Intelligence

Robotic tails can enhance the stability and maneuverability of mobile robots, but current designs face a trade-off between the power of rigid systems and the safety of soft ones. Rigid tails generate large inertial effects but pose risks in unstructured environments, while soft tails lack sufficient speed and force. We present a Biomimetic Vertebraic Soft Robotic (BVSR) tail that resolves this challenge through a compliant pneumatic body reinforced by a passively jointed vertebral column inspired by musculoskeletal structures. This hybrid design decouples load-bearing and actuation, enabling high-pressure actuation (up to 6 bar) for superior dynamics while preserving compliance. A dedicated kinematic and dynamic model incorporating vertebral constraints is developed and validated experimentally. The BVSR tail achieves angular velocities above 670°/s and generates inertial forces and torques up to 5.58 N and 1.21 Nm, indicating over 200% improvement compared to non-vertebraic designs. Demonstrations on rapid cart stabilization, obstacle negotiation, high-speed steering, and quadruped integration confirm its versatility and practical utility for agile robotic platforms.


RadGS-Reg: Registering Spine CT with Biplanar X-rays via Joint 3D Radiative Gaussians Reconstruction and 3D/3D Registration

Shen, Ao, Fu, Xueming, Jiang, Junfeng, Zeng, Qiang, Tang, Ye, Chen, Zhengming, Nong, Luming, Wang, Feng, Zhou, S. Kevin

arXiv.org Artificial Intelligence

Computed Tomography (CT)/X-ray registration in image-guided navigation remains challenging because of its stringent requirements for high accuracy and real-time performance. Traditional "render and compare" methods, relying on iterative projection and comparison, suffer from spatial information loss and domain gap. 3D reconstruction from biplanar X-rays supplements spatial and shape information for 2D/3D registration, but current methods are limited by dense-view requirements and struggles with noisy X-rays. To address these limitations, we introduce RadGS-Reg, a novel framework for vertebral-level CT/X-ray registration through joint 3D Radiative Gaussians (RadGS) reconstruction and 3D/3D registration. Specifically, our biplanar X-rays vertebral RadGS reconstruction module explores learning-based RadGS reconstruction method with a Counterfactual Attention Learning (CAL) mechanism, focusing on vertebral regions in noisy X-rays. Additionally, a patient-specific pre-training strategy progressively adapts the RadGS-Reg from simulated to real data while simultaneously learning vertebral shape prior knowledge. Experiments on in-house datasets demonstrate the state-of-the-art performance for both tasks, surpassing existing methods. The code is available at: https://github.com/shenao1995/RadGS_Reg.