Goto

Collaborating Authors

 key feature


Advancing Embodied Intelligence in Robotic-Assisted Endovascular Procedures: A Systematic Review of AI Solutions

Yao, Tianliang, Lu, Bo, Kowarschik, Markus, Yuan, Yixuan, Zhao, Hubin, Ourselin, Sebastien, Althoefer, Kaspar, Ge, Junbo, Qi, Peng

arXiv.org Artificial Intelligence

Endovascular procedures have revolutionized vascular disease treatment, yet their manual execution is challenged by the demands for high precision, operator fatigue, and radiation exposure. Robotic systems have emerged as transformative solutions to mitigate these inherent limitations. A pivotal moment has arrived, where a confluence of pressing clinical needs and breakthroughs in AI creates an opportunity for a paradigm shift toward Embodied Intelligence (EI), enabling robots to navigate complex vascular networks and adapt to dynamic physiological conditions. Data-driven approaches, leveraging advanced computer vision, medical image analysis, and machine learning, drive this evolution by enabling real-time vessel segmentation, device tracking, and anatomical landmark detection. Reinforcement learning and imitation learning further enhance navigation strategies and replicate expert techniques. This review systematically analyzes the integration of EI into endovascular robotics, identifying profound systemic challenges such as the heterogeneity in validation standards and the gap between human mimicry and machine-native capabilities. Based on this analysis, a conceptual roadmap is proposed that reframes the ultimate objective away from systems that supplant clinical decision-making. This vision of augmented intelligence, where the clinician's role evolves into that of a high-level supervisor, provides a principled foundation for the future of the field.


key feature of the method is that the measured data in a given direction is not directly used to estimate the denoised

Neural Information Processing Systems

We thank the reviewers for their positive comments on clarity, novelty, and convincing experiments. It was unexpected to us too that such a simple method would work so well. Sparsity in a learned basis is an important approach distinct from our own, and we will mention it. We train one regressor per held-out volume ( 2.2). Pastur (MP) are patch based algorithms, assembling voxels from a patch across volumes into a matrix.


Machine Learning Algorithm for Noise Reduction and Disease-Causing Gene Feature Extraction in Gene Sequencing Data

Si, Weichen, Ou, Yihao, Tian, Zhen

arXiv.org Artificial Intelligence

In this study, we propose a machine learning-based method for noise reduction and disease-causing gene feature extraction in gene sequencing DeepSeqDenoise algorithm combines CNN and RNN to effectively remove the sequencing noise, and improves the signal-to-noise ratio by 9.4 dB. We screened 17 key features by feature engineering, and constructed an integrated learning model to predict disease-causing genes with 94.3% accuracy. We successfully identified 57 new candidate disease-causing genes in a cardiovascular disease cohort validation, and detected 3 missed variants in clinical applications. The method significantly outperforms existing tools and provides strong support for accurate diagnosis of genetic diseases.


Utility Inspired Generalizations of TOPSIS

Susmaga, Robert, Szczech, Izabela

arXiv.org Artificial Intelligence

TOPSIS, a popular method for ranking alternatives is based on aggregated distances to ideal and anti-ideal points. As such, it was considered to be essentially different from widely popular and acknowledged `utility-based methods', which build rankings from weight-averaged utility values. Nonetheless, TOPSIS has recently been shown to be a natural generalization of these `utility-based methods' on the grounds that the distances it uses can be decomposed into so called weight-scaled means (WM) and weight-scaled standard deviations (WSD) of utilities. However, the influence that these two components exert on the final ranking cannot be in any way influenced in the standard TOPSIS. This is why, building on our previous results, in this paper we put forward modifications that make TOPSIS aggregations responsive to WM and WSD, achieving some amount of well interpretable control over how the rankings are influenced by WM and WSD. The modifications constitute a natural generalization of the standard TOPSIS method because, thanks to them, the generalized TOPSIS may turn into the original TOPSIS or, otherwise, following the decision maker's preferences, may trade off WM for WSD or WSD for WM. In the latter case, TOPSIS gradually reduces to a regular `utility-based method'. All in all, we believe that the proposed generalizations constitute an interesting practical tool for influencing the ranking by controlled application of a new form of decision maker's preferences.


CCDP: Composition of Conditional Diffusion Policies with Guided Sampling

Razmjoo, Amirreza, Calinon, Sylvain, Gienger, Michael, Zhang, Fan

arXiv.org Artificial Intelligence

-- Imitation Learning offers a promising approach to learn directly from data without requiring explicit models, simulations, or detailed task definitions. During inference, actions are sampled from the learned distribution and executed on the robot. However, sampled actions may fail for various reasons, and simply repeating the sampling step until a successful action is obtained can be inefficient. In this work, we propose an enhanced sampling strategy that refines the sampling distribution to avoid previously unsuccessful actions. We demonstrate that by solely utilizing data from successful demonstrations, our method can infer recovery actions without the need for additional exploratory behavior or a high-level controller . Furthermore, we leverage the concept of diffusion model decomposition to break down the primary problem--which may require long-horizon history to manage failures--into multiple smaller, more manageable sub-problems in learning, data collection, and inference, thereby enabling the system to adapt to variable failure counts. Our approach yields a low-level controller that dynamically adjusts its sampling space to improve efficiency when prior samples fall short. We validate our method across several tasks, including door opening with unknown directions, object manipulation, and button-searching scenarios, demonstrating that our approach outperforms traditional baselines. Supplementary materials for this paper are available on our website: https://hri-eu.github.io/ccdp/. I. INTRODUCTION Recent advances in imitation learning--exemplified by methods such as Implicit Behavior Cloning [1] and diffusion/flow-matching policies [2], [3]--have demonstrated remarkable capabilities in learning and replicating complex data distributions from demonstrations.


Grasping Partially Occluded Objects Using Autoencoder-Based Point Cloud Inpainting

Koebler, Alexander, Gross, Ralf, Buettner, Florian, Thon, Ingo

arXiv.org Artificial Intelligence

Flexible industrial production systems will play a central role in the future of manufacturing due to higher product individualization and customization. A key component in such systems is the robotic grasping of known or unknown objects in random positions. Real-world applications often come with challenges that might not be considered in grasping solutions tested in simulation or lab settings. Partial occlusion of the target object is the most prominent. Examples of occlusion can be supporting structures in the camera's field of view, sensor imprecision, or parts occluding each other due to the production process. In all these cases, the resulting lack of information leads to shortcomings in calculating grasping points. In this paper, we present an algorithm to reconstruct the missing information. Our inpainting solution facilitates the real-world utilization of robust object matching approaches for grasping point calculation. We demonstrate the benefit of our solution by enabling an existing grasping system embedded in a real-world industrial application to handle occlusions in the input. With our solution, we drastically decrease the number of objects discarded by the process.


LLM-based User Profile Management for Recommender System

Bang, Seunghwan, Song, Hwanjun

arXiv.org Artificial Intelligence

The rapid advancement of Large Language Models (LLMs) has opened new opportunities in recommender systems by enabling zero-shot recommendation without conventional training. Despite their potential, most existing works rely solely on users' purchase histories, leaving significant room for improvement by incorporating user-generated textual data, such as reviews and product descriptions. Addressing this gap, we propose PURE, a novel LLM-based recommendation framework that builds and maintains evolving user profiles by systematically extracting and summarizing key information from user reviews. PURE consists of three core components: a Review Extractor for identifying user preferences and key product features, a Profile Updater for refining and updating user profiles, and a Recommender for generating personalized recommendations using the most current profile. To evaluate PURE, we introduce a continuous sequential recommendation task that reflects real-world scenarios by adding reviews over time and updating predictions incrementally. Our experimental results on Amazon datasets demonstrate that PURE outperforms existing LLM-based methods, effectively leveraging long-term user information while managing token limitations.


Concept Bottleneck Large Language Models

Sun, Chung-En, Oikarinen, Tuomas, Ustun, Berk, Weng, Tsui-Wei

arXiv.org Artificial Intelligence

We introduce the Concept Bottleneck Large Language Model (CB-LLM), a pioneering approach to creating inherently interpretable Large Language Models (LLMs). Unlike traditional black-box LLMs that rely on post-hoc interpretation methods with limited neuron function insights, CB-LLM sets a new standard with its built-in interpretability, scalability, and ability to provide clear, accurate explanations. We investigate two essential tasks in the NLP domain: text classification and text generation. In text classification, CB-LLM narrows the performance gap with traditional black-box models and provides clear interpretability. In text generation, we show how interpretable neurons in CB-LLM can be used for concept detection and steering text generation. Our CB-LLMs enable greater interaction between humans and LLMs across a variety of tasks -- a feature notably absent in existing LLMs. Large Language Models (LLMs) have become instrumental in advancing Natural Language Processing (NLP) tasks.


SADDE: Semi-supervised Anomaly Detection with Dependable Explanations

Yuan, Yachao, Huang, Yu, Yuan, Yali, Wang, Jin

arXiv.org Artificial Intelligence

Semi-supervised learning holds a pivotal position in anomaly detection applications, yet identifying anomaly patterns with a limited number of labeled samples poses a significant challenge. Furthermore, the absence of interpretability poses major obstacles to the practical adoption of semi-supervised frameworks. The majority of existing interpretation techniques are tailored for supervised/unsupervised frameworks or non-security domains, falling short in providing dependable interpretations. In this research paper, we introduce SADDE, a general framework designed to accomplish two primary objectives: (1) to render the anomaly detection process interpretable and enhance the credibility of interpretation outcomes, and (2) to assign high-confidence pseudo labels to unlabeled samples, thereby boosting the performance of anomaly detection systems when supervised data is scarce. To achieve the first objective, we devise a cutting-edge interpretation method that utilizes both global and local interpreters to furnish trustworthy explanations. For the second objective, we conceptualize a novel two-stage semi-supervised learning framework tailored for network anomaly detection, ensuring that the model predictions of both stages align with specific constraints. We apply SADDE to two illustrative network anomaly detection tasks and conduct extensive evaluations in comparison with notable prior works. The experimental findings underscore that SADDE is capable of delivering precise detection results alongside dependable interpretations for semi-supervised network anomaly detection systems. The source code for SADDE is accessible at: https://github.com/M-Code-Space/SADDE.


Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents

Gan, Yuyou, Yang, Yong, Ma, Zhe, He, Ping, Zeng, Rui, Wang, Yiming, Li, Qingming, Zhou, Chunyi, Li, Songze, Wang, Ting, Gao, Yunjun, Wu, Yingcai, Ji, Shouling

arXiv.org Artificial Intelligence

With the continuous development of large language models (LLMs), transformer-based models have made groundbreaking advances in numerous natural language processing (NLP) tasks, leading to the emergence of a series of agents that use LLMs as their control hub. While LLMs have achieved success in various tasks, they face numerous security and privacy threats, which become even more severe in the agent scenarios. To enhance the reliability of LLM-based applications, a range of research has emerged to assess and mitigate these risks from different perspectives. To help researchers gain a comprehensive understanding of various risks, this survey collects and analyzes the different threats faced by these agents. To address the challenges posed by previous taxonomies in handling cross-module and cross-stage threats, we propose a novel taxonomy framework based on the sources and impacts. Additionally, we identify six key features of LLM-based agents, based on which we summarize the current research progress and analyze their limitations. Subsequently, we select four representative agents as case studies to analyze the risks they may face in practical use. Finally, based on the aforementioned analyses, we propose future research directions from the perspectives of data, methodology, and policy, respectively.