AITopics | Information Fusion

Collaborating Authors

Information Fusion

News Overviews Instructional Materials AI-Alerts Classics

A Robust Approach for LiDAR-Inertial Odometry Without Sensor-Specific Modeling

Malladi, Meher V. R., Guadagnino, Tiziano, Lobefaro, Luca, Stachniss, Cyrill

arXiv.org Artificial IntelligenceSep-9-2025

Figure 1: Our robust LiDAR-inertial odometry system is directly operational in different environments, sensor configurations, and robotic platforms with distinct motion behaviours, all without any change in configuration or modeling approach. We depict the local map result of our odometry system in four distinct scenarios, shown clockwise from the top left: urban city with Ouster OS1-128 and built-in InvenSense IMU mounted on a car; mixed indoor-outdoor university buildings with Hesai QT64 and Alphasense IMU on a backpack (data from Tao et al. [31]); forest with Hesai XT32 and Xsens MTi-100 IMU mounted on the SAHA tree-harvesting machine (see Jelavic et al. [14]); and parking lot with V elodyne VLP-16 and onboard IMU on a Unitree Go1 quadruped (data from Ou et al. [25]). Abstract-- Accurate odometry is a critical component in a robotic navigation stack, and subsequent modules such as planning and control often rely on an estimate of the robot's motion. Sensor-based odometry approaches should be robust across sensor types and deployable in different target domains, from solid-state LiDARs mounted on cars in urban-driving scenarios to spinning LiDARs on handheld packages used in unstructured natural environments. In this paper, we propose a robust LiDAR-inertial odometry system that does not rely on sensor-specific modeling. Sensor fusion techniques for LiDAR and inertial measurement unit (IMU) data typically integrate IMU data iteratively in a Kalman filter or use pre-integration in a factor graph framework, combined with LiDAR scan matching often exploiting some form of feature extraction. We propose an alternative strategy that only requires a simplified motion model for IMU integration and directly registers LiDAR scans in a scan-to-map approach. Our approach allows us to impose a novel regularization on the LiDAR registration, improving the overall odometry performance. We detail extensive experiments on a number of datasets covering a wide array of commonly used robotic sensors and platforms. We show that our approach works with the exact same configuration in all these scenarios, demonstrating its robustness.

artificial intelligence, dataset, information fusion, (16 more...)

arXiv.org Artificial Intelligence

2509.06593

Country: Europe (0.68)

Genre: Research Report (1.00)

Industry:

Energy (0.48)
Transportation (0.46)
Information Technology (0.34)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)

Add feedback

Spoken in Jest, Detected in Earnest: A Systematic Review of Sarcasm Recognition -- Multimodal Fusion, Challenges, and Future Prospects

Gao, Xiyuan, Nayak, Shekhar, Coler, Matt

arXiv.org Artificial IntelligenceSep-8-2025

Sarcasm, a common feature of human communication, poses challenges in interpersonal interactions and human-machine interactions. Linguistic research has highlighted the importance of prosodic cues, such as variations in pitch, speaking rate, and intonation, in conveying sarcastic intent. Although previous work has focused on text-based sarcasm detection, the role of speech data in recognizing sarcasm has been underexplored. Recent advancements in speech technology emphasize the growing importance of leveraging speech data for automatic sarcasm recognition, which can enhance social interactions for individuals with neurodegenerative conditions and improve machine understanding of complex human language use, leading to more nuanced interactions. This systematic review is the first to focus on speech-based sarcasm recognition, charting the evolution from unimodal to multimodal approaches. It covers datasets, feature extraction, and classification methods, and aims to bridge gaps across diverse research domains. The findings include limitations in datasets for sarcasm recognition in speech, the evolution of feature extraction techniques from traditional acoustic features to deep learning-based representations, and the progression of classification methods from unimodal approaches to multimodal fusion techniques. In so doing, we identify the need for greater emphasis on cross-cultural and multilingual sarcasm recognition, as well as the importance of addressing sarcasm as a multimodal phenomenon, rather than a text-based challenge.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.04605

Country:

Europe (1.00)
Asia > India (0.93)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

On Transferring, Merging, and Splitting Task-Oriented Network Digital Twins

Zhang, Zifan, Fang, Minghong, Chen, Mingzhe, Liu, Yuchen

arXiv.org Artificial IntelligenceSep-3-2025

--The integration of digital twinning technologies is driving next-generation networks toward new capabilities, allowing operators to thoroughly understand network conditions, efficiently analyze valuable radio data, and innovate applications through user-friendly, immersive interfaces. Building on this foundation, network digital twins (NDTs) accurately depict the operational processes and attributes of network infrastructures, facilitating predictive management through real-time analysis and measurement. However, constructing precise NDTs poses challenges, such as integrating diverse data sources, mapping necessary attributes from physical networks, and maintaining scalability for various downstream tasks. Unlike previous works that focused on the creation and mapping of NDTs from scratch, we explore intra-and inter-operations among NDTs within an Unified Twin Transformation (UTT) framework, which uncovers a new computing paradigm for efficient transfer, merging, and splitting of NDTs to create task-oriented twins. By leveraging joint multi-modal and distributed mapping mechanisms, UTT optimizes resource utilization and reduces the cost of creating NDTs, while ensuring twin model consistency. A theoretical analysis of the distributed mapping problem is conducted to establish convergence bounds for this multi-modal gated aggregation process. Evaluations on real-world twin-assisted applications, such as trajectory reconstruction, human localization, and sensory data generation, demonstrate the feasibility and effectiveness of interoperability among NDTs for corresponding task development. In the domain of telecommunications, wireless networks are experiencing a paradigmatic evolution, driven by the integration of advanced technologies such as edge computing [1], millimeter-wave communication [2], and machine learning [3]. These technologies are instrumental in laying groundwork for an array of novel applications and services in mixed physical and digital contexts, boosting capabilities of mobile broadband and enabling thorough integration of cyber-physical interactive systems.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.02551

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Telecommunications (0.88)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
(2 more...)

Add feedback

Detecting Rug Pulls in Decentralized Exchanges: Machine Learning Evidence from the TON Blockchain

Yaremus, Dmitry, Li, Jianghai, Kalacheva, Alisa, Vodolazov, Igor, Yanovich, Yury

arXiv.org Artificial IntelligenceSep-3-2025

This paper presents a machine learning framework for the early detection of rug pull scams on decentralized exchanges (DEXs) within The Open Network (TON) blockchain. TON's unique architecture, characterized by asynchronous execution and a massive web2 user base from Telegram, presents a novel and critical environment for fraud analysis. We conduct a comprehensive study on the two largest TON DEXs, Ston.Fi and DeDust, fusing data from both platforms to train our models. A key contribution is the implementation and comparative analysis of two distinct rug pull definitions-TVL-based (a catastrophic liquidity withdrawal) and idle-based (a sudden cessation of all trading activity)-within a single, unified study. We demonstrate that Gradient Boosting models can effectively identify rug pulls within the first five minutes of trading, with the TVL-based method achieving superior AUC (up to 0.891) while the idle-based method excels at recall. Our analysis reveals that while feature sets are consistent across exchanges, their underlying distributions differ significantly, challenging straightforward data fusion and highlighting the need for robust, platform-aware models. This work provides a crucial early-warning mechanism for investors and enhances the security infrastructure of the rapidly growing TON DeFi ecosystem. Introduction The Open Network [1] was originally conceived and developed by Telegram, and is now independently operated by the TON Foundation. It is a high-performance decentralized platform designed to support large-scale decentralized applications (DApps) [2] and smart contracts [3].

artificial intelligence, machine learning, rug pull, (18 more...)

arXiv.org Artificial Intelligence

2509.01168

Country: Europe > Russia (0.15)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Data-driven Discovery of Digital Twins in Biomedical Research

Métayer, Clémence, Ballesta, Annabelle, Martinelli, Julien

arXiv.org Artificial IntelligenceSep-3-2025

Recent technological advances have expanded the availability of high-throughput biological datasets, enabling the reliable design of digital twins of biomedical systems or patients. Such computational tools represent key reaction networks driving perturbation or drug response and can guide drug discovery and personalized therapeutics. Yet, their development still relies on laborious data integration by the human modeler, so that automated approaches are critically needed. The success of data-driven system discovery in Physics, rooted in clean datasets and well-defined governing laws, has fueled interest in applying similar techniques in Biology, which presents unique challenges. Here, we reviewed methodologies for automatically inferring digital twins from biological time series, which mostly involve symbolic or sparse regression. We evaluate algorithms according to eight biological and methodological challenges, associated to noisy/incomplete data, multiple conditions, prior knowledge integration, latent variables, high dimensionality, unobserved variable derivatives, candidate library design, and uncertainty quantification. Upon these criteria, sparse regression generally outperformed symbolic regression, particularly when using Bayesian frameworks. We further highlight the emerging role of deep learning and large language models, which enable innovative prior knowledge integration, though the reliability and consistency of such approaches must be improved. While no single method addresses all challenges, we argue that progress in learning digital twins will come from hybrid and modular frameworks combining chemical reaction network-based mechanistic grounding, Bayesian uncertainty quantification, and the generative and knowledge integration capacities of deep learning. To support their development, we further propose a benchmarking framework to evaluate methods across all challenges.

large language model, machine learning, regression, (20 more...)

arXiv.org Artificial Intelligence

2508.21484

Country:

Europe (0.67)
North America > United States (0.46)

Genre:

Research Report (0.81)
Workflow (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(5 more...)

Add feedback

MoE-Health: A Mixture of Experts Framework for Robust Multimodal Healthcare Prediction

Wang, Xiaoyang, Yang, Christopher C.

arXiv.org Artificial IntelligenceSep-1-2025

Healthcare systems generate diverse multimodal data, including Electronic Health Records (EHR), clinical notes, and medical images. Effectively leveraging this data for clinical prediction is challenging, particularly as real-world samples often present with varied or incomplete modalities. Existing approaches typically require complete modality data or rely on manual selection strategies, limiting their applicability in real-world clinical settings where data availability varies across patients and institutions. To address these limitations, we propose MoE-Health, a novel Mixture of Experts framework designed for robust multimodal fusion in healthcare prediction. MoE-Health architecture is specifically developed to handle samples with differing modalities and improve performance on critical clinical tasks. By leveraging specialized expert networks and a dynamic gating mechanism, our approach dynamically selects and combines relevant experts based on available data modalities, enabling flexible adaptation to varying data availability scenarios. We evaluate MoE-Health on the MIMIC-IV dataset across three critical clinical prediction tasks: in-hospital mortality prediction, long length of stay, and hospital readmission prediction. Experimental results demonstrate that MoE-Health achieves superior performance compared to existing multimodal fusion methods while maintaining robustness across different modality availability patterns. The framework effectively integrates multimodal information, offering improved predictive performance and robustness in handling heterogeneous and incomplete healthcare data, making it particularly suitable for deployment in diverse healthcare environments with heterogeneous data availability.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2508.21793

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.89)
Information Technology > Information Management (0.88)

Add feedback

COMETH: Convex Optimization for Multiview Estimation and Tracking of Humans

Martini, Enrico, Choi, Ho Jin, Figueroa, Nadia, Bombieri, Nicola

arXiv.org Artificial IntelligenceAug-29-2025

In the era of Industry 5.0, monitoring human activity is essential for ensuring both ergonomic safety and overall well-being. While multi-camera centralized setups improve pose estimation accuracy, they often suffer from high computational costs and bandwidth requirements, limiting scalability and real-time applicability. Distributing processing across edge devices can reduce network bandwidth and computational load. On the other hand, the constrained resources of edge devices lead to accuracy degradation, and the distribution of computation leads to temporal and spatial inconsistencies. We address this challenge by proposing COMETH (Convex Optimization for Multiview Estimation and Tracking of Humans), a lightweight algorithm for real-time multi-view human pose fusion that relies on three concepts: it integrates kinematic and biomechanical constraints to increase the joint positioning accuracy; it employs convex optimization-based inverse kinematics for spatial fusion; and it implements a state observer to improve temporal consistency. We evaluate COMETH on both public and industrial datasets, where it outperforms state-of-the-art methods in localization, detection, and tracking accuracy. The proposed fusion pipeline enables accurate and scalable human motion tracking, making it well-suited for industrial and safety-critical applications. The code is publicly available at https://github.com/PARCO-LAB/COMETH.

artificial intelligence, information fusion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.2092

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (0.47)
Information Technology (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(3 more...)

Add feedback

MedVQA-TREE: A Multimodal Reasoning and Retrieval Framework for Sarcopenia Prediction

Moradbeiki, Pardis, Ghadiri, Nasser, Zahabi, Sayed Jalal, Wiil, Uffe Kock, Brockhattingen, Kristoffer Kittelmann, Ebrahimi, Ali

arXiv.org Artificial IntelligenceAug-28-2025

Accurate sarcopenia diagnosis via ultrasound remains challenging due to subtle imaging cues, limited labeled data, and the absence of clinical context in most models. We propose MedVQA-TREE, a multimodal framework that integrates a hierarchical image interpretation module, a gated feature-level fusion mechanism, and a novel multi-hop, multi-query retrieval strategy. The vision module includes anatomical classification, region segmentation, and graph-based spatial reasoning to capture coarse, mid-level, and fine-grained structures. A gated fusion mechanism selectively integrates visual features with textual queries, while clinical knowledge is retrieved through a UMLS-guided pipeline accessing PubMed and a sarcopenia-specific external knowledge base. MedVQA-TREE was trained and evaluated on two public MedVQA datasets (VQA-RAD and PathVQA) and a custom sarcopenia ultrasound dataset. The model achieved up to 99% diagnostic accuracy and outperformed previous state-of-the-art methods by over 10%. These results underscore the benefit of combining structured visual understanding with guided knowledge retrieval for effective AI-assisted diagnosis in sarcopenia.

large language model, machine learning, natural language, (24 more...)

arXiv.org Artificial Intelligence

2508.19319

Country:

Europe > Denmark > Southern Denmark (0.04)
North America > United States > Maryland > Montgomery County > Bethesda (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
Asia > Middle East > Iran > Isfahan Province > Isfahan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Information Technology (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.92)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
(6 more...)

Add feedback

Towards a Spatiotemporal Fusion Approach to Precipitation Nowcasting

Curcio, Felipe, Castro, Pedro, Fonseca, Augusto, Castro, Rafaela, Franco, Raquel, Ogasawara, Eduardo, Stepanenko, Victor, Porto, Fabio, Ferro, Mariza, Bezerra, Eduardo

arXiv.org Artificial IntelligenceAug-28-2025

--With the increasing availability of meteorological data from various sensors, numerical models and reanalysis products, the need for efficient data integration methods has become paramount for improving weather forecasts and hy-drometeorological studies. In this work, we propose a data fusion approach for precipitation nowcasting by integrating data from meteorological and rain gauge stations in Rio de Janeiro metropolitan area with ERA5 reanalysis data and GFS numerical weather prediction. We employ the spatiotemporal deep learning architecture called STConvS2S, leveraging a structured dataset covering a 9 x 11 grid. The study spans from January 2011 to October 2024, and we evaluate the impact of integrating three surface station systems. Among the tested configurations, the fusion-based model achieves an F1-score of 0.2033 for forecasting heavy precipitation events (greater than 25 mm/h) at a one-hour lead time. Additionally, we present an ablation study to assess the contribution of each station network and propose a refined inference strategy for precipitation nowcasting, integrating the GFS numerical weather prediction (NWP) data with in-situ observations. Precipitation nowcasting (or very short-range forecasting [1]) involves predicting rainfall within a six-hour lead time. Objective analysis techniques are then employed to synthesize these disparate measurements into a coherent, gridded spatial map for precipitation nowcasting [16]. Accurate precipitation forecasting is critical for mitigating natural disasters, such as floods, landslides, and droughts, and supports informed decision-making across sectors including agriculture, transportation, energy, and public health [3]. Recent advancements in machine learning, particularly deep learning, have demonstrated significant potential in geoscien-tific applications, including precipitation nowcasting.

artificial intelligence, machine learning, precipitation, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.23919/FUSION65864.2025.11123942

2505.19258

Country:

North America > United States (0.68)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.26)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models

Wang, Yimu, Zhuang, Weiming, Chen, Chen, Huang, Jiabo, Li, Jingtao, Lyu, Lingjuan

arXiv.org Artificial IntelligenceAug-28-2025

In the era of deep learning, the increasing number of pre-trained models available online presents a wealth of knowledge. These models, developed with diverse architectures and trained on varied datasets for different tasks, provide unique interpretations of the real world. Their collective consensus is likely universal and generalizable to unseen data. However, effectively harnessing this collective knowledge poses a fundamental challenge due to the heterogeneity of pre-trained models. Existing knowledge integration solutions typically rely on strong assumptions about training data distributions and network architectures, limiting them to learning only from specific types of models and resulting in data and/or inductive biases. In this work, we introduce a novel framework, namely UNIFORM, for knowledge transfer from a diverse set of off-the-shelf models into one student model without such constraints. Specifically, we propose a dedicated voting mechanism to capture the consensus of knowledge both at the logit level -- incorporating teacher models that are capable of predicting target classes of interest -- and at the feature level, utilizing visual representations learned on arbitrary label spaces. Extensive experiments demonstrate that UNIFORM effectively enhances unsupervised object recognition performance compared to strong knowledge transfer baselines. Notably, it exhibits remarkable scalability by benefiting from over one hundred teachers, while existing methods saturate at a much smaller scale.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.19498

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.93)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(2 more...)

Add feedback