AITopics | Information Fusion

Collaborating Authors

Information Fusion

News Overviews Instructional Materials AI-Alerts Classics

A Methodological and Structural Review of Parkinsons Disease Detection Across Diverse Data Modalities

Miah, Abu Saleh Musa, Suzuki, taro, Shin, Jungpil

arXiv.org Artificial IntelligenceMay-2-2025

Parkinsons Disease (PD) is a progressive neurological disorder that primarily affects motor functions and can lead to mild cognitive impairment (MCI) and dementia in its advanced stages. With approximately 10 million people diagnosed globally 1 to 1.8 per 1,000 individuals, according to reports by the Japan Times and the Parkinson Foundation early and accurate diagnosis of PD is crucial for improving patient outcomes. While numerous studies have utilized machine learning (ML) and deep learning (DL) techniques for PD recognition, existing surveys are limited in scope, often focusing on single data modalities and failing to capture the potential of multimodal approaches. To address these gaps, this study presents a comprehensive review of PD recognition systems across diverse data modalities, including Magnetic Resonance Imaging (MRI), gait-based pose analysis, gait sensory data, handwriting analysis, speech test data, Electroencephalography (EEG), and multimodal fusion techniques. Based on over 347 articles from leading scientific databases, this review examines key aspects such as data collection methods, settings, feature representations, and system performance, with a focus on recognition accuracy and robustness. This survey aims to serve as a comprehensive resource for researchers, providing actionable guidance for the development of next generation PD recognition systems. By leveraging diverse data modalities and cutting-edge machine learning paradigms, this work contributes to advancing the state of PD diagnostics and improving patient care through innovative, multimodal approaches.

evolutionary algorithm, machine learning, pd recognition system, (17 more...)

arXiv.org Artificial Intelligence

2505.00525

Country:

North America > United States (1.00)
Asia > Middle East (1.00)
Asia > Japan > Honshū (0.27)
Europe > United Kingdom > England (0.27)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Decentralized Fusion of 3D Extended Object Tracking based on a B-Spline Shape Model

Han, Longfei, Kefferpütz, Klaus, Beyerer, Jürgen

arXiv.org Artificial IntelligenceApr-29-2025

Extended Object Tracking (EOT) exploits the high resolution of modern sensors for detailed environmental perception. Combined with decentralized fusion, it contributes to a more scalable and robust perception system. This paper investigates the decentralized fusion of 3D EOT using a B-spline curve based model. The spline curve is used to represent the side-view profile, which is then extruded with a width to form a 3D shape. We use covariance intersection (CI) for the decentralized fusion and discuss the challenge of applying it to EOT. We further evaluate the tracking result of the decentralized fusion with simulated and real datasets of traffic scenarios. We show that the CI-based fusion can significantly improve the tracking performance for sensors with unfavorable perspective.

artificial intelligence, information fusion, international conferenceon information fusion, (11 more...)

arXiv.org Artificial Intelligence

2504.18708

Country: Europe > Germany (0.31)

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Robots (0.47)
Information Technology > Artificial Intelligence > Vision (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.31)

Add feedback

Implementation Analysis of Collaborative Robot Digital Twins in Physics Engines

König, Christian, Petershans, Jan, Herbst, Jan, Rüb, Matthias, Krummacker, Dennis, Mittag, Eric, Schotten, Hans D.

arXiv.org Artificial IntelligenceApr-29-2025

This paper presents a Digital Twin (DT) of a 6G communications system testbed that integrates two robotic manipulators with a high-precision optical infrared tracking system in Unreal Engine 5. Practical details of the setup and implementation insights provide valuable guidance for users aiming to replicate such systems, an endeavor that is crucial to advancing DT applications within the scientific community. Key topics discussed include video streaming, integration within the Robot Operating System 2 (ROS 2), and bidirectional communication. The insights provided are intended to support the development and deployment of DTs in robotics and automation research.

application, artificial intelligence, information fusion, (16 more...)

arXiv.org Artificial Intelligence

2504.182

Country: Europe > Germany (0.28)

Genre: Research Report (0.82)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.34)

Add feedback

Rethinking Few-Shot Image Fusion: Granular Ball Priors Enable General-Purpose Deep Fusion

Deng, Minjie, Wei, Yan, Zhai, Hao, Wu, An, Ouyang, Yuncan, Peng, Qianyao

arXiv.org Machine LearningApr-25-2025

In image fusion tasks, the absence of real fused images as priors presents a fundamental challenge. Most deep learning-based fusion methods rely on large-scale paired datasets to extract global weighting features from raw images, thereby generating fused outputs that approximate real fused images. In contrast to previous studies, this paper explores few-shot training of neural networks under the condition of having prior knowledge. We propose a novel fusion framework named GBFF, and a Granular Ball Significant Extraction algorithm specifically designed for the few-shot prior setting. All pixel pairs involved in the fusion process are initially modeled as a Coarse-Grained Granular Ball. At the local level, Fine-Grained Granular Balls are used to slide through the brightness space to extract Non-Salient Pixel Pairs, and perform splitting operations to obtain Salient Pixel Pairs. Pixel-wise weights are then computed to generate a pseudo-supervised image. At the global level, pixel pairs with significant contributions to the fusion process are categorized into the Positive Region, while those whose contributions cannot be accurately determined are assigned to the Boundary Region. The Granular Ball performs modality-aware adaptation based on the proportion of the positive region, thereby adjusting the neural network's loss function and enabling it to complement the information of the boundary region. Extensive experiments demonstrate the effectiveness of both the proposed algorithm and the underlying theory. Compared with state-of-the-art (SOTA) methods, our approach shows strong competitiveness in terms of both fusion time and image expressiveness. Our code is publicly available at:

artificial intelligence, information fusion, machine learning, (15 more...)

arXiv.org Machine Learning

2504.08937

Genre:

Overview (0.67)
Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

STFM: A Spatio-Temporal Information Fusion Model Based on Phase Space Reconstruction for Sea Surface Temperature Prediction

Wang, Yin, Gong, Chunlin, Wu, Xiang, Zhang, Hanleran

arXiv.org Artificial IntelligenceApr-25-2025

The sea surface temperature (SST), a key environmental parameter, is crucial to optimizing production planning, making its accurate prediction a vital research topic. However, the inherent nonlinearity of the marine dynamic system presents significant challenges. Current forecasting methods mainly include physics-based numerical simulations and data-driven machine learning approaches. The former, while describing SST evolution through differential equations, suffers from high computational complexity and limited applicability, whereas the latter, despite its computational benefits, requires large datasets and faces interpretability challenges. This study presents a prediction framework based solely on data-driven techniques. Using phase space reconstruction, we construct initial-delay attractor pairs with a mathematical homeomorphism and design a Spatio-Temporal Fusion Mapping (STFM) to uncover their intrinsic connections. Unlike conventional models, our method captures SST dynamics efficiently through phase space reconstruction and achieves high prediction accuracy with minimal training data in comparative tests

artificial intelligence, machine learning, spatio-temporal information fusion model, (10 more...)

arXiv.org Artificial Intelligence

2504.1697

Country: Asia > China (0.68)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

4D Multimodal Co-attention Fusion Network with Latent Contrastive Alignment for Alzheimer's Diagnosis

Wei, Yuxiang, Zhang, Yanteng, Xiao, Xi, Wang, Tianyang, Wang, Xiao, Calhoun, Vince D.

arXiv.org Artificial IntelligenceApr-24-2025

--Multimodal neuroimaging provides complementary structural and functional insights into both human brain organization and disease-related dynamics. Recent studies demonstrate enhanced diagnostic sensitivity for Alzheimer's disease (AD) through synergistic integration of neuroimaging data (e.g., sMRI, fMRI) with behavioral cognitive scores tabular data biomarkers. However, the intrinsic heterogeneity across modalities (e.g., 4D spatiotemporal fMRI dynamics vs. 3D anatomical sMRI structure) presents critical challenges for discriminative feature fusion. T o bridge this gap, we propose M2M-AlignNet: a geometry-aware multimodal co-attention network with latent alignment for early AD diagnosis using sMRI and fMRI. At the core of our approach is a multi-patch-to-multi-patch (M2M) contrastive loss function that quantifies and reduces representational discrepancies via geometry-weighted patch correspondence, explicitly aligning fMRI components across brain regions with their sMRI structural substrates without one-to-one constraints. Additionally, we propose a latent-as-query co-attention module to autonomously discover fusion patterns, circumventing modality prioritization biases while minimizing feature redundancy. We conduct extensive experiments to confirm the effectiveness of our method and highlight the correspondance between fMRI and sMRI as AD biomarkers.

artificial intelligence, machine learning, modality, (15 more...)

arXiv.org Artificial Intelligence

2504.16798

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multi-Modal Fusion of In-Situ Video Data and Process Parameters for Online Forecasting of Cookie Drying Readiness

Li, Shichen, Shao, Chenhui

arXiv.org Artificial IntelligenceApr-23-2025

Food drying is essential for food production, extending shelf life, and reducing transportation costs. Accurate real-time forecasting of drying readiness is crucial for minimizing energy consumption, improving productivity, and ensuring product quality. However, this remains challenging due to the dynamic nature of drying, limited data availability, and the lack of effective predictive analytical methods. To address this gap, we propose an end-to-end multi-modal data fusion framework that integrates in-situ video data with process parameters for real-time food drying readiness forecasting. Our approach leverages a new encoder-decoder architecture with modality-specific encoders and a transformer-based decoder to effectively extract features while preserving the unique structure of each modality. We apply our approach to sugar cookie drying, where time-to-ready is predicted at each timestamp. Experimental results demonstrate that our model achieves an average prediction error of only 15 seconds, outperforming state-of-the-art data fusion methods by 65.69% and a video-only model by 11.30%. The proposed model is extensible to various other industrial modality fusion tasks for online decision-making. Introduction Drying is a fundamental process in the food industry that plays a critical role in both food production and preservation. By removing moisture, it transforms raw ingredients into their final, consumable forms while enhancing texture, flavor, and structural integrity [1]. However, food drying is a highly time-and energy-intensive process which accounts for 15% of energy consumption in U.S. industrial processes [2]. As a result, advancing drying technologies and improving product quality are key strategies for minimizing waste and enhancing energy efficiency [3].

artificial intelligence, information fusion, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2504.15599

Country:

North America > United States > Illinois (0.28)
North America > United States > Michigan (0.28)

Genre: Research Report > New Finding (0.88)

Industry:

Food & Agriculture (1.00)
Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.94)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhanced UAV Navigation Systems through Sensor Fusion with Trident Quaternions

Incicco, Sebastian, Giribet, Juan Ignacio, Colombo, Leonardo

arXiv.org Artificial IntelligenceApr-22-2025

Integrated Navigation (IN) techniques have emerged as a promising solution by combining multiple sensor measurements, such as those obtained from Inertial Measurement Units (IMU), Global Navigation Satellite Systems (GNSS), and vision-based sensors. IN approaches offer significant advantages, including robustness, improved accuracy, and the ability to overcome the limitations of individual sensors. Among the various mathematical tools employed in IN, quaternions have garnered considerable attention for estimating a vehicle's attitude (orientation). Quaternions provide an elegant and compact representation of orientation, avoiding the limitations of traditional Euler angles, such as singularities and ambiguity.

artificial intelligence, information fusion, quaternion, (16 more...)

arXiv.org Artificial Intelligence

2504.14133

Country: South America > Argentina (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.50)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.50)

Add feedback

Translating Multimodal AI into Real-World Inspection: TEMAI Evaluation Framework and Pathways for Implementation

Li, Zehan, Deng, Jinzhi, Ma, Haibing, Zhang, Chi, Xiao, Dan

arXiv.org Artificial IntelligenceApr-22-2025

Translating Multimodal AI into Real-World Inspection: TEMAI Evaluation Framework and Pathways for Implementation Zehan LI 1,3, Jinzhi Deng 1,2, Haibing Ma 1,2, Chi Zhang 1, and Dan Xiao 1 1 Moximize.ai 2 Shanghai Zhongqiao Vocational And Technical University 3 China Creative Studies Institute April 22, 2025 Abstract This paper introduces the Translational Evaluation of Multimodal AI for Inspection (TEMAI) framework, bridging multimodal AI capabilities with industrial inspection implementation. Adapting translational research principles from healthcare to industrial contexts, TEMAI establishes three core dimensions: Capability (technical feasibility), Adoption (organizational readiness), and Utility (value realization). The framework demonstrates that technical capability alone yields limited value without corresponding adoption mechanisms. TEMAI incorporates specialized metrics including the Value Density Coefficient and structured implementation pathways. Empirical validation through retail and photovoltaic inspection implementations revealed significant differences in value realization patterns despite similar capability reduction rates, confirming the framework's effectiveness across diverse industrial sectors while highlighting the importance of industry-specific adaptation strategies. Keywords: Multimodal AI, Industrial Inspection, Translational Framework, TEMAI 1 Introduction Industrial inspection tasks are fundamental to ensuring operational continuity and safety in manufacturing sectors, serving as a cornerstone for preventive maintenance and risk mitigation. These tasks, however, are plagued by systemic inefficiencies, including labor-intensive workflows, hazardous working environments (e.g., high-temperature zones or toxic gas exposure), and heavy reliance on empirical knowledge that is difficult to standardize or transfer across industries[1]. Despite incremental advancements in automation technologies--such as drones, AR-assisted devices, and IoT-enabled sensors--the integration of these tools into inspection workflows has yielded limited returns due to fragmented deployment, high implementation costs, and insufficient interoperability between hardware and software systems [2]. For instance, while drones have reduced human exposure to dangerous environments in power grid inspections, their operational scope remains constrained by battery life and data processing bottlenecks[3].

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2504.13873

Country: Asia > China > Shanghai > Shanghai (0.24)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Energy > Renewable > Solar (0.88)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.46)

Add feedback

DeepMLF: Multimodal language model with learnable tokens for deep fusion in sentiment analysis

Georgiou, Efthymios, Katsouros, Vassilis, Avrithis, Yannis, Potamianos, Alexandros

arXiv.org Artificial IntelligenceApr-16-2025

While multimodal fusion has been extensively studied in Multimodal Sentiment Analysis (MSA), the role of fusion depth and multimodal capacity allocation remains underexplored. In this work, we position fusion depth, scalability, and dedicated multimodal capacity as primary factors for effective fusion. We introduce DeepMLF, a novel multimodal language model (LM) with learnable tokens tailored toward deep fusion. DeepMLF leverages an audiovisual encoder and a pretrained decoder LM augmented with multimodal information across its layers. We append learnable tokens to the LM that: 1) capture modality interactions in a controlled fashion and 2) preserve independent information flow for each modality. These fusion tokens gather linguistic information via causal self-attention in LM Blocks and integrate with audiovisual information through cross-attention MM Blocks. Serving as dedicated multimodal capacity, this design enables progressive fusion across multiple layers, providing depth in the fusion process. Our training recipe combines modality-specific losses and language modelling loss, with the decoder LM tasked to predict ground truth polarity. Across three MSA benchmarks with varying dataset characteristics, DeepMLF achieves state-of-the-art performance. Our results confirm that deeper fusion leads to better performance, with optimal fusion depths (5-7) exceeding those of existing approaches. Additionally, our analysis on the number of fusion tokens reveals that small token sets ($\sim$20) achieve optimal performance. We examine the importance of representation learning order (fusion curriculum) through audiovisual encoder initialization experiments. Our ablation studies demonstrate the superiority of the proposed fusion design and gating while providing a holistic examination of DeepMLF's scalability to LLMs, and the impact of each training objective and embedding regularization.

deepmlf, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2504.11082

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.67)

Add feedback