Goto

Collaborating Authors

 Information Fusion


Multi-source and multimodal data fusion for predicting academic performance in blended learning university courses

arXiv.org Artificial Intelligence

In this paper we applied data fusion approaches for predicting the final academic performance of university students using multiple-source, multimodal data from blended learning environments. We collected and preprocessed data about first-year university students from different sources: theory classes, practical sessions, on-line Moodle sessions, and a final exam. Our objective was to discover which data fusion approach produced the best results using our data. We carried out experiments by applying four different data fusion approaches and six classification algorithms. The results showed that the best predictions were produced using ensembles and selecting the best attributes approach with discretized data. The best prediction models showed us that the level of attention in theory classes, scores in Moodle quizzes, and the level of activity in Moodle forums were the best set of attributes for predicting students' final performance in our courses.


PAS-SLAM: A Visual SLAM System for Planar Ambiguous Scenes

arXiv.org Artificial Intelligence

Visual SLAM (Simultaneous Localization and Mapping) based on planar features has found widespread applications in fields such as environmental structure perception and augmented reality. However, current research faces challenges in accurately localizing and mapping in planar ambiguous scenes, primarily due to the poor accuracy of the employed planar features and data association methods. In this paper, we propose a visual SLAM system based on planar features designed for planar ambiguous scenes, encompassing planar processing, data association, and multi-constraint factor graph optimization. We introduce a planar processing strategy that integrates semantic information with planar features, extracting the edges and vertices of planes to be utilized in tasks such as plane selection, data association, and pose optimization. Next, we present an integrated data association strategy that combines plane parameters, semantic information, projection IoU (Intersection over Union), and non-parametric tests, achieving accurate and robust plane data association in planar ambiguous scenes. Finally, we design a set of multi-constraint factor graphs for camera pose optimization. Qualitative and quantitative experiments conducted on publicly available datasets demonstrate that our proposed system competes effectively in both accuracy and robustness in terms of map construction and camera localization compared to state-of-the-art methods.


Combining shape and contour features to improve tool wear monitoring in milling processes

arXiv.org Artificial Intelligence

In this paper, a new system based on combinations of a shape descriptor and a contour descriptor has been proposed for classifying inserts in milling processes according to their wear level following a computer vision based approach. To describe the wear region shape we have proposed a new descriptor called ShapeFeat and its contour has been characterized using the method BORCHIZ that, to the best of our knowledge, achieves the best performance for tool wear monitoring following a computer vision-based approach. Results show that the combination of BORCHIZ with ShapeFeat using a late fusion method improves the classification performance significantly, obtaining an accuracy of 91.44% in the binary classification (i.e. the classification of the wear as high or low) and 82.90% using three target classes (i.e. classification of the wear as high, medium or low). These results outperform the ones obtained by both descriptors used on their own, which achieve accuracies of 88.70 and 80.67% for two and three classes, respectively, using ShapeFeat and 87.06 and 80.24% with B-ORCHIZ. This study yielded encouraging results for the manufacturing community in order to classify automatically the inserts in terms of their wear for milling processes.


Learning on Multimodal Graphs: A Survey

arXiv.org Artificial Intelligence

Multimodal data pervades various domains, including healthcare, social media, and transportation, where multimodal graphs play a pivotal role. Machine learning on multimodal graphs, referred to as multimodal graph learning (MGL), is essential for successful artificial intelligence (AI) applications. The burgeoning research in this field encompasses diverse graph data types and modalities, learning techniques, and application scenarios. This survey paper conducts a comparative analysis of existing works in multimodal graph learning, elucidating how multimodal learning is achieved across different graph types and exploring the characteristics of prevalent learning techniques. Additionally, we delineate significant applications of multimodal graph learning and offer insights into future directions in this domain. Consequently, this paper serves as a foundational resource for researchers seeking to comprehend existing MGL techniques and their applicability across diverse scenarios.


Select2Col: Leveraging Spatial-Temporal Importance of Semantic Information for Efficient Collaborative Perception

arXiv.org Artificial Intelligence

Collaborative perception by leveraging the shared semantic information plays a crucial role in overcoming the individual limitations of isolated agents. However, existing collaborative perception methods tend to focus solely on the spatial features of semantic information, while neglecting the importance of the temporal dimension. Consequently, the potential benefits of collaboration remain underutilized. In this article, we propose Select2Col, a novel collaborative perception framework that takes into account the \underline{s}patial-t\underline{e}mpora\underline{l} importanc\underline{e} of semanti\underline{c} informa\underline{t}ion. Within the Select2Col, we develop a collaborator selection method that utilizes a lightweight graph neural network (GNN) to estimate the importance of semantic information (IoSI) of each collaborator in enhancing perception performance, thereby identifying contributive collaborators while excluding those that potentially bring negative impact. Moreover, we present a semantic information fusion algorithm called HPHA (historical prior hybrid attention), which integrates multi-scale attention and short-term attention modules to capture the IoSI in feature representation from the spatial and temporal dimensions respectively, and assigns IoSI-consistent weights for efficient fusion of information from selected collaborators. Extensive experiments on three open datasets demonstrate that our proposed Select2Col significantly improves the perception performance compared to state-of-the-art approaches. The code associated with this research is publicly available at https://github.com/huangqzj/Select2Col/.


Interpretable Multi-Source Data Fusion Through Latent Variable Gaussian Process

arXiv.org Artificial Intelligence

With the advent of artificial intelligence (AI) and machine learning (ML), various domains of science and engineering communites has leveraged data-driven surrogates to model complex systems from numerous sources of information (data). The proliferation has led to significant reduction in cost and time involved in development of superior systems designed to perform specific functionalities. A high proposition of such surrogates are built extensively fusing multiple sources of data, may it be published papers, patents, open repositories, or other resources. However, not much attention has been paid to the differences in quality and comprehensiveness of the known and unknown underlying physical parameters of the information sources that could have downstream implications during system optimization. Towards resolving this issue, a multi-source data fusion framework based on Latent Variable Gaussian Process (LVGP) is proposed. The individual data sources are tagged as a characteristic categorical variable that are mapped into a physically interpretable latent space, allowing the development of source-aware data fusion modeling. Additionally, a dissimilarity metric based on the latent variables of LVGP is introduced to study and understand the differences in the sources of data. The proposed approach is demonstrated on and analyzed through two mathematical (representative parabola problem, 2D Ackley function) and two materials science (design of FeCrAl and SmCoFe alloys) case studies. From the case studies, it is observed that compared to using single-source and source unaware ML models, the proposed multi-source data fusion framework can provide better predictions for sparse-data problems, interpretability regarding the sources, and enhanced modeling capabilities by taking advantage of the correlations and relationships among different sources.


AONeuS: A Neural Rendering Framework for Acoustic-Optical Sensor Fusion

arXiv.org Artificial Intelligence

Underwater perception and 3D surface reconstruction are challenging problems with broad applications in construction, security, marine archaeology, and environmental monitoring. Treacherous operating conditions, fragile surroundings, and limited navigation control often dictate that submersibles restrict their range of motion and, thus, the baseline over which they can capture measurements. In the context of 3D scene reconstruction, it is well-known that smaller baselines make reconstruction more challenging. Our work develops a physics-based multimodal acoustic-optical neural surface reconstruction framework (AONeuS) capable of effectively integrating high-resolution RGB measurements with low-resolution depth-resolved imaging sonar measurements. By fusing these complementary modalities, our framework can reconstruct accurate high-resolution 3D surfaces from measurements captured over heavily-restricted baselines. Through extensive simulations and in-lab experiments, we demonstrate that AONeuS dramatically outperforms recent RGB-only and sonar-only inverse-differentiable-rendering--based surface reconstruction methods. A website visualizing the results of our paper is located at this address: https://aoneus.github.io/


Improving Robustness of LiDAR-Camera Fusion Model against Weather Corruption from Fusion Strategy Perspective

arXiv.org Artificial Intelligence

In recent years, LiDAR-camera fusion models have markedly advanced 3D object detection tasks in autonomous driving. However, their robustness against common weather corruption such as fog, rain, snow, and sunlight in the intricate physical world remains underexplored. In this paper, we evaluate the robustness of fusion models from the perspective of fusion strategies on the corrupted dataset. Based on the evaluation, we further propose a concise yet practical fusion strategy to enhance the robustness of the fusion models, namely flexibly weighted fusing features from LiDAR and camera sources to adapt to varying weather scenarios. Experiments conducted on four types of fusion models, each with two distinct lightweight implementations, confirm the broad applicability and effectiveness of the approach.


Custom IMU-Based Wearable System for Robust 2.4 GHz Wireless Human Body Parts Orientation Tracking and 3D Movement Visualization on an Avatar

arXiv.org Artificial Intelligence

Recent studies confirm the applicability of Inertial Measurement Unit (IMU)-based systems for human motion analysis. Notwithstanding, high-end IMU-based commercial solutions are yet too expensive and complex to democratize their use among a wide range of potential users. Less featured entry-level commercial solutions are being introduced in the market, trying to fill this gap, but still present some limitations that need to be overcome. At the same time, there is a growing number of scientific papers using not commercial, but custom do-it-yourself IMU-based systems in medical and sports applications. Even though these solutions can help to popularize the use of this technology, they have more limited features and the description on how to design and build them from scratch is yet too scarce in the literature. The aim of this work is two-fold: (1) Proving the feasibility of building an affordable custom solution aimed at simultaneous multiple body parts orientation tracking; while providing a detailed bottom-up description of the required hardware, tools, and mathematical operations to estimate and represent 3D movement in real-time. (2) Showing how the introduction of a custom 2.4 GHz communication protocol including a channel hopping strategy can address some of the current communication limitations of entry-level commercial solutions. The proposed system can be used for wireless real-time human body parts orientation tracking with up to 10 custom sensors, at least at 50 Hz. In addition, it provides a more reliable motion data acquisition in Bluetooth and Wi-Fi crowded environments, where the use of entry-level commercial solutions might be unfeasible. This system can be used as a groundwork for developing affordable human motion analysis solutions that do not require an accurate kinematic analysis.


iMove: Exploring Bio-impedance Sensing for Fitness Activity Recognition

arXiv.org Artificial Intelligence

Automatic and precise fitness activity recognition can be beneficial in aspects from promoting a healthy lifestyle to personalized preventative healthcare. While IMUs are currently the prominent fitness tracking modality, through iMove, we show bio-impedence can help improve IMU-based fitness tracking through sensor fusion and contrastive learning.To evaluate our methods, we conducted an experiment including six upper body fitness activities performed by ten subjects over five days to collect synchronized data from bio-impedance across two wrists and IMU on the left wrist.The contrastive learning framework uses the two modalities to train a better IMU-only classification model, where bio-impedance is only required at the training phase, by which the average Macro F1 score with the input of a single IMU was improved by 3.22 \% reaching 84.71 \% compared to the 81.49 \% of the IMU baseline model. We have also shown how bio-impedance can improve human activity recognition (HAR) directly through sensor fusion, reaching an average Macro F1 score of 89.57 \% (two modalities required for both training and inference) even if Bio-impedance alone has an average macro F1 score of 75.36 \%, which is outperformed by IMU alone. In addition, similar results were obtained in an extended study on lower body fitness activity classification, demonstrating the generalisability of our approach.Our findings underscore the potential of sensor fusion and contrastive learning as valuable tools for advancing fitness activity recognition, with bio-impedance playing a pivotal role in augmenting the capabilities of IMU-based systems.