Information Fusion
Hybrid Indoor Localization via Reinforcement Learning-based Information Fusion
Salimibeni, Mohammad, Mohammadi, Arash
The paper is motivated by the importance of the Smart Cities (SC) concept for future management of global urbanization. Among all Internet of Things (IoT)-based communication technologies, Bluetooth Low Energy (BLE) plays a vital role in city-wide decision making and services. Extreme fluctuations of the Received Signal Strength Indicator (RSSI), however, prevent this technology from being a reliable solution with acceptable accuracy in the dynamic indoor tracking/localization approaches for ever-changing SC environments. The latest version of the BLE v.5.1 introduced a better possibility for tracking users by utilizing the direction finding approaches based on the Angle of Arrival (AoA), which is more reliable. There are still some fundamental issues remaining to be addressed. Existing works mainly focus on implementing stand-alone models overlooking potentials fusion strategies. The paper addresses this gap and proposes a novel Reinforcement Learning (RL)-based information fusion framework (RL-IFF) by coupling AoA with RSSI-based particle filtering and Inertial Measurement Unit (IMU)-based Pedestrian Dead Reckoning (PDR) frameworks. The proposed RL-IFF solution is evaluated through a comprehensive set of experiments illustrating superior performance compared to its counterparts.
Shared Manifold Learning Using a Triplet Network for Multiple Sensor Translation and Fusion with Missing Data
Dutt, Aditya, Zare, Alina, Gader, Paul
Abstract--Heterogeneous data fusion can enhance the robustness and accuracy of an algorithm on a given task. However, due to the difference in various modalities, aligning the sensors and embedding their information into discriminative and compact representations is challenging. In this paper, we propose a Contrastive learning based MultiModal Alignment Network (CoMMANet) to align data from different sensors into a shared and discriminative manifold where class information is preserved. The proposed architecture uses a multimodal triplet autoencoder to cluster the latent space in such a way that samples of the same classes from each heterogeneous modality are mapped close to each other. Since all the modalities exist in a shared manifold, a unified classification framework is proposed. A comparison made with other methods demonstrates the superiority of this method. This method is also called decision fusion. In the context of a neural network, these outstanding results on tasks like land-use and land-cover representations are generated by the convolutional layers classification (LULC) [1] [2], mineral exploration [3] [4] and fused gradually to form a shared representation [5], urban planning [6], biodiversity conservation [7], sentiment layer. In Fusion methods can be classified into two groups: concatenation and alignment-based methods. Personal use of this material is permitted. To increase the interpretability learn spatial information by using a structured morphological of fusion models, Hong et al. [27] proposed a element of predefined size and shape. They proposed a graphbased shared and specific feature learning (S2FL) that is capable of model to couple the dimension reduction and fusion of decomposing data into modality-shared and modality-specific information. However, using this method, the cloud-covered components, which enables a better information blending of regions are not accurately classified because the morphological multiple heterogeneous modalities.
An Empirical Revisiting of Linguistic Knowledge Fusion in Language Understanding Tasks
Yu, Changlong, Xiao, Tianyi, Kong, Lingpeng, Song, Yangqiu, Ng, Wilfred
Though linguistic knowledge emerges during large-scale language model pretraining, recent work attempt to explicitly incorporate human-defined linguistic priors into task-specific fine-tuning. Infusing language models with syntactic or semantic knowledge from parsers has shown improvements on many language understanding tasks. To further investigate the effectiveness of structural linguistic priors, we conduct empirical study of replacing parsed graphs or trees with trivial ones (rarely carrying linguistic knowledge e.g., balanced tree) for tasks in the GLUE benchmark. Encoding with trivial graphs achieves competitive or even better performance in fully-supervised and few-shot settings. It reveals that the gains might not be significantly attributed to explicit linguistic priors but rather to more feature interactions brought by fusion layers. Hence we call for attention to using trivial graphs as necessary baselines to design advanced knowledge fusion methods in the future.
Senior Digital Clinical Data Manager
At Biogen Digital Health (BDH), we aspire to transform Biogen and patients' lives by making personalized & digital medicine in neuroscience a reality. Powered by data-science and digital technologies, we drive solutions to advance research, clinical care, and patient empowerment. Our team strives for real impact through excellence, innovation, and collaboration. This role is of key importance to achieve the strategic vision and objective to make Biogen a recognized leader in digital health sciences, hence contributing to our corporate vision & strategy. At Biogen Digital Health (BDH), we aspire to transform Biogen and patients' lives by making personalized & digital medicine in neuroscience a reality.
Artificial Intelligence-Based Methods for Fusion of Electronic Health Records and Imaging Data
Mohsen, Farida, Ali, Hazrat, Hajj, Nady El, Shah, Zubair
Healthcare data are inherently multimodal, including electronic health records (EHR), medical images, and multi-omics data. Combining these multimodal data sources contributes to a better understanding of human health and provides optimal personalized healthcare. Advances in artificial intelligence (AI) technologies, particularly machine learning (ML), enable the fusion of these different data modalities to provide multimodal insights. To this end, in this scoping review, we focus on synthesizing and analyzing the literature that uses AI techniques to fuse multimodal medical data for different clinical applications. More specifically, we focus on studies that only fused EHR with medical imaging data to develop various AI methods for clinical applications. We present a comprehensive analysis of the various fusion strategies, the diseases and clinical outcomes for which multimodal fusion was used, the ML algorithms used to perform multimodal fusion for each clinical application, and the available multimodal medical datasets. We followed the PRISMA-ScR guidelines. We searched Embase, PubMed, Scopus, and Google Scholar to retrieve relevant studies. We extracted data from 34 studies that fulfilled the inclusion criteria. In our analysis, a typical workflow was observed: feeding raw data, fusing different data modalities by applying conventional machine learning (ML) or deep learning (DL) algorithms, and finally, evaluating the multimodal fusion through clinical outcome predictions. Specifically, early fusion was the most used technique in most applications for multimodal learning (22 out of 34 studies). We found that multimodality fusion models outperformed traditional single-modality models for the same task. Disease diagnosis and prediction were the most common clinical outcomes (reported in 20 and 10 studies, respectively) from a clinical outcome perspective.
IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments
Soliman, Abanob, Bonardi, Fabien, Sidibé, Désiré, Bouchafa, Samia
The development process of high-fidelity SLAM systems depends on their validation upon reliable datasets. Towards this goal, we propose IBISCape, a simulated benchmark that includes data synchronization and acquisition APIs for telemetry from heterogeneous sensors: stereo-RGB/DVS, Depth, IMU, and GPS, along with the ground truth scene segmentation and vehicle ego-motion. Our benchmark is built upon the CARLA simulator, whose back-end is the Unreal Engine rendering a high dynamic scenery simulating the real world. Moreover, we offer 34 multi-modal datasets suitable for autonomous vehicles navigation, including scenarios for scene understanding evaluation like accidents, along with a wide range of frame quality based on a dynamic weather simulation class integrated with our APIs. We also introduce the first calibration targets to CARLA maps to solve the unknown distortion parameters problem of CARLA simulated DVS and RGB cameras. Finally, using IBISCape sequences, we evaluate four ORB-SLAM3 systems (monocular RGB, stereo RGB, Stereo Visual Inertial (SVI), and RGB-D) performance and BASALT Visual-Inertial Odometry (VIO) system on various sequences collected in simulated large-scale dynamic environments. Keywords: benchmark, multi-modal, datasets, Odometry, Calibration, DVS, SLAM
Global Big Data Conference
Redbird, a New York-based enterprise analytics operating system, announced it has raised $7.6 million in an oversubscribed seed round. The Redbird platform allows non-technical users to automate and unify analytics work without writing code and connects all data sources into a no-code environment for data prep, wrangling, analysis, reporting, and data science, according to a company release. Though the company touts its platform's no-code features as friendly for non-technical users, it also aims to make life easier for data professionals: "Even for technical teams with these [data] skill sets, it can be challenging and time consuming, ultimately distracting them from higher value work. We created Redbird with the goal of making it easier for organizations who would like all of their employees to be equipped with a more unified, automated, and accessible approach to doing this type of work," said Erin Tavgac, Redbird CEO and co-founder. Data engineers may find Redbird helpful for building data integrations, managing ETL workflows, provisioning data views, and maintaining data science models.
A sensor-to-pattern calibration framework for multi-modal industrial collaborative cells
Rato, Daniela, Oliveira, Miguel, Santos, Vítor, Gomes, Manuel, Sappa, Angel
Collaborative robotic industrial cells are workspaces where robots collaborate with human operators. In this context, safety is paramount, and for that a complete perception of the space where the collaborative robot is inserted is necessary. To ensure this, collaborative cells are equipped with a large set of sensors of multiple modalities, covering the entire work volume. However, the fusion of information from all these sensors requires an accurate extrinsic calibration. The calibration of such complex systems is challenging, due to the number of sensors and modalities, and also due to the small overlapping fields of view between the sensors, which are positioned to capture different viewpoints of the cell. This paper proposes a sensor to pattern methodology that can calibrate a complex system such as a collaborative cell in a single optimization procedure. Our methodology can tackle RGB and Depth cameras, as well as LiDARs. Results show that our methodology is able to accurately calibrate a collaborative cell containing three RGB cameras, a depth camera and three 3D LiDARs.
On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning
Becker, Philipp, Neumann, Gerhard
Improved state space models, such as Recurrent State Space Models (RSSMs), are a key factor behind recent advances in model-based reinforcement learning (RL). Yet, despite their empirical success, many of the underlying design choices are not well understood. We show that RSSMs use a suboptimal inference scheme and that models trained using this inference overestimate the aleatoric uncertainty of the ground truth system. We find this overestimation implicitly regularizes RSSMs and allows them to succeed in model-based RL. We postulate that this implicit regularization fulfills the same functionality as explicitly modeling epistemic uncertainty, which is crucial for many other model-based RL approaches. Yet, overestimating aleatoric uncertainty can also impair performance in cases where accurately estimating it matters, e.g., when we have to deal with occlusions, missing observations, or fusing sensor modalities at different frequencies. Moreover, the implicit regularization is a side-effect of the inference scheme and not the result of a rigorous, principled formulation, which renders analyzing or improving RSSMs difficult. Thus, we propose an alternative approach building on well-understood components for modeling aleatoric and epistemic uncertainty, dubbed Variational Recurrent Kalman Network (VRKN). This approach uses Kalman updates for exact smoothing inference in a latent space and Monte Carlo Dropout to model epistemic uncertainty. Due to the Kalman updates, the VRKN can naturally handle missing observations or sensor fusion problems with varying numbers of observations per time step. Our experiments show that using the VRKN instead of the RSSM improves performance in tasks where appropriately capturing aleatoric uncertainty is crucial while matching it in the deterministic standard benchmarks.
CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection
Hwang, Jyh-Jing, Kretzschmar, Henrik, Manela, Joshua, Rafferty, Sean, Armstrong-Crews, Nicholas, Chen, Tiffany, Anguelov, Dragomir
Robust 3D object detection is critical for safe autonomous driving. Camera and radar sensors are synergistic as they capture complementary information and work well under different environmental conditions. Fusing camera and radar data is challenging, however, as each of the sensors lacks information along a perpendicular axis, that is, depth is unknown to camera and elevation is unknown to radar. We propose the camera-radar matching network CramNet, an efficient approach to fuse the sensor readings from camera and radar in a joint 3D space. To leverage radar range measurements for better camera depth predictions, we propose a novel ray-constrained cross-attention mechanism that resolves the ambiguity in the geometric correspondences between camera features and radar features. Our method supports training with sensor modality dropout, which leads to robust 3D object detection, even when a camera or radar sensor suddenly malfunctions on a vehicle. We demonstrate the effectiveness of our fusion approach through extensive experiments on the RADIATE dataset, one of the few large-scale datasets that provide radar radio frequency imagery. A camera-only variant of our method achieves competitive performance in monocular 3D object detection on the Waymo Open Dataset. Keywords: Sensor fusion; cross attention; robust 3D object detection.