Goto

Collaborating Authors

 Information Fusion


Decentralized Distributed Expert Assisted Learning (D2EAL) approach for cooperative target-tracking

arXiv.org Artificial Intelligence

This paper addresses the problem of cooperative target tracking using a heterogeneous multi-robot system, where the robots are communicating over a dynamic communication network, and heterogeneity is in terms of different types of sensors and prediction algorithms installed in the robots. The problem is cast into a distributed learning framework, where robots are considered as 'agents' connected over a dynamic communication network. Their prediction algorithms are considered as 'experts' giving their look-ahead predictions of the target's trajectory. In this paper, a novel Decentralized Distributed Expert-Assisted Learning (D2EAL) algorithm is proposed, which improves the overall tracking performance by enabling each robot to improve its look-ahead prediction of the target's trajectory by its information sharing, and running a weighted information fusion process combined with online learning of weights based on a prediction loss metric. Theoretical analysis of D2EAL is carried out, which involves the analysis of worst-case bounds on cumulative prediction loss, and weights convergence analysis. Simulation studies show that in adverse scenarios involving large dynamic bias or drift in the expert predictions, D2EAL outperforms well-known covariance-based estimate/prediction fusion methods, both in terms of prediction performance and scalability.


Upper Limb Movement Recognition utilising EEG and EMG Signals for Rehabilitative Robotics

arXiv.org Artificial Intelligence

Upper limb movement classification, which maps input signals to the target activities, is a key building block in the control of rehabilitative robotics. Classifiers are trained for the rehabilitative system to comprehend the desires of the patient whose upper limbs do not function properly. Electromyography (EMG) signals and Electroencephalography (EEG) signals are used widely for upper limb movement classification. By analysing the classification results of the real-time EEG and EMG signals, the system can understand the intention of the user and predict the events that one would like to carry out. Accordingly, it will provide external help to the user. However, the noise in the real-time EEG and EMG data collection process contaminates the effectiveness of the data, which undermines classification performance. Moreover, not all patients process strong EMG signals due to muscle damage and neuromuscular disorder. To address these issues, this paper explores different feature extraction techniques and machine learning and deep learning models for EEG and EMG signals classification and proposes a novel decision-level multisensor fusion technique to integrate EEG signals with EMG signals. This system retrieves effective information from both sources to understand and predict the desire of the user, and thus aid. By testing out the proposed technique on a publicly available WAY-EEG-GAL dataset, which contains EEG and EMG signals that were recorded simultaneously, we manage to conclude the feasibility and effectiveness of the novel system.


Best Axes Composition Extended: Multiple Gyroscopes and Accelerometers Data Fusion to Reduce Systematic Error

arXiv.org Artificial Intelligence

Multiple rigidly attached Inertial Measurement Unit (IMU) sensors provide a richer flow of data compared to a single IMU. State-of-the-art methods follow a probabilistic model of IMU measurements based on the random nature of errors combined under a Bayesian framework. However, affordable low-grade IMUs, in addition, suffer from systematic errors due to their imperfections not covered by their corresponding probabilistic model. In this paper, we propose a method, the Best Axes Composition (BAC) of combining Multiple IMU (MIMU) sensors data for accurate 3D-pose estimation that takes into account both random and systematic errors by dynamically choosing the best IMU axes from the set of all available axes. We evaluate our approach on our MIMU visual-inertial sensor and compare the performance of the method with a purely probabilistic state-of-the-art approach of MIMU data fusion. We show that BAC outperforms the latter and achieves up to 20% accuracy improvement for both orientation and position estimation in open loop, but needs proper treatment to keep the obtained gain.


A Secure Healthcare 5.0 System Based on Blockchain Technology Entangled with Federated Learning Technique

arXiv.org Artificial Intelligence

In recent years, the global Internet of Medical Things (IoMT) industry has evolved at a tremendous speed. Security and privacy are key concerns on the IoMT, owing to the huge scale and deployment of IoMT networks. Machine learning (ML) and blockchain (BC) technologies have significantly enhanced the capabilities and facilities of healthcare 5.0, spawning a new area known as "Smart Healthcare." By identifying concerns early, a smart healthcare system can help avoid long-term damage. This will enhance the quality of life for patients while reducing their stress and healthcare costs. The IoMT enables a range of functionalities in the field of information technology, one of which is smart and interactive health care. However, combining medical data into a single storage location to train a powerful machine learning model raises concerns about privacy, ownership, and compliance with greater concentration. Federated learning (FL) overcomes the preceding difficulties by utilizing a centralized aggregate server to disseminate a global learning model. Simultaneously, the local participant keeps control of patient information, assuring data confidentiality and security. This article conducts a comprehensive analysis of the findings on blockchain technology entangled with federated learning in healthcare. 5.0. The purpose of this study is to construct a secure health monitoring system in healthcare 5.0 by utilizing a blockchain technology and Intrusion Detection System (IDS) to detect any malicious activity in a healthcare network and enables physicians to monitor patients through medical sensors and take necessary measures periodically by predicting diseases.


AFT-VO: Asynchronous Fusion Transformers for Multi-View Visual Odometry Estimation

arXiv.org Artificial Intelligence

Motion estimation approaches typically employ sensor fusion techniques, such as the Kalman Filter, to handle individual sensor failures. More recently, deep learning-based fusion approaches have been proposed, increasing the performance and requiring less model-specific implementations. However, current deep fusion approaches often assume that sensors are synchronised, which is not always practical, especially for low-cost hardware. To address this limitation, in this work, we propose AFT-VO, a novel transformer-based sensor fusion architecture to estimate VO from multiple sensors. Our framework combines predictions from asynchronous multi-view cameras and accounts for the time discrepancies of measurements coming from different sources. Our approach first employs a Mixture Density Network (MDN) to estimate the probability distributions of the 6-DoF poses for every camera in the system. Then a novel transformer-based fusion module, AFT-VO, is introduced, which combines these asynchronous pose estimations, along with their confidences. More specifically, we introduce Discretiser and Source Encoding techniques which enable the fusion of multi-source asynchronous signals. We evaluate our approach on the popular nuScenes and KITTI datasets. Our experiments demonstrate that multi-view fusion for VO estimation provides robust and accurate trajectories, outperforming the state of the art in both challenging weather and lighting conditions.


Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge

arXiv.org Artificial Intelligence

Medical vision-and-language pre-training (Med-VLP) has received considerable attention owing to its applicability to extracting generic vision-and-language representations from medical images and texts. Most existing methods mainly contain three elements: uni-modal encoders (i.e., a vision encoder and a language encoder), a multi-modal fusion module, and pretext tasks, with few studies considering the importance of medical domain expert knowledge and explicitly exploiting such knowledge to facilitate Med-VLP. Although there exist knowledge-enhanced vision-and-language pre-training (VLP) methods in the general domain, most require off-the-shelf toolkits (e.g., object detectors and scene graph parsers), which are unavailable in the medical domain. In this paper, we propose a systematic and effective approach to enhance Med-VLP by structured medical knowledge from three perspectives. First, considering knowledge can be regarded as the intermediate medium between vision and language, we align the representations of the vision encoder and the language encoder through knowledge. Second, we inject knowledge into the multi-modal fusion model to enable the model to perform reasoning using knowledge as the supplementation of the input image and text. Third, we guide the model to put emphasis on the most critical information in images and texts by designing knowledge-induced pretext tasks. To perform a comprehensive evaluation and facilitate further research, we construct a medical vision-and-language benchmark including three tasks. Experimental results illustrate the effectiveness of our approach, where state-of-the-art performance is achieved on all downstream tasks. Further analyses explore the effects of different components of our approach and various settings of pre-training.


Knowledge Graph Induction enabling Recommending and Trend Analysis: A Corporate Research Community Use Case

arXiv.org Artificial Intelligence

A research division plays an important role of driving innovation in an organization. Drawing insights, following trends, keeping abreast of new research, and formulating strategies are increasingly becoming more challenging for both researchers and executives as the amount of information grows in both velocity and volume. In this paper we present a use case of how a corporate research community, IBM Research, utilizes Semantic Web technologies to induce a unified Knowledge Graph from both structured and textual data obtained by integrating various applications used by the community related to research projects, academic papers, datasets, achievements and recognition. In order to make the Knowledge Graph more accessible to application developers, we identified a set of common patterns for exploiting the induced knowledge and exposed them as APIs. Those patterns were born out of user research which identified the most valuable use cases or user pain points to be alleviated. We outline two distinct scenarios: recommendation and analytics for business use. We will discuss these scenarios in detail and provide an empirical evaluation on entity recommendation specifically. The methodology used and the lessons learned from this work can be applied to other organizations facing similar challenges.


Open-Source LiDAR Time Synchronization System by Mimicking GNSS-clock

arXiv.org Artificial Intelligence

Data fusion algorithms that employ LiDAR measurements, such as Visual-LiDAR, LiDAR-Inertial, or Multiple LiDAR Odometry and simultaneous localization and mapping (SLAM) rely on precise timestamping schemes that grant synchronicity to data from LiDAR and other sensors. Poor synchronization performance, due to incorrect timestamping procedure, may negatively affect the algorithms' state estimation results. To provide highly accurate and precise synchronization between the sensors, we introduce an open-source hardware-software LiDAR to other sensors time synchronization system that exploits a dedicated hardware LiDAR time synchronization interface by providing emulated GNSS-clock to this interface, no physical GNSS-receiver is needed. The emulator is based on a general-purpose microcontroller and, due to concise hardware and software architecture, can be easily modified or extended for synchronization of sets of different sensors such as cameras, inertial measurement units (IMUs), wheel encoders, other LiDARs, etc. In the paper, we provide an example of such a system with synchronized LiDAR and IMU sensors. We conducted an evaluation of the sensors synchronization accuracy and precision, and state 1 microsecond performance. We compared our results with timestamping provided by ROS software and by a LiDAR inner clocking scheme to underline clear advantages over these two baseline methods.


Multi-modal Streaming 3D Object Detection

arXiv.org Artificial Intelligence

Modern autonomous vehicles rely heavily on mechanical LiDARs for perception. Current perception methods generally require 360{\deg} point clouds, collected sequentially as the LiDAR scans the azimuth and acquires consecutive wedge-shaped slices. The acquisition latency of a full scan (~ 100ms) may lead to outdated perception which is detrimental to safe operation. Recent streaming perception works proposed directly processing LiDAR slices and compensating for the narrow field of view (FOV) of a slice by reusing features from preceding slices. These works, however, are all based on a single modality and require past information which may be outdated. Meanwhile, images from high-frequency cameras can support streaming models as they provide a larger FoV compared to a LiDAR slice. However, this difference in FoV complicates sensor fusion. To address this research gap, we propose an innovative camera-LiDAR streaming 3D object detection framework that uses camera images instead of past LiDAR slices to provide an up-to-date, dense, and wide context for streaming perception. The proposed method outperforms prior streaming models on the challenging NuScenes benchmark. It also outperforms powerful full-scan detectors while being much faster. Our method is shown to be robust to missing camera images, narrow LiDAR slices, and small camera-LiDAR miscalibration.


UiPath Partners with Snowflake to Launch Data Integration

#artificialintelligence

UiPath, a leading enterprise automation software company, announced it has strengthened its partnership with Snowflake, the Data Cloud company, by launching a new bi-directional integration that will extend the value of automation across the enterprise. UiPath and Snowflake are enabling joint customers to design and build workflows based on 360-degree views of trusted and accessible data on Snowflake's platform. By leveraging the Snowflake Data Cloud, UiPath robots can quickly connect data directly to business processes in the Data Cloud without using complex code, speeding up time to value. Automation is helping organizations around the world become faster and more agile in the face of increased demand and rapidly changing environments. The UiPath end-to-end platform provides robotic process automation (RPA) at its core, removing manual work so users can focus on what matters most.