Country
Feature-wise change detection and robust indoor positioning using RANSAC-like approach
Fingerprinting-based positioning, one of the promising indoor positioning solutions, has been broadly explored owing to the pervasiveness of sensor-rich mobile devices, the prosperity of opportunistically measurable location-relevant signals and the progress of data-driven algorithms. One critical challenge is to controland improve the quality of the reference fingerprint map (RFM), which is built at the offline stage and applied for online positioning. The key concept concerningthe quality control of the RFM is updating the RFM according to the newly measured data. Though varies methods have been proposed for adapting the RFM, they approach the problem by introducing extra-positioning schemes (e.g. PDR orUGV) and directly adjust the RFM without distinguishing whether critical changes have occurred. This paper aims at proposing an extra-positioning-free solution by making full use of the redundancy of measurable features. Loosely inspired by random sampling consensus (RANSAC), arbitrarily sampled subset of features from the online measurement are used for generating multi-resamples, which areused for estimating the intermediate locations. In the way of resampling, it can mitigate the impact of the changed features on positioning and enables to retrieve accurate location estimation. The users location is robustly computed by identifying the candidate locations from these intermediate ones using modified Jaccardindex (MJI) and the feature-wise change belief is calculated according to the world model of the RFM and the estimated variability of features. In order to validate our proposed approach, two levels of experimental analysis have been carried out. On the simulated dataset, the average change detection accuracy is about 90%. Meanwhile, the improvement of positioning accuracy within 2 m is about 20% by dropping out the features that are detected as changed when performing positioning comparing to that of using all measured features for location estimation. On the long-term collected dataset, the average change detection accuracy is about 85%.
Improving Question Generation with Sentence-level Semantic Matching and Answer Position Inferring
Ma, Xiyao, Zhu, Qile, Zhou, Yanlin, Li, Xiaolin, Wu, Dapeng
Taking an answer and its context as input, sequence-to- sequence models have made considerable progress on question generation. However, we observe that these approaches often generate wrong question words or keywords and copy answer-irrelevant words from the input. We believe that lacking global question semantics and exploiting answer position-awareness not well are the key root causes. In this paper, we propose a neural question generation model with two concrete modules: sentence-level semantic matching and answer position inferring. Further, we enhance the initial state of the decoder by leveraging the answer-aware gated fusion mechanism. Experimental results demonstrate that our model outperforms the state-of-the-art (SOT A) models on SQuAD and MARCO datasets. Owing to its generality, our work also improves the existing models significantly.
Balancing the Tradeoff Between Clustering Value and Interpretability
Saisubramanian, Sandhya, Galhotra, Sainyam, Zilberstein, Shlomo
Graph clustering groups entities -- the vertices of a graph -- based on their similarity, typically using a complex distance function over a large number of features. Successful integration of clustering approaches in automated decision-support systems hinges on the interpretability of the resulting clusters. This paper addresses the problem of generating interpretable clusters, given features of interest that signify interpretability to an end-user, by optimizing interpretability in addition to common clustering objectives. We propose a $\beta$-interpretable clustering algorithm that ensures that at least $\beta$ fraction of nodes in each cluster share the same feature value. The tunable parameter $\beta$ is user-specified. We also present a more efficient algorithm for scenarios with $\beta\!=\!1$ and analyze the theoretical guarantees of the two algorithms. Finally, we empirically demonstrate the benefits of our approaches in generating interpretable clusters using four real-world datasets. The interpretability of the clusters is complemented by generating simple explanations denoting the feature values of the nodes in the clusters, using frequent pattern mining.
Interestingness Elements for Explainable Reinforcement Learning: Understanding Agents' Capabilities and Limitations
Sequeira, Pedro, Gervasio, Melinda
We propose an explainable reinforcement learning (XRL) framework that analyzes an agent's history of interaction with the environment to extract interestingness elements that help explain its behavior. The framework relies on data readily available from standard RL algorithms, augmented with data that can easily be collected by the agent while learning. We describe how to create visual explanations of an agent's behavior in the form of short video-clips highlighting key interaction moments, based on the proposed elements. We also report on a user study where we evaluated the ability of humans in correctly perceiving the aptitude of agents with different characteristics, including their capabilities and limitations, given explanations automatically generated by our framework. The results show that the diversity of aspects captured by the different interestingness elements is crucial to help humans correctly identify the agents' aptitude in the task, and determine when they might need adjustments to improve their performance.
Unsupervised Adversarial Image Inpainting
Pajot, Arthur, de Bezenac, Emmanuel, Gallinari, Patrick
We consider inpainting in an unsupervised setting where there is neither access to paired nor unpaired training data. The only available information is provided by the uncomplete observations and the inpainting process statistics. In this context, an observation should give rise to several plausible reconstructions which amounts at learning a distribution over the space of reconstructed images. We model the reconstruction process by using a conditional GAN with constraints on the stochastic component that introduce an explicit dependency between this component and the generated output. This allows us sampling from the latent component in order to generate a distribution of images associated to an observation. We demonstrate the capacity of our model on several image datasets: faces (CelebA), food images (Recipe-1M) and bedrooms (LSUN Bedrooms) with different types of imputation masks. The approach yields comparable performance to model variants trained with additional supervision.
Cooperative Perception for 3D Object Detection in Driving Scenarios using Infrastructure Sensors
Arnold, Eduardo, Dianati, Mehrdad, de Temple, Robert
The perception system of an autonomous vehicle is responsible for mapping sensor observations into a semantic description of the vehicle's environment. 3D object detection is a common function within this system and outputs a list of 3D bounding boxes around objects of interest. Various 3D object detection methods have relied on fusion of different sensor modalities to overcome limitations of individual sensors. However, occlusion, limited field-of-view and low-point density of the sensor data cannot be reliably and cost-effectively addressed by multi-modal sensing from a single point of view. Alternatively, cooperative perception incorporates information from spatially diverse sensors distributed around the environment as a way to mitigate these limitations. This paper proposes two schemes for cooperative 3D object detection. The early fusion scheme combines point clouds from multiple spatially diverse sensing points of view before detection. In contrast, the late fusion scheme fuses the independently estimated bounding boxes from multiple spatially diverse sensors. We evaluate the performance of both schemes using a synthetic cooperative dataset created in two complex driving scenarios, a T-junction and a roundabout. The evaluation show that the early fusion approach outperforms late fusion by a significant margin at the cost of higher communication bandwidth. The results demonstrate that cooperative perception can recall more than 95% of the objects as opposed to 30% for single-point sensing in the most challenging scenario. To provide practical insights into the deployment of such system, we report how the number of sensors and their configuration impact the detection performance of the system.
Variable-lag Granger Causality for Time Series Analysis
Amornbunchornvej, Chainarong, Zheleva, Elena, Berger-Wolf, Tanya Y.
Granger causality is a fundamental technique for causal inference in time series data, commonly used in the social and biological sciences. Typical operationalizations of Granger causality make a strong assumption that every time point of the effect time series is influenced by a combination of other time series with a fixed time delay. However, the assumption of the fixed time delay does not hold in many applications, such as collective behavior, financial markets, and many natural phenomena. To address this issue, we develop variable-lag Granger causality, a generalization of Granger causality that relaxes the assumption of the fixed time delay and allows causes to influence effects with arbitrary time delays. In addition, we propose a method for inferring variable-lag Granger causality relations. We demonstrate our approach on an application for studying coordinated collective behavior and show that it performs better than several existing methods in both simulated and real-world datasets. Our approach can be applied in any domain of time series analysis.
MedCAT -- Medical Concept Annotation Tool
Kraljevic, Zeljko, Bean, Daniel, Mascio, Aurelie, Roguski, Lukasz, Folarin, Amos, Roberts, Angus, Bendayan, Rebecca, Dobson, Richard
Biomedical documents such as Electronic Health Records (EHRs) contain a large amount of information in an unstructured format. The data in EHRs is a hugely valuable resource documenting clinical narratives and decisions, but whilst the text can be easily understood by human doctors it is challenging to use in research and clinical applications. To uncover the potential of biomedical documents we need to extract and structure the information they contain. The task at hand is Named Entity Recognition and Linking (NER+L). The number of entities, ambiguity of words, overlapping and nesting make the biomedical area significantly more difficult than many others. To overcome these difficulties, we have developed the Medical Concept Annotation Tool (MedCAT), an open-source unsupervised approach to NER+L. MedCAT uses unsupervised machine learning to disambiguate entities. It was validated on MIMIC-III (a freely accessible critical care database) and MedMentions (Biomedical papers annotated with mentions from the Unified Medical Language System). In case of NER+L, the comparison with existing tools shows that MedCAT improves the previous best with only unsupervised learning (F1=0.848 vs 0.691 for disease detection; F1=0.710 vs. 0.222 for general concept detection). A qualitative analysis of the vector embeddings learnt by MedCAT shows that it captures latent medical knowledge available in EHRs (MIMIC-III). Unsupervised learning can improve the performance of large scale entity extraction, but it has some limitations when working with only a couple of entities and a small dataset. In that case options are supervised learning or active learning, both of which are supported in MedCAT via the MedCATtrainer extension. Our approach can detect and link millions of different biomedical concepts with state-of-the-art performance, whilst being lightweight, fast and easy to use.
Expanding Label Sets for Graph Convolutional Networks
Coskun, Mustafa, Gungor, Burcu Bakir, Koyuturk, Mehmet
In recent years, Graph Convolutional Networks (GCNs) and their variants have been widely utilized in learning tasks that involve graphs. These tasks include recommendation systems, node classification, among many others. In node classification problem, the input is a graph in which the edges represent the association between pairs of nodes, multi-dimensional feature vectors are associated with the nodes, and some of the nodes in the graph have known labels. The objective is to predict the labels of the nodes that are not labeled, using the nodes features, in conjunction with graph topology. While GCNs have been successfully applied to this problem, the caveats that they inherit from traditional deep learning models pose significant challenges to broad utilization of GCNs in node classification. One such caveat is that training a GCN requires a large number of labeled training instances, which is often not the case in realistic settings. To remedy this requirement, state-of-the-art methods leverage network diffusion-based approaches to propagate labels across the network before training GCNs. However, these approaches ignore the tendency of the network diffusion methods in biasing proximity with centrality, resulting in the propagation of labels to the nodes that are well-connected in the graph. To address this problem, here we present an alternate approach to extrapolating node labels in GCNs in the following three steps: (i) clustering of the network to identify communities, (ii) use of network diffusion algorithms to quantify the proximity of each node to the communities, thereby obtaining a low-dimensional topological profile for each node, (iii) comparing these topological profiles to identify nodes that are most similar to the labeled nodes.
Location Forensics Analysis Using ENF Sequences Extracted from Power and Audio Recordings
Chowdhury, Dhiman, Sarkar, Mrinmoy
Electrical network frequency (ENF) is the signature of a power distribution grid which represents the nominal frequency (50 or 60 Hz) of a power system network. Due to load variations in a power grid, ENF sequences experience fluctuations. These ENF variations are inherently located in a multimedia signal which is recorded close to the grid or directly from the mains power line. Therefore, a multimedia recording can be localized by analyzing the ENF sequences of that signal in absence of the concurrent power signal. In this paper, a novel approach to analyze location forensics using ENF sequences extracted from a number of power and audio recordings is proposed. The digital recordings are collected from different grid locations around the world. Potential feature components are determined from the ENF sequences. Then, a multi-class support vector machine (SVM) classification model is developed to validate the location authenticity of the recordings. The performance assessments affirm the efficacy of the presented work.