Country
On Two Distinct Sources of Nonidentifiability in Latent Position Random Graph Models
Agterberg, Joshua, Tang, Minh, Priebe, Carey E.
The statistical analysis of network data is important for fields such as neuroscience (Vogelstein et al., 2012), sociology (Hoff et al., 2002), and physics (Newman and Girvan, 2004; Bickel and Chen, 2009). Recently, network data have become ubiquitous in the the modern data-science landscape, and a large literature on statistical methods for analyzing these data has developed. Popular statistical models for conditionally independent random graphs include, but are not limited to, the stochastic block model (Holland et al., 1983), the random dot product graph (Young and Scheinerman, 2007; Athreya et al., 2017), and graphons (Lovász, 2012; Diaconis and Janson, 2007). Both the stochastic block model and the random dot product graph are examples of latent position random graphs (Hoff et al., 2002), a graph model that is motivated by the idea that individual nodes have latent positions whose values determine their propensity to form connections. The purpose of this manuscript is to explain a curious phenomenon that arises in latent position random graph settings.
Deep State Space Models for Nonlinear System Identification
Gedon, Daniel, Wahlström, Niklas, Schön, Thomas B., Ljung, Lennart
An actively evolving model class for generative temporal models developed in the deep learning community are deep state space models (SSMs) which have a close connection to classic SSMs. In this work six new deep SSMs are implemented and evaluated for the identification of established nonlinear dynamic system benchmarks. The models and their parameter learning algorithms are elaborated rigorously. The usage of deep SSMs as a black-box identification model can describe a wide range of dynamics due to the flexibility of deep neural networks. Additionally, the uncertainty of the system is modelled and therefore one obtains a much richer representation and a whole class of systems to describe the underlying dynamics.
Peri-Diagnostic Decision Support Through Cost-Efficient Feature Acquisition at Test-Time
Vivar, Gerome, Mullakaeva, Kamilia, Zwergal, Andreas, Navab, Nassir, Ahmadi, Seyed-Ahmad
Computer-aided diagnosis (CADx) algorithms in medicine provide patient-specific decision support for physicians. These algorithms are usually applied after full acquisition of high-dimensional multimodal examination data, and often assume feature-completeness. This, however, is rarely the case due to examination costs, invasiveness, or a lack of indication. A sub-problem in CADx, which to our knowledge has received very little attention among the CADx community so far, is to guide the physician during the entire peri-diagnostic workflow, including the acquisition stage. We model the following question, asked from a physician's perspective: ''Given the evidence collected so far, which examination should I perform next, in order to achieve the most accurate and efficient diagnostic prediction?''. In this work, we propose a novel approach which is enticingly simple: use dropout at the input layer, and integrated gradients of the trained network at test-time to attribute feature importance dynamically. We validate and explain the effectiveness of our proposed approach using two public medical and two synthetic datasets. Results show that our proposed approach is more cost- and feature-efficient than prior approaches and achieves a higher overall accuracy. This directly translates to less unnecessary examinations for patients, and a quicker, less costly and more accurate decision support for the physician.
A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification
Coria, Juan M., Bredin, Hervé, Ghannay, Sahar, Rosset, Sophie
Despite the growing popularity of metric learning approaches, very little work has attempted to perform a fair comparison of these techniques for speaker verification. We try to fill this gap and compare several metric learning loss functions in a systematic manner on the VoxCeleb dataset. The first family of loss functions is derived from the cross entropy loss (usually used for supervised classification) and includes the congenerous cosine loss, the additive angular margin loss, and the center loss. The second family of loss functions focuses on the similarity between training samples and includes the contrastive loss and the triplet loss. We show that the additive angular margin loss function outperforms all other loss functions in the study, while learning more robust representations. Based on a combination of SincNet trainable features and the x-vector architecture, the network used in this paper brings us a step closer to a really-end-to-end speaker verification system, when combined with the additive angular margin loss, while still being competitive with the x-vector baseline. In the spirit of reproducible research, we also release open source Python code for reproducing our results, and share pretrained PyTorch models on torch.hub that can be used either directly or after fine-tuning.
Deep semantic gaze embedding and scanpath comparison for expertise classification during OPT viewing
Castner, Nora, Kübler, Thomas, Scheiter, Katharina, Richter, Juilane, Eder, Thérése, Hüttig, Fabian, Keutel, Constanze, Kasneci, Enkelejda
Modeling eye movement indicative of expertise behavior is decisive in user evaluation. However, it is indisputable that task semantics affect gaze behavior. We present a novel approach to gaze scanpath comparison that incorporates convolutional neural networks (CNN) to process scene information at the fixation level. Image patches linked to respective fixations are used as input for a CNN and the resulting feature vectors provide the temporal and spatial gaze information necessary for scanpath similarity comparison.We evaluated our proposed approach on gaze data from expert and novice dentists interpreting dental radiographs using a local alignment similarity score. Our approach was capable of distinguishing experts from novices with 93% accuracy while incorporating the image semantics. Moreover, our scanpath comparison using image patch features has the potential to incorporate task semantics from a variety of tasks
A Thorough Comparison Study on Adversarial Attacks and Defenses for Common Thorax Disease Classification in Chest X-rays
Rao, Chendi, Cao, Jiezhang, Zeng, Runhao, Chen, Qi, Fu, Huazhu, Xu, Yanwu, Tan, Mingkui
Recently, deep neural networks (DNNs) have made great progress on automated diagnosis with chest X-rays images. However, DNNs are vulnerable to adversarial examples, which may cause misdiagnoses to patients when applying the DNN based methods in disease detection. Recently, there is few comprehensive studies exploring the influence of attack and defense methods on disease detection, especially for the multi-label classification problem. In this paper, we aim to review various adversarial attack and defense methods on chest X-rays. First, the motivations and the mathematical representations of attack and defense methods are introduced in details. Second, we evaluate the influence of several state-of-the-art attack and defense methods for common thorax disease classification in chest X-rays. We found that the attack and defense methods have poor performance with excessive iterations and large perturbations. To address this, we propose a new defense method that is robust to different degrees of perturbations. This study could provide new insights into methodological development for the community.
Graph Hawkes Network for Reasoning on Temporal Knowledge Graphs
Han, Zhen, Wang, Yuyi, Ma, Yunpu, Günnemann, Stephan, Tresp, Volker
The Hawkes process has become a standard method for modeling self-exciting event sequences with different event types. A recent work generalizing the Hawkes process to a neurally self-modulating multivariate point process enables the capturing of more complex and realistic influences of past events on the future. However, this approach is limited by the number of event types, making it impossible to model the dynamics of evolving graph sequences, where each possible link between two nodes can be considered as an event type. The problem becomes even more dramatic when links are directional and labeled, since, in this case, the number of event types scales with the number of nodes and link types. To address this issue, we propose the Graph Hawkes Network to capture the dynamics of evolving graph sequences. Extensive experiments on large-scale temporal relational databases, such as temporal knowledge graphs, demonstrate the effectiveness of our approach.
A Decentralized Policy with Logarithmic Regret for a Class of Multi-Agent Multi-Armed Bandit Problems with Option Unavailability Constraints and Stochastic Communication Protocols
Pankayaraj, Pathmanathan, Maithripala, D. H. S., Berg, J. M.
This paper considers a multi-armed bandit (MAB) problem in which multiple mobile agents receive rewards by sampling from a collection of spatially dispersed stochastic processes, called bandits. The goal is to formulate a decentralized policy for each agent, in order to maximize the total cumulative reward over all agents, subject to option availability and inter-agent communication constraints. The problem formulation is motivated by applications in which a team of autonomous mobile robots cooperates to accomplish an exploration and exploitation task in an uncertain environment. Bandit locations are represented by vertices of the spatial graph. At any time, an agent's option consist of sampling the bandit at its current location, or traveling along an edge of the spatial graph to a new bandit location. Communication constraints are described by a directed, non-stationary, stochastic communication graph. At any time, agents may receive data only from their communication graph in-neighbors. For the case of a single agent on a fully connected spatial graph, it is known that the expected regret for any optimal policy is necessarily bounded below by a function that grows as the logarithm of time. A class of policies called upper confidence bound (UCB) algorithms asymptotically achieve logarithmic regret for the classical MAB problem. In this paper, we propose a UCB-based decentralized motion and option selection policy and a non-stationary stochastic communication protocol that guarantee logarithmic regret. To our knowledge, this is the first such decentralized policy for non-fully connected spatial graphs with communication constraints. When the spatial graph is fully connected and the communication graph is stationary, our decentralized algorithm matches or exceeds the best reported prior results from the literature.
Correlated daily time series and forecasting in the M4 competition
Ingel, Anti, Shahroudi, Novin, Kängsepp, Markus, Tättar, Andre, Komisarenko, Viacheslav, Kull, Meelis
We participated in the M4 competition for time series forecasting and describe here our methods for forecasting daily time series. We used an ensemble of five statistical forecasting methods and a method that we refer to as the correlator. Our retrospective analysis using the ground truth values published by the M4 organisers after the competition demonstrates that the correlator was responsible for most of our gains over the naive constant forecasting method. We identify data leakage as one reason for its success, partly due to test data selected from different time intervals, and partly due to quality issues in the original time series. We suggest that future forecasting competitions should provide actual dates for the time series so that some of those leakages could be avoided by the participants.
Importance of using appropriate baselines for evaluation of data-efficiency in deep reinforcement learning for Atari
Reinforcement learning (RL) has seen great advancements in the past few years. Nevertheless, the consensus among the RL community is that currently used methods, despite all their benefits, suffer from extreme data inefficiency, especially in the rich visual domains like Atari. To circumvent this problem, novel approaches were introduced that often claim to be much more efficient than popular variations of the state-of-the-art DQN algorithm. In this paper, however, we demonstrate that the newly proposed techniques simply used unfair baselines in their experiments. Namely, we show that the actual improvement in the efficiency came from allowing the algorithm for more training updates for each data sample, and not from employing the new methods. By allowing DQN to execute network updates more frequently we manage to reach similar or better results than the recently proposed advancement, often at a fraction of complexity and computational costs. Furthermore, based on the outcomes of the study, we argue that the agent similar to the modified DQN that is presented in this paper should be used as a baseline for any future work aimed at improving sample efficiency of deep reinforcement learning.