Country
Learning-Based Link Scheduling in Millimeter-wave Multi-connectivity Scenarios
Tatino, Cristian, Pappas, Nikolaos, Malanchini, Ilaria, Ewe, Lutz, Yuan, Di
Multi-connectivity is emerging as a promising solution to provide reliable communications and seamless connectivity for the millimeter-wave frequency range. Due to the blockage sensitivity at such high frequencies, connectivity with multiple cells can drastically increase the network performance in terms of throughput and reliability. However, an inefficient link scheduling, i.e., over and under-provisioning of connections, can lead either to high interference and energy consumption or to unsatisfied user's quality of service (QoS) requirements. In this work, we present a learning-based solution that is able to learn and then to predict the optimal link scheduling to satisfy users' QoS requirements while avoiding communication interruptions. Moreover, we compare the proposed approach with two base line methods and the genie-aided link scheduling that assumes perfect channel knowledge. We show that the learning-based solution approaches the optimum and outperforms the base line methods.
Self-Tuning Deep Reinforcement Learning
Zahavy, Tom, Xu, Zhongwen, Veeriah, Vivek, Hessel, Matteo, Oh, Junhyuk, van Hasselt, Hado, Silver, David, Singh, Satinder
Reinforcement learning (RL) algorithms often require expensive manual or automated hyperparameter searches in order to perform well on a new domain. This need is particularly acute in modern deep RL architectures which often incorporate many modules and multiple loss functions. In this paper, we take a step towards addressing this issue by using metagradients (Xu et al., 2018) to tune these hyperparameters via differentiable cross validation, whilst the agent interacts with and learns from the environment. We present the Self-Tuning Actor Critic (STAC) which uses this process to tune the hyperparameters of the usual loss function of the IMPALA actor critic agent(Espeholt et. al., 2018), to learn the hyperparameters that define auxiliary loss functions, and to balance trade offs in off policy learning by introducing and adapting the hyperparameters of a novel leaky V-trace operator. The method is simple to use, sample efficient and does not require significant increase in compute. Ablative studies show that the overall performance of STAC improves as we adapt more hyperparameters. When applied to 57 games on the Atari 2600 environment over 200 million frames our algorithm improves the median human normalized score of the baseline from 243% to 364%.
Compressive Learning of Generative Networks
Schellekens, Vincent, Jacques, Laurent
Generative networks implicitly approximate complex densities from their sampling with impressive accuracy. However, because of the enormous scale of modern datasets, this training process is often computationally expensive. We cast generative network training into the recent framework of compressive learning: we reduce the computational burden of large-scale datasets by first harshly compressing them in a single pass as a single sketch vector. We then propose a cost function, which approximates the Maximum Mean Discrepancy metric, but requires only this sketch, which makes it time- and memory-efficient to optimize.
Transformer++
Recent advancements in attention mechanisms have replaced recurrent neural networks and its variants for machine translation tasks. Transformer using attention mechanism solely achieved state-of-the-art results in sequence modeling. Neural machine translation based on the attention mechanism is parallelizable and addresses the problem of handling long-range dependencies among words in sentences more effectively than recurrent neural networks. One of the key concepts in attention is to learn three matrices, query, key, and value, where global dependencies among words are learned through linearly projecting word embeddings through these matrices. Multiple query, key, value matrices can be learned simultaneously focusing on a different subspace of the embedded dimension, which is called multi-head in Transformer. We argue that certain dependencies among words could be learned better through an intermediate context than directly modeling word-word dependencies. This could happen due to the nature of certain dependencies or lack of patterns that lend them difficult to be modeled globally using multi-head self-attention. In this work, we propose a new way of learning dependencies through a context in multi-head using convolution. This new form of multi-head attention along with the traditional form achieves better results than Transformer on the WMT 2014 English-to-German and English-to-French translation tasks. We also introduce a framework to learn POS tagging and NER information during the training of encoder which further improves results achieving a new state-of-the-art of 32.1 BLEU, better than existing best by 1.4 BLEU, on the WMT 2014 English-to-German and 44.6 BLEU, better than existing best by 1.1 BLEU, on the WMT 2014 English-to-French translation tasks. We call this Transformer++.
A Question-Centric Model for Visual Question Answering in Medical Imaging
Vu, Minh H., Lรถfstedt, Tommy, Nyholm, Tufve, Sznitman, Raphael
Deep learning methods have proven extremely effective at performing a variety of medical image analysis tasks. With their potential use in clinical routine, their lack of transparency has however been one of their few weak points, raising concerns regarding their behavior and failure modes. While most research to infer model behavior has focused on indirect strategies that estimate prediction uncertainties and visualize model support in the input image space, the ability to explicitly query a prediction model regarding its image content offers a more direct way to determine the behavior of trained models. To this end, we present a novel Visual Question Answering approach that allows an image to be queried by means of a written question. Experiments on a variety of medical and natural image datasets show that by fusing image and question features in a novel way, the proposed approach achieves an equal or higher accuracy compared to current methods.
Two Decades of AI4NETS-AI/ML for Data Networks: Challenges & Research Directions
The popularity of Artificial Intelligence (AI) -- and of Machine Learning (ML) as an approach to AI, has dramatically increased in the last few years, due to its outstanding performance in various domains, notably in image, audio, and natural language processing. In these domains, AI success-stories are boosting the applied field. When it comes to AI/ML for data communication Networks (AI4NETS), and despite the many attempts to turn networks into learning agents, the successful application of AI/ML in networking is limited. There is a strong resistance against AI/ML-based solutions, and a striking gap between the extensive academic research and the actual deployments of such AI/ML-based systems in operational environments. The truth is, there are still many unsolved complex challenges associated to the analysis of networking data through AI/ML, which hinders its acceptability and adoption in the practice. In this positioning paper I elaborate on the most important show-stoppers in AI4NETS, and present a research agenda to tackle some of these challenges, enabling a natural adoption of AI/ML for networking. In particular, I focus the future research in AI4NETS around three major pillars: (i) to make AI/ML immediately applicable in networking problems through the concepts of effective learning, turning it into a useful and reliable way to deal with complex data-driven networking problems; (ii) to boost the adoption of AI/ML at the large scale by learning from the Internet-paradigm itself, conceiving novel distributed and hierarchical learning approaches mimicking the distributed topological principles and operation of the Internet itself; and (iii) to exploit the softwarization and distribution of networks to conceive AI/ML-defined Networks (AIDN), relying on the distributed generation and re-usage of knowledge through novel Knowledge Delivery Networks (KDNs).
Detecting and Characterizing Bots that Commit Code
Dey, Tapajit, Mousavi, Sara, Ponce, Eduardo, Fry, Tanner, Vasilescu, Bogdan, Filippova, Anna, Mockus, Audris
Background: Some developer activity traditionally performed manually, such as making code commits, opening, managing, or closing issues is increasingly subject to automation in many OSS projects. Specifically, such activity is often performed by tools that react to events or run at specific times. We refer to such automation tools as bots and, in many software mining scenarios related to developer productivity or code quality it is desirable to identify bots in order to separate their actions from actions of individuals. Aim: Find an automated way of identifying bots and code committed by these bots, and to characterize the types of bots based on their activity patterns. Method and Result: We propose BIMAN, a systematic approach to detect bots using author names, commit messages, files modified by the commit, and projects associated with the ommits. For our test data, the value for AUC-ROC was 0.9. We also characterized these bots based on the time patterns of their code commits and the types of files modified, and found that they primarily work with documentation files and web pages, and these files are most prevalent in HTML and JavaScript ecosystems. We have compiled a shareable dataset containing detailed information about 461 bots we found (all of whom have more than 1000 commits) and 14,678,222 commits they created.
IntrA: 3D Intracranial Aneurysm Dataset for Deep Learning
Yang, Xi, Xia, Ding, Kin, Taichi, Igarashi, Takeo
Medicine is an important application area for deep learning models. Research in this field is a combination of medical expertise and data science knowledge. In this paper, instead of 2D medical images, we introduce an open-access 3D intracranial aneurysm dataset, IntrA, that makes the application of points-based and mesh-based classification and segmentation models available. Our dataset can be used to diagnose intracranial aneurysms and to extract the neck for a clipping operation in medicine and other areas of deep learning, such as normal estimation and surface reconstruction. We provide a large-scale benchmark of classification and part segmentation by testing state-of-the-art networks. We also discuss the performance of each method and demonstrate the challenges of our dataset. The published dataset can be accessed here: https://github.com/intra3d2019/IntrA.
Learning in the Sky: An Efficient 3D Placement of UAVs
Arani, Atefeh Hajijamali, Azari, M. Mahdi, Melek, William, Safavi-Naeini, Safieddin
Deployment of unmanned aerial vehicles (UAVs) as aerial base stations can deliver a fast and flexible solution for serving varying traffic demand. In order to adequately benefit of UAVs deployment, their efficient placement is of utmost importance, and requires to intelligently adapt to the environment changes. In this paper, we propose a learning-based mechanism for the three-dimensional deployment of UAVs assisting terrestrial cellular networks in the downlink. The problem is modeled as a non-cooperative game among UAVs in satisfaction form. To solve the game, we utilize a low complexity algorithm, in which unsatisfied UAVs update their locations based on a learning algorithm. Simulation results reveal that the proposed UAV placement algorithm yields significant performance gains up to about 52% and 74% in terms of throughput and the number of dropped users, respectively, compared to an optimized baseline algorithm.
Explainable and Scalable Machine-Learning Algorithms for Detection of Autism Spectrum Disorder using fMRI Data
Eslami, Taban, Raiker, Joseph S., Saeed, Fahad
Specifically, misdiagnosis as well as the provision of treatments intended for one disorder to those impacted by another disorder represent critical challenges to the field of mental health and highlight an urgent need for the identification of appropriate biomarkers as well as novel approaches to aid clinicians in diagnosing mental illness and monitoring treatment outcomes [1]. The current psychiatric diagnostic process is based nearly exclusively on reports of behavioral and/or emotional observations of symptomology (DSM-5/ICD-10) and requires careful consideration of multiple aspects related to descriptions of these observations by informants such as parents, teachers, or children (e.g., when symptoms started, what they look like over time, how long they have lasted). Collectively, all of these factors rely heavily upon: a) clinicians obtaining sufficient information to characterize the nature of symptomatology and b) patient's and/or other informant's (e.g., parents, teachers) ability to sufficiently describe these symptoms as well as recall their onset, course, and duration. Unfortunately, in contrast to other health conditions (e.g., diabetes, HIV, hepatitis-C) which can be characterized based on interpretation of quantitative tests of biological markers (e.g., blood tests), no such quantitative tests of biological processes currently exist for mental health conditions such as ASD [2]. Fortunately, the emergence of "big data" analytics and the increasing availability of data shared across various repositories - particularly in the case of federally funded research - provide a renewed opportunity for leveraging more advanced analytic approaches (e.g., machine learning, deep learning) to develop diagnostic aids for clinicians as well as isolate potential biomarkers for a number of these mental health conditions (e.g., ADHD, ASD). Autism Spectrum Disorder (ASD) represents a lifelong neuro-developmental disorder characterized by deficits in social communication, social interaction, and stereotypical interests and/or repetitive behaviors that emerge in the early developmental period [3].