Goto

Collaborating Authors

 Country


Resource Management in Wireless Networks via Multi-Agent Deep Reinforcement Learning

arXiv.org Machine Learning

We propose a mechanism for distributed radio resource management using multi-agent deep reinforcement learning (RL) for interference mitigation in wireless networks. We equip each transmitter in the network with a deep RL agent, which receives partial delayed observations from its associated users, while also exchanging observations with its neighboring agents, and decides on which user to serve and what transmit power to use at each scheduling interval. Our proposed framework enables the agents to make decisions simultaneously and in a distributed manner, without any knowledge about the concurrent decisions of other agents. Moreover, our design of the agents' observation and action spaces is scalable, in the sense that an agent trained on a scenario with a specific number of transmitters and receivers can be readily applied to scenarios with different numbers of transmitters and/or receivers. Simulation results demonstrate the superiority of our proposed approach compared to decentralized baselines in terms of the tradeoff between average and 5 th percentile user rates, while achieving performance close to, and even in certain cases outperforming, that of a centralized information-theoretic scheduling algorithm. We also show that our trained agents are robust and maintain their performance gains when experiencing mismatches between training and testing deployments. I. INTRODUCTION One of the key drivers for improving throughput in future wireless networks, including fifth generation mobile networks (5G), is the densification achieved by deploying more base stations. The authors are with Intel Corporation, Santa Clara, CA 95054. The rise of such ultra-dense network paradigms implies that the limited physical wireless resources (in time, frequency, etc.) need to support an increasing number of simultaneous transmissions. Effective radio resource management procedures are, therefore, critical to mitigate the interference among such concurrent transmissions and achieve the desired performance enhancement in these ultra-dense environments. The radio resource management problem is in general non-convex and therefore computationally complex, especially as the network size increases. There is a rich literature of centralized and distributed algorithms for radio resource management, using various techniques in different areas such as geometric programming [1], weighted minimum mean square optimization [2], game theory [3], information theory [4], [5], and fractional programming [6].


Ensemble Slice Sampling

arXiv.org Machine Learning

Slice Sampling has emerged as a powerful Markov Chain Monte Carlo algorithm that adapts to the characteristics of the target distribution with minimal hand-tuning. However, Slice Sampling's performance is highly sensitive to the user-specified initial length scale hyperparameter. Moreover, Slice Sampling generally struggles with poorly scaled or strongly correlated distributions. This paper introduces Ensemble Slice Sampling, a new class of algorithms that bypasses such difficulties by adaptively tuning the length scale. Furthermore, Ensemble Slice Sampling's performance is immune to linear correlations by exploiting an ensemble of parallel walkers. These algorithms are trivial to construct, require no hand-tuning, and can easily be implemented in parallel computing environments. Empirical tests show that Ensemble Slice Sampling can improve efficiency by more than an order of magnitude compared to conventional MCMC methods on highly correlated target distributions such as the Autoregressive Process of Order 1 and the Correlated Funnel distribution.


Fast Fair Regression via Efficient Approximations of Mutual Information

arXiv.org Machine Learning

Most work in algorithmic fairness to date has focused on discrete outcomes, such as deciding whether to grant someone a loan or not. In these classification settings, group fairness criteria such as independence, separation and sufficiency can be measured directly by comparing rates of outcomes between subpopulations. Many important problems however require the prediction of a real-valued outcome, such as a risk score or insurance premium. In such regression settings, measuring group fairness criteria is computationally challenging, as it requires estimating information-theoretic divergences between conditional probability density functions. This paper introduces fast approximations of the independence, separation and sufficiency group fairness criteria for regression models from their (conditional) mutual information definitions, and uses such approximations as regularisers to enforce fairness within a regularised risk minimisation framework. Experiments in real-world datasets indicate that in spite of its superior computational efficiency our algorithm still displays state-of-the-art accuracy/fairness tradeoffs.


Stochasticity of Deterministic Gradient Descent: Large Learning Rate for Multiscale Objective Function

arXiv.org Machine Learning

Optimization is a central ingredient of machine learning. First-order optimization algorithms, for instance, are particularly popular for deep learning tasks due to their scalabilities to highdimensional problems, because they employ gradient but not higher-order information of objective functions for iteratively approximating minimizers. Among first-order methods, arguably the most used is gradient descent method (GD), or rather one of its variants, stochastic gradient descent method (SGD). Designed for objective functions that sum a large amount of terms, which for instance can originate from big data, SGD introduces a randomization mechanism of gradient subsampling to improve the scalability of GD (e.g., Zhang [2004], Moulines and Bach [2011], Roux et al. [2012]). Consequently, the iteration of SGD, unlike GD, is not deterministic even when it is started at a fixed initial condition.


Combining Parametric Land Surface Models with Machine Learning

arXiv.org Machine Learning

A hybrid machine learning and process-based-modeling (PBM) approach is proposed and evaluated at a handful of AmeriFlux sites to simulate the top-layer soil moisture state. The Hybrid-PBM (HPBM) employed here uses the Noah land-surface model integrated with Gaussian Processes. It is designed to correct the model only in climatological situations similar to the training data else it reverts to the PBM. In this way, our approach avoids bad predictions in scenarios where similar training data is not available and incorporates our physical understanding of the system. Here we assume an autoregressive model and obtain out-of-sample results with upwards of a 3-fold reduction in the RMSE using a one-year leave-one-out cross-validation at each of the selected sites. A path is outlined for using hybrid modeling to build global land-surface models with the potential to significantly outperform the current state-of-the-art.


Scalable Neural Methods for Reasoning With a Symbolic Knowledge Base

arXiv.org Machine Learning

We describe a novel way of representing a symbolic knowledge base (KB) called a sparse-matrix reified KB. This representation enables neural modules that are fully differentiable, faithful to the original semantics of the KB, expressive enough to model multi-hop inferences, and scalable enough to use with realistically large KBs. The sparse-matrix reified KB can be distributed across multiple GPUs, can scale to tens of millions of entities and facts, and is orders of magnitude faster than naive sparse-matrix implementations. The reified KB enables very simple end-to-end architectures to obtain competitive performance on several benchmarks representing two families of tasks: KB completion, and learning semantic parsers from denotations.


Multi-variate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows

arXiv.org Machine Learning

Time series forecasting is often fundamental to scientific and engineering problems and enables decision making. With ever increasing data set sizes, a trivial solution to scale up predictions is to assume independence between interacting time series. However, modeling statistical dependencies can improve accuracy and enable analysis of interaction effects. Deep learning methods are well suited for this problem, but multi-variate models often assume a simple parametric distribution and do not scale to high dimensions. In this work we model the multi-variate temporal dynamics of time series via an autoregressive deep learning model, where the data distribution is represented by a conditioned normalizing flow. This combination retains the power of autoregressive models, such as good performance in extrapolation into the future, with the flexibility of flows as a general purpose high-dimensional distribution model, while remaining computationally tractable. We show that it improves over the state-of-the-art for standard metrics on many real-world data sets with several thousand interacting time-series.


Robust Reinforcement Learning via Adversarial training with Langevin Dynamics

arXiv.org Machine Learning

We introduce a sampling perspective to tackle the challenging task of training robust Reinforcement Learning (RL) agents. Leveraging the powerful Stochastic Gradient Langevin Dynamics, we present a novel, scalable two-player RL algorithm, which is a sampling variant of the two-player policy gradient method. Our algorithm consistently outperforms existing baselines, in terms of generalization across different training and testing conditions, on several MuJoCo environments. Our experiments also show that, even for objective functions that entirely ignore potential environmental shifts, our sampling approach remains highly robust in comparison to standard RL algorithms.


Estimating Gradients for Discrete Random Variables by Sampling without Replacement

arXiv.org Machine Learning

We derive an unbiased estimator for expectations over discrete random variables based on sampling without replacement, which reduces variance as it avoids duplicate samples. We show that our estimator can be derived as the Rao-Blackwellization of three different estimators. Combining our estimator with REINFORCE, we obtain a policy gradient estimator and we reduce its variance using a built-in control variate which is obtained without additional model evaluations. The resulting estimator is closely related to other gradient estimators. Experiments with a toy problem, a categorical Variational Auto-Encoder and a structured prediction problem show that our estimator is the only estimator that is consistently among the best estimators in both high and low entropy settings.


Deep Speaker Embeddings for Far-Field Speaker Recognition on Short Utterances

arXiv.org Machine Learning

Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions according to the results obtained for early NIST SRE (Speaker Recognition Evaluation) datasets. From the practical point of view, taking into account the increased interest in virtual assistants (such as Amazon Alexa, Google Home, AppleSiri, etc.), speaker verification on short utterances in uncontrolled noisy environment conditions is one of the most challenging and highly demanded tasks. This paper presents approaches aimed to achieve two goals: a) improve the quality of far-field speaker verification systems in the presence of environmental noise, reverberation and b) reduce the system qualitydegradation for short utterances. For these purposes, we considered deep neural network architectures based on TDNN (TimeDelay Neural Network) and ResNet (Residual Neural Network) blocks. We experimented with state-of-the-art embedding extractors and their training procedures. Obtained results confirm that ResNet architectures outperform the standard x-vector approach in terms of speaker verification quality for both long-duration and short-duration utterances. We also investigate the impact of speech activity detector, different scoring models, adaptation and score normalization techniques. The experimental results are presented for publicly available data and verification protocols for the VoxCeleb1, VoxCeleb2, and VOiCES datasets.