Goto

Collaborating Authors

 Optimization


Deep Learning for Radio Resource Allocation with Diverse Quality-of-Service Requirements in 5G

arXiv.org Machine Learning

To accommodate diverse Quality-of-Service (QoS) requirements in 5th generation cellular networks, base stations need real-time optimization of radio resources in time-varying network conditions. This brings high computing overheads and long processing delays. In this work, we develop a deep learning framework to approximate the optimal resource allocation policy that minimizes the total power consumption of a base station by optimizing bandwidth and transmit power allocation. We find that a fully-connected neural network (NN) cannot fully guarantee the QoS requirements due to the approximation errors and quantization errors of the numbers of subcarriers. To tackle this problem, we propose a cascaded structure of NNs, where the first NN approximates the optimal bandwidth allocation, and the second NN outputs the transmit power required to satisfy the QoS requirement with given bandwidth allocation. Considering that the distribution of wireless channels and the types of services in the wireless networks are non-stationary, we apply deep transfer learning to update NNs in non-stationary wireless networks. Simulation results validate that the cascaded NNs outperform the fully connected NN in terms of QoS guarantee. In addition, deep transfer learning can reduce the number of training samples required to train the NNs remarkably. I. INTRODUCTION A. Background The 5th Generation (5G) cellular networks are expected to support various emerging applications with diverse Quality-of-Service (QoS) requirements, such as enhanced mobile broadband services, massive This paper has been presented in part at the IEEE Global Communications Conference 2019 [1]. The authors are with the School of Electrical and Information Engineering, University of Sydney, Sydney, NSW 2006, Australia (email: {rui.dong, To guarantee the QoS requirements of different types of services, existing optimization algorithms for radio resource allocation are designed to maximize spectrum efficiency or energy efficiency by optimizing scarce radio resources, such as time-frequency resource blocks and transmit power, subject to QoS constraints [3-9]. There are two major challenges for implementing existing optimization algorithms in practical 5G networks. First, QoS constraints of some services, such as delay-sensitive and URLLC services, may not have closed-form expressions. To execute an optimization algorithm, the system needs to evaluate the QoS achieved by a certain policy via extensive simulations or experiments, and thus suffers from long processing delay [9, 10]. Second, even if the closed-form expressions of QoS constraints can be obtained in some scenarios, the optimization problems are non-convex in general [8,10,11].


A General Large Neighborhood Search Framework for Solving Integer Programs

arXiv.org Machine Learning

This paper studies how to design abstractions of large-scale combinatorial optimization problems that can leverage existing state-of-the-art solvers in general purpose ways, and that are amenable to data-driven design. The goal is to arrive at new approaches that can reliably outperform existing solvers in wall-clock time. We focus on solving integer programs, and ground our approach in the large neighborhood search (LNS) paradigm, which iteratively chooses a subset of variables to optimize while leaving the remainder fixed. The appeal of LNS is that it can easily use any existing solver as a subroutine, and thus can inherit the benefits of carefully engineered heuristic approaches and their software implementations. We also show that one can learn a good neighborhood selector from training data. Through an extensive empirical validation, we demonstrate that our LNS framework can significantly outperform, in wall-clock time, compared to state-of-the-art commercial solvers such as Gurobi.


A hybrid optimization procedure for solving a tire curing scheduling problem

arXiv.org Artificial Intelligence

This paper addresses a lot-sizing and scheduling problem variant arising from the study of the curing process of a tire factory. The aim is to find the minimum makespan needed for producing enough tires to meet the demand requirements on time, considering the availability and compatibility of different resources involved. To solve this problem, we suggest a hybrid approach that consists in first applying a heuristic to obtain an estimated value of the makespan and then solving a mathematical model to determine the minimum value. We note that the size of the model (number of variables and constraints) depends significantly on the estimated makespan. Extensive numerical experiments over different instances based on real data are presented to evaluate the effectiveness of the hybrid procedure proposed. From the results obtained we can note that the hybrid approach is able to achieve the optimal makespan for many of the instances, even large ones, since the results provided by the heuristic allow to reduce significantly the size of the mathematical model.


Differentially Private Federated Learning for Resource-Constrained Internet of Things

arXiv.org Machine Learning

With the proliferation of smart devices having built-in sensors, Internet connectivity, and programmable computation capability in the era of Internet of things (IoT), tremendous data is being generated at the network edge. Federated learning is capable of analyzing the large amount of data from a distributed set of smart devices without requiring them to upload their data to a central place. However, the commonly-used federated learning algorithm is based on stochastic gradient descent (SGD) and not suitable for resource-constrained IoT environments due to its high communication resource requirement. Moreover, the privacy of sensitive data on smart devices has become a key concern and needs to be protected rigorously. This paper proposes a novel federated learning framework called DP-PASGD for training a machine learning model efficiently from the data stored across resource-constrained smart devices in IoT while guaranteeing differential privacy. The optimal schematic design of DP-PASGD that maximizes the learning performance while satisfying the limits on resource cost and privacy loss is formulated as an optimization problem, and an approximate solution method based on the convergence analysis of DP-PASGD is developed to solve the optimization problem efficiently. Numerical results based on real-world datasets verify the effectiveness of the proposed DP-PASGD scheme.


Bayesian Hierarchical Multi-Objective Optimization for Vehicle Parking Route Discovery

arXiv.org Artificial Intelligence

Discovering an optimal route to the most feasible parking lot has been a matter of concern for any driver which aggravates further during peak hours of the day and at congested places leading to considerable wastage of time and fuel. This paper proposes a Bayesian hierarchical technique for obtaining the most optimal route to a parking lot. The route selection is based on conflicting objectives and hence the problem belongs to the domain of multi-objective optimization. A probabilistic data driven method has been used to overcome the inherent problem of weight selection in the popular weighted sum technique. The weights of these conflicting objectives have been refined using a Bayesian hierarchical model based on Multinomial and Dirichlet prior. Genetic algorithm has been used to obtain optimal solutions. Simulated data has been used to obtain routes which are in close agreement with real life situations.


Gryffin: An algorithm for Bayesian optimization for categorical variables informed by physical intuition with applications to chemistry

arXiv.org Machine Learning

Designing functional molecules and advanced materials requires complex interdependent design choices: tuning continuous process parameters such as temperatures or flow rates, while simultaneously selecting categorical variables like catalysts or solvents. To date, the development of data-driven experiment planning strategies for autonomous experimentation has largely focused on continuous process parameters despite the urge to devise efficient strategies for the selection of categorical variables to substantially accelerate scientific discovery. We introduce Gryffin, as a general purpose optimization framework for the autonomous selection of categorical variables driven by expert knowledge. Gryffin augments Bayesian optimization with kernel density estimation using smooth approximations to categorical distributions. Leveraging domain knowledge from physicochemical descriptors to characterize categorical options, Gryffin can significantly accelerate the search for promising molecules and materials. Gryffin can further highlight relevant correlations between the provided descriptors to inspire physical insights and foster scientific intuition. In addition to comprehensive benchmarks, we demonstrate the capabilities and performance of Gryffin on three examples in materials science and chemistry: (i) the discovery of non-fullerene acceptors for organic solar cells, (ii) the design of hybrid organic-inorganic perovskites for light-harvesting, and (iii) the identification of ligands and process parameters for Suzuki-Miyaura reactions. Our observations suggest that Gryffin, in its simplest form without descriptors, constitutes a competitive categorical optimizer compared to state-of-the-art approaches. However, when leveraging domain knowledge provided via descriptors, Gryffin can optimize at considerable higher rates and refine this domain knowledge to spark scientific understanding.


Preferential Batch Bayesian Optimization

arXiv.org Machine Learning

Most research in Bayesian optimization (BO) has focused on direct feedback scenarios, where one has access to exact, or perturbed, values of some expensive-to-evaluate objective. This direction has been mainly driven by the use of BO in machine learning hyper-parameter configuration problems. However, in domains such as modelling human preferences, A/B tests or recommender systems, there is a need of methods that are able to replace direct feedback with preferential feedback, obtained via rankings or pairwise comparisons. In this work, we present Preferential Batch Bayesian Optimization (PBBO), a new framework that allows to find the optimum of a latent function of interest, given any type of parallel preferential feedback for a group of two or more points. We do so by using a Gaussian process model with a likelihood specially designed to enable parallel and efficient data collection mechanisms, which are key in modern machine learning. We show how the acquisitions developed under this framework generalize and augment previous approaches in Bayesian optimization, expanding the use of these techniques to a wider range of domains. An extensive simulation study shows the benefits of this approach, both with simulated functions and four real data sets.


Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence

arXiv.org Machine Learning

This paper develops a unified framework, based on iterated random operator theory, to analyze the convergence of constant stepsize recursive stochastic algorithms (RSAs) in machine learning and reinforcement learning. RSAs use randomization to efficiently compute expectations, and so their iterates form a stochastic process. The key idea is to lift the RSA into an appropriate higher-dimensional space and then express it as an equivalent Markov chain. Instead of determining the convergence of this Markov chain (which may not converge under constant stepsize), we study the convergence of the distribution of this Markov chain. To study this, we define a new notion of Wasserstein divergence. We show that if the distribution of the iterates in the Markov chain satisfy certain contraction property with respect to the Wasserstein divergence, then the Markov chain admits an invariant distribution. Inspired by the SVRG algorithm, we develop a method to convert any RSA to a variance reduced RSA that converges to the optimal solution with in almost sure sense or in probability. We show that convergence of a large family of constant stepsize RSAs can be understood using this framework. We apply this framework to ascertain the convergence of mini-batch SGD, forward-backward splitting with catalyst, SVRG, SAGA, empirical Q value iteration, synchronous Q-learning, enhanced policy iteration, and MDPs with a generative model. We also develop two new algorithms for reinforcement learning and establish their convergence using this framework.


Zeroth-order Optimization on Riemannian Manifolds

arXiv.org Machine Learning

We propose and analyze zeroth-order algorithms for optimization over Riemannian manifolds, where we observe only potentially noisy evaluations of the objective function. Our approach is based on estimating the Riemannian gradient from the objective function evaluations. We consider three settings for the objective function: (i) deterministic and smooth, (ii) stochastic and smooth, and (iii) composition of smooth and non-smooth parts. For each of the setting, we characterize the oracle complexity of our algorithm to obtain appropriately defined notions of $\epsilon$-stationary points. Notably, our complexities are independent of the ambient dimension of the Euclidean space in which the manifold is embedded in, and only depend on the intrinsic dimension of the manifold. As a proof of concept, we demonstrate the applicability of our method to the problem of black-box attacks to deep neural networks, by providing simulation and real-world image data based experimental results.


An Inverse-free Truncated Rayleigh-Ritz Method for Sparse Generalized Eigenvalue Problem

arXiv.org Machine Learning

This paper considers the sparse generalized eigenvalue problem (SGEP), which aims to find the leading eigenvector with at most $k$ nonzero entries. SGEP naturally arises in many applications in machine learning, statistics, and scientific computing, for example, the sparse principal component analysis (SPCA), the sparse discriminant analysis (SDA), and the sparse canonical correlation analysis (SCCA). In this paper, we focus on the development of a three-stage algorithm named {\em inverse-free truncated Rayleigh-Ritz method} ({\em IFTRR}) to efficiently solve SGEP. In each iteration of IFTRR, only a small number of matrix-vector products is required. This makes IFTRR well-suited for large scale problems. Particularly, a new truncation strategy is proposed, which is able to find the support set of the leading eigenvector effectively. Theoretical results are developed to explain why IFTRR works well. Numerical simulations demonstrate the merits of IFTRR.