Goto

Collaborating Authors

 Optimization


On Efficient Multilevel Clustering via Wasserstein Distances

arXiv.org Machine Learning

We propose a novel approach to the problem of multilevel clustering, which aims to simultaneously partition data in each group and discover grouping patterns among groups in a potentially large hierarchically structured corpus of data. Our method involves a joint optimization formulation over several spaces of discrete probability measures, which are endowed with Wasserstein distance metrics. We propose several variants of this problem, which admit fast optimization algorithms, by exploiting the connection to the problem of finding Wasserstein barycenters. Consistency properties are established for the estimates of both local and global clusters. Finally, the experimental results with both synthetic and real data are presented to demonstrate the flexibility and scalability of the proposed approach.


Visualizing Movement Control Optimization Landscapes

arXiv.org Machine Learning

A large body of animation research focuses on optimization of movement control, either as action sequences or policy parameters. However, as closed-form expressions of the objective functions are often not available, our understanding of the optimization problems is limited. Building on recent work on analyzing neural network training, we contribute novel visualizations of high-dimensional control optimization landscapes; this yields insights into why control optimization is hard and why common practices like early termination and spline-based action parameterizations make optimization easier. For example, our experiments show how trajectory optimization can become increasingly ill-conditioned with longer trajectories, but parameterizing control as partial target states - e.g., target angles converted to torques using a PD-controller - can act as an efficient preconditioner. Both our visualizations and quantitative empirical data also indicate that neural network policy optimization scales better than trajectory optimization for long planning horizons. Our work advances the understanding of movement optimization and our visualizations should also provide value in educational use.


Deep Learning Assisted Heuristic Tree Search for the Container Pre-marshalling Problem

arXiv.org Artificial Intelligence

The container pre-marshalling problem (CPMP) is concerned with the re-ordering of containers in container terminals during off-peak times so that containers can be quickly retrieved when the port is busy. The problem has received significant attention in the literature and is addressed by a large number of exact and heuristic methods. Existing methods for the CPMP heavily rely on problem-specific components (e.g., proven lower bounds) that need to be developed by domain experts with knowledge of optimization techniques and a deep understanding of the problem at hand. With the goal to automate the costly and time-intensive design of heuristics for the CPMP, we propose a new method called Deep Learning Heuristic Tree Search (DLTS). It uses deep neural networks to learn solution strategies and lower bounds customized to the CPMP solely through analyzing existing (near-) optimal solutions to CPMP instances. The networks are then integrated into a tree search procedure to decide which branch to choose next and to prune the search tree. DLTS produces the highest quality heuristic solutions to the CPMP to date with gaps to optimality below 2% on real-world sized instances.


Two complexity results on c-optimality in experimental design

#artificialintelligence

Finding a c-optimal design of a regression model is a basic optimization problem in statistics. We study the computational complexity of the problem in the case of a finite experimental domain. We formulate a decision version of the problem and prove its \(\boldsymbol{\mathit{NP}}\)-completeness. We provide examples of computationally complex instances of the design problem, motivated by cryptography. The problem, being \(\boldsymbol{\mathit{NP}}\)-complete, is then relaxed; we prove that a decision version of the relaxation, called approximate c-optimality, is P-complete.


Solving Combinatorial Optimization problems with Quantum inspired Evolutionary Algorithm Tuned using a Novel Heuristic Method

arXiv.org Artificial Intelligence

Quantum inspired Evolutionary Algorithms were proposed more than a decade ago and have been employed for solving a wide range of difficult search and optimization problems. A number of changes have been proposed to improve performance of canonical QEA. However, canonical QEA is one of the few evolutionary algorithms, which uses a search operator with relatively large number of parameters. It is well known that performance of evolutionary algorithms is dependent on specific value of parameters for a given problem. The advantage of having large number of parameters in an operator is that the search process can be made more powerful even with a single operator without requiring a combination of other operators for exploration and exploitation. However, the tuning of operators with large number of parameters is complex and computationally expensive. This paper proposes a novel heuristic method for tuning parameters of canonical QEA. The tuned QEA outperforms canonical QEA on a class of discrete combinatorial optimization problems which, validates the design of the proposed parameter tuning framework. The proposed framework can be used for tuning other algorithms with both large and small number of tunable parameters.


Sparse Canonical Correlation Analysis via Concave Minimization

arXiv.org Machine Learning

A new approach to the sparse Canonical Correlation Analysis (sCCA)is proposed with the aim of discovering interpretable associations in very high-dimensional multi-view, i.e.observations of multiple sets of variables on the same subjects, problems. Inspired by the sparse PCA approach of Journee et al. (2010), we also show that the sparse CCA formulation, while non-convex, is equivalent to a maximization program of a convex objective over a compact set for which we propose a first-order gradient method. This result helps us reduce the search space drastically to the boundaries of the set. Consequently, we propose a two-step algorithm, where we first infer the sparsity pattern of the canonical directions using our fast algorithm, then we shrink each view, i.e. observations of a set of covariates, to contain observations on the sets of covariates selected in the previous step, and compute their canonical directions via any CCA algorithm. We also introduceDirected Sparse CCA, which is able to find associations which are aligned with a specified experiment design, andMulti-View sCCA which is used to discover associations between multiple sets of covariates. Our simulations establish the superior convergence properties and computational efficiency of our algorithm as well as accuracy in terms of the canonical correlation and its ability to recover the supports of the canonical directions. We study the associations between metabolomics, trasncriptomics and microbiomics in a multi-omic study usingMuLe, which is an R-package that implements our approach, in order to form hypotheses on mechanisms of adaptations of Drosophila Melanogaster to high doses of environmental toxicants, specifically Atrazine, which is a commonly used chemical fertilizer.


A Joint Learning and Communications Framework for Federated Learning over Wireless Networks

arXiv.org Machine Learning

In this paper, the problem of training federated learning (FL) algorithms over a realistic wireless network is studied. In particular, in the considered model, wireless users execute an FL algorithm while training their local FL models using their own data and transmitting the trained local FL models to a base station (BS) that will generate a global FL model and send it back to the users. Since all training parameters are transmitted over wireless links, the quality of the training will be affected by wireless factors such as packet errors and the availability of wireless resources. Meanwhile, due to the limited wireless bandwidth, the BS must select an appropriate subset of users to execute the FL algorithm so as to build a global FL model accurately. This joint learning, wireless resource allocation, and user selection problem is formulated as an optimization problem whose goal is to minimize an FL loss function that captures the performance of the FL algorithm. To address this problem, a closed-form expression for the expected convergence rate of the FL algorithm is first derived to quantify the impact of wireless factors on FL. M. Chen is with the Chinese University of Hong Kong, Shenzhen, 518172, China, and also with the Department of Electrical Engineering, Princeton University, Princeton, NJ, 08544, USA, Email: mingzhec@princeton.edu. Z. Y ang is with the Centre for Telecommunications Research, Department of Informatics, King's College London, WC2B 4BG, UK, Email: yang.zhaohui@kcl.ac.uk. W . Saad is with the Wireless@VT, Bradley Department of Electrical and Computer Engineering, Virginia Tech, Blacksburg, V A, 24060, USA, Email: walids@vt.edu. C. Yin is with the Beijing Key Laboratory of Network System Architecture and Convergence, Beijing University of Posts and Telecommunications, Beijing, 100876, China, Emails: ccyin@ieee.org. Poor is with the Department of Electrical Engineering, Princeton University, Princeton, NJ, 08544, USA, Email: poor@princeton.edu. S. Cui is with the Shenzhen Research Institute of Big Data and School of Science and Engineering, the Chinese University of Hong Kong, Shenzhen, 518172, China, Email: robert.cui@gmail.com This work was supported in part by the U.S. National Science Foundation under Grants CNS-1836802 and CCF-0939370. Finally, the user selection and uplink RB allocation is optimized so as to minimize the FL loss function.


Gated Recurrent Units Learning for Optimal Deployment of Visible Light Communications Enabled UAVs

arXiv.org Machine Learning

In this paper, the problem of optimizing the deployment of unmanned aerial vehicles (UAVs) equipped with visible light communication (VLC) capabilities is studied. In the studied model, the UAVs can simultaneously provide communications and illumination to service ground users. Ambient illumination increases the interference over VLC links while reducing the illumination threshold of the UAVs. Therefore, it is necessary to consider the illumination distribution of the target area for UAV deployment optimization. This problem is formulated as an optimization problem whose goal is to minimize the total transmit power while meeting the illumination and communication requirements of users. To solve this problem, an algorithm based on the machine learning framework of gated recurrent units (GRUs) is proposed. Using GRUs, the UAVs can model the long-term historical illumination distribution and predict the future illumination distribution. In order to reduce the complexity of the prediction algorithm while accurately predicting the illumination distribution, a Gaussian mixture model (GMM) is used to fit the illumination distribution of the target area at each time slot. Based on the predicted illumination distribution, the optimization problem is proved to be a convex optimization problem that can be solved by using duality. Simulations using real data from the Earth observations group (EOG) at NOAA/NCEI show that the proposed approach can achieve up to 22.1% reduction in transmit power compared to a conventional optimal UAV deployment that does not consider the illumination distribution. The results also show that UAVs must hover at areas having strong illumination, thus providing useful guidelines on the deployment of VLC-enabled UAVs.


Bayesian Optimization under Heavy-tailed Payoffs

arXiv.org Machine Learning

We consider black box optimization of an unknown function in the nonparametric Gaussian process setting when the noise in the observed function values can be heavy tailed. This is in contrast to existing literature that typically assumes sub-Gaussian noise distributions for queries. Under the assumption that the unknown function belongs to the Reproducing Kernel Hilbert Space (RKHS) induced by a kernel, we first show that an adaptation of the well-known GP-UCB algorithm with reward truncation enjoys sublinear $\tilde{O}(T^{\frac{2 + \alpha}{2(1+\alpha)}})$ regret even with only the $(1+\alpha)$-th moments, $\alpha \in (0,1]$, of the reward distribution being bounded ($\tilde{O}$ hides logarithmic factors). However, for the common squared exponential (SE) and Mat\'{e}rn kernels, this is seen to be significantly larger than a fundamental $\Omega(T^{\frac{1}{1+\alpha}})$ lower bound on regret. We resolve this gap by developing novel Bayesian optimization algorithms, based on kernel approximation techniques, with regret bounds matching the lower bound in order for the SE kernel. We numerically benchmark the algorithms on environments based on both synthetic models and real-world data sets.


Gumbel-softmax Optimization: A Simple General Framework for Combinatorial Optimization Problems on Graphs

arXiv.org Machine Learning

Many problems in real life can be converted to combinatorial optimization problems (COPs) on graphs, that is to find a best node state configuration or a network structure such that the designed objective function is optimized under some constraints. However, these problems are notorious for their hardness to solve because most of them are NP-hard or NP-complete. Although traditional general methods such as simulated annealing (SA), genetic algorithms (GA) and so forth have been devised to these hard problems, their accuracy and time consumption are not satisfying in practice. In this work, we proposed a simple, fast, and general algorithm framework called Gumbel-softmax Optimization (GSO) for COPs. By introducing Gumbel-softmax technique which is developed in machine learning community, we can optimize the objective function directly by gradient descent algorithm regardless of the discrete nature of variables. We test our algorithm on four different problems including Sherrington-Kirkpatrick (SK) model, maximum independent set (MIS) problem, modularity optimization, and structural optimization problem. High-quality solutions can be obtained with much less time consuming compared to traditional approaches.