Goto

Collaborating Authors

 Optimization


Overcomplete Independent Component Analysis via SDP

arXiv.org Machine Learning

We present a novel algorithm for overcomplete independent components analysis (ICA), where the number of latent sources k exceeds the dimension p of observed variables. Previous algorithms either suffer from high computational complexity or make strong assumptions about the form of the mixing matrix. Our algorithm does not make any sparsity assumption yet enjoys favorable computational and theoretical properties. Our algorithm consists of two main steps: (a) estimation of the Hessians of the cumulant generating function (as opposed to the fourth and higher order cumulants used by most algorithms) and (b) a novel semi-definite programming (SDP) relaxation for recovering a mixing component. We show that this relaxation can be efficiently solved with a projected accelerated gradient descent method, which makes the whole algorithm computationally practical. Moreover, we conjecture that the proposed program recovers a mixing component at the rate k < p^2/4 and prove that a mixing component can be recovered with high probability when k < (2 - epsilon) p log p when the original components are sampled uniformly at random on the hyper sphere. Experiments are provided on synthetic data and the CIFAR-10 dataset of real images.


Pretending Fair Decisions via Stealthily Biased Sampling

arXiv.org Machine Learning

Fairness by decision-makers is believed to be auditable by third parties. In this study, we show that this is not always true. We consider the following scenario. Imagine a decision-maker who discloses a subset of his dataset with decisions to make his decisions auditable. If he is corrupt, and he deliberately selects a subset that looks fair even though the overall decision is unfair, can we identify this decision-maker's fraud? We answer this question negatively. We first propose a sampling method that produces a subset whose distribution is biased from the original (to pretend to be fair); however, its differentiation from uniform sampling is difficult. We call such a sampling method as stealthily biased sampling, which is formulated as a Wasserstein distance minimization problem, and is solved through a minimum-cost flow computation. We proved that the stealthily biased sampling minimizes an upper-bound of the indistinguishability. We conducted experiments to see that the stealthily biased sampling is, in fact, difficult to detect.


A Review on Quantile Regression for Stochastic Computer Experiments

arXiv.org Machine Learning

We report on an empirical study of the main strategies for conditional quantile estimation in the context of stochastic computer experiments. To ensure adequate diversity, six metamodels are presented, divided into three categories based on order statistics, functional approaches, and those of Bayesian inspiration. The metamodels are tested on several problems characterized by the size of the training set, the input dimension, the quantile order and the value of the probability density function in the neighborhood of the quantile. The metamodels studied reveal good contrasts in our set of 480 experiments, enabling several patterns to be extracted. Based on our results, guidelines are proposed to allow users to select the best method for a given problem.


SPI-Optimizer: an integral-Separated PI Controller for Stochastic Optimization

arXiv.org Machine Learning

To overcome the oscillation problem in the classical momentum-based optimizer, recent work associates it with the proportional-integral (PI) controller, and artificially adds D term producing a PID controller. It suppresses oscillation with the sacrifice of introducing extra hyper-parameter. In this paper, we start by analyzing: why momentum-based method oscillates about the optimal point? and answering that: the fluctuation problem relates to the lag effect of integral (I) term. Inspired by the conditional integration idea in classical control society, we propose SPI-Optimizer, an integral-Separated PI controller based optimizer WITHOUT introducing extra hyperparameter. It separates momentum term adaptively when the inconsistency of current and historical gradient direction occurs. Extensive experiments demonstrate that SPIOptimizer generalizes well on popular network architectures to eliminate the oscillation, and owns competitive performance with faster convergence speed (up to 40% epochs reduction ratio ) and more accurate classification result on MNIST, CIFAR10, and CIFAR100 (up to 27.5% error reduction ratio) than the state-of-the-art methods.


Thirty Years of Machine Learning:The Road to Pareto-Optimal Next-Generation Wireless Networks

arXiv.org Machine Learning

Next-generation wireless networks (NGWN) have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of machine learning by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning, respectively. Furthermore, we investigate their employment in the compelling applications of NGWNs, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various machine learning algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.


Trajectory Normalized Gradients for Distributed Optimization

arXiv.org Machine Learning

Recently, researchers proposed various low-precision gradient compression, for efficient communication in large-scale distributed optimization. Based on these work, we try to reduce the communication complexity from a new direction. We pursue an ideal bijective mapping between two spaces of gradient distribution, so that the mapped gradient carries greater information entropy after the compression. In our setting, all servers should share a reference gradient in advance, and they communicate via the normalized gradients, which are the subtraction or quotient, between current gradients and the reference. To obtain a reference vector that yields a stronger signal-to-noise ratio, dynamically in each iteration, we extract and fuse information from the past trajectory in hindsight, and search for an optimal reference for compression. We name this to be the trajectory-based normalized gradients (TNG). It bridges the research from different societies, like coding, optimization, systems, and learning. It is easy to implement and can universally combine with existing algorithms. Our experiments on benchmarking hard non-convex functions, convex problems like logistic regression demonstrate that TNG is more compression-efficient for communication of distributed optimization of general functions.


Message-passing algorithm of quantum annealing with nonstoquastic Hamiltonian

arXiv.org Machine Learning

Quantum annealing (QA) is a generic method for solving optimization problems using fictitious quantum fluctuation. The current device performing QA involves controlling the transverse field; it is classically simulatable by using the standard technique for mapping the quantum spin systems to the classical ones. In this sense, the current system for QA is not powerful despite utilizing quantum fluctuation. Hence, we developed a system with a time-dependent Hamiltonian consisting of a combination of the formulated Ising model and the "driver" Hamiltonian with only quantum fluctuation. In the previous study, for a fully connected spin model, quantum fluctuation can be addressed in a relatively simple way. We proved that the fully connected antiferromagnetic interaction can be transformed into a fluctuating transverse field and is thus classically simulatable at sufficiently low temperatures. Using the fluctuating transverse field, we established several ways to simulate part of the nonstoquastic Hamiltonian on classical computers. We formulated a message-passing algorithm in the present study. This algorithm is capable of assessing the performance of QA with part of the nonstoquastic Hamiltonian having a large number of spins. In other words, we developed a different approach for simulating the nonstoquastic Hamiltonian without using the quantum Monte Carlo technique. Our results were validated by comparison to the results obtained by the replica method.


Multiobjective Coverage Path Planning: Enabling Automated Inspection of Complex, Real-World Structures

arXiv.org Artificial Intelligence

An important open problem in robotic planning is the autonomous generation of 3D inspection paths -- that is, planning the best path to move a robot along in order to inspect a target structure. We recently suggested a new method for planning paths allowing the inspection of complex 3D structures, given a triangular mesh model of the structure. The method differs from previous approaches in its emphasis on generating and considering also plans that result in imperfect coverage of the inspection target. In many practical tasks, one would accept imperfections in coverage if this results in a substantially more energy efficient inspection path. The key idea is using a multiobjective evolutionary algorithm to optimize the energy usage and coverage of inspection plans simultaneously - and the result is a set of plans exploring the different ways to balance the two objectives. We here test our method on a set of inspection targets with large variation in size and complexity, and compare its performance with two state-of-the-art methods for complete coverage path planning. The results strengthen our confidence in the ability of our method to generate good inspection plans for different types of targets. The method's advantage is most clearly seen for real-world inspection targets, since traditional complete coverage methods have no good way of generating plans for structures with hidden parts. Multiobjective evolution, by optimizing energy usage and coverage together ensures a good balance between the two - both when 100% coverage is feasible, and when large parts of the object are hidden.


A New CGAN Technique for Constrained Topology Design Optimization

arXiv.org Machine Learning

This paper presents a new conditional GAN (named convex relaxing CGAN or crCGAN) to replicate the conventional constrained topology optimization algorithms in an extremely effective and efficient process. The proposed crCGAN consists of a generator and a discriminator, both of which are deep convolutional neural networks (CNN) and the topology design constraint can be conditionally set to both the generator and discriminator. In order to improve the training efficiency and accuracy due to the dependency between the training images and the condition, a variety of crCGAN formulation are introduced to relax the non-convex design space. These new formulations were evaluated and validated via a series of comprehensive experiments. Moreover, a minibatch discrimination technique was introduced in the crCGAN training process to stabilize the convergence and avoid the mode collapse problems. Additional verifications were conducted using the state-of-the-art MNIST digits and CIFAR-10 images conditioned by class labels. The experimental evaluations clearly reveal that the new objective formulation with the minibatch discrimination training provides not only the accuracy but also the consistency of the designs.


Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information

arXiv.org Machine Learning

This paper offers a methodological contribution at the intersection of machine learning and operations research. Namely, we propose a methodology to quickly predict tactical solutions to a given operational problem. In this context, the tactical solution is less detailed than the operational one but it has to be computed in very short time and under imperfect information. The problem is of importance in various applications where tactical and operational planning problems are interrelated and information about the operational problem is revealed over time. This is for instance the case in certain capacity planning and demand management systems. We formulate the problem as a two-stage optimal prediction stochastic program whose solution we predict with a supervised machine learning algorithm. The training data set consists of a large number of deterministic (second stage) problems generated by controlled probabilistic sampling. The labels are computed based on solutions to the deterministic problems (solved independently and offline) employing appropriate aggregation and subselection methods to address uncertainty. Results on our motivating application in load planning for rail transportation show that deep learning algorithms produce highly accurate predictions in very short computing time (milliseconds or less). The prediction accuracy is comparable to solutions computed by sample average approximation of the stochastic program.