Goto

Collaborating Authors

 Optimization


Incorporating Expert Prior Knowledge into Experimental Design via Posterior Sampling

arXiv.org Machine Learning

Scientific experiments are usually expensive due to complex experimental preparation and processing. Experimental design is therefore involved with the task of finding the optimal experimental input that results in the desirable output by using as few experiments as possible. Experimenters can often acquire the knowledge about the location of the global optimum. However, they do not know how to exploit this knowledge to accelerate experimental design. In this paper, we adopt the technique of Bayesian optimization for experimental design since Bayesian optimization has established itself as an efficient tool for optimizing expensive black-box functions. Again, it is unknown how to incorporate the expert prior knowledge about the global optimum into Bayesian optimization process. To address it, we represent the expert knowledge about the global optimum via placing a prior distribution on it and we then derive its posterior distribution. An efficient Bayesian optimization approach has been proposed via posterior sampling on the posterior distribution of the global optimum. We theoretically analyze the convergence of the proposed algorithm and discuss the robustness of incorporating expert prior. We evaluate the efficiency of our algorithm by optimizing synthetic functions and tuning hyperparameters of classifiers along with a real-world experiment on the synthesis of short polymer fiber. The results clearly demonstrate the advantages of our proposed method.


Training Binary Neural Networks using the Bayesian Learning Rule

arXiv.org Machine Learning

Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which justify such methods? In this paper, we propose such an approach using the Bayesian learning rule. The rule, when applied to estimate a Bernoulli distribution over the binary weights, results in an algorithm which justifies some of the algorithmic choices made by the previous approaches. The algorithm not only obtains state-of-the-art performance, but also enables uncertainty estimation for continual learning to avoid catastrophic forgetting. Our work provides a principled approach for training binary neural networks which justifies and extends existing approaches.


On the regularity and conditioning of low rank semidefinite programs

arXiv.org Machine Learning

Low rank matrix recovery problems appear widely in statistics, combinatorics, and imaging. One celebrated method for solving these problems is to formulate and solve a semidefinite program (SDP). It is often known that the exact solution to the SDP with perfect data recovers the solution to the original low rank matrix recovery problem. It is more challenging to show that an approximate solution to the SDP formulated with noisy problem data acceptably solves the original problem; arguments are usually ad hoc for each problem setting, and can be complex. In this note, we identify a set of conditions that we call regularity that limit the error due to noisy problem data or incomplete convergence. In this sense, regular SDPs are robust: regular SDPs can be (approximately) solved efficiently at scale; and the resulting approximate solutions, even with noisy data, can be trusted. Moreover, we show that regularity holds generically, and also for many structured low rank matrix recovery problems, including the stochastic block model, $\mathbb{Z}_2$ synchronization, and matrix completion. Formally, we call an SDP regular if it has a surjective constraint map, admits a unique primal and dual solution pair, and satisfies strong duality and strict complementarity. However, regularity is not a panacea: we show the Burer-Monteiro formulation of the SDP may have spurious second-order critical points, even for a regular SDP with a rank 1 solution.


Algorithms for Optimizing Fleet Scheduling of Air Ambulances

arXiv.org Artificial Intelligence

Proper scheduling of air assets can be the difference between life and death for a patient. While poor scheduling can be incredibly problematic during hospital transfers, it can be potentially catastrophic in the case of a disaster. These issues are amplified in the case of an air emergency medical service (EMS) system where populations are dispersed, and resources are limited. There are exact methodologies existing for scheduling missions, although actual calculation times can be quite significant given a large enough problem space. For this research, known coordinates of air and health facilities were used in conjunction with a formulated integer linear programming model. This was the programmed through Gurobi so that performance could be compared against custom algorithmic solutions. Two methods were developed, one based on neighbourhood search and the other on Tabu search. While both were able to achieve results quite close to the Gurobi solution, the Tabu search outperformed the former algorithm. Additionally, it was able to do so in a greatly decreased time, with Gurobi actually being unable to resolve to optimal in larger examples. Parallel variations were also developed with the compute unified device architecture (CUDA), though did not improve the timing given the smaller sample size.


Topologically sensitive metaheuristics

arXiv.org Artificial Intelligence

We present the conceptual design of two topologically sensitive metaheuristics: 1. Topologically Sensitive Variable neighborhood search (TVNS) and 2. Topologically Sensitive Electromagnetism metaheuristics (TEM). Our intention is to show that this topological enhancement can be done in general case, therefore, we select two complementary techniques: VNS is single-solution based and discrete coded metaheuristic, while EM populationbased and real coded metaheuristic. The usability of such metaheuristics and their theoretical aspects will be discussed in further papers.


Google Open Sources TFCO to Help Build Fair Machine Learning Models

#artificialintelligence

Fairness is a highly subjective concept and is not different when comes to machine learning. We typically feels that the referees are "unfair" to our favorite team when they lose a close match or that any outcome is extremely "fair" when it goes our way. Given that machine learning models cannot rely on subjectivity, we need an efficient way to quantify fairness. A lot of research has been done in this area mostly framing fairness as an outcome optimization problem. Recently, Google AI research open sourced the Tensor Flow Constrained Optimization Library(TFCO), an optimization framework that can be used for optimizing different objectives of a machine learning model including fairness.


From Chess and Atari to StarCraft and Beyond: How Game AI is Driving the World of AI

arXiv.org Artificial Intelligence

This paper reviews the field of Game AI, which not only deals with creating agents that can play a certain game, but also with areas as diverse as creating game content automatically, game analytics, or player modelling. While Game AI was for a long time not very well recognized by the larger scientific community, it has established itself as a research area for developing and testing the most advanced forms of AI algorithms and articles covering advances in mastering video games such as StarCraft 2 and Quake III appear in the most prestigious journals. Because of the growth of the field, a single review cannot cover it completely. Therefore, we put a focus on important recent developments, including that advances in Game AI are starting to be extended to areas outside of games, such as robotics or the synthesis of chemicals. In this article, we review the algorithms and methods that have paved the way for these breakthroughs, report on the other important areas of Game AI research, and also point out exciting directions for the future of Game AI.


Bio-inspired Optimization: metaheuristic algorithms for optimization

arXiv.org Artificial Intelligence

In today's day and time solving real-world complex problems has become fundamentally vital and critical task. Many of these are combinatorial problems, where optimal solutions are sought rather than exact solutions. Traditional optimization methods are found to be effective for small scale problems. However, for real-world large scale problems, traditional methods either do not scale up or fail to obtain optimal solutions or they end-up giving solutions after a long running time. Even earlier artificial intelligence based techniques used to solve these problems could not give acceptable results. However, last two decades have seen many new methods in AI based on the characteristics and behaviors of the living organisms in the nature which are categorized as bio-inspired or nature inspired optimization algorithms. These methods, are also termed meta-heuristic optimization methods, have been proved theoretically and implemented using simulation as well used to create many useful applications. They have been used extensively to solve many industrial and engineering complex problems due to being easy to understand, flexible, simple to adapt to the problem at hand and most importantly their ability to come out of local optima traps. This local optima avoidance property helps in finding global optimal solutions. This paper is aimed at understanding how nature has inspired many optimization algorithms, basic categorization of them, major bio-inspired optimization algorithms invented in recent time with their applications.


Three Approaches for Personalization with Applications to Federated Learning

arXiv.org Machine Learning

The standard objective in machine learning is to train a single model for all users. However, in many learning scenarios, such as cloud computing and federated learning, it is possible to learn one personalized model per user. In this work, we present a systematic learning-theoretic study of personalization. We propose and analyze three approaches: user clustering, data interpolation, and model interpolation. For all three approaches, we provide learning-theoretic guarantees and efficient algorithms for which we also demonstrate the performance empirically. All of our algorithms are model agnostic and work for any hypothesis class.


Subspace Fitting Meets Regression: The Effects of Supervision and Orthonormality Constraints on Double Descent of Generalization Errors

arXiv.org Machine Learning

We study the linear subspace fitting problem in the overparameterized setting, where the estimated subspace can perfectly interpolate the training examples. Our scope includes the least-squares solutions to subspace fitting tasks with varying levels of supervision in the training data (i.e., the proportion of input-output examples of the desired low-dimensional mapping) and orthonormality of the vectors defining the learned operator. This flexible family of problems connects standard, unsupervised subspace fitting that enforces strict orthonormality with a corresponding regression task that is fully supervised and does not constrain the linear operator structure. This class of problems is defined over a supervision-orthonormality plane, where each coordinate induces a problem instance with a unique pair of supervision level and softness of orthonormality constraints. We explore this plane and show that the generalization errors of the corresponding subspace fitting problems follow double descent trends as the settings become more supervised and less orthonormally constrained.