AITopics

2006.12399

Country:

Europe > United Kingdom (0.14)
North America > United States > New York > New York County > New York City (0.05)
Europe > Sweden > Stockholm > Stockholm (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Law (0.67)
Law Enforcement & Public Safety (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Moss, Henry B., Leslie, David S., Rayson, Paul

MUMBO: MUlti-task Max-value Bayesian Optimization

arXiv.org Machine LearningJun-22-2020

We propose MUMBO, the first high-performing yet computationally efficient acquisition function for multi-task Bayesian optimization. Here, the challenge is to perform efficient optimization by evaluating low-cost functions somehow related to our true target function. This is a broad class of problems including the popular task of multi-fidelity optimization. However, while information-theoretic acquisition functions are known to provide state-of-the-art Bayesian optimization, existing implementations for multi-task scenarios have prohibitive computational requirements. Previous acquisition functions have therefore been suitable only for problems with both low-dimensional parameter spaces and function query costs sufficiently large to overshadow very significant optimization overheads. In this work, we derive a novel multi-task version of entropy search, delivering robust performance with low computational overheads across classic optimization challenges and multi-task hyper-parameter tuning. MUMBO is scalable and efficient, allowing multi-task Bayesian optimization to be deployed in problems with rich parameter and fidelity spaces.

machine learning, natural language, optimization, (16 more...)

2006.12093

Country: Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

John, Philips George, Vijaykeerthy, Deepak, Saha, Diptikalyan

Verifying Individual Fairness in Machine Learning Models

arXiv.org Artificial IntelligenceJun-21-2020

We consider the problem of whether a given decision model, working with structured data, has individual fairness. Following the work of Dwork, a model is individually biased (or unfair) if there is a pair of valid inputs which are close to each other (according to an appropriate metric) but are treated differently by the model (different class label, or large difference in output), and it is unbiased (or fair) if no such pair exists. Our objective is to construct verifiers for proving individual fairness of a given model, and we do so by considering appropriate relaxations of the problem. We construct verifiers which are sound but not complete for linear classifiers, and kernelized polynomial/radial basis function classifiers. We also report the experimental results of evaluating our proposed algorithms on publicly available datasets.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2006.11737

Country: Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Ma, Xingchen, Blaschko, Matthew B.

Additive Tree-Structured Covariance Function for Conditional Parameter Spaces in Bayesian Optimization

Bayesian optimization (BO) is a sample-efficient global optimization algorithm for black-box functions which are expensive to evaluate. Existing literature on model based optimization in conditional parameter spaces are usually built on trees. In this work, we generalize the additive assumption to tree-structured functions and propose an additive tree-structured covariance function, showing improved sample-efficiency, wider applicability and greater flexibility. Furthermore, by incorporating the structure information of parameter spaces and the additive assumption in the BO loop, we develop a parallel algorithm to optimize the acquisition function and this optimization can be performed in a low dimensional space. We demonstrate our method on an optimization benchmark function, as well as on a neural network model compression problem, and experimental results show our approach significantly outperforms the current state of the art for conditional parameter optimization including SMAC, TPE and Jenatton et al. (2017).

artificial intelligence, covariance function, machine learning, (15 more...)

2006.11771

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Iwazaki, Shogo, Inatsu, Yu, Takeuchi, Ichiro

Bayesian Quadrature Optimization for Probability Threshold Robustness Measure

In many product development problems, the performance of the product is governed by two types of parameters called design parameter and environmental parameter. While the former is fully controllable, the latter varies depending on the environment in which the product is used. The challenge of such a problem is to find the design parameter that maximizes the probability that the performance of the product will meet the desired requisite level given the variation of the environmental parameter. In this paper, we formulate this practical problem as active learning (AL) problems and propose efficient algorithms with theoretically guaranteed performance. Our basic idea is to use Gaussian Process (GP) model as the surrogate model of the product development process, and then to formulate our AL problems as Bayesian Quadrature Optimization problems for probabilistic threshold robustness (PTR) measure. We derive credible intervals for the PTR measure and propose AL algorithms for the optimization and level set estimation of the PTR measure. We clarify the theoretical properties of the proposed algorithms and demonstrate their efficiency in both synthetic and real-world product development problems.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

2006.11986

Country:

Europe > Italy > Sicily > Palermo (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Modeling & Simulation (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Solozabal, Ruben, Ceberio, Josu, Takáč, Martin

Constrained Combinatorial Optimization with Reinforcement Learning

This paper presents a framework to tackle constrained combinatorial optimization problems using deep Reinforcement Learning (RL). To this end, we extend the Neural Combinatorial Optimization (NCO) theory in order to deal with constraints in its formulation. Notably, we propose defining constrained combinatorial problems as fully observable Constrained Markov Decision Processes (CMDP). In that context, the solution is iteratively constructed based on interactions with the environment. The model, in addition to the reward signal, relies on penalty signals generated from constraint dissatisfaction to infer a policy that acts as a heuristic algorithm. Moreover, having access to the complete state representation during the optimization process allows us to rely on memory-less architectures, enhancing the results obtained in previous sequence-to-sequence approaches. Conducted experiments on the constrained Job Shop and Resource Allocation problems prove the superiority of the proposal for computing rapid solutions when compared to classical heuristic, metaheuristic, and Constraint Programming (CP) solvers.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2006.11984

Country:

Europe > Spain > Basque Country (0.04)
North America > United States > Pennsylvania > Northampton County > Bethlehem (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Shin, Ahnjae, Kim, Do Yoon, Jeong, Joo Seong, Chun, Byung-Gon

Hippo: Taming Hyper-parameter Optimization of Deep Learning with Stage Trees

Hyper-parameter optimization is crucial for pushing the accuracy of a deep learning model to its limits. A hyper-parameter optimization job, referred to as a study, involves numerous trials of training a model using different training knobs, and therefore is very computation-heavy, typically taking hours and days to finish. We observe that trials issued from hyper-parameter optimization algorithms often share common hyper-parameter sequence prefixes. Based on this observation, we propose Hippo, a hyper-parameter optimization system that removes redundancy in the training process to reduce the overall amount of computation significantly. Instead of executing each trial independently as in existing hyper-parameter optimization systems, Hippo breaks down the hyper-parameter sequences into stages and merges common stages to form a tree of stages (called a stage-tree), then executes a stage once per tree on a distributed GPU server environment. Hippo is applicable to not only single studies, but multi-study scenarios as well, where multiple studies of the same model and search space can be formulated as trees of stages. Evaluations show that Hippo's stage-based execution strategy outperforms trial-based methods such as Ray Tune for several models and hyper-parameter optimization algorithms, reducing GPU-hours and end-to-end training time significantly.

artificial intelligence, machine learning, optimization problem, (20 more...)

2006.11972

Country:

North America > United States > California > San Diego County > Carlsbad (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kovalev, Dmitry, Salim, Adil, Richtárik, Peter

Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization

We consider the task of decentralized minimization of the sum of smooth strongly convex functions stored across the nodes a network. For this problem, lower bounds on the number of gradient computations and the number of communication rounds required to achieve $\varepsilon$ accuracy have recently been proven. We propose two new algorithms for this decentralized optimization problem and equip them with complexity guarantees. We show that our first method is optimal both in terms of the number of communication rounds and in terms of the number of gradient computations. Unlike existing optimal algorithms, our algorithm does not rely on the expensive evaluation of dual gradients. Our second algorithm is optimal in terms of the number of communication rounds, without a logarithmic factor. Our approach relies on viewing the two proposed algorithms as accelerated variants of the Forward Backward algorithm to solve monotone inclusions associated with the decentralized optimization problem. We also verify the efficacy of our methods against state-of-the-art algorithms through numerical experiments.

algorithm, artificial intelligence, machine learning, (16 more...)

2006.11773

Country:

North America > United States > New York (0.04)
Asia > Middle East > Saudi Arabia > Mecca Province > Thuwal (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)

Ahn, Kwangjun, Yun, Chulhee, Sra, Suvrit

SGD with shuffling: optimal rates without component convexity and large epoch requirements

We study without-replacement SGD for solving finite-sum optimization problems. Specifically, depending on how the indices of the finite-sum are shuffled, we consider the RandomShuffle (shuffle at the beginning of each epoch) and SingleShuffle (shuffle only once) algorithms. First, we establish minimax optimal convergence rates of these algorithms up to poly-log factors. Notably, our analysis is general enough to cover gradient dominated nonconvex costs, and does not rely on the convexity of individual component functions unlike existing optimal convergence results. Secondly, assuming convexity of the individual components, we further sharpen the tight convergence results for RandomShuffle by removing the drawbacks common to all prior arts: large number of epochs required for the results to hold, and extra poly-log factor gaps to the lower bound.

artificial intelligence, machine learning, optimization problem, (16 more...)

2006.06946

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.65)

A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning

Liu, Sijia, Chen, Pin-Yu, Kailkhura, Bhavya, Zhang, Gaoyuan, Hero, Alfred, Varshney, Pramod K.

Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many signal processing and machine learning applications. It is used for solving optimization problems similarly to gradient-based methods. However, it does not require the gradient, using only function evaluations. Specifically, ZO optimization iteratively performs three major steps: gradient estimation, descent direction computation, and solution update. In this paper, we provide a comprehensive review of ZO optimization, with an emphasis on showing the underlying intuition, optimization principles and recent advances in convergence analysis. Moreover, we demonstrate promising applications of ZO optimization, such as evaluating robustness and generating explanations from black-box deep learning models, and efficient online sensor management.

artificial intelligence, machine learning, optimization, (19 more...)

2006.06224

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Overview (1.00)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
(2 more...)