AITopics

1907.01543

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Schwalbe-Koda, Daniel, Gómez-Bombarelli, Rafael

Generative Models for Automatic Chemical Design

arXiv.org Machine LearningJul-2-2019

Materials discovery is decisive for tackling urgent challenges related to energy, the environment, health care and many others. In chemistry, conventional methodologies for innovation usually rely on expensive and incremental strategies to optimize properties from molecular structures. On the other hand, inverse approaches map properties to structures, thus expediting the design of novel useful compounds. In this chapter, we examine the way in which current deep generative models are addressing the inverse chemical discovery paradigm. We begin by revisiting early inverse design algorithms. Then, we introduce generative models for molecular systems and categorize them according to their architecture and molecular representation. Using this classification, we review the evolution and performance of important molecular generation schemes reported in the literature. Finally, we conclude highlighting the prospects and challenges of generative models as cutting edge tools in materials discovery.

artificial intelligence, machine learning, natural language, (20 more...)

1907.01632

Country: North America > United States > Massachusetts (0.28)

Genre:

Research Report (0.50)
Overview (0.34)

Industry:

Materials > Chemicals (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Daxberger, Erik, Makarova, Anastasia, Turchetta, Matteo, Krause, Andreas

Mixed-Variable Bayesian Optimization

arXiv.org Machine LearningJul-2-2019

The optimization of expensive to evaluate, black-box, mixed-variable functions, i.e. functions that have continuous and discrete inputs, is a difficult and yet pervasive problem in science and engineering. In Bayesian optimization (BO), special cases of this problem that consider fully continuous or fully discrete domains have been widely studied. However, few methods exist for mixed-variable domains. In this paper, we introduce MiVaBo, a novel BO algorithm for the efficient optimization of mixed-variable functions that combines a linear surrogate model based on expressive feature representations with Thompson sampling. We propose two methods to optimize its acquisition function, a challenging problem for mixed-variable domains, and we show that MiVaBo can handle complex constraints over the discrete part of the domain that other methods cannot take into account. Moreover, we provide the first convergence analysis of a mixed-variable BO algorithm. Finally, we show that MiVaBo is significantly more sample efficient than state-of-the-art mixed-variable BO algorithms on hyperparameter tuning tasks.

artificial intelligence, constraint, machine learning, (18 more...)

1907.01329

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.92)
(2 more...)

arXiv.org Artificial IntelligenceJul-1-2019

Two-stage Optimization for Machine Learning Workflow

Quemy, Alexandre

Machines learning techniques plays a preponderant role in dealing with massive amount of data and are employed in almost every possible domain. Building a high quality machine learning model to be deployed in production is a challenging task, from both, the subject matter experts and the machine learning practitioners. For a broader adoption and scalability of machine learning systems, the construction and configuration of machine learning workflow need to gain in automation. In the last few years, several techniques have been developed in this direction, known as autoML. In this paper, we present a two-stage optimization process to build data pipelines and configure machine learning algorithms. First, we study the impact of data pipelines compared to algorithm configuration in order to show the importance of data preprocessing over hyperparameter tuning. The second part presents policies to efficiently allocate search time between data pipeline construction and algorithm configuration. Those policies are agnostic from the metaoptimizer. Last, we present a metric to determine if a data pipeline is specific or independent from the algorithm, enabling fine-grain pipeline pruning and meta-learning for the coldstart problem.

configuration, midstream oil & gas, optimization problem, (20 more...)

arXiv.org Artificial Intelligence

1907.00678

Country:

Europe > Sweden (0.14)
Europe > Poland > Lesser Poland Province > Kraków (0.14)

Genre: Workflow (0.85)

Industry: Energy > Oil & Gas > Midstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

The Cost of a Reductions Approach to Private Fair Optimization

Alabi, Daniel

We examine a reductions approach to fair optimization and learning where a black-box optimizer is used to learn a fair model for classification or regression [Alabi et al., 2018, Agarwal et al., 2018] and explore the creation of such fair models that adhere to data privacy guarantees (specifically differential privacy). For this approach, we consider two suites of use cases: the first is for optimizing convex performance measures of the confusion matrix (such as $G$-mean and $H$-mean); the second is for satisfying statistical definitions of algorithmic fairness (such as equalized odds, demographic parity, and the gini index of inequality). The reductions approach to fair optimization can be abstracted as the constrained group-objective optimization problem where we aim to optimize an objective that is a function of losses of individual groups, subject to some constraints. We present two differentially private algorithms: an $(\epsilon, 0)$ exponential sampling algorithm and an $(\epsilon, \delta)$ algorithm that uses a linear optimizer to incrementally move toward the best decision. We analyze the privacy and utility guarantees of these empirical risk minimization algorithms. Compared to a previous method for ensuring differential privacy subject to a relaxed form of the equalized odds fairness constraint, the $(\epsilon, \delta)$ differentially private algorithm we present provides asymptotically better sample complexity guarantees. The technique of using an approximate linear optimizer oracle to achieve privacy might be applicable to other problems not considered in this paper. Finally, we show an algorithm-agnostic lower bound on the accuracy of any solution to the problem of $(\epsilon, 0)$ or $(\epsilon, \delta)$ private constrained group-objective optimization.

algorithm, artificial intelligence, machine learning, (13 more...)

1906.09613

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(15 more...)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Kilbertus, Niki, Ball, Philip J., Kusner, Matt J., Weller, Adrian, Silva, Ricardo

The Sensitivity of Counterfactual Fairness to Unmeasured Confounding

Causal approaches to fairness have seen substantial recent interest, both from the machine learning community and from wider parties interested in ethical prediction algorithms. In no small part, this has been due to the fact that causal models allow one to simultaneously leverage data and expert knowledge to remove discriminatory effects from predictions. However, one of the primary assumptions in causal modeling is that you know the causal graph. This introduces a new opportunity for bias, caused by misspecifying the causal model. One common way for misspecification to occur is via unmeasured confounding: the true causal effect between variables is partially described by unobserved quantities. In this work we design tools to assess the sensitivity of fairness measures to this confounding for the popular class of non-linear additive noise models (ANMs). Specifically, we give a procedure for computing the maximum difference between two counterfactually fair predictors, where one has become biased due to confounding. For the case of bivariate confounding our technique can be swiftly computed via a sequence of closed-form updates. For multivariate confounding we give an algorithm that can be efficiently solved via automatic differentiation. We demonstrate our new sensitivity analysis tools in real-world fairness scenarios to assess the bias arising from confounding.

artificial intelligence, correlation, machine learning, (17 more...)

1907.0104

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (0.93)

Industry:

Health & Medicine (1.00)
Education (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Okuno, Shunya, Aihara, Kazuyuki, Hirata, Yoshito

Forecasting high-dimensional dynamics exploiting suboptimal embeddings

Delay embedding---a method for reconstructing dynamical systems by delay coordinates---is widely used to forecast nonlinear time series as a model-free approach. When multivariate time series are observed, several existing frameworks can be applied to yield a single forecast combining multiple forecasts derived from various embeddings. However, the performance of these frameworks is not always satisfactory because they randomly select embeddings or use brute force and do not consider the diversity of the embeddings to combine. Herein, we develop a forecasting framework that overcomes these existing problems. The framework exploits various "suboptimal embeddings" obtained by minimizing the in-sample error via combinatorial optimization. The framework achieves the best results among existing frameworks for sample toy datasets and a real-world flood dataset. We show that the framework is applicable to a wide range of data lengths and dimensions. Therefore, the framework can be applied to various fields such as neuroscience, ecology, finance, fluid dynamics, weather, and disaster prevention.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

1907.01552

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.68)

Lee, Jonathan N., Pacchiano, Aldo, Jordan, Michael I.

Approximate Sherali-Adams Relaxations for MAP Inference via Entropy Regularization

Maximum a posteriori (MAP) inference is a fundamental computational paradigm for statistical inference. In the setting of graphical models, MAP inference entails solving a combinatorial optimization problem to find the most likely configuration of the discrete-valued model. Linear programming (LP) relaxations in the Sherali-Adams hierarchy are widely used to attempt to solve this problem. We leverage recent work in entropy-regularized linear programming to propose an iterative projection algorithm (SMPLP) for large scale MAP inference that is guaranteed to converge to a near-optimal solution to the relaxation. With an appropriately chosen regularization constant, we show the resulting rounded solution solves the exact MAP problem whenever the LP is tight. We further provide theoretical guarantees on the number of iterations sufficient to achieve $\epsilon$-close solutions. Finally, we show in practice that SMPLP is competitive for solving Sherali-Adams relaxations.

artificial intelligence, optimization problem, projection, (19 more...)

1907.01127

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Yang, Lei, Chen, Xiaojun, Xiang, Shuhuang

The Constrained $L_p$-$L_q$ Basis Pursuit Denoising Problem

In this paper, we consider the constrained $L_p$-$L_q$ basis pursuit denoising problem, which aims to find a minimizer of $\|\bf{x}\|_p^p$ subject to $\|A\bf{x}-\bf{b}\|_q\leq\sigma$ for given $A \in \mathbb{R}^{m \times n}$, $\bf{b}\in\mathbb{R}^m$, $\sigma \geq0$, $0\leq p\leq1$ and $q \geq 1$. We first study the properties of the optimal solutions of this problem. Specifically, without any condition on the matrix $A$, we provide upper bounds in cardinality and infinity norm for the optimal solutions, and show that all optimal solutions must be on the boundary of the feasible set when $0

artificial intelligence, optimal solution, optimization problem, (18 more...)

1907.0088

Country: Asia > China (0.46)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Sadeghi, Omid, Fazel, Maryam

Online Continuous DR-Submodular Maximization with Long-Term Budget Constraints

arXiv.org Machine LearningJun-30-2019

In this paper, we study a class of online optimization problems with long-term budget constraints where the objective functions are not necessarily concave (nor convex) but they instead satisfy the Diminishing Returns (DR) property. Specifically, a sequence of monotone DR-submodular objective functions $\{f_t(x)\}_{t=1}^T$ and monotone linear budget functions $\{\langle p_t,x \rangle \}_{t=1}^T$ arrive over time and assuming a total targeted budget $B_T$, the goal is to choose points $x_t$ at each time $t\in\{1,\dots,T\}$, without knowing $f_t$ and $p_t$ on that step, to achieve sub-linear regret bound while the total budget violation $\sum_{t=1}^T \langle p_t,x_t \rangle -B_T$ is sub-linear as well. Prior work has shown that achieving sub-linear regret is impossible if the budget functions are chosen adversarially. Therefore, we modify the notion of regret by comparing the agent against a $(1-\frac{1}{e})$-approximation to the best fixed decision in hindsight which satisfies the budget constraint proportionally over any window of length $W$. We propose the Online Saddle Point Hybrid Gradient (OSPHG) algorithm to solve this class of online problems. For $W=T$, we recover the aforementioned impossibility result. However, when $W=o(T)$, we show that it is possible to obtain sub-linear bounds for both the $(1-\frac{1}{e})$-regret and the total budget violation.

algorithm, artificial intelligence, machine learning, (16 more...)

1907.00316

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.46)