AITopics

2003.1087

Country: Europe > Germany > Berlin (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Matei, Ion, de Kleer, Johan, Somarakis, Christoforos, Rai, Rahul, Baras, John S.

Interpretable machine learning models: a physics-based view

arXiv.org Artificial IntelligenceMar-22-2020

To understand changes in physical systems and facilitate decisions, explaining how model predictions are made is crucial. We use model-based interpretability, where models of physical systems are constructed by composing basic constructs that explain locally how energy is exchanged and transformed. We use the port Hamiltonian (p-H) formalism to describe the basic constructs that contain physically interpretable processes commonly found in the behavior of physical systems. We describe how we can build models out of the p-H constructs and how we can train them. In addition we show how we can impose physical properties such as dissipativity that ensure numerical stability of the training process. We give examples on how to build and train models for describing the behavior of two physical systems: the inverted pendulum and swarm dynamics. I. Introduction The necessity for interpretability comes from the fact that it is not always enough to train and model and get an answer, but is also important to understand why a particular answer was given. A simple but meaningful definition of model interpretability given in [17] relates this notion to the degree to which a human can understand the cause of a decision. In our case, since we care about models that describe the behavior of physical systems, we change the definition to the degree to which a human can understand the physical processes that cause a prediction. Throughout this paper we focus on physically-interpretable models: models that embed physical laws that explain how energy is transformed and exchanged in the system. A physically-interpretable model facilitates learning and updating the model when something unexpected happens. This update is done by finding an explanation for an unexpected event. For example, an electrical motor unexpectedly overheats and we ask ourselves: "Why is the motor overheating?".

constraint, interpretability, time complexity, (16 more...)

arXiv.org Artificial Intelligence

2003.10025

Country:

North America > United States > District of Columbia > Washington (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre:

Overview (0.46)
Research Report (0.40)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.64)

Merchán, Eduardo C. Garrido, Pérez, Luis C. Jariego

Towards Automatic Bayesian Optimization: A first step involving acquisition functions

arXiv.org Artificial IntelligenceMar-21-2020

Bayesian Optimization is the state of the art technique for the optimization of black boxes, i.e., functions where we do not have access to their analytical expression nor its gradients, they are expensive to evaluate and its evaluation is noisy. The most popular application of bayesian optimization is the automatic hyperparameter tuning of machine learning algorithms, where we obtain the best configuration of machine learning algorithms by optimizing the estimation of the generalization error of these algorithms. Despite being applied with success, bayesian optimization methodologies also have hyperparameters that need to be configured such as the probabilistic surrogate model or the acquisition function used. A bad decision over the configuration of these hyperparameters implies obtaining bad quality results. Typically, these hyperparameters are tuned by making assumptions of the objective function that we want to evaluate but there are scenarios where we do not have any prior information about the objective function. In this paper, we propose a first attempt over automatic bayesian optimization by exploring several heuristics that automatically tune the acquisition function of bayesian optimization. We illustrate the effectiveness of these heurisitcs in a set of benchmark problems and a hyperparameter tuning problem of a machine learning algorithm.

acquisition function, optimization problem, upstream oil & gas, (18 more...)

arXiv.org Artificial Intelligence

2003.09643

Country: Europe > Spain (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Gergatsouli, Evangelia, Lucier, Brendan, Tzamos, Christos

Black-box Methods for Restoring Monotonicity

In many practical applications, heuristic or approximation algorithms are used to efficiently solve the task at hand. However their solutions frequently do not satisfy natural monotonicity properties of optimal solutions. In this work we develop algorithms that are able to restore monotonicity in the parameters of interest. Specifically, given oracle access to a (possibly non-monotone) multi-dimensional real-valued function $f$, we provide an algorithm that restores monotonicity while degrading the expected value of the function by at most $\varepsilon$. The number of queries required is at most logarithmic in $1/\varepsilon$ and exponential in the number of parameters. We also give a lower bound showing that this exponential dependence is necessary. Finally, we obtain improved query complexity bounds for restoring the weaker property of $k$-marginal monotonicity. Under this property, every $k$-dimensional projection of the function $f$ is required to be monotone. The query complexity we obtain only scales exponentially with $k$.

algorithm, monotone, monotonicity, (14 more...)

2003.09554

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry: Transportation > Air (0.41)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Tran, Anh, McCann, Scott, Furlan, John M., Pagalthivarthi, Krishnan V., Visintainer, Robert J., Wildey, Tim

aphBO-2GP-3B: A budgeted asynchronously-parallel multi-acquisition for known/unknown constrained Bayesian optimization on high-performing computing architecture

High-fidelity complex engineering simulations are highly predictive, but also computationally expensive and often require substantial computational efforts. The mitigation of computational burden is usually enabled through parallelism in high-performance cluster (HPC) architecture. In this paper, an asynchronous constrained batch-parallel Bayesian optimization method is proposed to efficiently solve the computationally-expensive simulation-based optimization problems on the HPC platform, with a budgeted computational resource, where the maximum number of simulations is a constant. The advantages of this method are three-fold. First, the efficiency of the Bayesian optimization is improved, where multiple input locations are evaluated massively parallel in an asynchronous manner to accelerate the optimization convergence with respect to physical runtime. This efficiency feature is further improved so that when each of the inputs is finished, another input is queried without waiting for the whole batch to complete. Second, the method can handle both known and unknown constraints. Third, the proposed method considers several acquisition functions at the same time and sample based on an evolving probability mass distribution function using GP-Hedge scheme, where parameters are corresponding to the performance of each acquisition function. The proposed framework is termed aphBO-2GP-3B, which corresponds to asynchronous parallel hedge Bayesian optimization with two Gaussian processes and three batches. The aphBO-2GP-3B framework is demonstrated using two high-fidelity expensive industrial applications, where the first one is based on finite element analysis (FEA) and the second one is based on computational fluid dynamics (CFD) simulations.

acquisition function, constraint, optimization, (15 more...)

2003.09436

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.14)
North America > United States > New York (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(2 more...)

Genre:

Research Report (0.64)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Dall'Amico, Lorenzo, Couillet, Romain, Tremblay, Nicolas

A unified framework for spectral clustering in sparse graphs

One of the most natural tasks in graph theory is community detection, i.e., the identification of similarity groups on a given network. Practically, for an unweighted and undirected graph G(V, E) with V n nodes and E edges, community detection consists in finding a non-overlapping partition of the nodes that identifies underlying communities in a completely unsupervised manner. There is no unique definition of a community, but a general criterion is to impose that nodes in the same community have more interconnections than nodes in different communities, as a consequence of the stronger affinity among members of the same community [17]. There exist many ways of formalizing this intuition, some of them under the form of a cost function to minimize, such as the MinCut, RatioCut, and NormalizedCut costs [53]. The resulting optimizations are however NPhard problems and, as a consequence, many algorithms consist in retrieving relaxed continuous solutions of the problem.

eigenvalue, eigenvector, equation, (14 more...)

2003.09198

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
(3 more...)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Communications (0.93)
Information Technology > Data Science > Data Mining (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Deng, Kangkang, Peng, Zheng

An Inexact Manifold Augmented Lagrangian Method for Adaptive Sparse Canonical Correlation Analysis with Trace Lasso Regularization

Canonical correlation analysis (CCA for short) describes the relationship between two sets of variables by finding some linear combinations of these variables that maximizing the correlation coefficient. However, in high-dimensional settings where the number of variables exceeds sample size, or in the case of that the variables are highly correlated, the traditional CCA is no longer appropriate. In this paper, an adaptive sparse version of CCA (ASCCA for short) is proposed by using the trace Lasso regularization. The proposed ASCCA reduces the instability of the estimator when the covariates are highly correlated, and thus improves its interpretation. The ASCCA is further reformulated to an optimization problem on Riemannian manifolds, and an manifold inexact augmented Lagrangian method is then proposed for the resulting optimization problem. The performance of the ASCCA is compared with the other sparse CCA techniques in different simulation settings, which illustrates that the ASCCA is feasible and efficient.

optimization problem, regularization, trace lasso regularization, (9 more...)

2003.09195

Country:

Asia > China > Fujian Province > Fuzhou (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Chang, Kai-Hung, Cheng, Chin-Yi

Learning to simulate and design for structural engineering

In the architecture and construction industries, structural design for large buildings has always been laborious, time-consuming, and difficult to optimize. It is an iterative process that involves two steps: analyzing the current structural design by a slow and computationally expensive simulation, and then manually revising the design based on professional experience and rules. In this work, we propose an end-to-end learning pipeline to solve the size design optimization problem, which is to design the optimal cross-sections for columns and beams, given the design objectives and building code as constraints. We pre-train a graph neural network as a surrogate model to not only replace the structural simulation for speed but also use its differentiable nature to provide gradient signals to the other graph neural network for size optimization. Our results show that the pre-trained surrogate model can predict simulation results accurately, and the trained optimization model demonstrates the capability of designing convincing cross-section designs for buildings under various scenarios.

constraint, optimization model, surrogate model, (11 more...)

2003.09103

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan (0.04)

Genre: Research Report > New Finding (0.86)

Industry:

Construction & Engineering (1.00)
Materials > Construction Materials (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Icarte, Rodrigo Toro, Waldie, Ethan, Klassen, Toryn, Valenzano, Rick, Castro, Margarita, McIlraith, Sheila

Learning Reward Machines for Partially Observable Reinforcement Learning

Neural Information Processing SystemsMar-19-2020, 03:03:29 GMT

Reward Machines (RMs), originally proposed for specifying problems in Reinforcement Learning (RL), provide a structured, automata-based representation of a reward function that allows an agent to decompose problems into subproblems that can be efficiently learned using off-policy learning. Here we show that RMs can be learned from experience, instead of being specified by the user, and that the resulting problem decomposition can be used to effectively solve partially observable RL problems. We pose the task of learning RMs as a discrete optimization problem where the objective is to find an RM that decomposes the problem into a set of subproblems such that the combination of their optimal memoryless policies is an optimal policy for the original problem. We show the effectiveness of this approach on three partially observable domains, where it significantly outperforms A3C, PPO, and ACER, and discuss its advantages, limitations, and broader potential. Papers published at the Neural Information Processing Systems Conference.

learning reward machine, observable reinforcement learning, reinforcement learning, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Ibragimov, Bulat, Gusev, Gleb

Minimal Variance Sampling in Stochastic Gradient Boosting

Neural Information Processing SystemsMar-19-2020, 02:48:22 GMT

Stochastic Gradient Boosting (SGB) is a widely used approach to regularization of boosting models based on decision trees. It was shown that, in many cases, random sampling at each iteration can lead to better generalization performance of the model and can also decrease the learning time. Different sampling approaches were proposed, where probabilities are not uniform, and it is not currently clear which approach is the most effective. In this paper, we formulate the problem of randomization in SGB in terms of optimization of sampling probabilities to maximize the estimation accuracy of split scoring used to train decision trees.This optimization problem has a closed-form nearly optimal solution, and it leads to a new sampling technique, which we call Minimal Variance Sampling (MVS).The method both decreases the number of examples needed for each iteration of boosting and increases the quality of the model significantly as compared to the state-of-the art sampling methods. The superiority of the algorithm was confirmed by introducing MVS as a new default option for subsampling in CatBoost, a gradient boosting library achieving state-of-the-art quality on various machine learning tasks. Papers published at the Neural Information Processing Systems Conference.