AITopics

1911.01024

Country:

North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > Ohio > Hamilton County > Cincinnati (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (0.55)
Transportation > Electric Vehicle (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Yue, Xubo, Kontar, Raed Al

Lookahead Bayesian Optimization via Rollout: Guarantees and Sequential Rolling Horizons

arXiv.org Machine LearningNov-3-2019

We consider the optimization problem: x arg max x X f (x), (1) where x is a d-dimensional vector and X is a compact (closed and bounded) set in R d . Given limited budget B, BO aims to search for the optimal x by itera-tively updating a surrogate model of f (x), where this surrogate is used to find the next design to evaluate. Typically, in BO, the surrogate model is a Gaussian process ( GP), due to its Bayesian interpretation and uncertainty quantification capability (see Rasmussen (2003) for more information). More specifically, given the current data D k, BO aims to determine the next informative sampling point x k 1 by solving the auxiliary problem: x k 1: x arg max x X Q k(x; D k). (2) where Q k is a acquisition/utility function that only involves evaluating the surrogate and not the expensive objective function f . Typically, evaluation of acquisition function is relatively cheap. The rationale is to seek design points that produce maximum increment of the objective function. After Eq. (2) is solved, the iterative algorithm proceeds by augmenting the current training data D k with a new observation to obtain D k 1 D k { (x k 1,y k 1) }. Popular choices of acquisition functions are entropy search (ES) (Hennig and Schuler, 2012), predictive entropy search (PES) (Hern andez-Lobato et al., 2014) and expectation improvement (EI) (Lam et al., 2016). All aforementioned functions exploit myopic strategies and ignore the future information.

algorithm, optimization, rollout, (14 more...)

1911.01004

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Michigan (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

He, Jiayi, Kumar, Aravind Sampath, Chada, Arun, Mutnury, Bhyrav, Drewniak, James

Machine Learning for high speed channel optimization

arXiv.org Machine LearningNov-1-2019

-- Design of printed circuit board (PCB) stack - up requires the consideration of characteristic impedance, insertion loss and crosstalk. As there are many parameters in a PCB stack - up design, the optimization of these parameters needs to be efficient and accurate. A le ss optimal stack - up would lead to expensive PCB material choices in high speed designs. In this paper, a n efficient global optimization method using parallel and intelligent Bayesian optimization is proposed for the stripline design . In high speed system design, optimizing printed circuit board (PCB) stack - up is playing a more and more important role in design stage.

bayesian optimization, objective function, optimization, (15 more...)

1911.04317

Country:

Asia > Japan (0.05)
North America > United States > Texas > Williamson County > Round Rock (0.04)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Energy (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.56)

Bae, Kiwook, Ryu, Heechang, Shin, Hayong

Does Adam optimizer keep close to the optimal point?

arXiv.org Machine LearningNov-1-2019

The adaptive optimizer for training neural networks has continually evolved to overcome the limitations of the previously proposed adaptive methods. Recent studies have found the rare counterexamples that Adam cannot converge to the optimal point. Those counterexamples reveal the distortion of Adam due to a small second momentum from a small gradient. Unlike previous studies, we show Adam cannot keep closer to the optimal point for not only the counterexamples but also a general convex region when the effective learning rate exceeds the certain bound. Subsequently, we propose an algorithm that overcomes Adam's limitation and ensures that it can reach and stay at the optimal point region.

effective learning rate, learning rate, optimal point, (14 more...)

1911.00289

Country: North America > Canada (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.37)

Haddadpour, Farzin, Mahdavi, Mehrdad

On the Convergence of Local Descent Methods in Federated Learning

In federated distributed learning, the goal is to optimize a global training objective defined over distributed devices, where the data shard at each device is sampled from a possibly different distribution (a.k.a., heterogeneous or non i.i.d. data samples). In this paper, we generalize the local stochastic and full gradient descent with periodic averaging-- originally designed for homogeneous distributed optimization, to solve nonconvex optimization problems in federated learning. Although scant research is available on the effectiveness of local SGD in reducing the number of communication rounds in homogeneous setting, its convergence and communication complexity in heterogeneous setting is mostly demonstrated empirically and lacks through theoretical understating. To bridge this gap, we demonstrate that by properly analyzing the effect of unbiased gradients and sampling schema in federated setting, under mild assumptions, the implicit variance reduction feature of local distributed methods generalize to heterogeneous data shards and exhibits the best known convergence rates of homogeneous setting both in general nonconvex and under {\pl}~ condition (generalization of strong-convexity). Our theoretical results complement the recent empirical studies that demonstrate the applicability of local GD/SGD to federated learning. We also specialize the proposed local method for networked distributed optimization. To the best of our knowledge, the obtained convergence rates are the sharpest known to date on the convergence of local decant methods with periodic averaging for solving nonconvex federated optimization in both centralized and networked distributed optimization.

artificial intelligence, convergence, machine learning, (16 more...)

1910.14425

Country:

North America > United States > Virginia (0.04)
North America > United States > Pennsylvania > Centre County > University Park (0.04)

Genre: Research Report > New Finding (0.45)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Babier, Aaron, Mahmood, Rafid, McNiven, Andrea L., Diamant, Adam, Chan, Timothy C. Y.

The importance of evaluating the complete automated knowledge-based planning pipeline

arXiv.org Artificial IntelligenceOct-31-2019

We determine how prediction methods combine with optimization methods in two-stage knowledge-based planning (KBP) pipelines to produce radiation therapy treatment plans. We trained two dose prediction methods, a generative adversarial network (GAN) and a random forest (RF) with the same 130 treatment plans. The models were applied to 87 out-of-sample patients to create two sets of predicted dose distributions that were used as input to two optimization models. The first optimization model, inverse planning (IP), estimates weights for dose-objectives from a predicted dose distribution and generates new plans using conventional inverse planning. The second optimization model, dose mimicking (DM), minimizes the sum of one-sided quadratic penalties between the predictions and the generated plans using several dose-objectives. Altogether, four KBP pipelines (GAN-IP, GAN-DM, RF-IP, and RF-DM) were constructed and benchmarked against the corresponding clinical plans using clinical criteria; the error of both prediction methods was also evaluated. The best performing plans were GAN-IP plans, which satisfied the same criteria as their corresponding clinical plans (78%) more often than any other KBP pipeline. However, GAN did not necessarily provide the best prediction for the second-stage optimization models. Specifically, both the RF-IP and RF-DM plans satisfied all clinical criteria 25% and 15% more often than GAN-DM plans (the worst performing planning), respectively. GAN predictions also had a higher mean absolute error (3.9 Gy) than those from RF (3.6 Gy). We find that state-of-the-art prediction methods when paired with different optimization algorithms, produce treatment plans with considerable variation in quality.

dose distribution, optimization model, prediction, (16 more...)

arXiv.org Artificial Intelligence

1910.14257

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.51)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Nuclear Medicine (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Lee, Hao-Chih, Danieletto, Matteo, Miotto, Riccardo, Cherng, Sarah T., Dudley, Joel T.

Scaling structural learning with NO-BEARS to infer causal transcriptome networks

Constructing gene regulatory networks is a critical step in revealing disease mechanisms from transcriptomic data. In this work, we present NO-BEARS, a novel algorithm for estimating gene regulatory networks. The NO-BEARS algorithm is built on the basis of the NOTEARS algorithm with two improvements. First, we propose a new constraint and its fast approximation to reduce the computational cost of the NO-TEARS algorithm. Next, we introduce a polynomial regression loss to handle non-linearity in gene expressions. Our implementation utilizes modern GPU computation that can decrease the time of hours-long CPU computation to seconds. Using synthetic data, we demonstrate improved performance, both in processing time and accuracy, on inferring gene regulatory networks from gene expression data.

algorithm, iteration, no-tear algorithm, (14 more...)

1911.00081

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.47)

Graber, Colin, Schwing, Alexander

Graph Structured Prediction Energy Networks

For joint inference over multiple variables, a variety of structured prediction techniques have been developed to model correlations among variables and thereby improve predictions. However, many classical approaches suffer from one of two primary drawbacks: they either lack the ability to model high-order correlations among variables while maintaining computationally tractable inference, or they do not allow to explicitly model known correlations. To address this shortcoming, we introduce `Graph Structured Prediction Energy Networks,' for which we develop inference techniques that allow to both model explicit local and implicit higher-order correlations while maintaining tractability of inference. We apply the proposed method to tasks from the natural language processing and computer vision domain and demonstrate its general utility.

inference, prediction, proc, (14 more...)

1910.1467

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
North America > Canada (0.04)

Genre: Research Report (0.50)

Industry: Energy > Power Industry (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
(3 more...)

Bojchevski, Aleksandar, Günnemann, Stephan

Certifiable Robustness to Graph Perturbations

Despite the exploding interest in graph neural networks there has been little effort to verify and improve their robustness. This is even more alarming given recent findings showing that they are extremely vulnerable to adversarial attacks on both the graph structure and the node attributes. We propose the first method for verifying certifiable (non-)robustness to graph perturbations for a general class of models that includes graph neural networks and label/feature propagation. By exploiting connections to PageRank and Markov decision processes our certificates can be efficiently (and under many threat models exactly) computed. Furthermore, we investigate robust training procedures that increase the number of certifiably robust nodes while maintaining or improving the clean predictive accuracy.

budget, constraint, node, (17 more...)

1910.14356

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Government (0.67)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

#artificialintelligenceOct-30-2019, 09:27:24 GMT

Efficient New Approaches for Data-Driven Global Optimization, with Applications in Computer-Aided Design

Alongside derivative-based methods, which scale better to higher-dimensional problems, derivative-free methods play an essential role in the optimization of many practical engineering systems, especially those in which function evaluations are determined by statistical averaging, and those for which the function of interest is nonconvex in the adjustable parameters. This work focuses on the development of a new family of surrogate-based derivative-free optimization schemes, namely $\Delta$-DOGS schemes. The idea unifying this efficient and (under the appropriate assumptions) provably-globally-convergent family of schemes is the minimization of a search function which linearly combines a computationally inexpensive ''surrogate (that is, an interpolation, or in some cases a regression, of recent function evaluations - we generally favor some variant of polyharmonic splines for this purpose), to summarize the trends evident in the data available thus far, with a synthetic piecewise-quadratic ''uncertainty function (built on the framework of a Delaunay triangulation of existing datapoints), to characterize the reliability of the surrogate by quantifying the distance of any given point in parameter space to the nearest function evaluations. The grid is successively refined as convergence is approached. Moreover, it handles the linear constraint domain. This work also introduces a method to scale the parameter domain under consideration based on adaptive variation of the seen data in the optimization process, thereby obtaining a significantly smoother surrogate.

application, data-driven global optimization, delta, (10 more...)

#artificialintelligence

Industry: Information Technology > Software (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.52)
Information Technology > CAD (0.40)