AITopics

1812.08598

Country: Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre: Research Report (0.82)

Industry: Energy > Power Industry > Utilities > Nuclear (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

arXiv.org Artificial IntelligenceDec-13-2018

Doubly Bayesian Optimization

Lavin, Alexander

Bayesian optimization (BO) is a powerful method for optimizing complex black-box functions that are costly to evaluate directly. Although useful out of the box, complexities arise when the domain exhibits non-smooth structure, noise, or greater than five dimensions. Extending BO for these issues is non-trivial, which is why we suggest casting BO methods into the probabilistic programming paradigm. These systems (PPS) enable users to encode model structure and naturally reason about uncertainties, which can be leveraged towards improved BO methods. Here we present a probabilistic domain-specific language where BO is native, showing the Bayesian approach to optimization is more naturally expressed in a PPS, and better equipped to address the above issues. We validate the approach on standard optimization benchmarks, while demonstrating the utility of programmable structure to address the inner-optimization problem of BO. Importantly, we also show that the framework enables the user to more readily use advanced techniques such as unscented BO and noisy expected improvement.

artificial intelligence, machine learning, optimization, (15 more...)

1812.04562

Country:

North America > United States (0.46)
North America > Canada (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Altschuler, Jason, Bach, Francis, Rudi, Alessandro, Weed, Jonathan

Massively scalable Sinkhorn distances via the Nystr\"om method

arXiv.org Machine LearningDec-12-2018

The Sinkhorn distance, a variant of the Wasserstein distance with entropic regularization, is an increasingly popular tool in machine learning and statistical inference. We give a simple, practical, parallelizable algorithm NYS-SINK, based on Nystr\"om approximation, for computing Sinkhorn distances on a massive scale. As we show in numerical experiments, our algorithm easily computes Sinkhorn distances on data sets hundreds of times larger than can be handled by state-of-the-art approaches. We also give provable guarantees establishing that the running time and memory requirements of our algorithm adapt to the intrinsic dimension of the underlying data.

algorithm, artificial intelligence, machine learning, (19 more...)

1812.05189

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

arXiv.org Artificial IntelligenceDec-12-2018

MetaStyle: Three-Way Trade-Off Among Speed, Flexibility, and Quality in Neural Style Transfer

Zhang, Chi, Zhu, Yixin, Zhu, Song-Chun

An unprecedented booming has been witnessed in the research area of artistic style transfer ever since Gatys et al. introduced the neural method. One of the remaining challenges is to balance a trade-off among three critical aspects---speed, flexibility, and quality: (i) the vanilla optimization-based algorithm produces impressive results for arbitrary styles, but is unsatisfyingly slow due to its iterative nature, (ii) the fast approximation methods based on feed-forward neural networks generate satisfactory artistic effects but bound to only a limited number of styles, and (iii) feature-matching methods like AdaIN achieve arbitrary style transfer in a real-time manner but at a cost of the compromised quality. We find it considerably difficult to balance the trade-off well merely using a single feed-forward step and ask, instead, whether there exists an algorithm that could adapt quickly to any style, while the adapted model maintains high efficiency and good image quality. Motivated by this idea, we propose a novel method, coined MetaStyle, which formulates the neural style transfer as a bilevel optimization problem and combines learning with only a few post-processing update steps to adapt to a fast approximation model with satisfying artistic effects, comparable to the optimization-based methods for an arbitrary style. The qualitative and quantitative analysis in the experiments demonstrates that the proposed approach achieves high-quality arbitrary artistic style transfer effectively, with a good trade-off among speed, flexibility, and quality.

artificial intelligence, machine learning, style transfer, (17 more...)

1812.05233

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningDec-11-2018

Online Newton Step Algorithm with Estimated Gradient

Liu, Binbin, Li, Jundong, Song, Yunquan, Liang, Xijun, Jian, Ling, Liu, Huan

They have shown to be effective in handling large-scale and high-velocity streaming data and emerged to become popular in the big data era Hoi et al. [2018, 2014]. In recent years, a number of effective online learning algorithms have been investigated and applied in a variety of high impact domains,ranging from game theory, information theory to machine learning and data mining Ding et al. [2017], Shalev-Shwartz [2011], Wang et al. [2003]. Most previously proposed online learning algorithms fall into the wellestablished frameworkof online convex optimization Gordon [1999], Zinkevich [2003]. In terms of the optimization algorithms, online learning algorithms can be grouped into the following categories: (i) first-order algorithms which aim to optimize the objective function using the first-order (sub) gradient such as the well-known OGD algorithm Zinkevich [2003]; and (ii) second-order algorithms which aim to exploit second-order information to speed up the convergence of the optimization, such as the ONS algorithm Hazan et al. [2007]. In online convex optimization, previous approaches are mainly based on the first-order optimization, i.e., optimization using the first-order derivative of the cost function. Theregret bound achieved by these algorithms is proportional to the polynomial of the number of rounds T . For example, Zinkevich [2003] showed that with the simple OGD, we can achieve the regret bound of O( T). Later on, Hazan et al. [2007] introduced a new algorithm with ONS by exploiting the second-order derivative of the cost function, which can be viewed as an online 2 analogy of the Newton-Raphson method Ypma and Tjalling [1995] in the offline learning.Although the time complexity O(d

artificial intelligence, data mining, machine learning, (16 more...)

1811.09955

Country: North America (0.46)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

arXiv.org Machine LearningDec-11-2018

On the Curved Geometry of Accelerated Optimization

Defazio, Aaron

In this work we propose a differential geometric motivation for Nesterov's accelerated gradient method (AGM) for strongly-convex problems. By considering the optimization procedure as occurring on a Riemannian manifold with a natural structure, The AGM method can be seen as the proximal point method applied in this curved space. This viewpoint can also be extended to the continuous time case, where the accelerated gradient method arises from the natural block-implicit Euler discretization of an ODE on the manifold. We provide an analysis of the convergence rate of this ODE for quadratic objectives.

artificial intelligence, machine learning, optimization problem, (16 more...)

1812.04634

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

arXiv.org Machine LearningDec-11-2018

Why Does Stagewise Training Accelerate Convergence of Testing Error Over SGD?

Yang, Tianbao, Yan, Yan, Yuan, Zhuoning, Jin, Rong

Stagewise training strategy is commonly used for learning neural networks, which uses a stochastic algorithm (e.g., SGD) starting with a relatively large step size (aka learning rate) and geometrically decreasing the step size after a number of iterations. It has been observed that the stagewise SGD has much faster convergence than the vanilla SGD with a polynomial decaying step size in terms of both training error and testing error. {\it But how to explain this phenomenon has been largely ignored by existing studies.} This paper provides some theoretical evidence for explaining this faster convergence. In particular, we consider the stagewise training strategy for minimizing empirical risk that satisfies the Polyak-\L ojasiewicz condition, which has been observed/proved for neural networks and also holds for a broad family of convex functions. For convex loss functions and "nice-behaviored" non-convex loss functions that are close to a convex function (namely weakly convex functions), we establish faster convergence of stagewise training than the vanilla SGD under the same condition on both training error and testing error. Indeed, the proposed algorithm has additional favorable features that come with theoretical guarantee for the considered non-convex optimization problems, including using explicit algorithmic regularization at each stage, using stagewise averaged solution for restarting, and returning the last stagewise averaged solution as the final solution. To differentiate from commonly used stagewise SGD, we refer to our algorithm as stagewise regularized training algorithm. Of independent interest, the proved testing error bounds for a family of non-convex loss functions are dimensionality and norm independent.

artificial intelligence, machine learning, optimization problem, (13 more...)

1812.03934

Country: North America > United States > Iowa (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Gora, Paweł, Brzeski, Maciej, Możejko, Marcin, Klemenko, Arkadiusz, Kochański, Adrian

Investigating performance of neural networks and gradient boosting models approximating microscopic traffic simulations in traffic optimization tasks

arXiv.org Machine LearningDec-10-2018

We analyze the accuracy of traffic simulations metamodels based on neural networks and gradient boosting models (LightGBM), applied to traffic optimization as fitness functions of genetic algorithms. Our metamodels approximate outcomes of traffic simulations (the total time of waiting on a red signal) taking as an input different traffic signal settings, in order to efficiently find (sub)optimal settings. Their accuracy was proven to be very good on randomly selected test sets, but it turned out that the accuracy may drop in case of settings expected (according to genetic algorithms) to be close to local optima, which makes the traffic optimization process more difficult. In this work, we investigate 16 different metamodels and 20 settings of genetic algorithms, in order to understand what are the reasons of this phenomenon, what is its scale, how it can be mitigated and what can be potentially done to design better real-time traffic optimization methods.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

1812.00401

Country: Europe > Poland (0.14)

Genre: Research Report (0.84)

Industry: Transportation > Infrastructure & Services (0.38)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

arXiv.org Machine LearningDec-10-2018

KF-LAX: Kronecker-factored curvature estimation for control variate optimization in reinforcement learning

Firouzi, Mohammad

A key challenge for gradient based optimization methods in model-free reinforcement learning is to develop an approach that is sample efficient and has low variance. In this work, we apply Kronecker-factored curvature estimation technique (KFAC) to a recently proposed gradient estimator for control variate optimization, RELAX, to increase the sample efficiency of using this gradient estimation method in reinforcement learning. The performance of the proposed method is demonstrated on a synthetic problem and a set of three discrete control task Atari games.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1812.04181

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Song, Jiaming, Kalluri, Pratyusha, Grover, Aditya, Zhao, Shengjia, Ermon, Stefano

Learning Controllable Fair Representations

arXiv.org Artificial IntelligenceDec-10-2018

Learning data representations that are transferable and fair with respect to certain protected attributes is crucial to reducing unfair decisions made downstream, while preserving the utility of the data. We propose an information-theoretically motivated objective for learning maximally expressive representations subject to fairness constraints. We demonstrate that a range of existing approaches optimize approximations to the Lagrangian dual of our objective. In contrast to these existing approaches, our objective provides the user control over the fairness of representations by specifying limits on unfairness. We introduce a dual optimization method that optimizes the model as well as the expressiveness-fairness trade-off. Empirical evidence suggests that our proposed method can account for multiple notions of fairness and achieves higher expressiveness at a lower computational cost.

artificial intelligence, machine learning, representation, (17 more...)

1812.04218

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)