AITopics

1812.07211

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.27)
North America > United States > New York (0.04)
Europe > France (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (1.00)
Banking & Finance > Trading (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Hämäläinen, Perttu, Babadi, Amin, Ma, Xiaoxiao, Lehtinen, Jaakko

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

Proximal Policy Optimization (PPO) is a highly popular model-free reinforcement learning (RL) approach. However, in continuous state and actions spaces and a Gaussian policy -- common in computer animation and robotics -- PPO is prone to getting stuck in local optima. In this paper, we observe a tendency of PPO to prematurely shrink the exploration variance, which naturally leads to slow progress. Motivated by this, we borrow ideas from CMA-ES, a black-box optimization method designed for intelligent adaptive Gaussian exploration, to derive PPO-CMA, a novel proximal policy optimization approach that can expand the exploration variance on objective function slopes and shrink the variance when close to the optimum. This is implemented by using separate neural networks for policy mean and variance and training the mean and variance in separate passes. Our experiments demonstrate a clear improvement over vanilla PPO in many difficult OpenAI Gym MuJoCo tasks.

machine learning, reinforcement learning, variance, (17 more...)

1810.02541

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Leng, Yan, Dong, Xiaowen, Pentland, Alex

Learning Quadratic Games on Networks

Individuals, or organizations, cooperate with or compete against one another in a wide range of practical situations. In the economics literature, such strategic interactions are often modeled as games played on networks, where an individual's payoff depends not only on her action but also that of her neighbors. The current literature has largely focused on analyzing the characteristics of network games in the scenario where the structure of the network, which is represented by a graph, is known beforehand. It is often the case, however, that the actions of the players are readily observable while the underlying interaction network remains hidden. In this paper, we propose two novel frameworks for learning, from the observations on individual actions, network games with linear-quadratic payoffs, and in particular the structure of the interaction network. Our frameworks are based on the Nash equilibrium of such games and involve solving a joint optimization problem for the graph structure and the individual marginal benefits. We test the proposed frameworks in synthetic settings and further study several factors that affect their learning performance. Moreover, with experiments on three real world examples, we show that our methods can effectively and more accurately learn the games than the baselines. The proposed approach is among the first of its kind for learning quadratic games, and have both theoretical and practical implications for understanding strategic interactions in a network environment.

artificial intelligence, machine learning, optimization problem, (17 more...)

1811.0879

Country:

North America > United States > Massachusetts (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report (1.00)

Industry:

Banking & Finance > Trading (0.68)
Health & Medicine (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Generative One-Shot Learning (GOL): A Semi-Parametric Approach to One-Shot Learning in Autonomous Vision

Grigorescu, Sorin

Highly Autonomous Driving (HAD) systems rely on deep neural networks for the visual perception of the driving environment. Such networks are trained on large manually annotated databases. In this work, a semi-parametric approach to one-shot learning is proposed, with the aim of bypassing the manual annotation step required for training perceptions systems used in autonomous driving. The proposed generative framework, coined Generative One-Shot Learning (GOL), takes as input single one-shot objects, or generic patterns, and a small set of so-called regularization samples used to drive the generative process. New synthetic data is generated as Pareto optimal solutions from one-shot objects using a set of generalization functions built into a generalization generator. GOL has been evaluated on environment perception challenges encountered in autonomous vision.

artificial intelligence, machine learning, optimization problem, (17 more...)

doi: 10.1109/ICRA.2018.8461174

1812.07567

Country:

Europe (0.68)
North America > Canada (0.28)

Genre: Research Report (0.64)

Industry: Transportation > Ground > Road (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Immorlica, Nicole, Sankararaman, Karthik Abinav, Schapire, Robert, Slivkins, Aleksandrs

Adversarial Bandits with Knapsacks

We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a well-known knapsack problem: find an optimal packing of items into a limited-size knapsack. The BwK problem is a common generalization of numerous motivating examples, which range from dynamic pricing to repeated auctions to dynamic ad allocation to network routing and scheduling. While the prior work on BwK focused on the stochastic version, we pioneer the other extreme in which the outcomes can be chosen adversarially. This is a considerably harder problem, compared to both the stochastic version and the "classic" adversarial bandits, in that regret minimization is no longer feasible. Instead, the objective is to minimize the competitive ratio: the ratio of the benchmark reward to the algorithm's reward. We design an algorithm with competitive ratio O(log T) relative to the best fixed distribution over actions, where T is the time horizon; we also prove a matching lower bound. The key conceptual contribution is a new perspective on the stochastic version of the problem. We suggest a new algorithm for the stochastic version, which builds on the framework of regret minimization in repeated games and admits a substantially simpler analysis compared to prior work. We then analyze this algorithm for the adversarial version and use it as a subroutine to solve the latter.

artificial intelligence, data mining, machine learning, (23 more...)

1811.11881

Country: North America > United States > Maryland (0.27)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Lecoutre, Christophe, Roussel, Olivier

Proceedings of the 2018 XCSP3 Competition

arXiv.org Artificial IntelligenceDec-17-2018

This document represents the proceedings of the 2018 XCSP3 Competition. The results of this competition of constraint solvers were presented at CP'18, the 24th International Conference on Principles and Practice of Constraint Programming, held in Lille, France from 27th August 2018 to 31th August, 2018.

machine learning, natural language, programming language, (22 more...)

1901.0183

Country:

Europe > France > Hauts-de-France > Nord > Lille (0.24)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(14 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(5 more...)

Chou, Glen, Berenson, Dmitry, Ozay, Necmiye

Learning Constraints from Demonstrations

arXiv.org Artificial IntelligenceDec-17-2018

We extend the learning from demonstration paradigm by providing a method for learning unknown constraints shared across tasks, using demonstrations of the tasks, their cost functions, and knowledge of the system dynamics and control constraints. Given safe demonstrations, our method uses hit-and-run sampling to obtain lower cost, and thus unsafe, trajectories. Both safe and unsafe trajectories are used to obtain a consistent representation of the unsafe set via solving an integer program. Our method generalizes across system dynamics and learns a guaranteed subset of the constraint. We also provide theoretical analysis on what subset of the constraint can be learnable from safe demonstrations. We demonstrate our method on linear and nonlinear system dynamics, show that it can be modified to work with suboptimal demonstrations, and that it can also be used to learn constraints in a feature space.

constraint, demonstration, trajectory, (15 more...)

1812.07084

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Lv, Zhaoyang, Dellaert, Frank, Rehg, James M., Geiger, Andreas

Taking a Deeper Look at the Inverse Compositional Algorithm

arXiv.org Artificial IntelligenceDec-17-2018

In this paper, we provide a modern synthesis of the classic inverse compositional algorithm for dense image alignment. We first discuss the assumptions made by this well-established technique, and subsequently propose to relax these assumptions by incorporating data-driven priors into this model. More specifically, we unroll a robust version of the inverse compositional algorithm and replace multiple components of this algorithm using more expressive models whose parameters we train in an end-to-end fashion from data. Our experiments on several challenging 3D rigid motion estimation tasks demonstrate the advantages of combining optimization with learning-based techniques, outperforming the classic inverse compositional algorithm as well as data-driven image-to-pose regression approaches.

artificial intelligence, computer vision, machine learning, (15 more...)

1812.06861

Country: North America > United States (0.67)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Bertsimas, Dimitris, Li, Michael Lingzhi

Interpretable Matrix Completion: A Discrete Optimization Approach

arXiv.org Machine LearningDec-17-2018

We consider the problem of matrix completion with side information on an $n\times m$ matrix. We formulate the problem exactly as a sparse regression problem of selecting features and show that it can be reformulated as a binary convex optimization problem. We design OptComplete, based on a novel concept of stochastic cutting planes to enable efficient scaling of the algorithm up to matrices of sizes $n = 10^6$ and $m = 10^5$. We report experiments on both synthetic and real-world datasets that show that OptComplete outperforms current state-of-the-art methods both in terms of accuracy and scalability, while providing insight on the factors that affect the ratings.

algorithm, artificial intelligence, optimization problem, (14 more...)

1812.06647

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > Promising Solution (0.68)

Industry:

Media > Film (0.94)
Leisure & Entertainment (0.94)
Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Dantas, Cassio, Gribonval, Rémi

Stable safe screening and structured dictionaries for faster $\ell\_{1}$ regularization

arXiv.org Machine LearningDec-17-2018

In this paper, we propose a way to combine two acceleration techniques for the $\ell_1$-regularized least squares problem: safe screening tests, which allow to eliminate useless dictionary atoms, and the use of fast structured approximations of the dictionary matrix. To do so, we introduce a new family of screening tests, termed stable screening, which can cope with approximation errors on the dictionary atoms while keeping the safety of the test (i.e. zero risk of rejecting atoms belonging to the solution support). Some of the main existing screening tests are extended to this new framework. The proposed algorithm consists in using a coarser (but faster) approximation of the dictionary at the initial iterations and then switching to better approximations until eventually adopting the original dictionary. A systematic switching criterion based on the duality gap saturation and the screening ratio is derived.Simulation results show significant reductions in both computational complexity and execution times for a wide range of tested scenarios.

algorithm, artificial intelligence, machine learning, (20 more...)

1812.06635

Country: North America (0.67)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)