AITopics

Lagrangian methods are widely used algorithms for constrained optimization problems, but their learning dynamics exhibit oscillations and overshoot which, when applied to safe reinforcement learning, leads to constraint-violating behavior during agent training. We address this shortcoming by proposing a novel Lagrange multiplier update method that utilizes derivatives of the constraint function. We take a controls perspective, wherein the traditional Lagrange multiplier update behaves as \emph{integral} control; our terms introduce \emph{proportional} and \emph{derivative} control, achieving favorable learning dynamics through damping and predictive measures. We apply our PID Lagrangian methods in deep RL, setting a new state of the art in Safety Gym, a safe RL benchmark. Lastly, we introduce a new method to ease controller tuning by providing invariance to the relative numerical scales of reward and cost. Our extensive experiments demonstrate improved performance and hyperparameter robustness, while our algorithms remain nearly as simple to derive and implement as the traditional Lagrangian approach.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2007.03964

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Marteau-Ferey, Ulysse, Bach, Francis, Rudi, Alessandro

Non-parametric Models for Non-negative Functions

Linear models have shown great effectiveness and flexibility in many fields such as machine learning, signal processing and statistics. They can represent rich spaces of functions while preserving the convexity of the optimization problems where they are used, and are simple to evaluate, differentiate and integrate. However, for modeling non-negative functions, which are crucial for unsupervised learning, density estimation, or non-parametric Bayesian methods, linear models are not applicable directly. Moreover, current state-of-the-art models like generalized linear models either lead to non-convex optimization problems, or cannot be easily integrated. In this paper we provide the first model for non-negative functions which benefits from the same good properties of linear models. In particular, we prove that it admits a representer theorem and provide an efficient dual formulation for convex problems. We study its representation power, showing that the resulting space of functions is strictly richer than that of generalized linear models. Finally we extend the model and the theoretical results to functions with outputs in convex cones. The paper is complemented by an experimental evaluation of the model showing its effectiveness in terms of formulation, algorithmic derivation and practical results on the problems of density estimation, regression with heteroscedastic errors, and multiple quantile regression.

artificial intelligence, linear model, machine learning, (20 more...)

2007.03926

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)
(2 more...)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Sun, Chuangchuang, Kim, Dong-Ki, How, Jonathan P.

Set-Invariant Constrained Reinforcement Learning with a Meta-Optimizer

This paper investigates reinforcement learning with constraints, which is indispensable in safety-critical environments. To drive the constraint violation monotonically decrease, the constraints are taken as Lyapunov functions, and new linear constraints are imposed on the updating dynamics of the policy parameters such that the original safety set is forward-invariant in expectation. As the new guaranteed-feasible constraints are imposed on the updating dynamics instead of the original policy parameters, classic optimization algorithms are no longer applicable. To address this, we propose to learn a neural network-based meta-optimizer to optimize the objective while satisfying such linear constraints. The constraint-satisfaction is achieved via projection onto a polytope formulated by multiple linear inequality constraints, which can be solved analytically with our newly designed metric. Ultimately, the meta-optimizer trains the policy network to monotonically decrease the constraint violation and maximize the cumulative reward. Numerical results validate the theoretical findings.

constraint, machine learning, reinforcement learning, (14 more...)

2006.11419

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Liu, Ruisen, Natarajan, Manisha, Gombolay, Matthew

Human-Robot Team Coordination with Dynamic and Latent Human Task Proficiencies: Scheduling with Learning Curves

As robots become ubiquitous in the workforce, it is essential that human-robot collaboration be both intuitive and adaptive. A robot's quality improves based on its ability to explicitly reason about the time-varying (i.e. learning curves) and stochastic capabilities of its human counterparts, and adjust the joint workload to improve efficiency while factoring human preferences. We introduce a novel resource coordination algorithm that enables robots to explore the relative strengths and learning abilities of their human teammates, by constructing schedules that are robust to stochastic and time-varying human task performance. We first validate our algorithmic approach using data we collected from a user study (n = 20), showing we can quickly generate and evaluate a robust schedule while discovering the latest individual worker proficiency. Second, we conduct a between-subjects experiment (n = 90) to validate the efficacy of our coordinating algorithm. Results from the human-subjects experiment indicate that scheduling strategies favoring exploration tend to be beneficial for human-robot collaboration as it improves team fluency (p = 0.0438), while also maximizing team efficiency (p < 0.001).

artificial intelligence, constraint-based reasoning, optimization problem, (17 more...)

2007.01921

Country: North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education (0.54)
Leisure & Entertainment > Games (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.84)

arXiv.org Machine LearningJul-8-2020

Linear-Time Algorithms for Adaptive Submodular Maximization

Tang, Shaojie

In this paper, we develop fast algorithms for two stochastic submodular maximization problems. We start with the well-studied adaptive submodular maximization problem subject to a cardinality constraint. We develop the first linear-time algorithm which achieves a $(1-1/e-\epsilon)$ approximation ratio. Notably, the time complexity of our algorithm is $O(n\log\frac{1}{\epsilon})$ (number of function evaluations) which is independent of the cardinality constraint, where $n$ is the size of the ground set. Then we introduce the concept of fully adaptive submodularity, and develop a linear-time algorithm for maximizing a fully adaptive submoudular function subject to a partition matroid constraint. We show that our algorithm achieves a $\frac{1-1/e-\epsilon}{4-2/e-2\epsilon}$ approximation ratio using only $O(n\log\frac{1}{\epsilon})$ number of function evaluations.

algorithm, artificial intelligence, machine learning, (19 more...)

2007.04214

Country: North America > United States > Texas (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Feurer, Matthias, Eggensperger, Katharina, Falkner, Stefan, Lindauer, Marius, Hutter, Frank

Auto-Sklearn 2.0: The Next Generation

arXiv.org Machine LearningJul-8-2020

Automated Machine Learning, which supports practitioners and researchers with the tedious task of manually designing machine learning pipelines, has recently achieved substantial success. In this paper we introduce new Automated Machine Learning (AutoML) techniques motivated by our winning submission to the second ChaLearn AutoML challenge, PoSH Auto-sklearn. For this, we extend Auto-sklearn with a new, simpler meta-learning technique, improve its way of handling iterative algorithms and enhance it with a successful bandit strategy for budget allocation. Furthermore, we go one step further and study the design space of AutoML itself and propose a solution towards truly hand-free AutoML. Together, these changes give rise to the next generation of our AutoML system, Auto-sklearn (2.0). We verify the improvement by these additions in a large experimental study on 39 AutoML benchmark datasets and conclude the paper by comparing to Auto-sklearn (1.0), reducing the regret by up to a factor of five.

artificial intelligence, machine learning, optimization problem, (14 more...)

2007.04074

Country:

Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Jiang, Yiping, Chen, Tianshi

Accelerated Sparse Bayesian Learning via Screening Test and Its Applications

arXiv.org Machine LearningJul-8-2020

In high-dimensional settings, sparse structures are critical for efficiency in term of memory and computation complexity. For a linear system, to find the sparsest solution provided with an over-complete dictionary of features directly is typically NP-hard, and thus alternative approximate methods should be considered. In this paper, our choice for alternative method is sparse Bayesian learning, which, as empirical Bayesian approaches, uses a parameterized prior to encourage sparsity in solution, rather than the other methods with fixed priors such as LASSO. Screening test, however, aims at quickly identifying a subset of features whose coefficients are guaranteed to be zero in the optimal solution, and then can be safely removed from the complete dictionary to obtain a smaller, more easily solved problem. Next, we solve the smaller problem, after which the solution of the original problem can be recovered by padding the smaller solution with zeros. The performance of the proposed method will be examined on various data sets and applications.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2007.04006

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Zhou, Yingxue, Wu, Zhiwei Steven, Banerjee, Arindam

Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification

arXiv.org Machine LearningJul-7-2020

Differentially private SGD (DP-SGD) is one of the most popular methods for solving differentially private empirical risk minimization (ERM). Due to its noisy perturbation on each gradient update, the error rate of DP-SGD scales with the ambient dimension $p$, the number of parameters in the model. Such dependence can be problematic for over-parameterized models where $p \gg n$, the number of training samples. Existing lower bounds on private ERM show that such dependence on $p$ is inevitable in the worst case. In this paper, we circumvent the dependence on the ambient dimension by leveraging a low-dimensional structure of gradient space in deep networks---that is, the stochastic gradients for deep nets usually stay in a low dimensional subspace in the training process. We propose Projected DP-SGD that performs noise reduction by projecting the noisy gradients to a low-dimensional subspace, which is given by the top gradient eigenspace on a small public dataset. We provide a general sample complexity analysis on the public dataset for the gradient subspace identification problem and demonstrate that under certain low-dimensional assumptions the public sample complexity only grows logarithmically in $p$. Finally, we provide a theoretical analysis and empirical evaluations to show that our method can substantially improve the accuracy of DP-SGD.

artificial intelligence, machine learning, optimization problem, (18 more...)

2007.03813

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Minnesota (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)

Yang, Yuwen, Rajgopal, Jayant

Learning Combined Set Covering and Traveling Salesman Problem

arXiv.org Artificial IntelligenceJul-7-2020

The Traveling Salesman Problem is one of the most intensively studied combinatorial optimization problems due both to its range of real-world applications and its computational complexity. When combined with the Set Covering Problem, it raises even more issues related to tractability and scalability. We study a combined Set Covering and Traveling Salesman problem and provide a mixed integer programming formulation to solve the problem. Motivated by applications where the optimal policy needs to be updated on a regular basis and repetitively solving this via MIP can be computationally expensive, we propose a machine learning approach to effectively deal with this problem by providing an opportunity to learn from historical optimal solutions that are derived from the MIP formulation. We also present a case study using the vaccine distribution chain of the World Health Organization, and provide numerical results with data derived from four countries in sub-Saharan Africa.

artificial intelligence, machine learning, population center, (18 more...)

2007.03203

Country:

Africa > Sub-Saharan Africa (0.25)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Tang, Shaojie, Yuan, Jing

Adaptive Cascade Submodular Maximization

arXiv.org Artificial IntelligenceJul-7-2020

In this paper, we propose and study the cascade submodular maximization problem under the adaptive setting. The input of our problem is a set of items, each item is in a particular state (i.e., the marginal contribution of an item) which is drawn from a known probability distribution. However, we can not know its actual state before selecting it. As compared with existing studies on stochastic submodular maximization, one unique setting of our problem is that each item is associated with a continuation probability which represents the probability that one can continue to select the next item after selecting the current one. Intuitively, this term captures the externality of one item to all its subsequent items in terms of the opportunity of being selected. Therefore, the actual set of items that can be selected by a policy depends on the specific ordering it adopts to select items, this makes our problem fundamentally different from classical submodular set optimization problems. Our objective is to identify the best sequence of selecting items so as to maximize the expected utility of the selected items. We propose a class of stochastic utility functions, \emph{adaptive cascade submodular functions}, and show that the objective functions in many practical application domains satisfy adaptive cascade submodularity. Then we develop a $\frac{1-1/e}{8}$ approximation algorithm to the adaptive cascade submodular maximization problem.

artificial intelligence, avg, optimization problem, (15 more...)

2007.03592

Country: North America > United States > Texas (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.35)