AITopics | Schapire, Robert

A Unified Model and Dimension for Interactive Estimation

Brukhim, Nataly, Dudik, Miroslav, Pacchiano, Aldo, Schapire, Robert

arXiv.org Artificial IntelligenceJun-9-2023

We study an abstract framework for interactive learning called interactive estimation in which the goal is to estimate a target from its "similarity'' to points queried by the learner. We introduce a combinatorial measure called dissimilarity dimension which largely captures learnability in our model. We present a simple, general, and broadly-applicable algorithm, for which we obtain both regret and PAC generalization bounds that are polynomial in the new dimension. We show that our framework subsumes and thereby unifies two classic learning models: statistical-query learning and structured bandits. We also delineate how the dissimilarity dimension is related to well-known parameters for both frameworks, in some cases yielding significantly improved analyses.

artificial intelligence, dimension, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.06184

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.94)

Add feedback

Interactive Learning from Activity Description

Nguyen, Khanh, Misra, Dipendra, Schapire, Robert, Dudík, Miro, Shafto, Patrick

arXiv.org Artificial IntelligenceFeb-13-2021

We present a novel interactive learning protocol that enables training request-fulfilling agents by verbally describing their activities. Our protocol gives rise to a new family of interactive learning algorithms that offer complementary advantages against traditional algorithms like imitation learning (IL) and reinforcement learning (RL). We develop an algorithm that practically implements this protocol and employ it to train agents in two challenging request-fulfilling problems using purely language-description feedback. Empirical results demonstrate the strengths of our algorithm: compared to RL baselines, it is more sample-efficient; compared to IL baselines, it achieves competitive success rates while not requiring feedback providers to have agent-specific expertise. We also provide theoretical guarantees of the algorithm under certain assumptions on the teacher and the environment.

agent, artificial intelligence, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2102.07024

Country:

Europe (1.00)
North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report > New Finding (0.87)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

Reinforcement Learning with Convex Constraints

Miryoosefi, Sobhan, Brantley, Kianté, Daumé, Hal III, Dudik, Miroslav, Schapire, Robert

arXiv.org Artificial IntelligenceJun-21-2019

In standard reinforcement learning (RL), a learning agent seeks to optimize the overall reward. However, many key aspects of a desired behavior are more naturally expressed as constraints. For instance, the designer may want to limit the use of unsafe actions, increase the diversity of trajectories to enable exploration, or approximate expert trajectories when rewards are sparse. In this paper, we propose an algorithmic scheme that can handle a wide class of constraints in RL tasks, specifically, any constraints that require expected values of some vector measurements (such as the use of an action) to lie in a convex set. This captures previously studied constraints (such as safety and proximity to an expert), but also enables new classes of constraints (such as diversity). Our approach comes with rigorous theoretical guarantees and only relies on the ability to approximately solve standard RL tasks. As a result, it can be easily adapted to work with any model-free or model-based RL algorithm. In our experiments, we show that it matches previous algorithms that enforce safety via constraints, but can also enforce new properties that these algorithms cannot incorporate, such as diversity.

artificial intelligence, constraint, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1906.09323

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Adversarial Bandits with Knapsacks

Immorlica, Nicole, Sankararaman, Karthik Abinav, Schapire, Robert, Slivkins, Aleksandrs

arXiv.org Machine LearningDec-18-2018

We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a well-known knapsack problem: find an optimal packing of items into a limited-size knapsack. The BwK problem is a common generalization of numerous motivating examples, which range from dynamic pricing to repeated auctions to dynamic ad allocation to network routing and scheduling. While the prior work on BwK focused on the stochastic version, we pioneer the other extreme in which the outcomes can be chosen adversarially. This is a considerably harder problem, compared to both the stochastic version and the "classic" adversarial bandits, in that regret minimization is no longer feasible. Instead, the objective is to minimize the competitive ratio: the ratio of the benchmark reward to the algorithm's reward. We design an algorithm with competitive ratio O(log T) relative to the best fixed distribution over actions, where T is the time horizon; we also prove a matching lower bound. The key conceptual contribution is a new perspective on the stochastic version of the problem. We suggest a new algorithm for the stochastic version, which builds on the framework of regret minimization in repeated games and admits a substantially simpler analysis compared to prior work. We then analyze this algorithm for the adversarial version and use it as a subroutine to solve the latter.

algorithm, game theory, optimization problem, (22 more...)

arXiv.org Machine Learning

1811.11881

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Functional Frank-Wolfe Boosting for General Loss Functions

Wang, Chu, Wang, Yingfei, E, Weinan, Schapire, Robert

arXiv.org Machine LearningOct-8-2015

Boosting is a generic learning method for classification and regression. Yet, as the number of base hypotheses becomes larger, boosting can lead to a deterioration of test performance. Overfitting is an important and ubiquitous phenomenon, especially in regression settings. To avoid overfitting, we consider using $l_1$ regularization. We propose a novel Frank-Wolfe type boosting algorithm (FWBoost) applied to general loss functions. By using exponential loss, the FWBoost algorithm can be rewritten as a variant of AdaBoost for binary classification. FWBoost algorithms have exactly the same form as existing boosting methods, in terms of making calls to a base learning algorithm with different weights update. This direct connection between boosting and Frank-Wolfe yields a new algorithm that is as practical as existing boosting methods but with new guarantees and rates of convergence. Experimental results show that the test performance of FWBoost is not degraded with larger rounds in boosting, which is consistent with the theoretical analysis.

algorithm, artificial intelligence, optimization problem, (17 more...)

arXiv.org Machine Learning

1510.02558

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Convex Risk Minimization and Conditional Probability Estimation

Telgarsky, Matus, Dudík, Miroslav, Schapire, Robert

arXiv.org Machine LearningJun-15-2015

This paper proves, in very general settings, that convex risk minimization is a procedure to select a unique conditional probability model determined by the classification problem. Unlike most previous work, we give results that are general enough to include cases in which no minimum exists, as occurs typically, for instance, with standard boosting algorithms. Concretely, we first show that any sequence of predictors minimizing convex risk over the source distribution will converge to this unique model when the class of predictors is linear (but potentially of infinite dimension). Secondly, we show the same result holds for \emph{empirical} risk minimization whenever this class of predictors is finite dimensional, where the essential technical contribution is a norm-free generalization bound.

artificial intelligence, bayesian inference, ker, (17 more...)

arXiv.org Machine Learning

1506.04513

Country: North America > United States > Michigan (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.62)

Add feedback

Improving Performance in Neural Networks Using a Boosting Algorithm

Drucker, Harris, Schapire, Robert, Simard, Patrice

Neural Information Processing SystemsDec-31-1993

A boosting algorithm converts a learning machine with error rate less than 50% to one with an arbitrarily low error rate. However, the algorithm discussed here depends on having a large supply of independent training samples. We show how to circumvent this problem and generate an ensemble of learning machines whose performance in optical character recognition problems is dramatically improved over that of a single network. We report the effect of boosting on four databases (all handwritten) consisting of 12,000 digits from segmented ZIP codes from the United State Postal Service (USPS) and the following from the National Institute of Standards and Testing (NIST): 220,000 digits, 45,000 upper case alphas, and 45,000 lower case alphas. We use two performance measures: the raw error rate (no rejects) and the reject rate required to achieve a 1% error rate on the patterns not rejected.

error rate, neural network, us government, (20 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.70)
Government > Post Office (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Improving Performance in Neural Networks Using a Boosting Algorithm

Drucker, Harris, Schapire, Robert, Simard, Patrice

Neural Information Processing SystemsDec-31-1993

A boosting algorithm converts a learning machine with error rate less than 50% to one with an arbitrarily low error rate. However, the algorithm discussed here depends on having a large supply of independent training samples. We show how to circumvent this problem and generate an ensemble of learning machines whose performance in optical character recognition problems is dramatically improved over that of a single network. We report the effect of boosting on four databases (all handwritten) consisting of 12,000 digits from segmented ZIP codes from the United State Postal Service (USPS) and the following from the National Institute of Standards and Testing (NIST): 220,000 digits, 45,000 upper case alphas, and 45,000 lower case alphas. We use two performance measures: the raw error rate (no rejects) and the reject rate required to achieve a 1% error rate on the patterns not rejected.

error rate, neural network, us government, (20 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.70)
Government > Post Office (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Estimating Average-Case Learning Curves Using Bayesian, Statistical Physics and VC Dimension Methods

Haussler, David, Kearns, Michael, Opper, Manfred, Schapire, Robert

Neural Information Processing SystemsDec-31-1992

In this paper we investigate an average-case model of concept learning, and give results that place the popular statistical physics and VC dimension theories of learning curve behavior in a common framework.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Cruz County > Santa Cruz (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Estimating Average-Case Learning Curves Using Bayesian, Statistical Physics and VC Dimension Methods

Haussler, David, Kearns, Michael, Opper, Manfred, Schapire, Robert

Neural Information Processing SystemsDec-31-1992

In this paper we investigate an average-case model of concept learning, and give results that place the popular statistical physics and VC dimension theories of learning curve behavior in a common framework.

Add feedback

Filters

Collaborating Authors

Schapire, Robert

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Unified Model and Dimension for Interactive Estimation

Interactive Learning from Activity Description

Reinforcement Learning with Convex Constraints

Adversarial Bandits with Knapsacks

Functional Frank-Wolfe Boosting for General Loss Functions

Convex Risk Minimization and Conditional Probability Estimation

Improving Performance in Neural Networks Using a Boosting Algorithm

Improving Performance in Neural Networks Using a Boosting Algorithm

Estimating Average-Case Learning Curves Using Bayesian, Statistical Physics and VC Dimension Methods

Estimating Average-Case Learning Curves Using Bayesian, Statistical Physics and VC Dimension Methods