AITopics | Syed, Umar

Collaborating Authors

Syed, Umar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Imitation Learning with a Value-Based Prior

Syed, Umar, Schapire, Robert E.

arXiv.org Artificial IntelligenceJun-20-2012

The goal of imitation learning is for an apprentice to learn how to behave in a stochastic environment by observing a mentor demonstrating the correct behavior. Accurate prior knowledge about the correct behavior can reduce the need for demonstrations from the mentor. We present a novel approach to encoding prior knowledge about the correct behavior, where we assume that this prior knowledge takes the form of a Markov Decision Process (MDP) that is used by the apprentice as a rough and imperfect model of the mentor's behavior. Specifically, taking a Bayesian approach, we treat the value of a policy in this modeling MDP as the log prior probability of the policy. In other words, we assume a priori that the mentor's behavior is likely to be a high value policy in the modeling MDP, though quite possibly different from the optimal policy. We describe an efficient algorithm that, given a modeling MDP and a set of demonstrations by a mentor, provably converges to a stationary point of the log posterior of the mentor's policy, where the posterior is computed with respect to the "value based" prior. We also present empirical evidence that this prior does in fact speed learning of the mentor's policy, and is an improvement in our experiments over similar previous methods.

artificial intelligence, bayesian inference, mentor, (18 more...)

arXiv.org Artificial Intelligence

1206.529

Country: Europe > Portugal (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Graphical Models for Bandit Problems

Amin, Kareem, Kearns, Michael, Syed, Umar

arXiv.org Artificial IntelligenceFeb-14-2012

We introduce a rich class of graphical models for multi-armed bandit problems that permit both the state or context space and the action space to be very large, yet succinctly specify the payoffs for any context-action pair. Our main result is an algorithm for such models whose regret is bounded by the number of parameters and whose running time depends only on the treewidth of the graph substructure induced by the action space.

algorithm, artificial intelligence, big data, (17 more...)

arXiv.org Artificial Intelligence

1202.3782

Country: North America > United States (0.28)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Semi-Supervised Learning with Adversarially Missing Label Information

Syed, Umar, Taskar, Ben

Neural Information Processing SystemsDec-31-2010

We address the problem of semi-supervised learning in an adversarial setting. Instead of assuming that labels are missing at random, we analyze a less favorable scenario where the label information can be missing partially and arbitrarily, which is motivated by several practical examples. We present nearly matching upper and lower generalization bounds for learning in this setting under reasonable assumptions about available label information. Motivated by the analysis, we formulate a convex optimization problem for parameter estimation, derive an efficient algorithm, and analyze its convergence. We provide experimental results on several standard data sets showing the robustness of our algorithm to the pattern of missing label information, outperforming several strong baselines.

artificial intelligence, inductive learning, information, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

A Reduction from Apprenticeship Learning to Classification

Syed, Umar, Schapire, Robert E.

Neural Information Processing SystemsDec-31-2010

We provide new theoretical results for apprenticeship learning, a variant of reinforcement learning in which the true reward function is unknown, and the goal is to perform well relative to an observed expert. We study a common approach to learning from expert demonstrations: using a classification algorithm to learn to imitate the expert's behavior. Although this straightforward learning strategy is widely-used in practice, it has been subject to very little formal analysis. We prove that, if the learned classifier has error rate $\eps$, the difference between the value of the apprentice's policy and the expert's policy is $O(\sqrt{\eps})$. Further, we prove that this difference is only $O(\eps)$ when the expert's policy is close to optimal. This latter result has an important practical consequence: Not only does imitating a near-optimal expert result in a better policy, but far fewer demonstrations are required to successfully imitate such an expert. This suggests an opportunity for substantial savings whenever the expert is known to be good, but demonstrations are expensive or difficult to obtain.

apprentice, artificial intelligence, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Private and Third-Party Randomization in Risk-Sensitive Equilibrium Concepts

Brautbar, Mickey (University of Pennsylvania) | Kearns, Michael (University of Pennsylvania) | Syed, Umar (University of Pennsylvania)

AAAI ConferencesJul-15-2010

We consider risk-sensitive generalizations of Nash and correlated equilibria in noncooperative games. We prove that, except for a class of degenerate games, unless a two-player game has a pure Nash equilibrium, it does not have a risk-sensitive Nash equilibrium. We also show that every game has a risk-sensitive correlated equilibrium. The striking contrast between these existence results is due to the different sources of randomization in Nash (private randomization) and correlated equilibria (third-party randomization).

artificial intelligence, equilibrium, game theory, (18 more...)

AAAI Conferences

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.14)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Adapting to the Shifting Intent of Search Queries

Syed, Umar, Slivkins, Aleksandrs, Mishra, Nina

Neural Information Processing SystemsDec-31-2009

Search engines today present results that are often oblivious to recent shifts in intent. For example, the meaning of the query independence day shifts in early July to a US holiday and to a movie around the time of the box office release. While no studies exactly quantify the magnitude of intent-shifting traffic, studies suggest that news events, seasonal topics, pop culture, etc account for 1/2 the search queries. This paper shows that the signals a search engine receives can be used to both determine that a shift in intent happened, as well as find a result that is now more relevant. We present a meta-algorithm that marries a classifier with a bandit algorithm to achieve regret that depends logarithmically on the number of query impressions, under certain assumptions. We provide strong evidence that this regret is close to the best achievable. Finally, via a series of experiments, we demonstrate that our algorithm outperforms prior approaches, particularly as the amount of intent-shifting traffic increases.

big data, classifier, information management, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report > Experimental Study (0.34)

Industry: Media (0.34)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.72)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

A Game-Theoretic Approach to Apprenticeship Learning

Syed, Umar, Schapire, Robert E.

Neural Information Processing SystemsDec-31-2008

We study the problem of an apprentice learning to behave in an environment with an unknown reward function by observing the behavior of an expert. We follow on the work of Abbeel and Ng [1] who considered a framework in which the true reward function is assumed to be a linear combination of a set of known and observable features.We give a new algorithm that, like theirs, is guaranteed to learn a policy that is nearly as good as the expert's, given enough examples. However, unlike their algorithm, we show that ours may produce a policy that is substantially better than the expert's. Moreover, our algorithm is computationally faster, is easier toimplement, and can be applied even in the absence of an expert. The method is based on a game-theoretic view of the problem, which leads naturally to a direct application of the multiplicative-weights algorithm of Freund and Schapire [2] for playing repeated matrix games. In addition to our formal presentation and analysis of the new algorithm, we sketch how the method can be applied when the transition functionitself is unknown, and we provide an experimental demonstration of the algorithm on a toy video-game environment.

algorithm, artificial intelligence, computer game, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Industry: Leisure & Entertainment > Games > Computer Games (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)

Add feedback