AITopics

doi: 10.1613/jair.2655

10600

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(18 more...)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Journal of Artificial Intelligence ResearchMar-27-2009

Identification of Pleonastic It Using the Web

Li, Y., Musilek, P., Reformat, M., Wyard-Scott, L.

In a significant minority of cases, certain pronouns, especially the pronoun it, can be used without referring to any specific entity. This phenomenon of pleonastic pronoun usage poses serious problems for systems aiming at even a shallow understanding of natural language texts. In this paper, a novel approach is proposed to identify such uses of it: the extrapositional cases are identified using a series of queries against the web, and the cleft cases are identified using a simple set of syntactic rules. The system is evaluated with four sets of news articles containing 679 extrapositional cases as well as 78 cleft constructs. The identification results are comparable to those obtained by human efforts.

category, extraposition, query, (17 more...)

doi: 10.1613/jair.2622

10593

Country:

Europe > United Kingdom > England > Greater London > London (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
(14 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Banking & Finance (1.00)
Law (0.67)
Government > Regional Government (0.45)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)

Journal of Artificial Intelligence ResearchMar-12-2009

Generic Preferences over Subsets of Structured Objects

Binshtok, M., Brafman, R. I., Domshlak, C., Shiomony, S. E.

Various tasks in decision making and decision support systems require selecting a preferred subset of a given set of items. Here we focus on problems where the individual items are described using a set of characterizing attributes, and a generic preference specification is required, that is, a specification that can work with an arbitrary set of items. For example, preferences over the content of an online newspaper should have this form: At each viewing, the newspaper contains a subset of the set of articles currently available. Our preference specification over this subset should be provided offline, but we should be able to use it to select a subset of any currently available set of articles, e.g., based on their tags. We present a general approach for lifting formalisms for specifying preferences over objects with multiple attributes into ones that specify preferences over subsets of such objects. We also show how we can compute an optimal subset given such a specification in a relatively efficient manner. We provide an empirical evaluation of the approach as well as some worst-case complexity results.

algorithm, constraint, subset, (15 more...)

doi: 10.1613/jair.2653

10588

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Israel (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(4 more...)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Media > News (0.74)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Zinkevich, Martin, Johanson, Michael, Bowling, Michael, Piccione, Carmelo

Regret Minimization in Games with Incomplete Information

Extensive games are a powerful model of multiagent decision-making scenarios with incomplete information. Finding a Nash equilibrium for very large instances of these games has received a great deal of recent attention. In this paper, we describe a new technique for solving large games based on regret minimization. In particular, we introduce the notion of counterfactual regret, which exploits the degree of incomplete information in an extensive game. We show how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium. We demonstrate this technique in the domain of poker, showing we can solve abstractions of limit Texas Hold'em with as many as 10

abstraction, extensive game, information, (17 more...)

Country:

North America > United States > Texas (0.25)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Leisure & Entertainment > Games > Poker (0.90)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Games > Poker (0.70)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Learning with Tree-Averaged Densities and Distributions

Kirshner, Sergey

We utilize the ensemble of trees framework, a tractable mixture over superexponential number of tree-structured distributions [1], to develop a new model for multivariate density estimation. The model is based on a construction of treestructured copulas - multivariate distributions with uniform on [0, 1] marginals.

algorithm, copula, dependence, (15 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Bhatnagar, Shalabh, Ghavamzadeh, Mohammad, Lee, Mark, Sutton, Richard S.

Incremental Natural Actor-Critic Algorithms

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility with function approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variance in some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal difference learning in the actor and by incorporating natural gradients, and they extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms.

algorithm, approximation, gradient, (14 more...)

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Zinkevich, Martin, Johanson, Michael, Bowling, Michael, Piccione, Carmelo

Regret Minimization in Games with Incomplete Information

Extensive games are a powerful model of multiagent decision-making scenarios with incomplete information. Finding a Nash equilibrium for very large instances of these games has received a great deal of recent attention. In this paper, we describe a new technique for solving large games based on regret minimization. In particular, we introduce the notion of counterfactual regret, which exploits the degree of incomplete information in an extensive game. We show how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium. We demonstrate this technique in the domain of poker, showing we can solve abstractions of limit Texas Hold'em with as many as 10

abstraction, extensive game, information, (17 more...)

Country:

North America > United States > Texas (0.25)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Leisure & Entertainment > Games > Poker (0.90)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Games > Poker (0.70)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Bhatnagar, Shalabh, Ghavamzadeh, Mohammad, Lee, Mark, Sutton, Richard S.

Incremental Natural Actor-Critic Algorithms

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility with function approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variance in some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal difference learning in the actor and by incorporating natural gradients, and they extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms.

algorithm, approximation, gradient, (14 more...)

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Bhatnagar, Shalabh, Ghavamzadeh, Mohammad, Lee, Mark, Sutton, Richard S.

Incremental Natural Actor-Critic Algorithms

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learningmethods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility withfunction approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variancein some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal difference learningin the actor and by incorporating natural gradients, and they extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms.

algorithm, approximation, gradient, (14 more...)

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Journal of Artificial Intelligence ResearchJul-1-2008

A General Theory of Additive State Space Abstractions

Yang, F., Culberson, J., Holte, R., Zahavi, U., Felner, A.

Informally, a set of abstractions of a state space S is additive if the distance between any two states in S is always greater than or equal to the sum of the corresponding distances in the abstract spaces. The first known additive abstractions, called disjoint pattern databases, were experimentally demonstrated to produce state of the art performance on certain state spaces. However, previous applications were restricted to state spaces with special properties, which precludes disjoint pattern databases from being defined for several commonly used testbeds, such as Rubik's Cube, TopSpin and the Pancake puzzle. In this paper we give a general definition of additive abstractions that can be applied to any state space and prove that heuristics based on additive abstractions are consistent as well as admissible. We use this new definition to create additive abstractions for these testbeds and show experimentally that well chosen additive abstractions can reduce search time substantially for the (18,4)-TopSpin puzzle and by three orders of magnitude over state of the art methods for the 17-Pancake puzzle. We also derive a way of testing if the heuristic value returned by additive abstractions is provably too low and show that the use of this test can reduce search time for the 15-puzzle and TopSpin by roughly a factor of two.

abstract space, abstraction, additive abstraction, (16 more...)

doi: 10.1613/jair.2486

10558