AITopics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Neural Information Processing SystemsDec-31-2002

PAC Generalization Bounds for Co-training

Dasgupta, Sanjoy, Littman, Michael L., McAllester, David A.

The rule-based bootstrapping introduced by Yarowsky, and its cotraining variantby Blum and Mitchell, have met with considerable empirical success. Earlier work on the theory of co-training has been only loosely related to empirically useful co-training algorithms. Here we give a new PACstyle bound on generalization error which justifies both the use of confidences -- partial rules and partial labeling of the unlabeled data -- and the use of an agreement-based objective function as suggested byCollins and Singer. Our bounds apply to the multiclass case, i.e., where instances are to be assigned one of labels for

artificial intelligence, evolutionary algorithm, probability, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)

Boosting with Multi-Way Branching in Decision Trees

Mansour, Yishay, McAllester, David A.

It is known that decision tree learning can be viewed as a form of boosting. However, existing boosting theorems for decision tree learning allow only binary-branching trees and the generalization to multi-branching trees is not immediate. Practical decision tree algorithms, such as CART and C4.5, implement a tradeoff between the number of branches and the improvement in tree quality as measured by an index function. Here we give a boosting justification for a particular quantitative tradeoff curve. Our main theorem states, in essence, that if we require an improvement proportional to the log of the number of branches then top-down greedy construction of decision trees remains an effective boosting algorithm.

artificial intelligence, decision tree learning, hypothesis, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Boosting with Multi-Way Branching in Decision Trees

Mansour, Yishay, McAllester, David A.

It is known that decision tree learning can be viewed as a form of boosting. However, existing boosting theorems for decision tree learning allow only binary-branching trees and the generalization to multi-branching trees is not immediate. Practical decision tree algorithms, suchas CART and C4.5, implement a tradeoff between the number of branches and the improvement in tree quality as measured by an index function. Here we give a boosting justification fora particular quantitative tradeoff curve. Our main theorem states, in essence, that if we require an improvement proportional to the log of the number of branches then top-down greedy construction ofdecision trees remains an effective boosting algorithm.

artificial intelligence, decision tree learning, hypothesis, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Policy Gradient Methods for Reinforcement Learning with Function Approximation

Sutton, Richard S., McAllester, David A., Singh, Satinder P., Mansour, Yishay

Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly represented by its own function approximator, independent of the value function, and is updated according to the gradient of expected reward with respect to the policy parameters. Williams's REINFORCE method and actor-critic methods are examples of this approach. Our main new result is to show that the gradient can be written in a form suitable for estimation from experience aided by an approximate action-value or advantage function. Using this result, we prove for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.

artificial intelligence, function approximation, fuzzy logic, (13 more...)

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Boosting with Multi-Way Branching in Decision Trees

Mansour, Yishay, McAllester, David A.

artificial intelligence, decision tree learning, hypothesis, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Policy Gradient Methods for Reinforcement Learning with Function Approximation

Sutton, Richard S., McAllester, David A., Singh, Satinder P., Mansour, Yishay

Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining apolicy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly represented by its own function approximator, independent ofthe value function, and is updated according to the gradient of expected reward with respect to the policy parameters. Williams's REINFORCE method and actor-critic methods are examples of this approach. Our main new result is to show that the gradient can be written in a form suitable for estimation from experience aided by an approximate action-value or advantage function. Using this result, we prove for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.

artificial intelligence, function approximation, fuzzy logic, (13 more...)

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.86)