Goto

Collaborating Authors

 Decision Tree Learning


Tight performance bounds on greedy policies based on imperfect value functions

Classics

Reinforcement learning is an effective technique for learning action policies in discrete stochastic environments, but its efficiency can decay exponentially with the size of the state space. In many situations significant portions of a large state space may be irrelevant to a specific goal and can be aggregated into a few, relevant, states. The U Tree algorithm generates a tree based state discretization that efficiently finds the relevant state chunks of large propositional domains. In this paper, we extend the U Tree algorithm to challenging domains with a continuous state space for which there is no initial discretization.


Human and Machine 'Quick Modeling'

Neural Information Processing Systems

We present here an interesting experiment in'quick modeling' by humans, performed independently on small samples, in several languages and two continents, over the last three years. Comparisons to decision tree procedures and neural net processing are given. From these, we conjecture that human reasoning is better represented by the latter, but substantially different from both. Implications for the'strong convergence hypothesis' between neural networks and machine learning are discussed, now expanded to include human reasoning comparisons. 1 INTRODUCTION Until recently the fields of symbolic and connectionist learning evolved separately. Suddenly in the last two years a significant number of papers comparing the two methodologies have appeared. A beginning synthesis of these two fields was forged at the NIPS '90 Workshop #5 last year (Pratt and Norton, 1990), where one may find a good bibliography of the recent work of Atlas, Dietterich, Omohundro, Sanger, Shavlik, Tsoi, Utgoff and others. It was at that NIPS '90 Workshop that we learned of these studies, most of which concentrate on performance comparisons of decision tree algorithms (such as ID3, CART) and neural net algorithms (such as Perceptrons, Backpropagation). Independently three years ago we had looked at Quinlan's ID3 scheme (Quinlan, 1984) and intuitively and rather instantly not agreeing with the generalization he obtains by ID3 from a sample of 8 items generalized to 12 items, we subjected this example to a variety of human experiments. We report our findings, as compared to the performance of ID3 and also to various neural net computations.


Human and Machine 'Quick Modeling'

Neural Information Processing Systems

We present here an interesting experiment in'quick modeling' by humans, performed independently on small samples, in several languages and two continents, over the last three years. Comparisons to decision tree procedures and neural net processing are given. From these, we conjecture that human reasoning is better represented by the latter, but substantially different from both. Implications for the'strong convergence hypothesis' between neural networks and machine learning are discussed, now expanded to include human reasoning comparisons. 1 INTRODUCTION Until recently the fields of symbolic and connectionist learning evolved separately. Suddenly in the last two years a significant number of papers comparing the two methodologies have appeared. A beginning synthesis of these two fields was forged at the NIPS '90 Workshop #5 last year (Pratt and Norton, 1990), where one may find a good bibliography of the recent work of Atlas, Dietterich, Omohundro, Sanger, Shavlik, Tsoi, Utgoff and others. It was at that NIPS '90 Workshop that we learned of these studies, most of which concentrate on performance comparisons of decision tree algorithms (such as ID3, CART) and neural net algorithms (such as Perceptrons, Backpropagation). Independently three years ago we had looked at Quinlan's ID3 scheme (Quinlan, 1984) and intuitively and rather instantly not agreeing with the generalization he obtains by ID3 from a sample of 8 items generalized to 12 items, we subjected this example to a variety of human experiments. We report our findings, as compared to the performance of ID3 and also to various neural net computations.


Human and Machine 'Quick Modeling'

Neural Information Processing Systems

We present here an interesting experiment in'quick modeling' by humans, performed independently on small samples, in several languages and two continents, over the last three years. Comparisons to decision tree procedures andneural net processing are given. From these, we conjecture that human reasoning is better represented by the latter, but substantially different fromboth. Implications for the'strong convergence hypothesis' between neuralnetworks and machine learning are discussed, now expanded to include human reasoning comparisons. 1 INTRODUCTION Until recently the fields of symbolic and connectionist learning evolved separately. Suddenly in the last two years a significant number of papers comparing the two methodologies have appeared. A beginning synthesis of these two fields was forged at the NIPS '90 Workshop #5 last year (Pratt and Norton, 1990), where one may find a good bibliography of the recent work of Atlas, Dietterich, Omohundro, Sanger, Shavlik, Tsoi, Utgoff and others. It was at that NIPS '90 Workshop that we learned of these studies, most of which concentrate on performance comparisons of decision tree algorithms (such as ID3, CART) and neural net algorithms (such as Perceptrons, Backpropagation). Independently threeyears ago we had looked at Quinlan's ID3 scheme (Quinlan, 1984) and intuitively and rather instantly not agreeing with the generalization he obtains by ID3 from a sample of 8 items generalized to 12 items, we subjected this example to a variety of human experiments. We report our findings, as compared to the performance of ID3 and also to various neural net computations.


Comparison of three classification techniques: CART, C4.5 and Multi-Layer Perceptrons

Neural Information Processing Systems

In this paper, after some introductory remarks into the classification problem as considered in various research communities, and some discussions concerning some of the reasons for ascertaining the performances of the three chosen algorithms, viz., CART (Classification and Regression Tree), C4.5 (one of the more recent versions of a popular induction tree technique known as ID3), and a multi-layer perceptron (MLP), it is proposed to compare the performances of these algorithms under two criteria: classification and generalisation. It is found that, in general, the MLP has better classification and generalisation accuracies compared with the other two algorithms. 1 Introduction Classification of data into categories has been pursued by a number of research communities, viz., applied statistics, knowledge acquisition, neural networks. In applied statistics, there are a number of techniques, e.g., clustering algorithms (see e.g., Hartigan), CART (Classification and Regression Trees, see e.g., Breiman et al). Clustering algorithms are used when the underlying data naturally fall into a number of groups, the distance among groups are measured by various metrics [Hartigan]. CART [Breiman, et all has been very popular among applied statisticians. It assumes that the underlying data can be separated into categories, the decision boundaries can either be parallel to the axis or they can be a linear combination of these axes!. Under certain assumptions on the input data and their associated lIn CART, and C4.5, the axes are the same as the input features


Basis-Function Trees as a Generalization of Local Variable Selection Methods for Function Approximation

Neural Information Processing Systems

Function approximation on high-dimensional spaces is often thwarted by a lack of sufficient data to adequately "fill" the space, or lack of sufficient computational resources. The technique of local variable selection provides a partial solution to these problems by attempting to approximate functions locally using fewer than the complete set of input dimensions.


Comparison of three classification techniques: CART, C4.5 and Multi-Layer Perceptrons

Neural Information Processing Systems

In this paper, after some introductory remarks into the classification problem as considered in various research communities, and some discussions concerning some of the reasons for ascertaining the performances of the three chosen algorithms, viz., CART (Classification and Regression Tree), C4.5 (one of the more recent versions of a popular induction tree technique known as ID3), and a multi-layer perceptron (MLP), it is proposed to compare the performances of these algorithms under two criteria: classification and generalisation. It is found that, in general, the MLP has better classification and generalisation accuracies compared with the other two algorithms. 1 Introduction Classification of data into categories has been pursued by a number of research communities, viz., applied statistics, knowledge acquisition, neural networks. In applied statistics, there are a number of techniques, e.g., clustering algorithms (see e.g., Hartigan), CART (Classification and Regression Trees, see e.g., Breiman et al). Clustering algorithms are used when the underlying data naturally fall into a number of groups, the distance among groups are measured by various metrics [Hartigan]. CART [Breiman, et all has been very popular among applied statisticians. It assumes that the underlying data can be separated into categories, the decision boundaries can either be parallel to the axis or they can be a linear combination of these axes!. Under certain assumptions on the input data and their associated lIn CART, and C4.5, the axes are the same as the input features


Basis-Function Trees as a Generalization of Local Variable Selection Methods for Function Approximation

Neural Information Processing Systems

Function approximation on high-dimensional spaces is often thwarted by a lack of sufficient data to adequately "fill" the space, or lack of sufficient computational resources. The technique of local variable selection provides a partial solution to these problems by attempting to approximate functions locally using fewer than the complete set of input dimensions.


Comparison of three classification techniques: CART, C4.5 and Multi-Layer Perceptrons

Neural Information Processing Systems

In this paper, after some introductory remarks into the classification problem asconsidered in various research communities, and some discussions concerning some of the reasons for ascertaining the performances of the three chosen algorithms, viz., CART (Classification and Regression Tree), C4.5 (one of the more recent versions of a popular induction tree technique knownas ID3), and a multi-layer perceptron (MLP), it is proposed to compare the performances of these algorithms under two criteria: classification andgeneralisation. It is found that, in general, the MLP has better classification and generalisation accuracies compared with the other two algorithms. 1 Introduction Classification of data into categories has been pursued by a number of research communities, viz., applied statistics, knowledge acquisition, neural networks. In applied statistics, there are a number of techniques, e.g., clustering algorithms (see e.g., Hartigan), CART (Classification and Regression Trees, see e.g., Breiman et al). Clustering algorithms are used when the underlying data naturally fall into a number of groups, the distance among groups are measured by various metrics [Hartigan]. CART[Breiman, et all has been very popular among applied statisticians. It assumes that the underlying data can be separated into categories, the decision boundaries can either be parallel to the axis or they can be a linear combination of these axes!.


Basis-Function Trees as a Generalization of Local Variable Selection Methods for Function Approximation

Neural Information Processing Systems

Function approximation on high-dimensional spaces is often thwarted by a lack of sufficient data to adequately "fill" the space, or lack of sufficient computational resources. The technique of local variable selection provides a partial solution to these problems by attempting to approximate functions locally using fewer than the complete set of input dimensions.