Dep artment of Computer and Information Scienc e, Link oping University S-581 83 Link oping, Swe den Abstract W ei n v estigate the computational prop erties of the spatial algebra R CC-5 whic h is a restricted v ersion of the R CC framew ork for spatial reasoning. The satisabili t y problem for R CC-5 is kno wn to b e NPcomplete but not m uc hi sk n o wn ab out its appro ximately four billion sub classes. In the pro cess, w e iden tify all maximal tractable subalgebras whic h are four in total. The main reason for this is, probably, that spatial reasoning has pro v ed to b e applicable to real-w orld problems in, for example, geographical database systems (Egenhofer, 1991; Grigni, P apadias, & P apadimitriou, 1995) and molecular biology (Cui, 1994). In b oth these applications, the size of the problem instances can b e h uge, so the complexit y of problems and algorithms is a highly relev an t area to study .H o w ev er, questions of computational complexit yh a v e not receiv ed so m uc h atten tion in the literature; t w o notable exceptions are the results rep orted b y Neb el (1995) and Renz and Neb el (1997). Aw ell-kno wn framew ork for qualitativ e spatial reasoning is the so-called R CC approac h (Randell & Cohn, 1989; Randell, Cui, & Cohn, 1992). This approac h is based on mo delling qualitativ e spatial relations b et w een regions using rst-order logic. Of sp ecial in terest, from a complexit y-theoretic standp oin t, are the t w o sub classes R CC-5 and R CC-8. It is w ell-kno wn that b oth R CC-5 and R CC-8 ha v e quite w eak expressiv ep o w er.
This paper presents new experimental evidence against the utility of Occam's razor. A~systematic procedure is presented for post-processing decision trees produced by C4.5. This procedure was derived by rejecting Occam's razor and instead attending to the assumption that similar objects are likely to belong to the same class. It increases a decision tree's complexity without altering the performance of that tree on the training data from which it is inferred. The resulting more complex decision trees are demonstrated to have, on average, for a variety of common learning tasks, higher predictive accuracy than the less complex original decision trees. This result raises considerable doubt about the utility of Occam's razor as it is commonly applied in modern machine learning.
Cue phrases may be used in a discourse sense to explicitly signal discourse structure, but also in a sentential sense to convey semantic rather than structural information. Correctly classifying cue phrases as discourse or sentential is critical in natural language processing systems that exploit discourse structure, e.g., for performing tasks such as anaphora resolution and plan recognition. This paper explores the use of machine learning for classifying cue phrases as discourse or sentential. Two machine learning programs (Cgrendel and C4.5) are used to induce classification models from sets of pre-classified cue phrases and their features in text and speech. Machine learning is shown to be an effective technique for not only automating the generation of classification models, but also for improving upon previous results. When compared to manually derived classification models already in the literature, the learned models often perform with higher accuracy and contain new linguistic insights into the data. In addition, the ability to automatically construct classification models makes it easier to comparatively analyze the utility of alternative feature representations of the data. Finally, the ease of retraining makes the learning approach more scalable and flexible than manual methods.
Stacked generalization is a general method of using a high-level model to combine lower-level models to achieve greater predictive accuracy. In this paper we address two crucial issues which have been considered to be a `black art' in classification tasks ever since the introduction of stacked generalization in 1992 by Wolpert: the type of generalizer that is suitable to derive the higher-level model, and the kind of attributes that should be used as its input. We find that best results are obtained when the higher-level model combines the confidence (and not just the predictions) of the lower-level ones. We demonstrate the effectiveness of stacked generalization for combining three different types of learning algorithms for classification tasks. We also compare the performance of stacked generalization with majority vote and published results of arcing and bagging.
Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search strategy should consistently construct clusterings of high quality, but be computationally inexpensive as well. In general, we cannot have it both ways, but we can partition the search so that a system inexpensively constructs a `tentative' clustering for initial examination, followed by iterative optimization, which continues to search in background for improved clusterings. Given this motivation, we evaluate an inexpensive strategy for creating initial clusterings, coupled with several control strategies for iterative optimization, each of which repeatedly modifies an initial clustering in search of a better one. One of these methods appears novel as an iterative optimization strategy in clustering contexts. Once a clustering has been constructed it is judged by analysts -- often according to task-specific criteria. Several authors have abstracted these criteria and posited a generic performance task akin to pattern completion, where the error rate over completed patterns is used to `externally' judge clustering utility. Given this performance task, we adapt resampling-based pruning strategies used by supervised learning systems to the task of simplifying hierarchical clusterings, thus promising to ease post-clustering analysis. Finally, we propose a number of objective functions, based on attribute-selection measures for decision-tree induction, that might perform well on the error rate and simplicity dimensions.