Country
Rule Mining and Missing-Value Prediction in the Presence of Data Ambiguities
Wickramaratna, Kasun (University of Miami) | Kubat, Miroslav (University of Miami) | Premaratne, Kamal (University of Miami) | Wickramarathne, Thanuka (University of Miami)
The success of knowledge discovery in real-world domains often depends on our ability to handle data imperfections. Here we study this problem in the framework of association mining, seeking to identify frequent itemsets in transactional databases where the presence of some items in a given transaction is unknown. We want to use the frequent itemsets to predict "missing items": based on the partial contents of a shopping cart, predict what else will be added. We describe a technique that addresses this task, and report experiments illustrating its behavior.
VipBoost: A More Accurate Boosting Algorithm
Su, Xiaoyuan (Florida Atlantic University) | Khoshgoftaar, Taghi M | Greiner, Russell
Boosting is a well-known method for improving the accuracy of many learning algorithms. In this paper, we propose a novel boosting algorithm, VipBoost (voting on boosting classifications from imputed learning sets), which first generates multiple incomplete datasets from the original dataset by randomly removing a small percentage of observed attribute values, then uses an imputer to fill in the missing values. It then applies AdaBoost (using some base learner) to produce classifiers trained on each of the imputed learning sets, to produce multiple classifiers. The subsequent prediction on a new test case is the most frequent classification from these classifiers. Our empirical results show that VipBoost produces very effective classifiers that significantly improve accuracy for unstable base learners and some stable learners, especially when the initial dataset is incomplete.
Multivariate Time Series Classification with Temporal Abstractions
Batal, Iyad (University of Pittsburgh) | Sacchi, Lucia (University of Pavia) | Bellazzi, Riccardo (University of Pavia) | Hauskrecht, Milos (University of Pittsburgh)
The increase in the number of complex temporal datasets collected today has prompted the development of methods that extend classical machine learning and data mining methods to time-series data. This work focuses on methods for multivariate time-series classification. Time series classification is a challenging problem mostly because the number of temporal features that describe the data and are potentially useful for classification is enormous. We study and develop a temporal abstraction framework for generating multivariate time series features suitable for classification tasks. We propose the STF-Mine algorithm that automatically mines discriminative temporal abstraction patterns from the time series data and uses them to learn a classification model. Our experimental evaluations, carried out on both synthetic and real world medical data, demonstrate the benefit of our approach in learning accurate classifiers for time-series datasets.
Beating the Defense: Using Plan Recognition to Inform Learning Agents
Molineaux, Matthew (Knexus Research Corporation) | Aha, David W. (Naval Research Laboratory) | Sukthankar, Gita (University of Central Florida)
In this paper, we investigate the hypothesis that plan recognition can significantly improve the performance of a case-based reinforcement learner in an adversarial action selection task. Our environment is a simplification of an American football game. The performance task is to control the behavior of a quarterback in a pass play, where the goal is to maximize yardage gained. Plan recognition focuses on predicting the play of the defensive team. We modeled plan recognition as an unsupervised learning task, and conducted a lesion study. We found that plan recognition was accurate, and that it significantly improved performance. More generally, our studies show that plan recognition reduced the dimensionality of the state space, which allowed learning to be conducted more effectively. We describe the algorithms, explain the reasons for performance improvement, and also describe a further empirical comparison that highlights the utility of plan recognition for this task.
Discovering Patterns of Collaboration for Recommendation
Gunawardena, Sidath (Drexel University) | Weber, Rosina (Drexel University)
Collaboration between research scientists, particularly those with diverse backgrounds, is a driver of scientific innovation. However, finding the right collaborator is often an unscientific process that is subject to chance. This paper explores recommending collaborators based on repeating patterns of previous successful collaboration experiences, what we term prototypical collaborations. We investigate a method for discovering such prototypes to use them as a basis to guide the recommendation of new collaborations. To this end, we also examine two methods for matching collaboration seekers to these prototypical collaborations. Our initial studies reveal that though promising, improving collaborations through recommendation is a complex goal.
Methodology for Classifying and Indexing Case-Based Reasoning Systems in the Health Sciences
Bichindaritz, Isabelle (University of Washington Tacoma) | John C. Reed, Jr. (University of Washington Tacoma)
As the amount of information available to researchers grows at an increasing rate, it becomes much more difficult to find relevant resources. An approach taken by several authoritative bodies, such as the Association for Computing Machinery and the U.S. National Library of Medicine, is the introduction of a classification scheme. However, even the most modern schemes are not capable of adequately distinguishing one research paper from another, due mainly to their broad generality. This paper describes a methodology for building a much narrower, specialized classification scheme focused on the area of Cased-Based Reasoning in the Health Sciences. It is derived from thorough analysis of the field, but with a framework that can be adapted to other areas. Using a tiered approach to further subdivide systems into more specific classes according to criteria specific to this particular field, this classification scheme affords interdisciplinary search, which is generally left out of generic indexing systems. This paper presents the resulting classification scheme and showcases its usefulness for classifying and tracking the evolution of research.
Improving KD-Tree Based Retrieval for Attribute Dependent Generalized Cases
Bergmann, Ralph (University of Trier) | Tartakovski, Alexander (Piterion GmbH)
Generalized cases are cases that cover a subspace rather than a point in the problem-solution space. Attribute dependent generalized cases are a subclass of generalized cases, which cause a high computational complexity during similarity assessment. We present a new approach for an efficient index-based retrieval of such generalized cases by an improved kd-tree approach. The experimental evaluation demonstrates a significant improvement in retrieval efficiency compared to previous methods.
What a Legal CBR Ontology Should Provide
Ashley, Kevin D. (University of Pittsburgh)
This paper discusses the state of the art in CBR ontologies from the perspective of one developing an improved system for case-based legal reasoning. The paper proposes three specific roles for a CBR ontology and illustrates them in the context of the intended output of the new system: a legal classroom discussion of how to decide a case featuring hypothetical reasoning and abstract analogies. The paper distills the ontological requirements for modeling the example’s case-based arguments and assesses whether current research can meet those requirements. The concrete example helps to focus on and define goals for improving CBR ontologies.
Special Track on Case-Based Reasoning
Watson, Ian (University of Auckland) | Ontanon, Santiago (Georgia Institute of Technology)
Following successful special tracks on case-based reasoning at FLAIRS over the past seven years, we invited papers for the Eighth Special Track on CBR at the 22nd International FLAIRS Conference. Case-based reasoning is an AI problem solving and analysis methodology that retrieves and adapts previous experiences to fit new contexts. This forum is intended to gather AI researchers and practitioners with an interest in CBR to present and discuss developments in CBR theory and application. Submission topics included foundations of CBR; methods for CBR (such as representation, indexing, retrieval, adaptation); evaluation methods for CBR systems and integrations; practical applications of CBR; textual CBR; CBR and creativity; CBR and design; distributed CBR; case based maintenance; spatiotemporal CBR; CBR in the health sciences; CBR integrations; case based planning; and CBR and games. The invited speaker for the special track for 2009 is Ashok Goel from the Georgia Institute of Technology, USA.
Extracting Meaning from Cell Phone Improvement Ideas
Turner, Jenine (Athenahealth) | Lencevicius, Raimondas (Qwobl) | Adler, Mark (Nokia Research Center)
Numerous companies nowadays gather product improvement There are two additional modifications we use to adjust ideas. Reviewing all of the resulting thousands of our feature set, that provide improvements over the original ideas without tools would require a great deal of time and feature counts. The first is based upon our assumption that resources. Automatic tools can help these reviewers in a words in the title are more important than words in the other number of ways. The questions we address here are categorization, text fields. We simply weight unigrams and bigrams that finding common ideas, and finding idea trends over appear in the title ten times as heavily as those that appear in time. We explore techniques to answer these questions using the rest of the text.