AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Multiple Hypothesis Testing in Pattern Discovery

Hanhijärvi, Sami, Puolamäki, Kai, Garriga, Gemma C.

arXiv.org Machine LearningJun-29-2009

The problem of multiple hypothesis testing arises when there are more than one hypothesis to be tested simultaneously for statistical significance. This is a very common situation in many data mining applications. For instance, assessing simultaneously the significance of all frequent itemsets of a single dataset entails a host of hypothesis, one for each itemset. A multiple hypothesis testing method is needed to control the number of false positives (Type I error). Our contribution in this paper is to extend the multiple hypothesis framework to be used with a generic data mining algorithm. We provide a method that provably controls the family-wise error rate (FWER, the probability of at least one false positive) in the strong sense. We evaluate the performance of our solution on both real and generated data. The results show that our method controls the FWER while maintaining the power of the test.

data mining, machine learning, pattern recognition, (17 more...)

arXiv.org Machine Learning

0906.5263

Genre: Research Report > Experimental Study (0.56)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
(2 more...)

Add feedback

Self-Supervised Aerial Images Analysis for Extracting Parking lot Structure

Seo, Young-Woo (Robotics Institute, Carnegie Mellon University) | Ratliff, Nathan (Robotics Institute, Carnegie Mellon University) | Urmson, Chris (Robotics Institute, Carnegie Mellon University)

AAAI ConferencesJun-23-2009

Road network information simplifies autonomous driving by providing strong priors about environments. It informs a robotic vehicle with where it can drive, models of what can be expected, and contextual cues that influence driving behaviors. Currently, however, road network information is manually generated using a combination of GPS survey and aerial imagery. These manual techniques are labor intensive and error prone. To full exploit the benefits of digital imagery, these processes should be automated. As a step toward this goal, we present an algorithm that extracts the structure of parking lot visible from a given aerial image. To minimize human intervention in the use of aerial imagery, we devise a self-supervised learning algorithm that automatically generates a set of parking spot templates to learn the appearance of a parking lot and estimates the structure of the parking lot from the learned model. The data set extracted from a single image alone is too small to sufficiently learn an accurate parking spot model. However, strong priors trained using large data sets collected across multiple images dramatically improvce performance. Our self-supervised approach outperforms the prior alone by adapting the distribution of examples toward that found in the current image. A thorough empirical analysis compares leading state-of-the-art learning techniques on this problem.

hypothesis, parking lot, parking spot, (16 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.73)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.69)
(2 more...)

Add feedback

Abnormal Activity Recognition based on HDP-HMM Models

Hu, Derek Hao (Hong Kong University of Science and Technology) | Zhang, Xian-Xing (Nanjing University) | Yin, Jie (CSIRO ICT Centre) | Zheng, Vincent Wenchen (Hong Kong University of Science and Technology) | Yang, Qiang (Hong Kong University of Science and Technology)

AAAI ConferencesJun-23-2009

Detecting abnormal activities from sensor readings is an important research problem in activity recognition. A number of different algorithms have been proposed in the past to tackle this problem. Many of the previous state-based approaches suffer from the problem of failing to decide the appropriate number of states, which are difficult to find through a trial and-error approach, in real-world applications. In this paper, we propose an accurate and flexible framework for abnormal activity recognition from sensor readings that involves less human tuning of model parameters. Our approach first applies a Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM), which supports an infinite number of states, to automatically find an appropriate number of states. We incorporate a Fisher Kernel into the One-Class Support Vector Machine (OCSVM) model to filter out the activities that are likely to be normal. Finally, we derive an abnormal activity model from the normal activity models to reduce false positive rate in an unsupervised manner. Our main contribution is that our proposed HDP-HMM models can decide the appropriate number of states automatically, and that by incorporating a Fisher Kernel into the OCSVM model, we can combine the advantages from generative model and discriminative model. We demonstrate the effectiveness of our approach by using several real-world datasets to test our algorithm’s performance.

educe, execution, state-context pair, (14 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country: Asia > China > Guangdong Province > Guangzhou (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.53)

Add feedback

Probabilistic Counting with Randomized Storage

Durme, Benjamin Van (University of Rochester) | Lall, Ashwin (Georgia Institute of Technology)

AAAI ConferencesJun-23-2009

Previous work by Talbot and Osborne [2007] explored the use of randomized storage mechanisms in language modeling. These structures trade a small amount of error for significant space savings, enabling the use of larger language models on relatively modest hardware. Going beyond space efficient count storage, here we present the Talbot Osborne Morris Bloom (TOMB) Counter, an extended model for performing space efficient counting over streams of finite length. Theoretical and experimental results are given, showing the promise of approximate counting over large vocabularies in the context of limited space.

bloom counter, bloom filter, frequency, (12 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > New York > Monroe County > Rochester (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > California (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.31)

Add feedback

Introspection and Adaptable Model Integration for Dialogue-based Question Answering

Sonntag, Daniel (German Research Center for AI (DFKI))

AAAI ConferencesJun-23-2009

Dialogue-based Question Answering (QA) is a highly complex task that brings together a QA system including various natural language processing components (i.e., components for question classification, information extraction, and retrieval) with dialogue systems for effective and natural communication. The dialogue-based access is difficult to establish when the QA system in use is complex and combines many different answer services with different quality and access characteristics. For example, some questions are processed by opendomain QA services with a broad coverage. Others should be processed by using a domain-specific instance ontology for more reliable answers. Different answer services may change their characteristics over time and the dialogue reaction models have to be updated according to that. To solve this problem, we developed introspective methods to integrate adaptable models of the answer services. We evaluated the impact of the learned models on the dialogue performance, i.e., whether the adaptable models can be used for a more convenient dialogue formulation process. We show significant effectiveness improvements in the resulting dialogues when using the machine learning (ML) models. Examples are provided in the context of the generation of system-initiative feedback to user questions and answers, as provided by heterogeneous information services.

dialogue, dialogue system, information, (14 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country:

South America > Chile (0.04)
South America > Brazil (0.04)
North America > United States > New York (0.04)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.96)

Add feedback

Detection of Imperative and Declarative Question-Answer Pairs in Email Conversations

Kwong, Helen (Stanford University) | Yorke-Smith, Neil (SRI International)

AAAI ConferencesJun-23-2009

Question-answer pairs extracted from email threads can help construct summaries of the thread, as well as inform semantic-based assistance with email. Previous work dedicated to email threads extracts only questions in interrogative form. We extend the scope of question and answer detection and pairing to encompass also questions in imperative and declarative forms, and to operate at sentence-level fidelity. Building on prior work, our methods are based on learned models over a set of features that include the content, context, and structure of email threads. For two large email corpora, we show that our methods balance precision and recall in extracting question-answer pairs, while maintaining a modest computation time.

algorithm, detection, email thread, (16 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

Combining Speech and Sketch to Interpret Unconstrained Descriptions of Mechanical Devices

Bischel, David Tyler (University of California, Riverside) | Stahovich, Thomas F. (University of California, Riverside) | Davis, Randall (Massachusetts Institute of Technology) | Adler, Aaron (Massachusetts Institute of Technology) | Peterson, Eric J. (University of California, Riverside)

AAAI ConferencesJun-23-2009

Mechanical design tools would be considerably more useful if we could interact with them in the way that human designers communicate design ideas to one another, i.e., using crude sketches and informal speech. Those crude sketches frequently contain pen strokes of two different sorts, one type portraying device structure, the other denoting gestures, such as arrows used to indicate motion. We report here on techniques we developed that use information from both sketch and speech to distinguish gesture strokes from non-gestures -- a critical first step in understanding a sketch of a device. We collected and analyzed unconstrained device descriptions, which revealed six common types of gestures. Guided by this knowledge, we developed a classifier that uses both sketch and speech features to distinguish gesture strokes from non-gestures. Experiments with our techniques indicate that the sketch and speech modalities alone produce equivalent classification accuracy, but combining them produces higher accuracy.

classifier, probability, sketch, (15 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Riverside County > Riverside (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Questionnaire & Opinion Survey (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Human Computer Interaction (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.35)

Add feedback

Succinct Approximate Counting of Skewed Data

Talbot, David (Google Inc.)

AAAI ConferencesJun-23-2009

Practical data analysis relies on the ability to count observations of objectssuccinctly and efficiently. Unfortunately the space usage of an exact estimator grows with the size of the a priori set from which objects are drawn while the time required to maintain such an estimator grows with the size of the data set. We present static and on-line approximation schemes that avoid these limitations when approximate frequency estimates are acceptable. Our Log-Frequency Sketch extends the approximate counting algorithm of Morris [Morris1978] to estimate frequencies with bounded relative error via a single pass over a data set. It uses constant space per object when the frequencies follow a power law and can be maintained in constant time per observation. We give an (epsilon, delta)-approximation scheme which we verify empirically on a large natural language data set where, for instance, 95 percent of frequencies are estimated with relative error less than 0.25 using fewer than 11 bits per object in the static case and 15 bits per object on-line.

frequency, hash function, relative error, (16 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country:

North America > United States > California > Santa Clara County > Mountain View (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Czechia > Prague (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Bootstrap Voting Experts

Hewlett, Daniel (University of Arizona) | Cohen, Paul (University of Arizona)

AAAI ConferencesJun-23-2009

Bootstrap Voting Experts (BVE) is an extension to the Voting Experts algorithm for unsupervised chunking of sequences. BVE generates a series of segmentations, each of which incorporates knowledge gained from the previous segmentation. We show that this method of bootstrapping improves the performance of Voting Experts in a variety of unsupervised word segmentation scenarios, and generally improves both precision and recall of the algorithm. We also show that Minimum Description Length (MDL) can be used to choose nearly optimal parameters for Voting Experts in an unsupervised manner.

algorithm, segmentation, xpert, (14 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country: North America > United States > Arizona > Pima County > Tucson (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Add feedback

Acquiring Agent-Based Models of Conflict from Event Data

Taylor, Glenn (Soar Technology, Inc.) | Quist, Michael (Soar Technology, Inc.) | Hicken, Allen (University of Michigan)

AAAI ConferencesJun-23-2009

Building and using agent-based models is often impractical, in part due to the cost of including expensive subject matter experts (SMEs) in the development process. In this paper, we describe a method for "bootstrapping" model building to lower the cost of overall model development. The models we are interested in here capture dynamic phenomena related to international and subnational conflict. The method of acquiring these models begins with event data drawn from news reports about a conflict region, and infers model characteristics particular to a conflict modeling framework called the Power Structure Toolkit (PSTK). We describe the toolkit and how it has been used prior to this work. We then describe the current problem of modeling conflict and the empirical data available to learn models, and extensions to the PSTK for model generation from this data. We also describe a formative evaluation of the system that compares the performance and costs of models built entirely by an SME against models built with an SME aided by the automated model generation process. Early results indicate at least equivalent prediction rates with significant savings in model generation costs.

actor, conflict, pstk, (11 more...)

AAAI Conferences

Twenty-First International Joint Conference on Artificial Intelligence

Country:

Asia > Philippines (0.05)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > New York (0.04)
(6 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback