AITopics

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia (0.04)
Africa (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.51)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.49)
(3 more...)

AAAI ConferencesJul-19-2011

Distance Metric Learning under Covariate Shift

Cao, Bin (Hong Kong University of Science and Technology) | Ni, Xiaochuan (Microsoft Research Asia) | Sun, Jian-Tao (Microsoft Research Asia) | Wang, Gang (Microsoft) | Yang, Qiang (Hong Kong University of Science and Technology)

Learning distance metrics is a fundamental problem in machine learning. Previous distance-metric learning research assumes that the training and test data are drawn from the same distribution, which may be violated in practical applications. When the distributions differ, a situation referred to as covariate shift, the metric learned from training data may not work well on the test data. In this case the metric is said to be inconsistent. In this paper, we address this problem by proposing a novel metric learning framework known as consistent distance metric learning (CDML), which solves the problem under covariate shift situations. We theoretically analyze the conditions when the metrics learned under covariate shift are consistent. Based on the analysis, a convex optimization problem is proposed to deal with the CDML problem. An importance sampling method is proposed for metric learning and two importance weighting strategies are proposed and compared in this work. Experiments are carried out on synthetic and real world datasets to show the effectiveness of the proposed method.

covariate shift, learning, metric learning, (14 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Oregon (0.04)
Asia > China > Hong Kong (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

AAAI ConferencesJul-19-2011

Semi-Supervised Learning from a Translation Model Between Data Distributions

Anaya-Sánchez, Henry (Universitat Jaume I) | Martínez-Sotoca, José (Universitat Jaume I) | Martínez-Usó, Adolfo (Universitat Jaume I)

In this paper, we introduce a probabilistic classification model to address the task of semi-supervised learning. The major novelty of our proposal stems from measuring distributional relationships between the labeled and unlabeled data. This is achieved from a stochastic translation model between data distributions that is estimated from a mixture model. The proposed classifier is defined from the combination of both the translation model and a kernel logistic regression on labeled data. Experimental results obtained over synthetic and real-world data sets validate the usefulness of our proposal.

classifier, probability, translation model, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Europe > Spain (0.14)
North America > United States > Wisconsin (0.04)
North America > Canada > Newfoundland and Labrador > Labrador (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.35)
Research Report > Experimental Study (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

AAAI ConferencesJul-12-2011

Classifying the Political Leaning of News Articles and Users from User Votes

Zhou, Daniel Xiaodan (University of Michigan) | Resnick, Paul (University of Michigan) | Mei, Qiaozhu (University of Michigan)

Social news aggregator services generate readers’ subjective reactions to news opinion articles. Can we use those as a resource to classify articles as liberal or conservative, even without knowing the self-identified political leaning of most users? We applied three semi-supervised learning methods that propagate classifications of political news articles and users as conservative or liberal, based on the assumption that liberal users will vote for liberal articles more often, and similarly for conservative users and articles. Starting from a few labeled articles and users, the algorithms propagate political leaning labels to the entire graph. In cross-validation, the best algorithm achieved 99.6% accuracy on held-out users and 96.3% accuracy on held-out articles. Adding social data such as users’ friendship or text features such as cosine similarity did not improve accuracy. The propagation algorithms, using the subjective liking data from users, also performed better than an SVM based text classifier, which achieved 92.0% accuracy on articles.

artificial intelligence, machine learning, natural language, (19 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)

Industry:

Media > News (0.66)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.55)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Oblinger, Daniel (Defense Advanced Research Projects Agency)

Toward a Computational Model of Transfer

AI MagazineJul-9-2011

TLP and the field as a whole made great strides in each of these dimensions. Indeed, the program has helped TL become a recognized subdiscipline of machine learning. Other articles in this special issue detail the work accomplished in TLP; this article focuses on a broad framing of the research conducted and an assessment of its progress, limitations, and challenges, from an admittedly personal but DARPAinfluenced perspective. Traditionally every DARPA program has focused its research by requiring a precise measure of progress. The DARPA TLP decided to measure transfer by comparing the learning of tasks A and B versus the learning of B alone. In figure 1 the curve labeled B represents a traditional learning curve of the performance on target task B as a function of the number of training instances.

artificial intelligence, knowledge, machine learning, (17 more...)

AI Magazine

Country: North America > United States (0.58)

Genre:

Collection > Journal > Special Issue (0.54)
Overview (0.34)

Industry:

Leisure & Entertainment > Sports > Football (0.70)
Government > Regional Government > North America Government > United States Government (0.58)
Government > Military (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)

Knoblock, C. A., Lerman, K., Minton, S. N.

Wrapper Maintenance: A Machine Learning Approach

arXiv.org Artificial IntelligenceJun-23-2011

The proliferation of online information sources has led to an increased use of wrappers for extracting data from Web sources. While most of the previous research has focused on quick and efficient generation of wrappers, the development of tools for wrapper maintenance has received less attention. This is an important research problem because Web sources often change in ways that prevent the wrappers from extracting data correctly. We present an efficient algorithm that learns structural information about data from positive examples alone. We describe how this information can be used for two wrapper maintenance applications: wrapper verification and reinduction. The wrapper verification system detects when a wrapper is not extracting correct data, usually because the Web source has changed its format. The reinduction algorithm automatically recovers from changes in the Web source by identifying data on Web pages so that a new wrapper may be generated for this source. To validate our approach, we monitored 27 wrappers over a period of a year. The verification algorithm correctly discovered 35 of the 37 wrapper changes, and made 16 mistakes, resulting in precision of 0.73 and recall of 0.95. We validated the reinduction algorithm on ten Web sources. We were able to successfully reinduce the wrappers, obtaining precision and recall values of 0.90 and 0.80 on the data extraction task.

artificial intelligence, logic & formal reasoning, machine learning, (21 more...)

doi: 10.1613/jair.1145

1106.4872

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New York > Kings County > New York City (0.04)
(14 more...)

Genre: Research Report (1.00)

Industry:

Consumer Products & Services > Restaurants (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.71)
(2 more...)

arXiv.org Artificial IntelligenceJun-22-2011

Acquiring Word-Meaning Mappings for Natural Language Interfaces

Thompson, C.

This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. WOLFIE is part of an integrated system that learns to transform sentences into representations such as logical database queries. Experimental results are presented demonstrating WOLFIE's ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by WOLFIE are compared to those acquired by a similar system, with results favorable to WOLFIE. A second set of experiments demonstrates WOLFIE's ability to scale to larger and more difficult, albeit artificially generated, corpora. In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance.

artificial intelligence, machine learning, natural language, (20 more...)

doi: 10.1613/jair.1063

1106.4571

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
(20 more...)

Genre: Research Report > New Finding (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Provost, F., Weiss, G. M.

Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction

arXiv.org Artificial IntelligenceJun-22-2011

For large, real-world inductive learning problems, the number of training examples often must be limited due to the costs associated with procuring, preparing, and storing the training examples and/or the computational costs associated with learning from them. In such circumstances, one question of practical importance is: if only n training examples can be selected, in what proportion should the classes be represented? In this article we help to answer this question by analyzing, for a fixed training-set size, the relationship between the class distribution of the training data and the performance of classification trees induced from these data. We study twenty-six data sets and, for each, determine the best class distribution for learning. The naturally occurring class distribution is shown to generally perform well when classifier performance is evaluated using undifferentiated error rate (0/1 loss). However, when the area under the ROC curve is used to evaluate classifier performance, a balanced distribution is shown to perform well. Since neither of these choices for class distribution always generates the best-performing classifier, we introduce a "budget-sensitive" progressive sampling algorithm for selecting training examples based on the class associated with each example. An empirical analysis of this algorithm shows that the class distribution of the resulting training set yields classifiers with good (nearly-optimal) classification performance.

artificial intelligence, inductive learning, machine learning, (17 more...)

doi: 10.1613/jair.1199

1106.4557

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > San Mateo County > Menlo Park (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

arXiv.org Artificial IntelligenceJun-22-2011

Specific-to-General Learning for Temporal Events with Application to Learning Event Definitions from Video

Fern, A., Givan, R., Siskind, J. M.

We develop, analyze, and evaluate a novel, supervised, specific-to-general learner for a simple temporal logic and use the resulting algorithm to learn visual event definitions from video sequences. First, we introduce a simple, propositional, temporal, event-description language called AMA that is sufficiently expressive to represent many events yet sufficiently restrictive to support learning. We then give algorithms, along with lower and upper complexity bounds, for the subsumption and generalization problems for AMA formulas. We present a positive-examples--only specific-to-general learning method based on these algorithms. We also present a polynomial-time--computable ``syntactic'' subsumption test that implies semantic subsumption without being equivalent to it. A generalization algorithm based on syntactic subsumption can be used in place of semantic generalization to improve the asymptotic complexity of the resulting learning algorithm. Finally, we apply this algorithm to the task of learning relational event definitions from video and show that it yields definitions that are competitive with hand-coded ones.

formula, logic & formal reasoning, machine learning, (19 more...)

doi: 10.1613/jair.1050

1106.4572

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.84)

Rampone, Salvatore, Russo, Ciro

A fuzzified BRAIN algorithm for learning DNF from incomplete data

arXiv.org Artificial IntelligenceJun-16-2011

Aim of this paper is to address the problem of learning Boolean functions from training data with missing values. We present an extension of the BRAIN algorithm, called U-BRAIN (Uncertainty-managing Batch Relevance-based Artificial INtelligence), conceived for learning DNF Boolean formulas from partial truth tables, possibly with uncertain values or missing bits. Such an algorithm is obtained from BRAIN by introducing fuzzy sets in order to manage uncertainty. In the case where no missing bits are present, the algorithm reduces to the original BRAIN.

artificial intelligence, inductive learning, machine learning, (18 more...)

doi: 10.1285/i20705948v5n2p256

1002.4014

Country:

Europe > Italy (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)