AITopics

0809.4530

Country:

North America > United States > Texas (0.14)
North America > Canada > Ontario > Middlesex County > London (0.14)
Oceania > New Zealand > North Island > Waikato (0.04)
(31 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Media > Film (0.92)
Leisure & Entertainment > Sports (0.92)
(5 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(7 more...)

Zaffalon, M., Miranda, E.

Conservative Inference Rule for Uncertain Reasoning under Incompleteness

Journal of Artificial Intelligence ResearchApr-29-2009

In this paper we formulate the problem of inference under incomplete information in very general terms. This includes modelling the process responsible for the incompleteness, which we call the incompleteness process. We allow the process' behaviour to be partly unknown. Then we use Walley's theory of coherent lower previsions, a generalisation of the Bayesian theory to imprecision, to derive the rule to update beliefs under incompleteness that logically follows from our assumptions, and that we call conservative inference rule. This rule has some remarkable properties: it is an abstract rule to update beliefs that can be applied in any situation or domain; it gives us the opportunity to be neither too optimistic nor too pessimistic about the incompleteness process, which is a necessary condition to draw reliable while strong enough conclusions; and it is a coherent rule, in the sense that it cannot lead to inconsistencies. We give examples to show how the new rule can be applied in expert systems, in parametric statistical inference, and in pattern classification, and discuss more generally the view of incompleteness processes defended here as well as some of its consequences.

lower prevision, prevision, probability, (15 more...)

doi: 10.1613/jair.2736

10603

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia (0.05)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Tsatsaronis, George, Halkidi, Maria, Giakoumakis, Emmanouel A.

Quality Classifiers for Open Source Software Repositories

arXiv.org Artificial IntelligenceApr-29-2009

Initial open source software (OSS) projects rely on large repositories for hosting and distribution until they become independent. A huge amount of project metadata is collected and maintained in such software repositories providing useful information about projects and their success. In this paper we propose a data mining approach that processes the metadata contained in such OSS repositories. The proposed approach aims at the construction of a classifier that is trained on the metadata of existing projects and predicts the successful continuation of any given OSS. The successfulness of a project is defined with regard to the confidence level of the classifier which predicts that this project will be ported in widely used OSS projects (e.g.

classifier, data mining, machine learning, (15 more...)

0904.4708

Genre: Research Report (0.82)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Journal of Artificial Intelligence ResearchApr-23-2009

Learning Document-Level Semantic Properties from Free-Text Annotations

Branavan, S. R. K., Chen, H., Eisenstein, J., Barzilay, R.

This paper presents a new method for inferring the semantic properties of documents by leveraging free-text keyphrase annotations. Such annotations are becoming increasingly abundant due to the recent dramatic growth in semi-structured, user-generated online content. One especially relevant domain is product reviews, which are often annotated by their authors with pros/cons keyphrases such as ``a real bargain'' or ``good value.'' These annotations are representative of the underlying semantic properties; however, unlike expert annotations, they are noisy: lay authors may use different labels to denote the same property, and some labels may be missing. To learn using such noisy annotations, we find a hidden paraphrase structure which clusters the keyphrases. The paraphrase structure is linked with a latent topic model of the review texts, enabling the system to predict the properties of unannotated documents and to effectively aggregate the semantic properties of multiple reviews. Our approach is implemented as a hierarchical Bayesian model with joint inference. We find that joint inference increases the robustness of the keyphrase clustering and encourages the latent topics to correlate with semantically meaningful properties. Multiple evaluations demonstrate that our model substantially outperforms alternative approaches for summarizing single and multiple documents into a set of semantically salient keyphrases.

annotation, keyphrase, semantic property, (17 more...)

doi: 10.1613/jair.2633

10598

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Consumer Products & Services (0.46)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceApr-22-2009

Lexicographic probability, conditional probability, and nonstandard probability

Halpern, Joseph Y.

The relationship between Popper spaces (conditional probability spaces that satisfy some regularity conditions), lexicographic probability systems (LPS's), and nonstandard probability spaces (NPS's) is considered. If countable additivity is assumed, Popper spaces and a subclass of LPS's are equivalent; without the assumption of countable additivity, the equivalence no longer holds. If the state space is finite, LPS's are equivalent to NPS's. However, if the state space is infinite, NPS's are shown to be more general than LPS's.

artificial intelligence, machine learning, probability measure, (18 more...)

cs/0306106

Country:

Europe > United Kingdom > England (0.45)
North America > United States > New York (0.28)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)

Potscher, Benedikt M., Leeb, Hannes

On the Distribution of Penalized Maximum Likelihood Estimators: The LASSO, SCAD, and Thresholding

arXiv.org Machine LearningApr-17-2009

We study the distributions of the LASSO, SCAD, and thresholding estimators, in finite samples and in the large-sample limit. The asymptotic distributions are derived for both the case where the estimators are tuned to perform consistent model selection and for the case where the estimators are tuned to perform conservative model selection. Our findings complement those of Knight and Fu (2000) and Fan and Li (2001). We show that the distributions are typically highly nonnormal regardless of how the estimator is tuned, and that this property persists in large samples. The uniform convergence rate of these estimators is also obtained, and is shown to be slower than 1/root(n) in case the estimator is tuned to perform consistent model selection. An impossibility result regarding estimation of the estimators' distribution function is also provided.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1016/j.jmva.2009.06.010

0711.0660

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.41)

Vennekens, Joost, Denecker, Marc, Bruynooghe, Maurice

CP-logic: A Language of Causal Probabilistic Events and Its Relation to Logic Programming

arXiv.org Artificial IntelligenceApr-10-2009

This papers develops a logical language for representing probabilistic causal laws. Our interest in such a language is twofold. First, it can be motivated as a fundamental study of the representation of causal knowledge. Causality has an inherent dynamic aspect, which has been studied at the semantical level by Shafer in his framework of probability trees. In such a dynamic context, where the evolution of a domain over time is considered, the idea of a causal law as something which guides this evolution is quite natural. In our formalization, a set of probabilistic causal laws can be used to represent a class of probability trees in a concise, flexible and modular way. In this way, our work extends Shafer's by offering a convenient logical representation for his semantical objects. Second, this language also has relevance for the area of probabilistic logic programming. In particular, we prove that the formal semantics of a theory in our language can be equivalently defined as a probability distribution over the well-founded models of certain logic programs, rendering it formally quite similar to existing languages such as ICL or PRISM. Because we can motivate and explain our language in a completely self-contained way as a representation of probabilistic causal laws, this provides a new way of explaining the intuitions behind such probabilistic logic programs: we can say precisely which knowledge such a program expresses, in terms that are equally understandable by a non-logician. Moreover, we also obtain an additional piece of knowledge representation methodology for probabilistic logic programs, by showing how they can express probabilistic causal laws.

artificial intelligence, logic & formal reasoning, machine learning, (16 more...)

0904.1672

Country:

North America > United States (0.45)
Europe > Germany (0.27)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Freeman, Guy, Smith, Jim Q.

Bayesian MAP Model Selection of Chain Event Graphs

arXiv.org Machine LearningApr-6-2009

The class of chain event graph models is a generalisation of the class of discrete Bayesian networks, retaining most of the structural advantages of the Bayesian network for model interrogation, propagation and learning, while more naturally encoding asymmetric state spaces and the order in which events happen. In this paper we demonstrate how with complete sampling, conjugate closed form model selection based on product Dirichlet priors is possible, and prove that suitable homogeneity assumptions characterise the product Dirichlet prior on this class of models. We demonstrate our techniques using two educational examples.

artificial intelligence, ceg, machine learning, (17 more...)

arXiv.org Machine Learning

0904.0977

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Gabrilovich, E., Markovitch, S.

Wikipedia-based Semantic Interpretation for Natural Language Processing

Journal of Artificial Intelligence ResearchMar-30-2009

Adequate representation of natural language semantics requires access to vast amounts of common sense and domain-specific world knowledge. Prior work in the field was based on purely statistical techniques that did not make use of background knowledge, on limited lexicographic knowledge bases such as WordNet, or on huge manual efforts such as the CYC project. Here we propose a novel method, called Explicit Semantic Analysis (ESA), for fine-grained semantic interpretation of unrestricted natural language texts. Our method represents meaning in a high-dimensional space of concepts derived from Wikipedia, the largest encyclopedia in existence. We explicitly represent the meaning of any text in terms of Wikipedia-based concepts. We evaluate the effectiveness of our method on text categorization and on computing the degree of semantic relatedness between fragments of natural language text. Using ESA results in significant improvements over the previous state of the art in both tasks. Importantly, due to the use of natural concepts, the ESA model is easy to explain to human users.

category, proceedings, text categorization, (15 more...)

doi: 10.1613/jair.2669

10595

Country:

Asia > Middle East > Iraq (0.28)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
South America > Brazil (0.04)
(21 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Information Technology (1.00)
(14 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Journal of Artificial Intelligence ResearchMar-28-2009

Solving #SAT and Bayesian Inference with Backtracking Search

Bacchus, F., Dalmao, S., Pitassi, T.

Inference in Bayes Nets (BAYES) is an important problem with numerous applications in probabilistic reasoning. Counting the number of satisfying assignments of a propositional formula (#SAT) is a closely related problem of fundamental theoretical importance. Both these problems, and others, are members of the class of sum-of-products (SUMPROD) problems. In this paper we show that standard backtracking search when augmented with a simple memoization scheme (caching) can solve any sum-of-products problem with time complexity that is at least as good any other state-of-the-art exact algorithm, and that it can also achieve the best known time-space tradeoff. Furthermore, backtrackings ability to utilize more flexible variable orderings allows us to prove that it can achieve an exponential speedup over other standard algorithms for SUMPROD on some instances. The ideas presented here have been utilized in a number of solvers that have been applied to various types of sum-of-product problems. These systems have exploited the fact that backtracking can naturally exploit more of the problems structure to achieve improved performance on a range of probleminstances. Empirical evidence of this performance gain has appeared in published works describing these solvers, and we provide references to these works.

algorithm, decomposition, node, (15 more...)

doi: 10.1613/jair.2648

10594

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Mateo County > San Mateo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)