Goto

Collaborating Authors

 Uncertainty


Ontology-Mediated Queries for Probabilistic Databases

AAAI Conferences

Probabilistic databases (PDBs) are usually incomplete, e.g., containing only the facts that have been extracted from the Web with high confidence. However, missing facts are often treated as being false, which leads to unintuitive results when querying PDBs. Recently, open-world probabilistic databases (OpenPDBs) were proposed to address this issue by allowing probabilities of unknown facts to take any value from a fixed probability interval. In this paper, we extend OpenPDBs by Datalog+/- ontologies, under which both upper and lower probabilities of queries become even more informative, enabling us to distinguish queries that were indistinguishable before. We show that the dichotomy between P and PP in (Open)PDBs can be lifted to the case of first-order rewritable positive programs (without negative constraints); and that the problem can become NP^PP-complete, once negative constraints are allowed. We also propose an approximating semantics that circumvents the increase in complexity caused by negative constraints.


JAG: A Crowdsourcing Framework for Joint Assessment and Peer Grading

AAAI Conferences

Generation and evaluation of crowdsourced content is commonly treated as two separate processes, performed at different times and by two distinct groups of people: content creators and content assessors. As a result, most crowdsourcing tasks follow this template: one group of workers generates content and another group of workers evaluates it. In an educational setting, for example, content creators are traditionally students that submit open-response answers to assignments (e.g., a short answer, a circuit diagram, or a formula) and content assessors are instructors that grade these submissions. Despite the considerable success of peer-grading in massive open online courses (MOOCs), the process of test-taking and grading are still treated as two distinct tasks which typically occur at different times, and require an additional overhead of grader training and incentivization. Inspired by this problem in the context of education, we propose a general crowdsourcing framework that fuses open-response test-taking (content generation) and assessment into a single, streamlined process that appears to students in the form of an explicit test, but where everyone also acts as an implicit grader. The advantages offered by our framework include: a common incentive mechanism for both the creation and evaluation of content, and a probabilistic model that jointly models the processes of contribution and evaluation, facilitating efficient estimation of the quality of the contributions and the competency of the contributors. We demonstrate the effectiveness and limits of our framework via simulations and a real-world user study.


Anytime Anyspace AND/OR Search for Bounding the Partition Function

AAAI Conferences

Bounding the partition function is a key inference task in many graphical models. In this paper, we develop an anytime anyspace search algorithm taking advantage of AND/OR tree structure and optimized variational heuristics to tighten deterministic bounds on the partition function. We study how our priority-driven best-first search scheme can improve on state-of-the-art variational bounds in an anytime way within limited memory resources, as well as the effect of the AND/OR framework to exploit conditional independence structure within the search process within the context of summation. We compare our resulting bounds to a number of existing methods, and show that our approach offers a number of advantages on real-world problem instances taken from recent UAI competitions.


An Ambiguity Aversion Model for Decision Making under Ambiguity

AAAI Conferences

In real life, decisions are often made under ambiguity, where it is difficult to estimate accurately the probability of each single possible consequence of a choice. However, this problem has not been solved well in existing work for the following two reasons. (i) Some of them cannot cover the Ellsberg paradox and the Machina Paradox. Thus, the choices that they predict could be inconsistent with empirical observations. (ii) Some of them rely on parameter tuning without offering explanations for the reasonability of setting such bounds of parameters. Thus, the prediction of such a model in new decision making problems is doubtful. To the end, this paper proposes a new decision making model based on D-S theory and the emotion of ambiguity aversion. Some insightful properties of our model and the validating on two famous paradoxes show that our model indeed is a better alternative for decision making under ambiguity.


Learning Visual Sentiment Distributions via Augmented Conditional Probability Neural Network

AAAI Conferences

Visual sentiment analysis is raising more and more attention with the increasing tendency to express emotions through images. While most existing works assign a single dominant emotion to each image, we address the sentiment ambiguity by label distribution learning (LDL), which is motivated by the fact that image usually evokes multiple emotions. Two new algorithms are developed based on conditional probability neural network (CPNN). First, we proposed BCPNN which encodes image label into a binary representation to replace the signless integers used in CPNN, and employ it as a part of input for the neural network. Then, we train our ACPNN model by adding noises to ground truth label and augmenting affective distributions. Since current datasets are mostly annotated for single-label learning, we build two new datasets, one of which is relabeled on the popular Flickr dataset and the other is collected from Twitter. These datasets contain 20,745 images with multiple affective labels, which are over ten times larger than the existing ones. Experimental results show that the proposed methods outperform the state-of-the-art works on our large-scale datasets and other publicly available benchmarks.


A Declarative Approach to Data-Driven Fact Checking

AAAI Conferences

Fact checking is an essential part of any investigative work. For linguistic, psychological and social reasons, it is an inherently human task. Yet, modern media make it increasingly difficult for experts to keep up with the pace at which information is produced. Hence, we believe there is value in tools to assist them in this process. Much of the effort on Web data research has been focused on coping with incompleteness and uncertainty. Comparatively, dealing with context has received less attention, although it is crucial in judging the validity of a claim. For instance, what holds true in a US state, might not in its neighbors, e.g., due to obsolete or superseded laws. In this work, we address the problem of checking the validity of claims in multiple contexts. We define a language to represent and query facts across different dimensions. The approach is non-intrusive and allows relatively easy modeling, while capturing incompleteness and uncertainty. We describe the syntax and semantics of the language. We present algorithms to demonstrate its feasibility, and we illustrate its usefulness through examples.


PAC-Bayesian Theory Meets Bayesian Inference

arXiv.org Machine Learning

We exhibit a strong link between frequentist PAC-Bayesian risk bounds and the Bayesian marginal likelihood. That is, for the negative log-likelihood loss function, we show that the minimization of PAC-Bayesian generalization risk bounds maximizes the Bayesian marginal likelihood. This provides an alternative explanation to the Bayesian Occam's razor criteria, under the assumption that the data is generated by an i.i.d distribution. Moreover, as the negative log-likelihood is an unbounded loss function, we motivate and propose a PAC-Bayesian theorem tailored for the sub-gamma loss family, and we show that our approach is sound on classical Bayesian linear regression tasks.


Bayesian Opponent Exploitation in Imperfect-Information Games

arXiv.org Artificial Intelligence

Two fundamental problems in computational game theory are computing a Nash equilibrium and learning to exploit opponents given observations of their play (opponent exploitation). The latter is perhaps even more important than the former: Nash equilibrium does not have a compelling theoretical justification in game classes other than two-player zero-sum, and for all games one can potentially do better by exploiting perceived weaknesses of the opponent than by following a static equilibrium strategy throughout the match. The natural setting for opponent exploitation is the Bayesian setting where we have a prior model that is integrated with observations to create a posterior opponent model that we respond to. The most natural, and a well-studied prior distribution is the Dirichlet distribution. An exact polynomial-time algorithm is known for best-responding to the posterior distribution for an opponent assuming a Dirichlet prior with multinomial sampling in normal-form games; however, for imperfect-information games the best known algorithm is based on approximating an infinite integral without theoretical guarantees. We present the first exact algorithm for a natural class of imperfect-information games. We demonstrate that our algorithm runs quickly in practice and outperforms the best prior approaches. We also present an algorithm for the uniform prior setting.


[็ณปๅˆ—ๆดปๅ‹•] Machine Learning ๆฉŸๅ™จๅญธ็ฟ’่ชฒ็จ‹

#artificialintelligence

Our learning task: Given a training set S {(x1 , y1), (x2 , y2), . . . The simplest case: y { 1, 1} called binary classification problem If y is a real number it becomes a regression problem More general case, y can be a vector and each element is drawn from a finite set.


Shakespeare and Fuzzy Logic

@machinelearnbot

There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy. Shakespeare teaches us in this Hamlet quote that reality is much more complex than our mental projections and understanding. Reality is fuzzier than we would care to think. Although introducing subjectivities to modeling seems to harm the'objectivity' for the purists, this objectivity is more of a deliberate ignorance of real life issues than a sound strategy for modeling. There is a myriad of ambiguities and uncertainties in the information we receive decode and signal which tends to limit the functionality of traditional methods that are based on crisp logic. For instance, while USD 500 premium means you will have to give USD 500 to purchase the policy, the opinion whether this premium is adequate for the insurer or not, and reasonable or too expensive for the consumer is quite subjective.