Framework and Schema for Semantic Web Knowledge Bases

AAAI Conferences

There is a growing need for scalable semantic web repositories which support inference and provide efficient queries. There is also a growing interest in representing uncertain knowledge in semantic web datasets and ontologies. In this paper, I present a bit vector schema specifically designed for RDF (Resource Description Framework) datasets. I propose a system for materializing and storing inferred knowledge using this schema. I show experimental results that demonstrate that this solution simplifies inference queries and drastically improves results. I also propose and describe a solution for materializing and persisting uncertain information and probabilities. Thresholds and bit vectors are used to provide efficient query access to this uncertain knowledge. My goal is to provide a semantic web repository that supports knowledge inference, uncertainty reasoning, and Bayesian networks, without sacrificing performance or scalability.


Eliciting Categorical Data for Optimal Aggregation

Neural Information Processing Systems

Models for collecting and aggregating categorical data on crowdsourcing platforms typically fall into two broad categories: those assuming agents honest and consistent but with heterogeneous error rates, and those assuming agents strategic and seek to maximize their expected reward. The former often leads to tractable aggregation of elicited data, while the latter usually focuses on optimal elicitation and does not consider aggregation. In this paper, we develop a Bayesian model, wherein agents have differing quality of information, but also respond to incentives. Our model generalizes both categories and enables the joint exploration of optimal elicitation and aggregation. This model enables our exploration, both analytically and experimentally, of optimal aggregation of categorical data and optimal multiple-choice interface design.


A Fuzzy Logic Approach to Target Tracking

arXiv.org Artificial Intelligence

This paper discusses a target tracking problem in which no dynamic mathematical model is explicitly assumed. A nonlinear filter based on the fuzzy If-then rules is developed. A comparison with a Kalman filter is made, and empirical results show that the performance of the fuzzy filter is better. Intensive simulations suggest that theoretical justification of the empirical results is possible.


A Bayesian Framework for Robust Reasoning from Sensor Networks

AAAI Conferences

The work described in this paper defines a Bayesian framework to use noisy, but redundant data from multiple sensor streams and incorporate it with the contextual and domain knowledge that is provided by both the physical constraints imposed by the local environment where the sensors are located and by the people that are involved in the surveillance tasks. The paper also presents the preliminary results of applying the Bayesian framework to the people localization problem in indoor environment using a sensor network that consists of video cameras, infrared tag readers and a fingerprint reader.


Bayesian Bias Mitigation for Crowdsourcing

Neural Information Processing Systems

Biased labelers are a systemic problem in crowdsourcing, and a comprehensive toolbox for handling their responses is still being developed. A typical crowdsourcing application can be divided into three steps: data collection, data curation, and learning. At present these steps are often treated separately. We present Bayesian Bias Mitigation for Crowdsourcing (BBMC), a Bayesian model to unify all three. Most data curation methods account for the {\it effects} of labeler bias by modeling all labels as coming from a single latent truth. Our model captures the {\it sources} of bias by describing labelers as influenced by shared random effects. This approach can account for more complex bias patterns that arise in ambiguous or hard labeling tasks and allows us to merge data curation and learning into a single computation. Active learning integrates data collection with learning, but is commonly considered infeasible with Gibbs sampling inference. We propose a general approximation strategy for Markov chains to efficiently quantify the effect of a perturbation on the stationary distribution and specialize this approach to active learning. Experiments show BBMC to outperform many common heuristics.