There is a growing need for scalable semantic web repositories which support inference and provide efficient queries. There is also a growing interest in representing uncertain knowledge in semantic web datasets and ontologies. In this paper, I present a bit vector schema specifically designed for RDF (Resource Description Framework) datasets. I propose a system for materializing and storing inferred knowledge using this schema. I show experimental results that demonstrate that this solution simplifies inference queries and drastically improves results. I also propose and describe a solution for materializing and persisting uncertain information and probabilities. Thresholds and bit vectors are used to provide efficient query access to this uncertain knowledge. My goal is to provide a semantic web repository that supports knowledge inference, uncertainty reasoning, and Bayesian networks, without sacrificing performance or scalability.
Models for collecting and aggregating categorical data on crowdsourcing platforms typically fall into two broad categories: those assuming agents honest and consistent but with heterogeneous error rates, and those assuming agents strategic and seek to maximize their expected reward. The former often leads to tractable aggregation of elicited data, while the latter usually focuses on optimal elicitation and does not consider aggregation. In this paper, we develop a Bayesian model, wherein agents have differing quality of information, but also respond to incentives. Our model generalizes both categories and enables the joint exploration of optimal elicitation and aggregation. This model enables our exploration, both analytically and experimentally, of optimal aggregation of categorical data and optimal multiple-choice interface design.
This paper discusses a target tracking problem in which no dynamic mathematical model is explicitly assumed. A nonlinear filter based on the fuzzy If-then rules is developed. A comparison with a Kalman filter is made, and empirical results show that the performance of the fuzzy filter is better. Intensive simulations suggest that theoretical justification of the empirical results is possible.
Xu, Qianqian (Institute of Information Engineering, CAS, Beijing) | Xiong, Jiechao (BICMR and School of Mathematical Sciences, Peking University, Beijing) | Chen, Xi (BICMR and School of Mathematical Sciences, Peking University, Beijing) | Huang, Qingming (Stern School of Business, New York University) | Yao, Yuan (University of Chinese Academy of Sciences, Beijing)
Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains. However, task requester usually has a limited amount of budget, thus it is desirable to have a policy to wisely allocate the budget to achieve better quality. In this paper, we study the principle of information maximization for active sampling strategies in the framework of HodgeRank, an approach based on Hodge Decomposition of pairwise ranking data with multiple workers. The principle exhibits two scenarios of active sampling: Fisher information maximization that leads to unsupervised sampling based on a sequential maximization of graph algebraic connectivity without considering labels; and Bayesian information maximization that selects samples with the largest information gain from prior to posterior, which gives a supervised sampling involving the labels collected. Experiments show that the proposed methods boost the sampling efficiency as compared to traditional sampling schemes and are thus valuable to practical crowdsourcing experiments.
The work described in this paper defines a Bayesian framework to use noisy, but redundant data from multiple sensor streams and incorporate it with the contextual and domain knowledge that is provided by both the physical constraints imposed by the local environment where the sensors are located and by the people that are involved in the surveillance tasks. The paper also presents the preliminary results of applying the Bayesian framework to the people localization problem in indoor environment using a sensor network that consists of video cameras, infrared tag readers and a fingerprint reader.