Goto

Collaborating Authors

 Learning Graphical Models


The Bernstein Mechanism: Function Release under Differential Privacy

AAAI Conferences

We address the problem of general function release under differential privacy, by developing a functional mechanism that applies under the weak assumptions of oracle access to target function evaluation and sensitivity. These conditions permit treatment of functions described explicitly or implicitly as algorithmic black boxes. We achieve this result by leveraging the iterated Bernstein operator for polynomial approximation of the target function, and polynomial coefficient perturbation. Under weak regularity conditions, we establish fast rates on utility measured by high-probability uniform approximation. We provide a lower bound on the utility achievable for any functional mechanism that is epsilon-differentially private. The generality of our mechanism is demonstrated by the analysis of a number of example learners, including naive Bayes, non-parametric estimators and regularized empirical risk minimization. Competitive rates are demonstrated for kernel density estimation; and epsilon-differential privacy is achieved for a broader class of support vector machines than known previously.


Learning Bayesian Networks with Incomplete Data by Augmentation

AAAI Conferences

We present new algorithms for learning Bayesian networks from data with missing values using a data augmentation approach. An exact Bayesian network learning algorithm is obtained by recasting the problem into a standard Bayesian network learning problem without missing data. As expected, the exact algorithm does not scale to large domains. We build on the exact method to create an approximate algorithm using a hill-climbing technique. This algorithm scales to large domains so long as a suitable standard structure learning method for complete data is available. We perform a wide range of experiments to demonstrate the benefits of learning Bayesian networks with such new approach.


Knowing What to Ask: A Bayesian Active Learning Approach to the Surveying Problem

AAAI Conferences

We examine the surveying problem, where we attempt to predict how a target user is likely to respond to questions by iteratively querying that user, collaboratively based on the responses of a sample set of users. We focus on an active learning approach, where the next question we select to ask the user depends on their responses to the previous questions. We propose a method for solving the problem based on a Bayesian dimensionality reduction technique. We empirically evaluate our method, contrasting it to benchmark approaches based on augmented linear regression, and show that it achieves much better predictive performance, and is much more robust when there is missing data.


LPMLN, Weak Constraints, and P-log

AAAI Conferences

LP MLN is a recently introduced formalism that extends answer set programs by adopting the log-linear weight scheme of Markov Logic. This paper investigates the relationships between LPMLN and two other extensions of answer set programs: weak constraints to express a quantitative preference among answer sets, and P-log to incorporate probabilistic uncertainty. We present a translation of LP MLN into programs with weak constraints and a translation of P-log into LPMLN, which complement the existing translations in the opposite directions. The first translation allows us to compute the most probable stable models (i.e., MAP estimates) of LP MLN programs using standard ASP solvers. This result can be extended to other formalisms, such as Markov Logic, ProbLog, and Pearl's Causal Models, that are shown to be translatable into LP MLN . The second translation tells us how probabilistic nonmonotonicity (the ability of the reasoner to change his probabilistic model as a result of new information) of P-log can be represented in LP MLN , which yields a way to compute P-log using standard ASP solvers and MLN solvers.


Learning Visual Sentiment Distributions via Augmented Conditional Probability Neural Network

AAAI Conferences

Visual sentiment analysis is raising more and more attention with the increasing tendency to express emotions through images. While most existing works assign a single dominant emotion to each image, we address the sentiment ambiguity by label distribution learning (LDL), which is motivated by the fact that image usually evokes multiple emotions. Two new algorithms are developed based on conditional probability neural network (CPNN). First, we proposed BCPNN which encodes image label into a binary representation to replace the signless integers used in CPNN, and employ it as a part of input for the neural network. Then, we train our ACPNN model by adding noises to ground truth label and augmenting affective distributions. Since current datasets are mostly annotated for single-label learning, we build two new datasets, one of which is relabeled on the popular Flickr dataset and the other is collected from Twitter. These datasets contain 20,745 images with multiple affective labels, which are over ten times larger than the existing ones. Experimental results show that the proposed methods outperform the state-of-the-art works on our large-scale datasets and other publicly available benchmarks.


Multi-Task Deep Learning for User Intention Understanding in Speech Interaction Systems

AAAI Conferences

Speech interaction systems have been gaining popularity in recent years. The main purpose of these systems is to generate more satisfactory responses according to users' speech utterances, in which the most critical problem is to analyze user intention. Researches show that user intention conveyed through speech is not only expressed by content, but also closely related with users' speaking manners (e.g. with or without acoustic emphasis). How to incorporate these heterogeneous attributes to infer user intention remains an open problem. In this paper, we define Intention Prominence (IP) as the semantic combination of focus by text and emphasis by speech, and propose a multi-task deep learning framework to predict IP. Specifically, we first use long short-term memory (LSTM) which is capable of modeling long short-term contextual dependencies to detect focus and emphasis, and incorporate the tasks for focus and emphasis detection with multi-task learning (MTL) to reinforce the performance of each other. We then employ Bayesian network (BN) to incorporate multimodal features (focus, emphasis, and location reflecting users' dialect conventions) to predict IP based on feature correlations. Experiments on a data set of 135,566 utterances collected from real-world Sogou Voice Assistant illustrate that our method can outperform the comparison methods over 6.9-24.5% in terms of F1-measure. Moreover, a real practice in the Sogou Voice Assistant indicates that our method can improve the performance on user intention understanding by 7%.


A Declarative Approach to Data-Driven Fact Checking

AAAI Conferences

Fact checking is an essential part of any investigative work. For linguistic, psychological and social reasons, it is an inherently human task. Yet, modern media make it increasingly difficult for experts to keep up with the pace at which information is produced. Hence, we believe there is value in tools to assist them in this process. Much of the effort on Web data research has been focused on coping with incompleteness and uncertainty. Comparatively, dealing with context has received less attention, although it is crucial in judging the validity of a claim. For instance, what holds true in a US state, might not in its neighbors, e.g., due to obsolete or superseded laws. In this work, we address the problem of checking the validity of claims in multiple contexts. We define a language to represent and query facts across different dimensions. The approach is non-intrusive and allows relatively easy modeling, while capturing incompleteness and uncertainty. We describe the syntax and semantics of the language. We present algorithms to demonstrate its feasibility, and we illustrate its usefulness through examples.


Marrying Uncertainty and Time in Knowledge Graphs

AAAI Conferences

The management of uncertainty is crucial when harvesting structured content from unstructured and noisy sources. Knowledge Graphs ( KGs ) are a prominent example. KGs maintain both numerical and non-numerical facts, with the support of an underlying schema. These facts are usually accompanied by a confidence score that witnesses how likely is for them to hold. Despite their popularity, most of existing KGs focus on static data thus impeding the availabilityof timewise knowledge. What is missing is a comprehensive solution for the management of uncertain and temporal data in KGs . The goal of this paper is to fill this gap. We rely on two main ingredients. The first is a numerical extension of Markov Logic Networks (MLNs) that provide the necessary underpinning to formalize the syntax and semantics of uncertain temporal KGs . The second is a set of Datalog constraints with inequalities that extend the underlying schema of the KGs and help to detect inconsistencies. From a theoretical point of view, we discuss the complexity of two important classes of queries for uncertain temporal KGs: maximuma-posteriori and conditional probability inference. Due to the hardness of these problems and the fact that MLN solvers do not scale well, we also explore the usage of Probabilistic Soft Logics (PSL) as a practical tool to support our reasoning tasks. We report on an experimental evaluation comparing the MLN and PSL approaches.


Telugu OCR Framework using Deep Learning

arXiv.org Artificial Intelligence

In this paper, we address the task of Optical Character Recognition(OCR) for the Telugu script. We present an end-to-end framework that segments the text image, classifies the characters and extracts lines using a language model. The segmentation is based on mathematical morphology. The classification module, which is the most challenging task of the three, is a deep convolutional neural network. The language is modelled as a third degree markov chain at the glyph level. Telugu script is a complex alphasyllabary and the language is agglutinative, making the problem hard. In this paper we apply the latest advances in neural networks to achieve state-of-the-art error rates. We also review convolutional neural networks in great detail and expound the statistical justification behind the many tricks needed to make Deep Learning work.


PAC-Bayesian Theory Meets Bayesian Inference

arXiv.org Machine Learning

We exhibit a strong link between frequentist PAC-Bayesian risk bounds and the Bayesian marginal likelihood. That is, for the negative log-likelihood loss function, we show that the minimization of PAC-Bayesian generalization risk bounds maximizes the Bayesian marginal likelihood. This provides an alternative explanation to the Bayesian Occam's razor criteria, under the assumption that the data is generated by an i.i.d distribution. Moreover, as the negative log-likelihood is an unbounded loss function, we motivate and propose a PAC-Bayesian theorem tailored for the sub-gamma loss family, and we show that our approach is sound on classical Bayesian linear regression tasks.