Goldszmidt, Moises
Active Learning with Expected Error Reduction
Mussmann, Stephen, Reisler, Julia, Tsai, Daniel, Mousavi, Ehsan, O'Brien, Shayne, Goldszmidt, Moises
Active learning has been studied extensively as a method for efficient data collection. Among the many approaches in literature, Expected Error Reduction (EER) (Roy and McCallum) has been shown to be an effective method for active learning: select the candidate sample that, in expectation, maximally decreases the error on an unlabeled set. However, EER requires the model to be retrained for every candidate sample and thus has not been widely used for modern deep neural networks due to this large computational cost. In this paper we reformulate EER under the lens of Bayesian active learning and derive a computationally efficient version that can use any Bayesian parameter sampling method (such as arXiv:1506.02142). We then compare the empirical performance of our method using Monte Carlo dropout for parameter sampling against state of the art methods in the deep active learning literature. Experiments are performed on four standard benchmark datasets and three WILDS datasets (arXiv:2012.07421). The results indicate that our method outperforms all other methods except one in the data shift scenario: a model dependent, non-information theoretic method that requires an order of magnitude higher computational cost (arXiv:1906.03671).
Deciding Consistency of Databases Containing Defeasible and Strict Information
Goldszmidt, Moises, Pearl, Judea
We propose a norm of consistency for a mixed set of defeasible and strict sentences, based on a probabilistic semantics. This norm establishes a clear distinction between knowledge bases depicting exceptions and those containing outright contradictions. We then define a notion of entailment based also on probabilistic considerations and provide a characterization of the relation between consistency and entailment. We derive necessary and sufficient conditions for consistency, and provide a simple decision procedure for testing consistency and deciding whether a sentence is entailed by a database. Finally, it is shown that if al1 sentences are Horn clauses, consistency and entailment can be tested in polynomial time.
Reasoning With Qualitative Probabilities Can Be Tractable
Goldszmidt, Moises, Pearl, Judea
We recently described a formalism for reasoning with if-then rules that re expressed with different levels of firmness [18]. The formalism interprets these rules as extreme conditional probability statements, specifying orders of magnitude of disbelief, which impose constraints over possible rankings of worlds. It was shown that, once we compute a priority function Z+ on the rules, the degree to which a given query is confirmed or denied can be computed in O(log n`) propositional satisfiability tests, where n is the number of rules in the knowledge base. In this paper, we show that computing Z+ requires O(n2 X log n) satisfiability tests, not an exponential number as was conjectured in [18], which reduces to polynomial complexity in the case of Horn expressions. We also show how reasoning with imprecise observations can be incorporated in our formalism and how the popular notions of belief revision and epistemic entrenchment are embodied naturally and tractably.
Learning Bayesian Networks with Local Structure
Friedman, Nir, Goldszmidt, Moises
In this paper we examine a novel addition to the known methods for learning Bayesian networks from data that improves the quality of the learned networks. Our approach explicitly represents and learns the local structure in the conditional probability tables (CPTs), that quantify these networks. This increases the space of possible models, enabling the representation of CPTs with a variable number of parameters that depends on the learned local structures. The resulting learning procedure is capable of inducing models that better emulate the real complexity of the interactions present in the data. We describe the theoretical foundations and practical aspects of learning local structures, as well as an empirical evaluation of the proposed method. This evaluation indicates that learning curves characterizing the procedure that exploits the local structure converge faster than these of the standard procedure. Our results also show that networks learned with local structure tend to be more complex (in terms of arcs), yet require less parameters.
Context-Specific Independence in Bayesian Networks
Boutilier, Craig, Friedman, Nir, Goldszmidt, Moises, Koller, Daphne
Bayesian networks provide a language for qualitatively representing the conditional independence properties of a distribution. This allows a natural and compact representation of the distribution, eases knowledge acquisition, and supports effective inference algorithms. It is well-known, however, that there are certain independencies that we cannot capture qualitatively within the Bayesian network structure: independencies that hold only in certain contexts, i.e., given a specific assignment of values to certain variables. In this paper, we propose a formal notion of context-specific independence (CSI), based on regularities in the conditional probability tables (CPTs) at a node. We present a technique, analogous to (and based on) d-separation, for determining when such independence holds in a given network. We then focus on a particular qualitative representation scheme - tree-structured CPTs - for capturing CSI. We suggest ways in which this representation can be used to support effective inference algorithms. In particular, we present a structural decomposition of the resulting network which can improve the performance of clustering algorithms, and an alternative algorithm based on cutset conditioning.
Sequential Update of Bayesian Network Structure
Friedman, Nir, Goldszmidt, Moises
There is an obvious need for improving the performance and accuracy of a Bayesian network as new data is observed. Because of errors in model construction and changes in the dynamics of the domains, we cannot afford to ignore the information in new data. While sequential update of parameters for a fixed structure can be accomplished using standard techniques, sequential update of network structure is still an open problem. In this paper, we investigate sequential update of Bayesian networks were both parameters and structure are expected to change. We introduce a new approach that allows for the flexible manipulation of the tradeoff between the quality of the learned networks and the amount of information that is maintained about past observations. We formally describe our approach including the necessary modifications to the scoring functions for learning Bayesian networks, evaluate its effectiveness through an empirical study, and extend it to the case of missing data.
Continuous Value Function Approximation for Sequential Bidding Policies
Boutilier, Craig, Goldszmidt, Moises, Sabata, Bikash
Market-based mechanisms such as auctions are being studied as an appropriate means for resource allocation in distributed and mulitagent decision problems. When agents value resources in combination rather than in isolation, they must often deliberate about appropriate bidding strategies for a sequence of auctions offering resources of interest. We briefly describe a discrete dynamic programming model for constructing appropriate bidding policies for resources exhibiting both complementarities and substitutability. We then introduce a continuous approximation of this model, assuming that money (or the numeraire good) is infinitely divisible. Though this has the potential to reduce the computational cost of computing policies, value functions in the transformed problem do not have a convenient closed form representation. We develop {em grid-based} approximation for such value functions, representing value functions using piecewise linear approximations. We show that these methods can offer significant computational savings with relatively small cost in solution quality.
Data Analysis with Bayesian Networks: A Bootstrap Approach
Friedman, Nir, Goldszmidt, Moises, Wyner, Abraham
In recent years there has been significant progress in algorithms and methods for inducing Bayesian networks from data. However, in complex data analysis problems, we need to go beyond being satisfied with inducing networks with high scores. We need to provide confidence measures on features of these networks: Is the existence of an edge between two nodes warranted? Is the Markov blanket of a given node robust? Can we say something about the ordering of the variables? We should be able to address these questions, even when the amount of data is not enough to induce a high scoring network. In this paper we propose Efron's Bootstrap as a computationally efficient approach for answering these questions. In addition, we propose to use these confidence measures to induce better structures from the data, and to detect the presence of latent variables.
Making life better one large system at a time: Challenges for UAI research
Goldszmidt, Moises
The rapid growth and diversity in service offerings and the ensuing complexity of information technology ecosystems present numerous management challenges (both operational and strategic). Instrumentation and measurement technology is, by and large, keeping pace with this development and growth. However, the algorithms, tools, and technology required to transform the data into relevant information for decision making are not. The claim in this paper (and the invited talk) is that the line of research conducted in Uncertainty in Artificial Intelligence is very well suited to address the challenges and close this gap. I will support this claim and discuss open problems using recent examples in diagnosis, model discovery, and policy optimization on three real life distributed systems.