Goto

Collaborating Authors

 probability tree


Slaves to the Law of Large Numbers: An Asymptotic Equipartition Property for Perplexity in Generative Language Models

Mudumbai, Raghu, Bell, Tyler

arXiv.org Artificial Intelligence

We propose a new asymptotic equipartition property for the perplexity of a large piece of text generated by a language model and present theoretical arguments for this property. Perplexity, defined as a inverse likelihood function, is widely used as a performance metric for training language models. Our main result states that the logarithmic perplexity of any large text produced by a language model must asymptotically converge to the average entropy of its token distributions. This means that language models are constrained to only produce outputs from a ``typical set", which we show, is a vanishingly small subset of all possible grammatically correct outputs. We present preliminary experimental results from an open-source language model to support our theoretical claims. This work has possible practical applications for understanding and improving ``AI detection" tools and theoretical implications for the uniqueness, predictability and creative potential of generative models.


A new class of generative classifiers based on staged tree models

Carli, Federico, Leonelli, Manuele, Varando, Gherardo

arXiv.org Machine Learning

Generative models for classification use the joint probability distribution of the class variable and the features to construct a decision rule. Among generative models, Bayesian networks and naive Bayes classifiers are the most commonly used and provide a clear graphical representation of the relationship among all variables. However, these have the disadvantage of highly restricting the type of relationships that could exist, by not allowing for context-specific independences. Here we introduce a new class of generative classifiers, called staged tree classifiers, which formally account for context-specific independence. They are constructed by a partitioning of the vertices of an event tree from which conditional independence can be formally read. The naive staged tree classifier is also defined, which extends the classic naive Bayes classifier whilst retaining the same complexity. An extensive simulation study shows that the classification accuracy of staged tree classifiers is competitive with those of state-of-the-art classifiers. An applied analysis to predict the fate of the passengers of the Titanic highlights the insights that the new class of generative classifiers can give.


Algorithms for Causal Reasoning in Probability Trees

Genewein, Tim, McGrath, Tom, Déletang, Grégoire, Mikulik, Vladimir, Martic, Miljan, Legg, Shane, Ortega, Pedro A.

arXiv.org Artificial Intelligence

Probability trees are one of the simplest models of causal generative processes. They possess clean semantics and -- unlike causal Bayesian networks -- they can represent context-specific causal dependencies, which are necessary for e.g. causal induction. Yet, they have received little attention from the AI and ML community. Here we present concrete algorithms for causal reasoning in discrete probability trees that cover the entire causal hierarchy (association, intervention, and counterfactuals), and operate on arbitrary propositional and causal events. Our work expands the domain of causal reasoning to a very general class of discrete stochastic processes.


DeepMind Research Introduces Algorithms for Causal Reasoning in Probability Trees

#artificialintelligence

For cutting-edge AI researchers looking for clean semantics models to represent the context-specific causal dependencies essential for causal induction, this DeepMind's algorithm encourages you to look at good old-fashioned probability trees. The probability tree diagram is used to represent a probability space. Tree diagrams illustrate a series of independent events or conditional probabilities. The Node on the probability tree diagram represents an event, and it's probability. The root node represents a particular event where probability equals one.


DeepMind Introduces Algorithms for Causal Reasoning in Probability Trees

#artificialintelligence

Are you a cutting-edge AI researcher looking for models with clean semantics that can represent the context-specific causal dependencies necessary for causal induction? If so, maybe you should take a look at good old-fashioned probability trees. Probability trees may have been around for decades, but they have received little attention from the AI and ML community. "Probability trees are one of the simplest models of causal generative processes," explains the new DeepMind paper Algorithms for Causal Reasoning in Probability Trees, which the authors say is the first to propose concrete algorithms for causal reasoning in discrete probability trees. Humans naturally learn to reason in large part through inducing causal relationships from our observations, and we do this remarkably well, cognitive scientists say. Even when the data we perceive is sparse and limited, humans can quickly learn causal structures such as interactions between physical objects, observations of the co-occurrence frequencies between causes and effects, etc. Causal induction is also a classic problem in statistics and machine learning.


Is Machine Learning Getting Us Closer to Predicting Eruptions?

#artificialintelligence

When Whakaari (White Island) in New Zealand unexpectedly erupted in December 2019, more than 40 tourists found themselves trapped on a small island that was exploding. The hot gases and water, flying rocks and ash killed 21 people during that eruption. This tragedy was a wake-up call for tour operators who would regularly bring people to this restless volcano in the Bay of Plenty. It is a volcano that produces steam-driven explosions that come with little warning, and it is these types of blasts that have killed dozens of people on volcanoes around the world over the past decade. Part of the problem is how we think about volcanic danger.


Review of The Art of Causal Conjecture

AI Magazine

However, he found his attention increasingly distracted by the possibilities provided by probability trees for understanding probability and causality--so much so, in fact, that instead of finishing the first book, he wrote a different one on this second topic. It is this second book that is the subject of this review, and it is easy to see why the power and breadth of the ideas seduced Shafer from his original purpose. (Do not despair, however, those three draft chapters have now also appeared, although without the others that were originally intended to accompany them, in Probabilistic Expert Systems, Society for Industrial and Applied Mathematics, 1996). The author describes The Art of Causal Conjecture as presenting "a new mathematical and philosophical foundation for probability" (p. This is a large claim.


Towards a General Framework for Actual Causation Using CP-logic

Beckers, Sander, Vennekens, Joost

arXiv.org Artificial Intelligence

Since Pearl's seminal work on providing a formal language for causality, the subject has garnered a lot of interest among philosophers and researchers in artificial intelligence alike. One of the most debated topics in this context regards the notion of actual causation, which concerns itself with specific - as opposed to general - causal claims. The search for a proper formal definition of actual causation has evolved into a controversial debate, that is pervaded with ambiguities and confusion. The goal of our research is twofold. First, we wish to provide a clear way to compare competing definitions. Second, we also want to improve upon these definitions so they can be applied to a more diverse range of instances, including non-deterministic ones. To achieve these goals we will provide a general, abstract definition of actual causation, formulated in the context of the expressive language of CP-logic (Causal Probabilistic logic). We will then show that three recent definitions by Ned Hall (originally formulated for structural models) and a definition of our own (formulated for CP-logic directly) can be viewed and directly compared as instantiations of this abstract definition, which allows them to deal with a broader range of examples.


Sensitivity analysis for finite Markov chains in discrete time

de Cooman, Gert, Hermans, Filip, Quaeghebeur, Erik

arXiv.org Artificial Intelligence

When the initial and transition probabilities of a finite Markov chain in discrete time are not well known, we should perform a sensitivity analysis. This is done by considering as basic uncertainty models the so-called credal sets that these probabilities are known or believed to belong to, and by allowing the probabilities to vary over such sets. This leads to the definition of an imprecise Markov chain. We show that the time evolution of such a system can be studied very efficiently using so-called lower and upper expectations. We also study how the inferred credal set about the state at time n evolves as n->infinity: under quite unrestrictive conditions, it converges to a uniquely invariant credal set, regardless of the credal set given for the initial state. This leads to a non-trivial generalisation of the classical Perron-Frobenius Theorem to imprecise Markov chains.


Markov Chain Monte Carlo using Tree-Based Priors on Model Structure

Angelopoulos, Nicos, Cussens, James

arXiv.org Artificial Intelligence

We present a general framework for defining priors on model structure and sampling from the posterior using the Metropolis-Hastings algorithm. The key idea is that structure priors are defined via a probability tree and that the proposal mechanism for the Metropolis-Hastings algorithm operates by traversing this tree, thereby defining a cheaply computable acceptance probability. We have applied this approach to Bayesian net structure learning using a number of priors and tree traversal strategies. Our results show that these must be chosen appropriately for this approach to be successful.