ceg
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
stCEG: An R Package for Modelling Events over Spatial Areas Using Chain Event Graphs
Calley, Hollie, Williamson, Daniel
stCEG is an R package which allows a user to fully specify a Chain Event Graph (CEG) model from data and to produce interactive plots. It includes functions for the user to visualise spatial variables they wish to include in the model. There is also a web-based graphical user interface (GUI) provided, increasing ease of use for those without knowledge of R. We demonstrate stCEG using a dataset of homicides in London, which is included in the package. stCEG is the first software package for CEGs that allows for full model customisation.
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Rethinking LLM Advancement: Compute-Dependent and Independent Paths to Progress
Sanderson, Jack, Foley, Teddy, Guo, Spencer, Qu, Anqi, Josephson, Henry
Regulatory efforts to govern large language model (LLM) development have predominantly focused on restricting access to high-performance computational resources. This study evaluates the efficacy of such measures by examining whether LLM capabilities can advance through algorithmic innovation in compute-constrained environments. We propose a novel framework distinguishing compute-dependent innovations--which yield disproportionate benefits at high compute--from compute-independent innovations, which improve efficiency across compute scales. The impact is quantified using Compute-Equivalent Gain (CEG). Experimental validation with nanoGPT models confirms that compute-independent advancements yield significant performance gains (e.g., with combined CEG up to $3.5\times$) across the tested scales. In contrast, compute-dependent advancements were detrimental to performance at smaller experimental scales, but showed improved CEG (on par with the baseline) as model size increased, a trend consistent with their definition of yielding primary benefits at higher compute. Crucially, these findings indicate that restrictions on computational hardware, while potentially slowing LLM progress, are insufficient to prevent all capability gains driven by algorithmic advancements. We argue that effective AI oversight must therefore incorporate mechanisms for understanding, anticipating, and potentially guiding algorithmic research, moving beyond a singular focus on hardware. The proposed framework also serves as an analytical tool for forecasting AI progress.
- Asia > China (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California (0.04)
- Asia > Middle East > Jordan (0.04)
AI capabilities can be significantly improved without expensive retraining
Davidson, Tom, Denain, Jean-Stanislas, Villalobos, Pablo, Bas, Guillem
State-of-the-art AI systems can be significantly improved without expensive retraining via "post-training enhancements"-techniques applied after initial training like fine-tuning the system to use a web browser. We review recent post-training enhancements, categorizing them into five types: tool-use, prompting methods, scaffolding, solution selection, and data generation. Different enhancements improve performance on different tasks, making it hard to compare their significance. So we translate improvements from different enhancements into a common currency, the compute-equivalent gain: how much additional training compute would be needed to improve performance by the same amount as the enhancement. Our non-experimental work shows that post-training enhancements have significant benefits: most surveyed enhancements improve benchmark performance by more than a 5x increase in training compute, some by more than 20x. Post-training enhancements are relatively cheap to develop: fine-tuning costs are typically <1% of the original training cost. Governing the development of capable post-training enhancements may be challenging because frontier models could be enhanced by a wide range of actors.
- Research Report (1.00)
- Overview (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
CLEVRER-Humans: Describing Physical and Causal Events the Human Way
Mao, Jiayuan, Yang, Xuelin, Zhang, Xikun, Goodman, Noah D., Wu, Jiajun
Building machines that can reason about physical events and their causal relationships is crucial for flexible interaction with the physical world. However, most existing physical and causal reasoning benchmarks are exclusively based on synthetically generated events and synthetic natural language descriptions of causal relationships. This design brings up two issues. First, there is a lack of diversity in both event types and natural language descriptions; second, causal relationships based on manually-defined heuristics are different from human judgments. To address both shortcomings, we present the CLEVRER-Humans benchmark, a video reasoning dataset for causal judgment of physical events with human labels. We employ two techniques to improve data collection efficiency: first, a novel iterative event cloze task to elicit a new representation of events in videos, which we term Causal Event Graphs (CEGs); second, a data augmentation technique based on neural language generative models. We convert the collected CEGs into questions and answers to be consistent with prior work. Finally, we study a collection of baseline approaches for CLEVRER-Humans question-answering, highlighting the great challenges set forth by our benchmark.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Conditional Generative Modeling for High-dimensional Marked Temporal Point Processes
Dong, Zheng, Fan, Zekai, Zhu, Shixiang
Point processes offer a versatile framework for sequential event modeling. However, the computational challenges and constrained representational power of the existing point process models have impeded their potential for wider applications. This limitation becomes especially pronounced when dealing with event data that is associated with multi-dimensional or high-dimensional marks such as texts or images. To address this challenge, this study proposes a novel event generative framework for modeling point processes with high-dimensional marks. We aim to capture the distribution of events without explicitly specifying the conditional intensity or probability density function. Instead, we use a conditional generator that takes the history of events as input and generates the high-quality subsequent event that is likely to occur given the prior observations. The proposed framework offers a host of benefits, including considerable representational power to capture intricate dynamics in multi- or even high-dimensional event space, as well as exceptional efficiency in learning the model and generating samples. Our numerical results demonstrate superior performance compared to other state-of-the-art baselines.
- North America > United States > California (0.15)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
Structural Learning of Simple Staged Trees
Leonelli, Manuele, Varando, Gherardo
Bayesian networks faithfully represent the symmetric conditional independences existing between the components of a random vector. Staged trees are an extension of Bayesian networks for categorical random vectors whose graph represents non-symmetric conditional independences via vertex coloring. However, since they are based on a tree representation of the sample space, the underlying graph becomes cluttered and difficult to visualize as the number of variables increases. Here we introduce the first structural learning algorithms for the class of simple staged trees, entertaining a compact coalescence of the underlying tree from which non-symmetric independences can be easily read. We show that data-learned simple staged trees often outperform Bayesian networks in model fit and illustrate how the coalesced graph is used to identify non-symmetric conditional independences.
- Asia (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Oceania > New Zealand (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Constructing a Chain Event Graph from a Staged Tree
Chain Event Graphs (CEGs) are a recent family of probabilistic graphical models - a generalisation of Bayesian Networks - providing an explicit representation of structural zeros and context-specific conditional independences within their graph topology. A CEG is constructed from an event tree through a sequence of transformations beginning with the colouring of the vertices of the event tree to identify one-step transition symmetries. This coloured event tree, also known as a staged tree, is the output of the learning algorithms used for this family. Surprisingly, no general algorithm has yet been devised that automatically transforms any staged tree into a CEG representation. In this paper we provide a simple iterative backward algorithm for this transformation. Additionally, we show that no information is lost from transforming a staged tree into a CEG. Finally, we demonstrate that with an optimal stopping time, our algorithm is more efficient than the generalisation of a special case presented in Silander and Leong (2013). We also provide Python code using this algorithm to obtain a CEG from any staged tree along with the functionality to add edges with sampling zeros.
- Oceania > New Zealand (0.04)
- Europe > United Kingdom > England > West Midlands > Coventry (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
Bayesian Learning of Causal Relationships for System Reliability
Yu, Xuewen, Smith, Jim Q., Nichols, Linda
Concurrently advances in causal modelling has led to a better understanding of how to predict complex systems when these are subjected to control. A major breakthrough then occurred about 20 years ago when it was then discovered that causal and graphical modelling could be combined. This paper reports on how the combination of these two technologies are being applied to give more insights concerning causal hypotheses that relate specially to reliability and system safety. Of course causal ideas have been embedded within reliability theory for a long time, both to explore the reasons behind a failure and to also estimate the efficacy of various interventions in the system that might ameliorate adverse outcomes. However, the graphical frameworks around which these ideas appear have been tree like - for example fault trees or event chains (Leverson 2011) - rather than the most common graphical framework for analyzing causation: the BN. Only recently have graphical causal methods emerged for methodologies and algorithms to exist for causal discovery and causal reasoning on such tree structures. The primary such graph is the chain event graph (CEG), see Thwaites, Smith and Riccomagno (2010), Barclay, Hutton and Smith (2013) and Collazo et.
Automated Generation of Test Models from Semi-Structured Requirements
Fischbach, Jannik, Junker, Maximilian, Vogelsang, Andreas, Freudenstein, Dietmar
[Context:] Model-based testing is an instrument for automated generation of test cases. It requires identifying requirements in documents, understanding them syntactically and semantically, and then translating them into a test model. One light-weight language for these test models are Cause-Effect-Graphs (CEG) that can be used to derive test cases. [Problem:] The creation of test models is laborious and we lack an automated solution that covers the entire process from requirement detection to test model creation. In addition, the majority of requirements is expressed in natural language (NL), which is hard to translate to test models automatically. [Principal Idea:] We build on the fact that not all NL requirements are equally unstructured. We found that 14 % of the lines in requirements documents of our industry partner contain "pseudo-code"-like descriptions of business rules. We apply Machine Learning to identify such semi-structured requirements descriptions and propose a rule-based approach for their translation into CEGs. [Contribution:] We make three contributions: (1) an algorithm for the automatic detection of semi-structured requirements descriptions in documents, (2) an algorithm for the automatic translation of the identified requirements into a CEG and (3) a study demonstrating that our proposed solution leads to 86 % time savings for test model creation without loss of quality.
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)