AITopics | kolmogorov complexity

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh

Causal Inference via Kernel Deviance Measures

Neural Information Processing SystemsFeb-13-2026, 06:19:07 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, conditional distribution, machine learning, (18 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningJan-7-2026

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence

Finzi, Marc, Qiu, Shikai, Jiang, Yiding, Izmailov, Pavel, Kolter, J. Zico, Wilson, Andrew Gordon

Can we learn more from data than existed in the generating process itself? Can new and useful information be constructed from merely applying deterministic transformations to existing data? Can the learnable content in data be evaluated without considering a downstream task? On these questions, Shannon information and Kolmogorov complexity come up nearly empty-handed, in part because they assume observers with unlimited computational capacity and fail to target the useful information content. In this work, we identify and exemplify three seeming paradoxes in information theory: (1) information cannot be increased by deterministic transformations; (2) information is independent of the order of data; (3) likelihood modeling is merely distribution matching. To shed light on the tension between these results and modern practice, and to quantify the value of data, we introduce epiplexity, a formalization of information capturing what computationally bounded observers can learn from data. Epiplexity captures the structural content in data while excluding time-bounded entropy, the random unpredictable content exemplified by pseudorandom number generators and chaotic dynamical systems. With these concepts, we demonstrate how information can be created with computation, how it depends on the ordering of the data, and how likelihood modeling can produce more complex programs than present in the data generating process itself. We also present practical procedures to estimate epiplexity which we show capture differences across data sources, track with downstream performance, and highlight dataset interventions that improve out-of-distribution generalization. In contrast to principles of model selection, epiplexity provides a theoretical foundation for data selection, guiding how to select, generate, or transform data for learning systems.

information, large language model, machine learning, (19 more...)

arXiv.org Machine Learning

2601.0322

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.63)

Industry:

Education (0.92)
Information Technology > Security & Privacy (0.67)
Leisure & Entertainment > Games > Chess (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Neural Information Processing SystemsDec-24-2025, 15:03:34 GMT

Causal Discovery from Event Sequences by Local Cause-Effect Attribution

Sequences of events, such as crashes in the stock market or outages in a network, contain strong temporal dependencies, whose understanding is crucial to react to and influence future events. In this paper, we study the problem of discovering the underlying causal structure from event sequences. To this end, we introduce a new causal model, where individual events of the cause trigger events of the effect with dynamic delays. We show that in contrast to existing methods based on Granger causality, our model is identifiable for both instant and delayed effects.We base our approach on the Algorithmic Markov Condition, by which we identify the true causal network as the one that minimizes the Kolmogorov complexity. As the Kolmogorov complexity is not computable, we instantiate our model using Minimum Description Length and show that the resulting score identifies the causal direction. To discover causal graphs, we introduce the Cascade algorithm, which adds edges in topological order. Extensive evaluation shows that Cascade outperforms existing methods in settings with instantaneous effects, noise, and multiple colliders, and discovers insightful causal graphs on real-world data.

artificial intelligence, machine learning, proceedings, (5 more...)

Country: North America > United States (0.07)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.83)

arXiv.org Artificial IntelligenceDec-8-2025

On the Holographic Geometry of Deterministic Computation

Nye, Logan

Standard simulations of Turing machines suggest a linear relationship between the temporal duration $t$ of a run and the amount of information that must be stored by known simulations to certify, verify, or regenerate the configuration at time $t$. For deterministic multitape Turing machines over a fixed finite alphabet, this apparent linear dependence is not intrinsic: any length-$t$ run can be simulated using $O(\sqrt{t})$ work-tape cells via a Height Compression Theorem for succinct computation trees together with an Algebraic Replay Engine. In this paper we recast that construction in geometric and information-theoretic language. We interpret the execution trace as a spacetime DAG of local update events and exhibit a family of recursively defined holographic boundary summaries such that, along the square-root-space simulation, the total description length of all boundary data stored at any time is $O(\sqrt{t})$. Using Kolmogorov complexity, we prove that every internal configuration has constant conditional description complexity given the appropriate boundary summary and time index, establishing that the spacetime bulk carries no additional algorithmic information beyond its boundary. We express this as a one-dimensional computational area law: there exists a simulation in which the information capacity of the active "holographic screen'' needed to generate a spacetime region of volume proportional to $t$ is bounded by $O(\sqrt{t})$. In this precise sense, deterministic computation on a one-dimensional work tape admits a holographic representation, with the bulk history algebraically determined by data residing on a lower-dimensional boundary screen.

artificial intelligence, configuration, kolmogorov complexity, (14 more...)

2512.00607

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence (0.89)

Chung, Woojin, Kim, Jeonghoon

Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training

arXiv.org Artificial IntelligenceDec-1-2025

Large language models are trained with tokenizers, and the resulting token distribution is highly imbalanced: a few words dominate the stream while most occur rarely. Recent practice favors ever-larger vocabularies, but it is unclear where the benefit comes from. To this end, we perform a controlled study that scales the vocabulary of the language model from 24K to 196K while holding data, computation, and optimization unchanged. We begin by quantifying the complexity of tokenized text -- formalized via Kolmogorov complexity -- and show that larger vocabularies reduce this complexity. Above 24K, every common word is already tokenized as a single token, so enlarging vocabulary only deepens the relative token-frequency imbalance. Word-level loss decomposition shows that larger vocabularies reduce cross-entropy loss almost exclusively by lowering uncertainty on the 2,500 most frequent words, even though loss on the rare tail rises. The same frequent words cover roughly 75% of tokens in downstream benchmarks, so this training advantage transfers intact. We further show that enlarging model parameters with a fixed vocabulary yields the same frequent-word benefit. Our results recast "bigger vocabularies help" as "lowering complexity of tokenized text helps," offering a simple, principled knob for tokenizer-model co-design and clarifying the loss dynamics that govern language model scaling in pre-training.

large language model, machine learning, natural language, (17 more...)

2508.1539

Country:

Europe > Austria > Vienna (0.14)
North America > Canada > British Columbia > Vancouver (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh

Causal Inference via Kernel Deviance Measures

Neural Information Processing SystemsNov-20-2025, 17:27:02 GMT

In many areas of science, we strive to answer questions that are fundamentally causal in nature.

causal direction, complexity, conditional distribution, (16 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Neural Information Processing SystemsNov-15-2025, 07:02:34 GMT

Causal Discovery from Event Sequences by Local Cause-Effect Attribution

Suppose we are considering a multivariate event sequence. What caused a specific event to happen? Which variables are causes of each other?

event sequence, experiment, node, (16 more...)

Country:

North America > United States (0.04)
Oceania > Australia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan (0.04)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)

arXiv.org Artificial IntelligenceNov-4-2025

The Limits of AI Explainability: An Algorithmic Information Theory Approach

Rao, Shrisha

This paper establishes a theoretical foundation for understanding the fundamental limits of AI explainability through algorithmic information theory. We formalize explainability as the approximation of complex models by simpler ones, quantifying both approximation error and explanation complexity using Kolmogorov complexity. Our key theoretical contributions include: (1) a complexity gap theorem proving that any explanation significantly simpler than the original model must differ from it on some inputs; (2) precise bounds showing that explanation complexity grows exponentially with input dimension but polynomially with error tolerance for Lipschitz functions; and (3) a characterization of the gap between local and global explainability, demonstrating that local explanations can be significantly simpler while maintaining accuracy in relevant regions. We further establish a regulatory impossibility theorem proving that no governance framework can simultaneously pursue unrestricted AI capabilities, human-interpretable explanations, and negligible error. These results highlight considerations likely to be relevant to the design, evaluation, and oversight of explainable AI systems.

artificial intelligence, machine learning, natural language, (20 more...)

2504.20676

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > West Sussex (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Banking & Finance (0.92)
Government > Regional Government > North America Government > United States Government (0.92)
Law > Statutes (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.36)

arXiv.org Artificial IntelligenceOct-21-2025

Epistemic Trade-Off: An Analysis of the Operational Breakdown and Ontological Limits of "Certainty-Scope" in AI

Immediato, Generoso

The recently published "certainty - scope" conjecture offers a compelling insight into the inherent trade - off present within artificial intelligence (AI) systems. As general research, this investigation remains vital as a philosophical undertaking and a potential guide for directing AI investments, design, and deployment, especially in safety - critical and mission - critical domains where risk levels are substantially elevated. W hile maintaining intellectual coherence, its formalization ultimately consolidates this insight into a suspended epistemic truth, which resists operational implementation within practical systems. This paper argues that the conjecture's objective to furnish insights for engineering de sign and regulatory decision - making is limited by two fundamental factors: first, its dependence on incomputable constructs and its failure to capture the generality factors of AI, rendering it practically unimplementable and unverifiable; second, its foundational ontological assumption of AI systems as self - contained epistemic entities, distancing it from the complex and dynamic socio - technical environments where knowledge is co - constructed. We conclude that this dual breakdown -- an epistemic closure deficit and an embeddedness bypass -- hinders the conjecture's transition to a practical and actionable framework suitable for informing and guiding AI deployments . In response, we point towards a possible framing of the epistemic challenge, emphasizing the inherent epistemic burdens of AI within complex human - centric domains. Keywords: artificial intelligence (AI), AI governance, algorithmic information theory (AIT), certainty - scope trade - off, complex systems, computability & operationalization, epistemic entanglement, epistemic certainty, hybrid AI systems, information theory, Kolmogorov complexity, risk - based assurance, safety - critical AI, socio - technical systems, verification and validation (V&V).

artificial intelligence, generoso immediato generoso, machine learning, (13 more...)

2508.19304

Country:

North America > United States > Indiana > Monroe County > Bloomington (0.04)
North America > United States > Illinois (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Industry: Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)