AITopics | Minimum Complexity Machines

Collaborating Authors

Minimum Complexity Machines

News Overviews Instructional Materials AI-Alerts Classics

Fast and fully-automated histograms for large-scale data sets

Mendizábal, Valentina Zelaya, Boullé, Marc, Rossi, Fabrice

arXiv.org Artificial IntelligenceDec-27-2022

G-Enum histograms are a new fast and fully automated method for irregular histogram construction. By framing histogram construction as a density estimation problem and its automation as a model selection task, these histograms leverage the Minimum Description Length principle (MDL) to derive two different model selection criteria. Several proven theoretical results about these criteria give insights about their asymptotic behavior and are used to speed up their optimisation. These insights, combined to a greedy search heuristic, are used to construct histograms in linearithmic time rather than the polynomial time incurred by previous works. The capabilities of the proposed MDL density estimation method are illustrated with reference to other fully automated methods in the literature, both on synthetic and large real-world data sets.

artificial intelligence, histogram, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.csda.2022.107668

2212.13524

Country: North America > United States > New York (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

MDL-based Compressing Sequential Rules

Chen, Xinhong, Gan, Wensheng, Wan, Shicheng, Gu, Tianlong

arXiv.org Artificial IntelligenceDec-20-2022

Nowadays, with the rapid development of the Internet, the era of big data has come. The Internet generates huge amounts of data every day. However, extracting meaningful information from massive data is like looking for a needle in a haystack. Data mining techniques can provide various feasible methods to solve this problem. At present, many sequential rule mining (SRM) algorithms are presented to find sequential rules in databases with sequential characteristics. These rules help people extract a lot of meaningful information from massive amounts of data. How can we achieve compression of mined results and reduce data size to save storage space and transmission time? Until now, there has been little research on the compression of SRM. In this paper, combined with the Minimum Description Length (MDL) principle and under the two metrics (support and confidence), we introduce the problem of compression of SRM and also propose a solution named ComSR for MDL-based compressing of sequential rules based on the designed sequential rule coding scheme. To our knowledge, we are the first to use sequential rules to encode an entire database. A heuristic method is proposed to find a set of compact and meaningful sequential rules as much as possible. ComSR has two trade-off algorithms, ComSR_non and ComSR_ful, based on whether the database can be completely compressed. Experiments done on a real dataset with different thresholds show that a set of compact and meaningful sequential rules can be found. This shows that the proposed method works.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2212.10252

Country: Asia > China (0.28)

Genre: Research Report (0.64)

Industry:

Information Technology (0.68)
Materials > Metals & Mining (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.70)

Add feedback

Natural Language Syntax Complies with the Free-Energy Principle

Murphy, Elliot, Holmes, Emma, Friston, Karl

arXiv.org Artificial IntelligenceOct-26-2022

Natural language syntax yields an unbounded array of hierarchically structured expressions. We claim that these are used in the service of active inference in accord with the free-energy principle (FEP). While conceptual advances alongside modelling and simulation work have attempted to connect speech segmentation and linguistic communication with the FEP, we extend this program to the underlying computations responsible for generating syntactic objects. We argue that recently proposed principles of economy in language design - such as "minimal search" criteria from theoretical syntax - adhere to the FEP. This affords a greater degree of explanatory power to the FEP - with respect to higher language functions - and offers linguistics a grounding in first principles with respect to computability. We show how both tree-geometric depth and a Kolmogorov complexity estimate (recruiting a Lempel-Ziv compression algorithm) can be used to accurately predict legal operations on syntactic workspaces, directly in line with formulations of variational free energy minimization. This is used to motivate a general principle of language design that we term Turing-Chomsky Compression (TCC). We use TCC to align concerns of linguists with the normative account of self-organization furnished by the FEP, by marshalling evidence from theoretical linguistics and psycholinguistics to ground core principles of efficient syntactic computation within active inference.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2210.15098

Country:

Europe (1.00)
North America > United States > California (0.27)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Media (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.50)

Add feedback

Learning Logic Programs From Noisy Failures

Wahlig, John

arXiv.org Artificial IntelligenceDec-28-2021

Inductive Logic Programming (ILP) is a form of machine learning (ML) which in contrast to many other state of the art ML methods typically produces highly interpretable and reusable models. However, many ILP systems lack the ability to naturally learn from any noisy or partially misclassified training data. We introduce the relaxed learning from failures approach to ILP, a noise handling modification of the previously introduced learning from failures (LFF) approach which is incapable of handling noise. We additionally introduce the novel Noisy Popper ILP system which implements this relaxed approach and is a modification of the existing Popper system. Like Popper, Noisy Popper takes a generate-test-constrain loop to search its hypothesis space wherein failed hypotheses are used to construct hypothesis constraints. These constraints are used to prune the hypothesis space, making the hypothesis search more efficient. However, in the relaxed setting, constraints are generated in a more lax fashion as to avoid allowing noisy training data to lead to hypothesis constraints which prune optimal hypotheses. Constraints unique to the relaxed setting are generated via hypothesis comparison. Additional constraints are generated by weighing the accuracy of hypotheses against their sizes to avoid overfitting through an application of the minimum description length. We support this new setting through theoretical proofs as well as experimental results which suggest that Noisy Popper improves the noise handling capabilities of Popper but at the cost of overall runtime efficiency.

constraint, logic & formal reasoning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2201.03702

Country:

North America > United States (0.45)
South America > Brazil (0.27)
North America > Mexico (0.27)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.27)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.68)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.68)
Energy > Oil & Gas > Midstream (0.68)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.34)

Add feedback

First Steps of an Approach to the ARC Challenge based on Descriptive Grid Models and the Minimum Description Length Principle

Ferré, Sébastien

arXiv.org Artificial IntelligenceDec-1-2021

The Abstraction and Reasoning Corpus (ARC) was recently introduced by Fran\c{c}ois Chollet as a tool to measure broad intelligence in both humans and machines. It is very challenging, and the best approach in a Kaggle competition could only solve 20% of the tasks, relying on brute-force search for chains of hand-crafted transformations. In this paper, we present the first steps exploring an approach based on descriptive grid models and the Minimum Description Length (MDL) principle. The grid models describe the contents of a grid, and support both parsing grids and generating grids. The MDL principle is used to guide the search for good models, i.e. models that compress the grids the most. We report on our progress over a year, improving on the general approach and the models. Out of the 400 training tasks, our performance increased from 5 to 29 solved tasks, only using 30s computation time per task. Our approach not only predicts the output grids, but also outputs an intelligible model and explanations for how the model was incrementally built.

artificial intelligence, grid, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2112.00848

Country:

North America > United States > Texas (0.14)
Europe > France (0.14)

Genre:

Research Report (0.64)
Workflow (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (1.00)

Add feedback

Causal Direction of Data Collection Matters: Implications of Causal and Anticausal Learning for NLP

Jin, Zhijing, von Kügelgen, Julius, Ni, Jingwei, Vaidhya, Tejas, Kaushal, Ayush, Sachan, Mrinmaya, Schölkopf, Bernhard

arXiv.org Artificial IntelligenceOct-19-2021

The principle of independent causal mechanisms (ICM) states that generative processes of real world data consist of independent modules which do not influence or inform each other. While this idea has led to fruitful developments in the field of causal inference, it is not widely-known in the NLP community. In this work, we argue that the causal direction of the data collection process bears nontrivial implications that can explain a number of published NLP findings, such as differences in semi-supervised learning (SSL) and domain adaptation (DA) performance across different settings. We categorize common NLP tasks according to their causal direction and empirically assay the validity of the ICM principle for text data using minimum description length. We conduct an extensive meta-analysis of over 100 published SSL and 30 DA studies, and find that the results are consistent with our expectations based on causal insights. This work presents the first attempt to analyze the ICM principle in NLP, and provides constructive suggestions for future modeling choices. Code available at https://github.com/zhijing-jin/icm4nlp

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2110.03618

Country:

Asia (0.68)
North America > United States > Oregon (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.71)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Computing Complexity-aware Plans Using Kolmogorov Complexity

Stefansson, Elis, Johansson, Karl H.

arXiv.org Artificial IntelligenceSep-22-2021

In this paper, we introduce complexity-aware planning for finite-horizon deterministic finite automata with rewards as outputs, based on Kolmogorov complexity. Kolmogorov complexity is considered since it can detect computational regularities of deterministic optimal policies. We present a planning objective yielding an explicit trade-off between a policy's performance and complexity. It is proven that maximising this objective is non-trivial in the sense that dynamic programming is infeasible. We present two algorithms obtaining low-complexity policies, where the first algorithm obtains a low-complexity optimal policy, and the second algorithm finds a policy maximising performance while maintaining local (stage-wise) complexity constraints. We evaluate the algorithms on a simple navigation task for a mobile robot, where our algorithms yield low-complexity policies that concur with intuition.

artificial intelligence, complexity, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2109.10303

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.83)

Add feedback

Discovering Useful Compact Sets of Sequential Rules in a Long Sequence

Bourrand, Erwan, Galárraga, Luis, Galbrun, Esther, Fromont, Elisa, Termier, Alexandre

arXiv.org Artificial IntelligenceSep-15-2021

We are interested in understanding the underlying generation process for long sequences of symbolic events. To do so, we propose COSSU, an algorithm to mine small and meaningful sets of sequential rules. The rules are selected using an MDL-inspired criterion that favors compactness and relies on a novel rule-based encoding scheme for sequences. Our evaluation shows that COSSU can successfully retrieve relevant sets of closed sequential rules from a long sequence. Such rules constitute an interpretable model that exhibits competitive accuracy for the tasks of next-element prediction and classification.

artificial intelligence, sequence, us government, (18 more...)

arXiv.org Artificial Intelligence

2109.07519

Country:

Europe (0.28)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.46)

Add feedback

Rissanen Data Analysis: Examining Dataset Characteristics via Description Length

Perez, Ethan, Kiela, Douwe, Cho, Kyunghyun

arXiv.org Artificial IntelligenceMar-5-2021

We introduce a method to determine if a certain capability helps to achieve an accurate model of given data. We view labels as being generated from the inputs by a program composed of subroutines with different capabilities, and we posit that a subroutine is useful if and only if the minimal program that invokes it is shorter than the one that does not. Since minimum program length is uncomputable, we instead estimate the labels' minimum description length (MDL) as a proxy, giving us a theoretically-grounded method for analyzing dataset characteristics. We call the method Rissanen Data Analysis (RDA) after the father of MDL, and we showcase its applicability on a wide variety of settings in NLP, ranging from evaluating the utility of generating subquestions before answering a question, to analyzing the value of rationales and explanations, to investigating the importance of different parts of speech, and uncovering dataset gender bias.

deep learning, neural network, subquestion, (22 more...)

arXiv.org Artificial Intelligence

2103.03872

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.66)

Add feedback

Sequential prediction under log-loss and misspecification

Feder, Meir, Polyanskiy, Yury

arXiv.org Machine LearningJan-29-2021

We consider the question of sequential prediction under the log-loss in terms of cumulative regret. Namely, given a hypothesis class of distributions, learner sequentially predicts the (distribution of the) next letter in sequence and its performance is compared to the baseline of the best constant predictor from the hypothesis class. The well-specified case corresponds to an additional assumption that the data-generating distribution belongs to the hypothesis class as well. Here we present results in the more general misspecified case. Due to special properties of the log-loss, the same problem arises in the context of competitive-optimality in density estimation, and model selection. For the $d$-dimensional Gaussian location hypothesis class, we show that cumulative regrets in the well-specified and misspecified cases asymptotically coincide. In other words, we provide an $o(1)$ characterization of the distribution-free (or PAC) regret in this case -- the first such result as far as we know. We recall that the worst-case (or individual-sequence) regret in this case is larger by an additive constant ${d\over 2} + o(1)$. Surprisingly, neither the traditional Bayesian estimators, nor the Shtarkov's normalized maximum likelihood achieve the PAC regret and our estimator requires special "robustification" against heavy-tailed data. In addition, we show two general results for misspecified regret: the existence and uniqueness of the optimal estimator, and the bound sandwiching the misspecified regret between well-specified regrets with (asymptotically) close hypotheses classes.

artificial intelligence, bayesian inference, estimator, (18 more...)

arXiv.org Machine Learning

2102.0005

Country: North America (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback