AITopics | Minimum Complexity Machines

Collaborating Authors

Minimum Complexity Machines

News Overviews Instructional Materials AI-Alerts Classics

Model selection by minimum description length: Lower-bound sample sizes for the Fisher information approximation

Heck, Daniel W., Moshagen, Morten, Erdfelder, Edgar

arXiv.org Machine LearningAug-1-2018

The Fisher information approximation (FIA) is an implementation of the minimum description length principle for model selection. Unlike information criteria such as AIC or BIC, it has the advantage of taking the functional form of a model into account. Unfortunately, FIA can be misleading in finite samples, resulting in an inversion of the correct rank order of complexity terms for competing models in the worst case. As a remedy, we propose a lower-bound $N'$ for the sample size that suffices to preclude such errors. We illustrate the approach using three examples from the family of multinomial processing tree models.

artificial intelligence, machine learning, model selection, (16 more...)

arXiv.org Machine Learning

doi: 10.1016/j.jmp.2014.06.002

1808.00212

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.65)

Add feedback

Binary Matrix Factorization via Dictionary Learning

Ramirez, Ignacio

arXiv.org Machine LearningJul-25-2018

Matrix factorization is a key tool in data analysis; its applications include recommender systems, correlation analysis, signal processing, among others. Binary matrices are a particular case which has received significant attention for over thirty years, especially within the field of data mining. Dictionary learning refers to a family of methods for learning overcomplete basis (also called frames) in order to efficiently encode samples of a given type; this area, now also about twenty years old, was mostly developed within the signal processing field. In this work we propose two binary matrix factorization methods based on a binary adaptation of the dictionary learning paradigm to binary matrices. The proposed algorithms focus on speed and scalability; they work with binary factors combined with bit-wise operations and a few auxiliary integer ones. Furthermore, the methods are readily applicable to online binary matrix factorization. Another important issue in matrix factorization is the choice of rank for the factors; we address this model selection problem with an efficient method based on the Minimum Description Length principle. Our preliminary results show that the proposed methods are effective at producing interpretable factorizations of various data types of different nature.

health & medicine, opération, survey article, (20 more...)

arXiv.org Machine Learning

doi: 10.1109/JSTSP.2018.2875674

1804.05482

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.48)

Add feedback

High-dimensional Penalty Selection via Minimum Description Length Principle

Miyaguchi, Kohei, Yamanishi, Kenji

arXiv.org Machine LearningApr-26-2018

We tackle the problem of penalty selection of regularization on the basis of the minimum description length (MDL) principle. In particular, we consider that the design space of the penalty function is high-dimensional. In this situation, the luckiness-normalized-maximum-likelihood(LNML)-minimization approach is favorable, because LNML quantifies the goodness of regularized models with any forms of penalty functions in view of the minimum description length principle, and guides us to a good penalty function through the high-dimensional space. However, the minimization of LNML entails two major challenges: 1) the computation of the normalizing factor of LNML and 2) its minimization in high-dimensional spaces. In this paper, we present a novel regularization selection method (MDL-RS), in which a tight upper bound of LNML (uLNML) is minimized with local convergence guarantee. Our main contribution is the derivation of uLNML, which is a uniform-gap upper bound of LNML in an analytic expression. This solves the above challenges in an approximate manner because it allows us to accurately approximate LNML and then efficiently minimize it. The experimental results show that MDL-RS improves the generalization performance of regularized estimates specifically when the model has redundant parameters.

artificial intelligence, bayesian inference, ulnml, (16 more...)

arXiv.org Machine Learning

1804.09904

Country: Asia > Japan (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Finite Biased Teaching with Infinite Concept Classes

Hernandez-Orallo, Jose, Telle, Jan Arne

arXiv.org Artificial IntelligenceApr-19-2018

We investigate the teaching of infinite concept classes through the effect of the learning bias (which is used by the learner to prefer some concepts over others and by the teacher to devise the teaching examples) and the sampling bias (which determines how the concepts are sampled from the class). We analyse two important classes: Turing machines and finite-state machines. We derive bounds for the biased teaching dimension when the learning bias is derived from a complexity measure (Kolmogorov complexity and minimal number of states respectively) and analyse the sampling distributions that lead to finite expected biased teaching dimensions. We highlight the existing trade-off between the bound and the representativeness of the sample, and its implications for the understanding of what teaching rich concepts to machines entails.

artificial intelligence, machine learning, teaching dimension, (19 more...)

arXiv.org Artificial Intelligence

1804.07121

Country:

Europe > Spain (0.14)
Europe > Norway (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.48)

Add feedback

Automatic Segmentation of Data Sequences

Chen, Liangzhe (Virginia tech) | Amiri, Sorour E. (Virginia Tech) | Prakash, B. Aditya (Virginia Tech)

AAAI ConferencesFeb-8-2018

Segmenting temporal data sequences is an important problem which helps in understanding data dynamics in multiple applications such as epidemic surveillance, motion capture sequences, etc. In this paper, we give DASSA, the first self-guided and efficient algorithm to automatically find a segmentation that best detects the change of pattern in data sequences. To avoid introducing tuning parameters, we design DASSA to be a multi-level method which examines segments at each level of granularity via a compact data structure called the segment-graph. We build this data structure by carefully leveraging the information bottleneck method with the MDL principle to effectively represent each segment.Next, DASSA efficiently finds the optimal segmentation via a novel average-longest-path optimization on the segment-graph. Finally we show how the outputs from DASSA can be naturally interpreted to reveal meaningful patterns. We ran DASSA on multiple real datasets of varying sizes and it is very effective in finding the time-cut points of the segmentations (in some cases recovering the cut points perfectly) as well as in finding the corresponding changing patterns.

immunology, segmentation, upstream oil & gas, (24 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

South America (0.29)
North America > United States (0.14)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.96)
Health & Medicine > Epidemiology (0.95)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.35)

Add feedback

Nonparametric Quantile-Based Causal Discovery

Tagasovska, Natasa, Vatter, Thibault, Chavez-Demoulin, Valérie

arXiv.org Machine LearningJan-31-2018

Telling cause from effect using observational data is a challenging problem, especially in the bivariate case. Contemporary methods often assume an independence between the cause and the generating mechanism of the effect given the cause. From this postulate, they derive asymmetries to uncover causal relationships. In this work, we propose such an approach, based on the link between Kolmogorov complexity and quantile scoring. We use a nonparametric conditional quantile estimator based on copulas to implement our procedure, thus avoiding restrictive assumptions about the joint distribution between cause and effect. In an extensive study on real and synthetic data, we show that quantile copula causal discovery (QCCD) compares favorably to state-of-the-art methods, while at the same time being computationally efficient and scalable.

artificial intelligence, causal discovery, survey article, (18 more...)

arXiv.org Machine Learning

1801.10579

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.68)

Add feedback

Causal Inference on Multivariate and Mixed-Type Data

Marx, Alexander, Vreeken, Jilles

arXiv.org Machine LearningOct-16-2017

Given data over the joint distribution of two random variables $X$ and $Y$, we consider the problem of inferring the most likely causal direction between $X$ and $Y$. In particular, we consider the general case where both $X$ and $Y$ may be univariate or multivariate, and of the same or mixed data types. We take an information theoretic approach, based on Kolmogorov complexity, from which it follows that first describing the data over cause and then that of effect given cause is shorter than the reverse direction. The ideal score is not computable, but can be approximated through the Minimum Description Length (MDL) principle. Based on MDL, we propose two scores, one for when both $X$ and $Y$ are of the same single data type, and one for when they are mixed-type. We model dependencies between $X$ and $Y$ using classification and regression trees. As inferring the optimal model is NP-hard, we propose Crack, a fast greedy algorithm to determine the most likely causal direction directly from the data. Empirical evaluation on a wide range of data shows that Crack reliably, and with high accuracy, infers the correct causal direction on both univariate and multivariate cause-effect pairs over both single and mixed-type data.

artificial intelligence, decision tree learning, dependency, (16 more...)

arXiv.org Machine Learning

1702.06385

Country: Europe > Germany > Saarland (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Telling Cause from Effect using MDL-based Local and Global Regression

Marx, Alexander, Vreeken, Jilles

arXiv.org Machine LearningSep-26-2017

We consider the fundamental problem of inferring the causal direction between two univariate numeric random variables $X$ and $Y$ from observational data. The two-variable case is especially difficult to solve since it is not possible to use standard conditional independence tests between the variables. To tackle this problem, we follow an information theoretic approach based on Kolmogorov complexity and use the Minimum Description Length (MDL) principle to provide a practical solution. In particular, we propose a compression scheme to encode local and global functional relations using MDL-based regression. We infer $X$ causes $Y$ in case it is shorter to describe $Y$ as a function of $X$ than the inverse direction. In addition, we introduce Slope, an efficient linear-time algorithm that through thorough empirical evaluation on both synthetic and real world data we show outperforms the state of the art by a wide margin.

artificial intelligence, lope, machine learning, (18 more...)

arXiv.org Machine Learning

1709.08915

Country: Europe > Germany > Saarland (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.91)

Add feedback

Stochastic Generative Hashing

Dai, Bo, Guo, Ruiqi, Kumar, Sanjiv, He, Niao, Song, Le

arXiv.org Machine LearningAug-12-2017

Learning-based binary hashing has become a powerful paradigm for fast search and retrieval in massive databases. However, due to the requirement of discrete outputs for the hash functions, learning such functions is known to be very challenging. In addition, the objective functions adopted by existing hashing techniques are mostly chosen heuristically. In this paper, we propose a novel generative approach to learn hash functions through Minimum Description Length principle such that the learned hash codes maximally compress the dataset and can also be used to regenerate the inputs. We also develop an efficient learning algorithm based on the stochastic distributional gradient, which avoids the notorious difficulty caused by binary output constraints, to jointly optimize the parameters of the hash function and the associated generative model. Extensive experiments on a variety of large-scale datasets show that the proposed method achieves better retrieval results than the existing state-of-the-art methods.

algorithm, neural network, optimization problem, (17 more...)

arXiv.org Machine Learning

1701.02815

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.34)

Add feedback

Soft Weight-Sharing for Neural Network Compression

Ullrich, Karen, Meeds, Edward, Welling, Max

arXiv.org Machine LearningMay-9-2017

The success of deep learning in numerous application domains created the de- sire to run and train them on mobile devices. This however, conflicts with their computationally, memory and energy intense nature, leading to a growing interest in compression. Recent work by Han et al. (2015a) propose a pipeline that involves retraining, pruning and quantization of neural network weights, obtaining state-of-the-art compression rates. In this paper, we show that competitive compression rates can be achieved by using a version of soft weight-sharing (Nowlan & Hinton, 1992). Our method achieves both quantization and pruning in one simple (re-)training procedure. This point of view also exposes the relation between compression and the minimum description length (MDL) principle.

artificial intelligence, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

1702.04008

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.68)

Add feedback