AITopics

2012.03413

Country:

North America > United States > California (0.14)
North America > United States > Tennessee (0.14)

Genre: Research Report (0.82)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)

Makhalova, Tatiana, Kuznetsov, Sergei O., Napoli, Amedeo

Mint: MDL-based approach for Mining INTeresting Numerical Pattern Sets

arXiv.org Artificial IntelligenceNov-30-2020

Pattern mining is well established in data mining research, especially for mining binary datasets. Surprisingly, there is much less work about numerical pattern mining and this research area remains under-explored. In this paper, we propose Mint, an efficient MDL-based algorithm for mining numerical datasets. The MDL principle is a robust and reliable framework widely used in pattern mining, and as well in subgroup discovery. In Mint we reuse MDL for discovering useful patterns and returning a set of non-redundant overlapping patterns with well-defined boundaries and covering meaningful groups of objects. Mint is not alone in the category of numerical pattern miners based on MDL. In the experiments presented in the paper we show that Mint outperforms competitors among which Slim and RealKrimp.

artificial intelligence, data mining, dataset, (20 more...)

2011.14843

Country:

Europe > France (0.14)
Europe > Russia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.49)

Fukushima, Shintaro, Yamanishi, Kenji

Detecting Hierarchical Changes in Latent Variable Models

arXiv.org Machine LearningNov-22-2020

This paper addresses the issue of detecting hierarchical changes in latent variable models (HCDL) from data streams. There are three different levels of changes for latent variable models: 1) the first level is the change in data distribution for fixed latent variables, 2) the second one is that in the distribution over latent variables, and 3) the third one is that in the number of latent variables. It is important to detect these changes because we can analyze the causes of changes by identifying which level a change comes from (change interpretability). This paper proposes an information-theoretic framework for detecting changes of the three levels in a hierarchical way. The key idea to realize it is to employ the MDL (minimum description length) change statistics for measuring the degree of change, in combination with DNML (decomposed normalized maximum likelihood) code-length calculation. We give a theoretical basis for making reliable alarms for changes. Focusing on stochastic block models, we employ synthetic and benchmark datasets to empirically demonstrate the effectiveness of our framework in terms of change interpretability as well as change detection.

artificial intelligence, change detection, machine learning, (15 more...)

2011.09465

Country:

Asia > Japan > Honshū > Kantō (0.29)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.48)

arXiv.org Machine LearningOct-31-2020

Measure Theoretic Approach to Nonuniform Learnability

Bandyopadhyay, Ankit

An earlier introduced characterization of nonuniform learnability that allows the sample size to depend on the hypothesis to which the learner is compared has been redefined using the measure-theoretic approach. Where nonuniform learnability is a strict relaxation of the Probably Approximately Correct (PAC) framework. Introduction of a new algorithm, Generalize Measure Learnability framework (GML), to implement this approach with the study of its sample and computational complexity bounds. Like the Minimum Description Length (MDL) principle, this approach can be regarded as an explication of Occam's razor. Furthermore, many situations were presented (Hypothesis Classes that are countable where we can apply the GML framework) which we can learn to use the GML scheme and can achieve statistical consistency.

artificial intelligence, hypothesis class, machine learning, (13 more...)

2011.00392

Country: Asia > India (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.69)

Yang, Lincen, Baratchi, Mitra, van Leeuwen, Matthijs

Unsupervised Discretization by Two-dimensional MDL-based Histogram

arXiv.org Machine LearningOct-28-2020

Unsupervised discretization is a crucial step in many knowledge discovery tasks. The state-of-the-art method for one-dimensional data infers locally adaptive histograms using the minimum description length (MDL) principle, but the multi-dimensional case is far less studied: current methods consider the dimensions one at a time (if not independently), which result in discretizations based on rectangular cells of adaptive size. Unfortunately, this approach is unable to adequately characterize dependencies among dimensions and/or results in discretizations consisting of more cells (or bins) than is desirable. To address this problem, we propose an expressive model class that allows for far more flexible partitions of two-dimensional data. We extend the state of the art for the one-dimensional case to obtain a model selection problem based on the normalised maximum likelihood, a form of refined MDL. As the flexibility of our model class comes at the cost of a vast search space, we introduce a heuristic algorithm, named PALM, which partitions each dimension alternately and then merges neighbouring regions, all using the MDL principle. Experiments on synthetic data show that PALM 1) accurately reveals ground truth partitions that are within the model class (i.e., the search space), given a large enough sample size; 2) approximates well a wide range of partitions outside the model class; 3) converges, in contrast to its closest competitor IPD; and 4) is self-adaptive with regard to both sample size and local density structure of the data despite being parameter-free. Finally, we apply our algorithm to two geographic datasets to demonstrate its real-world potential.

artificial intelligence, machine learning, partition, (16 more...)

2006.01893

Country: Europe (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
(4 more...)

SY, Pranay, Nagaraj, Nithin

Causal Discovery using Compression-Complexity Measures

arXiv.org Machine LearningOct-22-2020

The task of learning a causal model from observational data, or a combination of observational and interventional data, is commonly referred to as a causal discovery or causal structure learning [1]. Causal discovery from two variables based on observational data in the absence of time series or controlled interventions is a challenging problem and necessitates additional assumptions [2]. This is a ubiquitous problem in almost all domains of science, but particularly so in econometrics, meteorology, biology and medicine where interventional approaches are difficult or in several cases not feasible. Model-free data-driven approaches for causal discovery have developed significantly over the past decade or so in an attempt to address the problem of causal discovery such as Granger Causality (GC) [3], Transfer Entropy (TE) [4] and Compression-Complexity Causality (CCC) [5]. These methods have been used in various disciplines across neuroscience, climatology, econometrics, etc and rely on properties of time-series data. Both GC and TE have assumptions that need to be met for satisfactory inference, while CCC is assumption-free and robust to many artefacts and nuisance variables. All three need careful parameter calibration and selection for optimally accurate performance. A class of model-free causal discovery methods do not assume a temporal structure in the data and are rooted in algorithmic information theory, chiefly based on the notion of Kolmogorov complexity. The Kolmogorov complexity of a finite binary string is the length of the shortest binary program that generates that string and reflects the computational resources needed to specify it.

immunology, neurology, sequence, (18 more...)

2010.09336

Country:

Asia (1.00)
North America > United States (0.46)
Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.69)

arXiv.org Artificial IntelligenceJul-28-2020

The Minimum Description Length Principle for Pattern Mining: A Survey

Galbrun, Esther

The aim of this document is to review the development of pattern mining methods based on and inspired from the Minimum Description Length (MDL) principle. Although this is an unrealistic goal, we strive for completeness. The reader is expected to be familiar with common pattern mining tasks and techniques, but not necessarily with concepts from information theory and coding, of which we therefore give an outline in Section 2. Background work is covered in Section 3, starting with the theory behind the MDL principle and similar principles, going over a few examples of uses of the principle in the adjacent fields of machine learning and natural language processing, and ending with a review of data mining methods that involve practical compression as a tool or that consider the problem of selecting patterns.

health & medicine, proceedings, survey article, (18 more...)

2007.14009

Country: Europe (0.45)

Genre:

Summary/Review (0.67)
Research Report (0.50)
Instructional Material (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Information Technology > Smart Houses & Appliances (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Dwivedi, Raaz, Singh, Chandan, Yu, Bin, Wainwright, Martin J.

Revisiting complexity and the bias-variance tradeoff

arXiv.org Machine LearningJun-17-2020

The recent success of high-dimensional models, such as deep neural networks (DNNs), has led many to question the validity of the bias-variance tradeoff principle in high dimensions. We reexamine it with respect to two key choices: the model class and the complexity measure. We argue that failing to suitably specify either one can falsely suggest that the tradeoff does not hold. This observation motivates us to seek a valid complexity measure, defined with respect to a reasonably good class of models. Building on Rissanen's principle of minimum description length (MDL), we propose a novel MDL-based complexity (MDL-COMP). We focus on the context of linear models, which have been recently used as a stylized tractable approximation to DNNs in high-dimensions. MDL-COMP is defined via an optimality criterion over the encodings induced by a good Ridge estimator class. We derive closed-form expressions for MDL-COMP and show that for a dataset with $n$ observations and $d$ parameters it is \emph{not always} equal to $d/n$, and is a function of the singular values of the design matrix and the signal-to-noise ratio. For random Gaussian design, we find that while MDL-COMP scales linearly with $d$ in low-dimensions ($dn$) the scaling is exponentially smaller, scaling as $\log d$. We hope that such a slow growth of complexity in high-dimensions can help shed light on the good generalization performance of several well-tuned high-dimensional models. Moreover, via an array of simulations and real-data experiments, we show that a data-driven Prac-MDL-COMP can inform hyper-parameter tuning for ridge regression in limited data settings, sometimes improving upon cross-validation.

deep learning, mdl-comp, neural network, (19 more...)

2006.10189

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Proença, Hugo M., Grünwald, Peter, Bäck, Thomas, van Leeuwen, Matthijs

Discovering outstanding subgroup lists for numeric targets using MDL

arXiv.org Machine LearningJun-16-2020

The task of subgroup discovery (SD) is to find interpretable descriptions of subsets of a dataset that stand out with respect to a target attribute. To address the problem of mining large numbers of redundant subgroups, subgroup set discovery (SSD) has been proposed. State-of-the-art SSD methods have their limitations though, as they typically heavily rely on heuristics and/or user-chosen hyperparameters. We propose a dispersion-aware problem formulation for subgroup set discovery that is based on the minimum description length (MDL) principle and subgroup lists. We argue that the best subgroup list is the one that best summarizes the data given the overall distribution of the target. We restrict our focus to a single numeric target variable and show that our formalization coincides with an existing quality measure when finding a single subgroup, but that-in addition-it allows to trade off subgroup quality with the complexity of the subgroup. We next propose SSD++, a heuristic algorithm for which we empirically demonstrate that it returns outstanding subgroup lists: non-redundant sets of compact subgroups that stand out by having strongly deviating means and small spread.

artificial intelligence, bayesian inference, subgroup, (18 more...)

2006.09186

Country: Europe > Netherlands (0.28)

Genre: Research Report (1.00)

Industry: Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Cheney, James, Gombau, Xavier, Berrada, Ghita, Benabderrahmane, Sidahmed

Categorical anomaly detection in heterogeneous data using minimum description length clustering

arXiv.org Artificial IntelligenceJun-14-2020

Two examples of anomaly detection based on MDL have been been proposed for categorical data based on the minimum description studied and shown to perform well: the OC3 algorithm [21] based length (MDL) principle. However, they can be ineffective when on an itemset mining technique called Krimp [26], and the CompreX detecting anomalies in heterogeneous datasets representing a mixture algorithm [2]. Broadly speaking, both take a similar approach: of different sources, such as security scenarios in which system first, a model H of the data that compresses it well is found using a and user processes have distinct behavior patterns. We propose a heuristic search, balancing the model complexity L(H) (number of meta-algorithm for enhancing any MDL-based anomaly detection bits required to compress the model structure/parameters) against model to deal with heterogeneous data by fitting a mixture model the data complexity L(X H) (number of bits required to compress to the data, via a variant of k-means clustering. Our experimental the data given the model). Once such a model H is found, we assign results show that using a discrete mixture model provides competitive to each object x X a score corresponding to the object's performance relative to two previous anomaly detection compressed size L(x H) given the selected model. Intuitively, if the algorithms, while mixtures of more sophisticated models yield further model accurately characterizes the data as a whole, records that are gains, on both synthetic datasets and realistic datasets from a representative will compress well, yielding a low anomaly score, security scenario.

anomaly detection, artificial intelligence, data mining, (19 more...)

2006.07916

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.71)