AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Learning temporal formulas from examples is hard

Mascle, Corto, Fijalkow, Nathanaël, Lagarde, Guillaume

arXiv.org Artificial IntelligenceDec-26-2023

We study the problem of learning linear temporal logic (LTL) formulas from examples, as a first step towards expressing a property separating positive and negative instances in a way that is comprehensible for humans. In this paper we initiate the study of the computational complexity of the problem. Our main results are hardness results: we show that the LTL learning problem is NP-complete, both for the full logic and for almost all of its fragments. This motivates the search for efficient heuristics, and highlights the complexity of expressing separating properties in concise natural language.

formula, ltl, suffix, (16 more...)

arXiv.org Artificial Intelligence

2312.16336

Country: Europe > France > Nouvelle-Aquitaine > Gironde > Bordeaux (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)

Add feedback

Quantum Learning Theory Beyond Batch Binary Classification

Mohan, Preetham, Tewari, Ambuj

arXiv.org Machine LearningDec-26-2023

Arunachalam and de Wolf (2018) showed that the sample complexity of quantum batch learning of boolean functions, in the realizable and agnostic settings, has the same form and order as the corresponding classical sample complexities. In this paper, we extend this, ostensibly surprising, message to batch multiclass learning, online boolean learning, and online multiclass learning. For our online learning results, we first consider an adaptive adversary variant of the classical model of Dawid and Tewari (2022). Then, we introduce the first (to the best of our knowledge) model of online learning with quantum examples.

artificial intelligence, learner, machine learning, (15 more...)

arXiv.org Machine Learning

2302.07409

Country:

North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Data is Moody: Discovering Data Modification Rules from Process Event Logs

Schuster, Marco Bjarne, Wiegand, Boris, Vreeken, Jilles

arXiv.org Artificial IntelligenceDec-22-2023

Although event logs are a powerful source to gain insight about the behavior of the underlying business process, existing work primarily focuses on finding patterns in the activity sequences of an event log, while ignoring event attribute data. Event attribute data has mostly been used to predict event occurrences and process outcome, but the state of the art neglects to mine succinct and interpretable rules how event attribute data changes during process execution. Subgroup discovery and rule-based classification approaches lack the ability to capture the sequential dependencies present in event logs, and thus lead to unsatisfactory results with limited insight into the process behavior. Given an event log, we are interested in finding accurate yet succinct and interpretable if-then rules how the process modifies data. We formalize the problem in terms of the Minimum Description Length (MDL) principle, by which we choose the model with the best lossless description of the data. Additionally, we propose the greedy Moody algorithm to efficiently search for rules. By extensive experiments on both synthetic and real-world data, we show Moody indeed finds compact and interpretable rules, needs little data for accurate discovery, and is robust to noise.

event log, modification rule, moody, (14 more...)

arXiv.org Artificial Intelligence

2312.14571

Country:

Europe > Italy (0.04)
Europe > Germany > Bremen > Bremen (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.69)

Add feedback

Statistical learning theory and Occam's razor: The argument from empirical risk minimization

Sterkenburg, Tom F.

arXiv.org Artificial IntelligenceDec-21-2023

This paper considers the epistemic justification for a simplicity preference in inductive inference that may be obtained from the machine learning framework of statistical learning theory. Uniting elements from both earlier arguments suggesting and rejecting such a justification, the paper spells out a qualified means-ends and model-relative justificatory argument, built on statistical learning theory's central mathematical learning guarantee for the method of empirical risk minimization.

hypothesis, hypothesis class, justification, (14 more...)

arXiv.org Artificial Intelligence

2312.13842

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(2 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Automatic Parameter Selection for Non-Redundant Clustering

Leiber, Collin, Mautz, Dominik, Plant, Claudia, Böhm, Christian

arXiv.org Artificial IntelligenceDec-19-2023

High-dimensional datasets often contain multiple meaningful clusterings in different subspaces. For example, objects can be clustered either by color, weight, or size, revealing different interpretations of the given dataset. A variety of approaches are able to identify such non-redundant clusterings. However, most of these methods require the user to specify the expected number of subspaces and clusters for each subspace. Stating these values is a non-trivial problem and usually requires detailed knowledge of the input dataset. In this paper, we propose a framework that utilizes the Minimum Description Length Principle (MDL) to detect the number of subspaces and clusters per subspace automatically. We describe an efficient procedure that greedily searches the parameter space by splitting and merging subspaces and clusters within subspaces. Additionally, an encoding strategy is introduced that allows us to detect outliers in each subspace. Extensive experiments show that our approach is highly competitive to state-of-the-art methods.

algorithm, dataset, subspace, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1137/1.9781611977172.26

2312.11952

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.34)

Add feedback

A Competitive Algorithm for Agnostic Active Learning

Price, Eric, Zhou, Yihan

arXiv.org Artificial IntelligenceDec-16-2023

For some hypothesis classes and input distributions, active agnostic learning needs exponentially fewer samples than passive learning; for other classes and distributions, it offers little to no improvement. The most popular algorithms for agnostic active learning express their performance in terms of a parameter called the disagreement coefficient, but it is known that these algorithms are inefficient on some inputs. We take a different approach to agnostic active learning, getting an algorithm that is competitive with the optimal algorithm for any binary hypothesis class $H$ and distribution $D_X$ over $X$. In particular, if any algorithm can use $m^*$ queries to get $O(\eta)$ error, then our algorithm uses $O(m^* \log |H|)$ queries to get $O(\eta)$ error. Our algorithm lies in the vein of the splitting-based approach of Dasgupta [2004], which gets a similar result for the realizable ($\eta = 0$) setting. We also show that it is NP-hard to do better than our algorithm's $O(\log |H|)$ overhead in general.

algorithm, hypothesis, probability, (17 more...)

arXiv.org Artificial Intelligence

2310.18786

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.66)

Add feedback

On the Computational Benefit of Multimodal Learning

Lu, Zhou

arXiv.org Artificial IntelligenceDec-15-2023

Human perception inherently operates in a multimodal manner. Similarly, as machines interpret the empirical world, their learning processes ought to be multimodal. The recent, remarkable successes in empirical multimodal learning underscore the significance of understanding this paradigm. Yet, a solid theoretical foundation for multimodal learning has eluded the field for some time. While a recent study by Lu (2023) has shown the superior sample complexity of multimodal learning compared to its unimodal counterpart, another basic question remains: does multimodal learning also offer computational advantages over unimodal learning? This work initiates a study on the computational benefit of multimodal learning. We demonstrate that, under certain conditions, multimodal learning can outpace unimodal learning exponentially in terms of computation. Specifically, we present a learning task that is NP-hard for unimodal learning but is solvable in polynomial time by a multimodal algorithm. Our construction is based on a novel modification to the intersection of two half-spaces problem.

intersection, learning, multimodal, (13 more...)

arXiv.org Artificial Intelligence

2309.13782

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.91)

Add feedback

Balancing Summarization and Change Detection in Graph Streams

Fukushima, Shintaro, Yamanishi, Kenji

arXiv.org Machine LearningDec-12-2023

This study addresses the issue of balancing graph summarization and graph change detection. Graph summarization compresses large-scale graphs into a smaller scale. However, the question remains: To what extent should the original graph be compressed? This problem is solved from the perspective of graph change detection, aiming to detect statistically significant changes using a stream of summary graphs. If the compression rate is extremely high, important changes can be ignored, whereas if the compression rate is extremely low, false alarms may increase with more memory. This implies that there is a trade-off between compression rate in graph summarization and accuracy in change detection. We propose a novel quantitative methodology to balance this trade-off to simultaneously realize reliable graph summarization and change detection. We introduce a probabilistic structure of hierarchical latent variable model into a graph, thereby designing a parameterized summary graph on the basis of the minimum description length principle. The parameter specifying the summary graph is then optimized so that the accuracy of change detection is guaranteed to suppress Type I error probability (probability of raising false alarms) to be less than a given confidence level. First, we provide a theoretical framework for connecting graph summarization with change detection. Then, we empirically demonstrate its effectiveness on synthetic and real datasets.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

2311.18694

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.05)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.35)

Add feedback

Adaptive Parameter Selection for Kernel Ridge Regression

Lin, Shao-Bo

arXiv.org Artificial IntelligenceDec-10-2023

This paper focuses on parameter selection issues of kernel ridge regression (KRR). Due to special spectral properties of KRR, we find that delicate subdivision of the parameter interval shrinks the difference between two successive KRR estimates. Based on this observation, we develop an early-stopping type parameter selection strategy for KRR according to the so-called Lepskii-type principle. Theoretical verifications are presented in the framework of learning theory to show that KRR equipped with the proposed parameter selection strategy succeeds in achieving optimal learning rates and adapts to different norms, providing a new record of parameter selection for kernel methods.

krr, lemma 4, log 2 8, (15 more...)

arXiv.org Artificial Intelligence

2312.05885

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.57)

Add feedback

On the Role of Entanglement and Statistics in Learning

Arunachalam, Srinivasan, Havlicek, Vojtech, Schatzki, Louis

arXiv.org Artificial IntelligenceDec-10-2023

In this work we make progress in understanding the relationship between learning models with access to entangled, separable and statistical measurements in the quantum statistical query (QSQ) model. To this end, we show the following results. $\textbf{Entangled versus separable measurements.}$ The goal here is to learn an unknown $f$ from the concept class $C\subseteq \{f:\{0,1\}^n\rightarrow [k]\}$ given copies of $\frac{1}{\sqrt{2^n}}\sum_x \vert x,f(x)\rangle$. We show that, if $T$ copies suffice to learn $f$ using entangled measurements, then $O(nT^2)$ copies suffice to learn $f$ using just separable measurements. $\textbf{Entangled versus statistical measurements}$ The goal here is to learn a function $f \in C$ given access to separable measurements and statistical measurements. We exhibit a class $C$ that gives an exponential separation between QSQ learning and quantum learning with entangled measurements (even in the presence of noise). This proves the "quantum analogue" of the seminal result of Blum et al. [BKW'03]. that separates classical SQ and PAC learning with classification noise. $\textbf{QSQ lower bounds for learning states.}$ We introduce a quantum statistical query dimension (QSD), which we use to give lower bounds on the QSQ learning. With this we prove superpolynomial QSQ lower bounds for testing purity, shadow tomography, Abelian hidden subgroup problem, degree-$2$ functions, planted bi-clique states and output states of Clifford circuits of depth $\textsf{polylog}(n)$. $\textbf{Further applications.}$ We give and $\textit{unconditional}$ separation between weak and strong error mitigation and prove lower bounds for learning distributions in the QSQ model. Prior works by Quek et al. [QFK+'22], Hinsche et al. [HIN+'22], and Nietner et al. [NIS+'23] proved the analogous results $\textit{assuming}$ diagonal measurements and our work removes this assumption.

algorithm, complexity, query, (15 more...)

arXiv.org Artificial Intelligence

2306.03161

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Hardware (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Add feedback