Goto

Collaborating Authors

 Computational Learning Theory


A Compression Based Classification Framework Using Symbolic Dynamics of Chaotic Maps

arXiv.org Artificial Intelligence

We propose a novel classification framework grounded in symbolic dynamics and data compression using chaotic maps. The core idea is to model each class by generating symbolic sequences from thresholded real-valued training data, which are then evolved through a one-dimensional chaotic map. For each class, we compute the transition probabilities of symbolic patterns (e.g., `00', `01', `10', and `11' for the second return map) and aggregate these statistics to form a class-specific probabilistic model. During testing phase, the test data are thresholded and symbolized, and then encoded using the class-wise symbolic statistics via back iteration, a dynamical reconstruction technique. The predicted label corresponds to the class yielding the shortest compressed representation, signifying the most efficient symbolic encoding under its respective chaotic model. This approach fuses concepts from dynamical systems, symbolic representations, and compression-based learning. We evaluate the proposed method: \emph{ChaosComp} on both synthetic and real-world datasets, demonstrating competitive performance compared to traditional machine learning algorithms (e.g., macro F1-scores for the proposed method on Breast Cancer Wisconsin = 0.9531, Seeds = 0.9475, Iris = 0.8469 etc.). Rather than aiming for state-of-the-art performance, the goal of this research is to reinterpret the classification problem through the lens of dynamical systems and compression, which are foundational perspectives in learning theory and information processing.


A Conjecture on a Fundamental Trade-Off between Certainty and Scope in Symbolic and Generative AI

arXiv.org Artificial Intelligence

This article introduces a conjecture that formalises a fundamental trade-off between provable correctness and broad data-mapping capacity in Artificial Intelligence (AI) systems. When an AI system is engineered for deductively watertight guarantees (demonstrable certainty about the error-free nature of its outputs) -- as in classical symbolic AI -- its operational domain must be narrowly circumscribed and pre-structured. Conversely, a system that can input high-dimensional data to produce rich information outputs -- as in contemporary generative models -- necessarily relinquishes the possibility of zero-error performance, incurring an irreducible risk of errors or misclassification. By making this previously implicit trade-off explicit and open to rigorous verification, the conjecture significantly reframes both engineering ambitions and philosophical expectations for AI. After reviewing the historical motivations for this tension, the article states the conjecture in information-theoretic form and contextualises it within broader debates in epistemology, formal verification, and the philosophy of technology. It then offers an analysis of its implications and consequences, drawing on notions of underdetermination, prudent epistemic risk, and moral responsibility. The discussion clarifies how, if correct, the conjecture would help reshape evaluation standards, governance frameworks, and hybrid system design. The conclusion underscores the importance of eventually proving or refuting the inequality for the future of trustworthy AI.


Learning in Structured Stackelberg Games

arXiv.org Artificial Intelligence

We study structured Stackelberg games, in which both players (the leader and the follower) observe contextual information about the state of the world at time of play. The leader plays against one of a finite number of followers, but the follower's type is not known until after the game has ended. Importantly, we assume a fixed relationship between the contextual information and the follower's type, thereby allowing the leader to leverage this additional structure when deciding her strategy. Under this setting, we find that standard learning theoretic measures of complexity do not characterize the difficulty of the leader's learning task. Instead, we introduce a new notion of dimension, the Stackelberg-Littlestone dimension, which we show characterizes the instance-optimal regret of the leader in the online setting. Based on this, we also provide a provably optimal learning algorithm. We extend our results to the distributional setting, where we use two new notions of dimension, the $ฮณ$-Stackelberg-Natarajan dimension and $ฮณ$-Stackelberg-Graph dimension. We prove that these control the sample complexity lower and upper bounds respectively, and we design a simple, improper algorithm that achieves the upper bound.


An $\tilde{O}$ptimal Differentially Private Learner for Concept Classes with VC Dimension 1

arXiv.org Artificial Intelligence

Machine learning algorithms can access sensitive information from the training dataset. We research the privacy-preserving machine learning technique, introduced by Kasiviswanathan et al. [17], that targets to learn a hypothesis while preserving the privacy of individual entries in the dataset. Informally, the goal is to construct a learner that satisfies the requirements of probably approximately correct (PAC) learning [22] and, simultaneously, differential privacy [11].


SAT-Based Bounded Fitting for the Description Logic ALC

arXiv.org Artificial Intelligence

Bounded fitting is a general paradigm for learning logical formulas from positive and negative data examples, that has received considerable interest recently. We investigate bounded fitting for the description logic ALC and its syntactic fragments. We show that the underlying size-restricted fitting problem is NP-complete for all studied fragments, even in the special case of a single positive and a single negative example. By design, bounded fitting comes with probabilistic guarantees in Valiant's PAC learning framework. In contrast, we show that other classes of algorithms for learning ALC concepts do not provide such guarantees. Finally, we present an implementation of bounded fitting in ALC and its fragments based on a SAT solver. We discuss optimizations and compare our implementation to other concept learning tools.


Probably Approximately Correct Causal Discovery

arXiv.org Machine Learning

The discovery of causal relationships is a foundational problem in artificial intelligence, statistics, epidemiology, economics, and beyond. While elegant theories exist for accurate causal discovery given infinite data, real-world applications are inherently resource-constrained. Effective methods for inferring causal relationships from observational data must perform well under finite data and time constraints, where "performing well" implies achieving high, though not perfect accuracy. In his seminal paper A Theory of the Learnable (Valiant, 1984), Valiant highlighted the importance of resource constraints in supervised machine learning, introducing the concept of Probably Approximately Correct (PAC) learning as an alternative to exact learning. Inspired by Valiant's work, we propose the Probably Approximately Correct Causal (PACC) Discovery framework, which extends PAC learning principles to the causal field. This framework emphasizes both computational and sample efficiency for established causal methods such as propensity score techniques and instrumental variable approaches. Furthermore, we show that it can provide theoretical guarantees for other widely used methods, such as the Self-Controlled Case Series (SCCS) method, which had previously lacked such guarantees.


Feature Selection and Junta Testing are Statistically Equivalent

arXiv.org Machine Learning

For a function $f \colon \{0,1\}^n \to \{0,1\}$, the junta testing problem asks whether $f$ depends on only $k$ variables. If $f$ depends on only $k$ variables, the feature selection problem asks to find those variables. We prove that these two tasks are statistically equivalent. Specifically, we show that the ``brute-force'' algorithm, which checks for any set of $k$ variables consistent with the sample, is simultaneously sample-optimal for both problems, and the optimal sample size is \[ ฮ˜\left(\frac 1 \varepsilon \left( \sqrt{2^k \log {n \choose k}} + \log {n \choose k}\right)\right). \]


Context-Based Fake News Detection using Graph Based Approach: ACOVID-19 Use-case

arXiv.org Artificial Intelligence

In todayล› digital world, fake news is spreading with immense speed. Its a significant concern to address. In this work, we addressed that challenge using novel graph based approach. We took dataset from Kaggle that contains real and fake news articles. To test our approach we incorporated recent covid-19 related news articles that contains both genuine and fake news that are relevant to this problem. This further enhances the dataset as well instead of relying completely on the original dataset. We propose a contextual graph-based approach to detect fake news articles. We need to convert news articles into appropriate schema, so we leverage Natural Language Processing (NLP) techniques to transform news articles into contextual graph structures. We then apply the Minimum Description Length (MDL)-based Graph-Based Anomaly Detection (GBAD) algorithm for graph mining. Graph-based methods are particularly effective for handling rich contextual data, as they enable the discovery of complex patterns that traditional query-based or statistical techniques might overlook. Our proposed approach identifies normative patterns within the dataset and subsequently uncovers anomalous patterns that deviate from these established norms.


On statistical learning of graphs

arXiv.org Artificial Intelligence

We study PAC and online learnability of hypothesis classes formed by copies of a countably infinite graph G, where each copy is induced by permuting G's vertices. This corresponds to learning a graph's labeling, knowing its structure and label set. We consider classes where permutations move only finitely many vertices. Our main result shows that PAC learnability of all such finite-support copies implies online learnability of the full isomorphism type of G, and is equivalent to the condition of automorphic triviality. We also characterize graphs where copies induced by swapping two vertices are not learnable, using a relaxation of the extension property of the infinite random graph. Finally, we show that, for all G and k>2, learnability for k-vertex permutations is equivalent to that for 2-vertex permutations, yielding a four-class partition of infinite graphs, whose complexity we also determine using tools coming from both descriptive set theory and computability theory.


(Exhaustive) Symbolic Regression and model selection by minimum description length

arXiv.org Artificial Intelligence

Symbolic regression is the machine learning method for learning functions from data. After a brief overview of the symbolic regression landscape, I will describe the two main challenges that traditional algorithms face: they have an unknown (and likely significant) probability of failing to find any given good function, and they suffer from ambiguity and poorly-justified assumptions in their function-selection procedure. To address these I propose an exhaustive search and model selection by the minimum description length principle, which allows accuracy and complexity to be directly traded off by measuring each in units of information. I showcase the resulting publicly available Exhaustive Symbolic Regression algorithm on three open problems in astrophysics: the expansion history of the universe, the effective behaviour of gravity in galaxies and the potential of the inflaton field. In each case the algorithm identifies many functions superior to the literature standards. This general purpose methodology should find widespread utility in science and beyond.