AITopics

2502.20229

Country:

Asia > Middle East > Israel (0.14)
North America > United States > New York (0.14)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)

Nazaret, Achille, Blei, David

Extremely Greedy Equivalence Search

arXiv.org Machine LearningFeb-26-2025

The goal of causal discovery is to learn a directed acyclic graph from data. One of the most well-known methods for this problem is Greedy Equivalence Search (GES). GES searches for the graph by incrementally and greedily adding or removing edges to maximize a model selection criterion. It has strong theoretical guarantees on infinite data but can fail in practice on finite data. In this paper, we first identify some of the causes of GES's failure, finding that it can get blocked in local optima, especially in denser graphs. We then propose eXtremely Greedy Equivalent Search (XGES), which involves a new heuristic to improve the search strategy of GES while retaining its theoretical guarantees. In particular, XGES favors deleting edges early in the search over inserting edges, which reduces the possibility of the search ending in local optima. A further contribution of this work is an efficient algorithmic formulation of XGES (and GES). We benchmark XGES on simulated datasets with known ground truth. We find that XGES consistently outperforms GES in recovering the correct graphs, and it is 10 times faster. XGES implementations in Python and C++ are available at https://github.com/ANazaret/XGES.

artificial intelligence, machine learning, optimization problem, (18 more...)

2502.19551

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceFeb-25-2025

Comparative Analysis of MDL-VAE vs. Standard VAE on 202 Years of Gynecological Data

Santos, Paula

This study presents a comparative evaluation of a Variational Autoencoder (VAE) enhanced with Minimum Description Length (MDL) regularization against a Standard Autoencoder for reconstructing high - dimensional gynecological data. The MDL - VAE exhibits significantly lower reconstruction errors (MSE, MAE, RMSE) and more structured latent representations, driven by effective KL divergence regularization. Statistical analyses confirm these performance improvements are significant. Furthermore, the MDL - VAE shows consistent training and validation losses and achieves efficient inference times, underscoring its robustness and practical viability. Our findings suggest that incorporating MDL principles into VAE architectures can substantially improve data reconstruction and generalization, making it a promising approach for advanced applica tions in healthcare data modeling and analysis. Despite substantial advances in medical research, early detection of menstrual disorders and tumors in the female reproductive system remains a significant challenge. This issue is critical because timely detection is essential for improving treatment outcomes, quality of life, and patient survival rates.

artificial intelligence, machine learning, representation, (16 more...)

doi: 10.5121/csit.2025.150403

2502.18412

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Consumer Health (0.34)
Health & Medicine > Therapeutic Area > Oncology (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.56)

arXiv.org Artificial IntelligenceFeb-25-2025

Contrastive Learning with Nasty Noise

Zhao, Ziruo

Contrastive learning has emerged as a powerful paradigm for self-supervised representation learning. This work analyzes the theoretical limits of contrastive learning under nasty noise, where an adversary modifies or replaces training samples. Using PAC learning and VC-dimension analysis, lower and upper bounds on sample complexity in adversarial settings are established. Additionally, data-dependent sample complexity bounds based on the l2-distance function are derived.

algorithm, artificial intelligence, machine learning, (16 more...)

2502.17872

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.56)

arXiv.org Machine LearningFeb-21-2025

Towards Efficient Contrastive PAC Learning

Shen, Jie

We study contrastive learning under the PAC learning framework. While a series of recent works have shown statistical results for learning under contrastive loss, based either on the VC-dimension or Rademacher complexity, their algorithms are inherently inefficient or not implying PAC guarantees. In this paper, we consider contrastive learning of the fundamental concept of linear representations. Surprisingly, even under such basic setting, the existence of efficient PAC learners is largely open. We first show that the problem of contrastive PAC learning of linear representations is intractable to solve in general. We then show that it can be relaxed to a semi-definite program when the distance between contrastive samples is measured by the $\ell_2$-norm. We then establish generalization guarantees based on Rademacher complexity, and connect it to PAC guarantees under certain contrastive large-margin conditions. To the best of our knowledge, this is the first efficient PAC learning algorithm for contrastive learning.

artificial intelligence, complexity, machine learning, (17 more...)

2502.15962

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Yamchote, Phaphontee, Win, Saw Nay Htet, Amornbunchornvej, Chainarong, Noraset, Thanapon

Some Insights of Construction of Feature Graph to Learn Pairwise Feature Interactions with Graph Neural Networks

arXiv.org Machine LearningFeb-19-2025

Feature interaction is crucial in predictive machine learning models, as it captures the relationships between features that influence model performance. In this work, we focus on pairwise interactions and investigate their importance in constructing feature graphs for Graph Neural Networks (GNNs). Rather than proposing new methods, we leverage existing GNN models and tools to explore the relationship between feature graph structures and their effectiveness in modeling interactions. Through experiments on synthesized datasets, we uncover that edges between interacting features are important for enabling GNNs to model feature interactions effectively. We also observe that including non-interaction edges can act as noise, degrading model performance. Furthermore, we provide theoretical support for sparse feature graph selection using the Minimum Description Length (MDL) principle. We prove that feature graphs retaining only necessary interaction edges yield a more efficient and interpretable representation than complete graphs, aligning with Occam's Razor. Our findings offer both theoretical insights and practical guidelines for designing feature graphs that improve the performance and interpretability of GNN models.

artificial intelligence, graph, machine learning, (18 more...)

2502.13471

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Wendong, Liang, Buchholz, Simon, Schölkopf, Bernhard

Algorithmic causal structure emerging through compression

arXiv.org Artificial IntelligenceFeb-18-2025

We explore the relationship between causality, symmetry, and compression. We build on and generalize the known connection between learning and compression to a setting where causal models are not identifiable. We propose a framework where causality emerges as a consequence of compressing data across multiple environments. We define algorithmic causality as an alternative definition of causality when traditional assumptions for causal identifiability do not hold. We demonstrate how algorithmic causal and symmetric structures can emerge from minimizing upper bounds on Kolmogorov complexity, without knowledge of intervention targets. We hypothesize that these insights may also provide a novel perspective on the emergence of causality in machine learning models, such as large language models, where causal relationships may not be explicitly identifiable.

artificial intelligence, machine learning, mechanism, (19 more...)

2502.0421

Country:

Europe (0.68)
North America > United States (0.27)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

B, Harikrishnan N, Vats, Anuja, Nagaraj, Nithin, Pedersen, Marius

Chaotic Map based Compression Approach to Classification

arXiv.org Artificial IntelligenceFeb-17-2025

Modern machine learning approaches often prioritize performance at the cost of increased complexity, computational demands, and reduced interpretability. This paper introduces a novel framework that challenges this trend by reinterpreting learning from an information-theoretic perspective, viewing it as a search for encoding schemes that capture intrinsic data structures through compact representations. Rather than following the conventional approach of fitting data to complex models, we propose a fundamentally different method that maps data to intervals of initial conditions in a dynamical system. Our GLS (Generalized L\"uroth Series) coding compression classifier employs skew tent maps - a class of chaotic maps - both for encoding data into initial conditions and for subsequent recovery. The effectiveness of this simple framework is noteworthy, with performance closely approaching that of well-established machine learning methods. On the breast cancer dataset, our approach achieves 92.98\% accuracy, comparable to Naive Bayes at 94.74\%. While these results do not exceed state-of-the-art performance, the significance of our contribution lies not in outperforming existing methods but in demonstrating that a fundamentally simpler, more interpretable approach can achieve competitive results.

artificial intelligence, machine learning, sequence, (15 more...)

2502.12302

Country:

Asia > India (0.46)
Europe (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

arXiv.org Artificial IntelligenceFeb-16-2025

Structure based SAT dataset for analysing GNN generalisation

Fu, Yi, Tompkins, Anthony, Song, Yang, Pagnucco, Maurice

Satisfiability (SAT) solvers based on techniques such as conflict driven clause learning (CDCL) have produced excellent performance on both synthetic and real world industrial problems. While these CDCL solvers only operate on a per-problem basis, graph neural network (GNN) based solvers bring new benefits to the field by allowing practitioners to exploit knowledge gained from solved problems to expedite solving of new SAT problems. However, one specific area that is often studied in the context of CDCL solvers, but largely overlooked in GNN solvers, is the relationship between graph theoretic measure of structure in SAT problems and the generalisation ability of GNN solvers. To bridge the gap between structural graph properties (e.g., modularity, self-similarity) and the generalisability (or lack thereof) of GNN based SAT solvers, we present StructureSAT: a curated dataset, along with code to further generate novel examples, containing a diverse set of SAT problems from well known problem domains. Furthermore, we utilise a novel splitting method that focuses on deconstructing the families into more detailed hierarchies based on their structural properties. With the new dataset, we aim to help explain problematic generalisation in existing GNN SAT solvers by exploiting knowledge of structural graph properties. We conclude with multiple future directions that can help researchers in GNN based SAT solving develop more effective and generalisable SAT solvers.

artificial intelligence, deep learning, machine learning, (19 more...)

2502.1141

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Block, Adam, Shetty, Abhishek

Small Loss Bounds for Online Learning Separated Function Classes: A Gaussian Process Perspective

arXiv.org Machine LearningFeb-14-2025

In order to develop practical and efficient algorithms while circumventing overly pessimistic computational lower bounds, recent work has been interested in developing oracle-efficient algorithms in a variety of learning settings. Two such settings of particular interest are online and differentially private learning. While seemingly different, these two fields are fundamentally connected by the requirement that successful algorithms in each case satisfy stability guarantees; in particular, recent work has demonstrated that algorithms for online learning whose performance adapts to beneficial problem instances, attaining the so-called small-loss bounds, require a form of stability similar to that of differential privacy. In this work, we identify the crucial role that separation plays in allowing oracle-efficient algorithms to achieve this strong stability. Our notion, which we term $\rho$-separation, generalizes and unifies several previous approaches to enforcing this strong stability, including the existence of small-separator sets and the recent notion of $\gamma$-approximability. We present an oracle-efficient algorithm that is capable of achieving small-loss bounds with improved rates in greater generality than previous work, as well as a variant for differentially private learning that attains optimal rates, again under our separation condition. In so doing, we prove a new stability result for minimizers of a Gaussian process that strengthens and generalizes previous work.

artificial intelligence, data mining, machine learning, (19 more...)

2502.10292

Country:

North America > United States (1.00)
Asia (0.92)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (0.67)
Education > Educational Setting > Online (0.63)

Technology:

Information Technology > Modeling & Simulation (0.72)
Information Technology > Data Science > Data Mining (0.67)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)