AITopics

2407.19777

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Denmark (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Walker, Thomas, Lomuscio, Alessio

Tightening the Evaluation of PAC Bounds Using Formal Verification Results

arXiv.org Artificial IntelligenceJul-29-2024

Probably Approximately Correct (PAC) bounds are widely used to derive probabilistic guarantees for the generalisation of machine learning models. They highlight the components of the model which contribute to its generalisation capacity. However, current state-of-the-art results are loose in approximating the generalisation capacity of deployed machine learning models. Consequently, while PAC bounds are theoretically useful, their applicability for evaluating a model's generalisation property in a given operational design domain is limited. The underlying classical theory is supported by the idea that bounds can be tightened when the number of test points available to the user to evaluate the model increases. Yet, in the case of neural networks, the number of test points required to obtain bounds of interest is often impractical even for small problems. In this paper, we take the novel approach of using the formal verification of neural systems to inform the evaluation of PAC bounds. Rather than using pointwise information obtained from repeated tests, we use verification results on regions around test points. We show that conditioning existing bounds on verification results leads to a tightening proportional to the underlying probability mass of the verified region.

generalisation, neural network, verification, (13 more...)

2407.20122

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)

Blum, Avrim, Ravichandran, Kavya

A Model for Combinatorial Dictionary Learning and Inference

arXiv.org Artificial IntelligenceJul-25-2024

We are often interested in decomposing complex, structured data into simple components that explain the data. The linear version of this problem is well-studied as dictionary learning and factor analysis. In this work, we propose a combinatorial model in which to study this question, motivated by the way objects occlude each other in a scene to form an image. First, we identify a property we call "well-structuredness" of a set of low-dimensional components which ensures that no two components in the set are too similar. We show how well-structuredness is sufficient for learning the set of latent components comprising a set of sample instances. We then consider the problem: given a set of components and an instance generated from some unknown subset of them, identify which parts of the instance arise from which components. We consider two variants: (1) determine the minimal number of components required to explain the instance; (2) determine the correct explanation for as many locations as possible. For the latter goal, we also devise a version that is robust to adversarial corruptions, with just a slightly stronger assumption on the components. Finally, we show that the learning problem is computationally infeasible in the absence of any assumptions.

background, pixel, probability, (14 more...)

2407.18436

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.40)

Industry: Education (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Charusaie, Mohammad-Amin, Samadi, Samira

A Unifying Post-Processing Framework for Multi-Objective Learn-to-Defer Problems

arXiv.org Artificial IntelligenceJul-17-2024

Learn-to-Defer is a paradigm that enables learning algorithms to work not in isolation but as a team with human experts. In this paradigm, we permit the system to defer a subset of its tasks to the expert. Although there are currently systems that follow this paradigm and are designed to optimize the accuracy of the final human-AI team, the general methodology for developing such systems under a set of constraints (e.g., algorithmic fairness, expert intervention budget, defer of anomaly, etc.) remains largely unexplored. In this paper, using a $d$-dimensional generalization to the fundamental lemma of Neyman and Pearson (d-GNP), we obtain the Bayes optimal solution for learn-to-defer systems under various constraints. Furthermore, we design a generalizable algorithm to estimate that solution and apply this algorithm to the COMPAS and ACSIncome datasets. Our algorithm shows improvements in terms of constraint violation over a set of baselines.

classifier, constraint, probability, (15 more...)

2407.1271

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France (0.04)
Africa > South Africa (0.04)

Genre: Research Report (0.63)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Transportation > Ground (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.87)
(2 more...)

Braverman, Vladimir, Dharangutte, Prathamesh, Shah, Vihan, Wang, Chen

Learning-augmented Maximum Independent Set

arXiv.org Artificial IntelligenceJul-16-2024

We study the Maximum Independent Set (MIS) problem on general graphs within the framework of learning-augmented algorithms. The MIS problem is known to be NP-hard and is also NP-hard to approximate to within a factor of $n^{1-\delta}$ for any $\delta>0$. We show that we can break this barrier in the presence of an oracle obtained through predictions from a machine learning model that answers vertex membership queries for a fixed MIS with probability $1/2+\varepsilon$. In the first setting we consider, the oracle can be queried once per vertex to know if a vertex belongs to a fixed MIS, and the oracle returns the correct answer with probability $1/2 + \varepsilon$. Under this setting, we show an algorithm that obtains an $\tilde{O}(\sqrt{\Delta}/\varepsilon)$-approximation in $O(m)$ time where $\Delta$ is the maximum degree of the graph. In the second setting, we allow multiple queries to the oracle for a vertex, each of which is correct with probability $1/2 + \varepsilon$. For this setting, we show an $O(1)$-approximation algorithm using $O(n/\varepsilon^2)$ total queries and $\tilde{O}(m)$ runtime.

algorithm, query, vertex, (14 more...)

2407.11364

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(22 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)

Faber, Daniel, Jabrayilov, Adalat, Mutzel, Petra

SAT Encoding of Partial Ordering Models for Graph Coloring Problems

arXiv.org Artificial IntelligenceJul-12-2024

In this paper, we revisit SAT encodings of the partial-ordering based ILP model for the graph coloring problem (GCP) and suggest a generalization for the bandwidth coloring problem (BCP). The GCP asks for the minimum number of colors that can be assigned to the vertices of a given graph such that each two adjacent vertices get different colors. The BCP is a generalization, where each edge has a weight that enforces a minimal'distance' between the assigned colors, and the goal is to minimize the'largest' color used. For the widely studied GCP, we experimentally compare the partial-ordering based SAT encoding to the state-of-the-art approaches on the DIMACS benchmark set. Our evaluation confirms that this SAT encoding is effective for sparse graphs and even outperforms the state-of-the-art on some DIMACS instances. For the BCP, our theoretical analysis shows that the partial-ordering based SAT and ILP formulations have an asymptotically smaller size than that of the classical assignment-based model. Our practical evaluation confirms not only a dominance compared to the assignment-based encodings but also to the state-of-the-art approaches on a set of benchmark instances. Up to our knowledge, we have solved several open instances of the BCP from the literature for the first time.

constraint, formulation, vertex, (15 more...)

2403.15961

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

arXiv.org Machine LearningJul-10-2024

Ramsey Theorems for Trees and a General 'Private Learning Implies Online Learning' Theorem

Fioravanti, Simone, Hanneke, Steve, Moran, Shay, Schefler, Hilla, Tsubari, Iska

This work continues to investigate the link between differentially private (DP) and online learning. Alon, Livni, Malliaris, and Moran (2019) showed that for binary concept classes, DP learnability of a given class implies that it has a finite Littlestone dimension (equivalently, that it is online learnable). Their proof relies on a model-theoretic result by Hodges (1997), which demonstrates that any binary concept class with a large Littlestone dimension contains a large subclass of thresholds. In a follow-up work, Jung, Kim, and Tewari (2020) extended this proof to multiclass PAC learning with a bounded number of labels. Unfortunately, Hodges's result does not apply in other natural settings such as multiclass PAC learning with an unbounded label space, and PAC learning of partial concept classes. This naturally raises the question of whether DP learnability continues to imply online learnability in more general scenarios: indeed, Alon, Hanneke, Holzman, and Moran (2021) explicitly leave it as an open question in the context of partial concept classes, and the same question is open in the general multiclass setting. In this work, we give a positive answer to these questions showing that for general classification tasks, DP learnability implies online learnability. Our proof reasons directly about Littlestone trees, without relying on thresholds. We achieve this by establishing several Ramsey-type theorems for trees, which might be of independent interest.

algorithm, subtree, vertex, (16 more...)

2407.07765

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
Europe > Austria > Styria > Graz (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.82)

Industry:

Education > Educational Setting > Online (0.60)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.88)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.60)

arXiv.org Artificial IntelligenceJul-7-2024

Foundations and Frontiers of Graph Learning Theory

Huang, Yu, Zhou, Min, Yang, Menglin, Wang, Zhen, Zhang, Muhan, Wang, Jie, Xie, Hong, Wang, Hao, Lian, Defu, Chen, Enhong

Recent advancements in graph learning have revolutionized the way to understand and analyze data with complex structures. Notably, Graph Neural Networks (GNNs), i.e. neural network architectures designed for learning graph representations, have become a popular paradigm. With these models being usually characterized by intuition-driven design or highly intricate components, placing them within the theoretical analysis framework to distill the core concepts, helps understand the key principles that drive the functionality better and guide further development. Given this surge in interest, this article provides a comprehensive summary of the theoretical foundations and breakthroughs concerning the approximation and learning behaviors intrinsic to prevalent graph learning models. Encompassing discussions on fundamental aspects such as expressiveness power, generalization, optimization, and unique phenomena such as over-smoothing and over-squashing, this piece delves into the theoretical foundations and frontier driving the evolution of graph learning. In addition, this article also presents several challenges and further initiates discussions on possible solutions.

gnn, graph, neural network, (13 more...)

2407.03125

Country:

North America > United States > District of Columbia > Washington (0.04)
North America > Canada > British Columbia (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Afzali, Mohammad, Ashtiani, Hassan, Liaw, Christopher

Agnostic Private Density Estimation via Stable List Decoding

arXiv.org Machine LearningJul-5-2024

We introduce a new notion of stability--which we call stable list decoding--and demonstrate its applicability in designing differentially private density estimators. This definition is weaker than global stability [ABLMM22] and is related to the notions of replicability [ILPS22] and list replicability [CMY23]. We show that if a class of distributions is stable list decodable, then it can be learned privately in the agnostic setting. As the main application of our framework, we prove the first upper bound on the sample complexity of private density estimation for Gaussian Mixture Models in the agnostic setting, extending the realizable result of Afzali et al. [AAL24].

algorithm, sample complexity, stable list, (14 more...)

2407.04783

Country:

North America > Canada > Ontario > Hamilton (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Ang, Andersen, Hamed, Waqas Bin, De Sterck, Hans

Sum-of-norms regularized Nonnegative Matrix Factorization

arXiv.org Machine LearningJun-30-2024

When applying nonnegative matrix factorization (NMF), generally the rank parameter is unknown. Such rank in NMF, called the nonnegative rank, is usually estimated heuristically since computing the exact value of it is NP-hard. In this work, we propose an approximation method to estimate such rank while solving NMF on-the-fly. We use sum-of-norm (SON), a group-lasso structure that encourages pairwise similarity, to reduce the rank of a factor matrix where the rank is overestimated at the beginning. On various datasets, SON-NMF is able to reveal the correct nonnegative rank of the data without any prior knowledge nor tuning. SON-NMF is a nonconvx nonsmmoth non-separable non-proximable problem, solving it is nontrivial. First, as rank estimation in NMF is NP-hard, the proposed approach does not enjoy a lower computational complexity. Using a graph-theoretic argument, we prove that the complexity of the SON-NMF is almost irreducible. Second, the per-iteration cost of any algorithm solving SON-NMF is possibly high, which motivated us to propose a first-order BCD algorithm to approximately solve SON-NMF with a low per-iteration cost, in which we do so by the proximal average operator. Lastly, we propose a simple greedy method for post-processing. SON-NMF exhibits favourable features for applications. Beside the ability to automatically estimate the rank from data, SON-NMF can deal with rank-deficient data matrix, can detect weak component with small energy. Furthermore, on the application of hyperspectral imaging, SON-NMF handle the issue of spectral variability naturally.

complexity, dataset, son-nmf, (14 more...)

2407.00706

Country:

North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)