decision list
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
Privately Learning Decision Lists and a Differentially Private Winnow
We give new differentially private algorithms for the classic problems of learning decision lists and large-margin halfspaces in the PAC and online models. In the PAC model, we give a computationally efficient algorithm for learning decision lists with minimal sample overhead over the best non-private algorithms. In the online model, we give a private analog of the influential Winnow algorithm for learning halfspaces with mistake bound polylogarithmic in the dimension and inverse polynomial in the margin. As an application, we describe how to privately learn decision lists in the online model, qualitatively matching state-of-the art non-private guarantees.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > North Carolina > Wake County > Raleigh (0.04)
- (4 more...)
Generating Part-Based Global Explanations Via Correspondence
Rathore, Kunal, Tadepalli, Prasad
Deep learning models are notoriously opaque. Existing explanation methods often focus on localized visual explanations for individual images. Concept-based explanations, while offering global insights, require extensive annotations, incurring significant labeling cost. We propose an approach that leverages user-defined part labels from a limited set of images and efficiently transfers them to a larger dataset. This enables the generation of global symbolic explanations by aggregating part-based local explanations, ultimately providing human-understandable explanations for model decisions on a large scale.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Oregon (0.04)
- North America > United States > California (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Transportation (0.47)
- Health & Medicine (0.46)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Reviews: Distribution-Independent PAC Learning of Halfspaces with Massart Noise
The paper gives a PAC learning algorithm for the basic problem of halfspaces in a model of learning with noise. The algorithm uses ideas from previous related results in the simpler model of random classification noise, with important new ideas. Learning with noise is a basic topic in learning theory. It can be argued that the most studied models (random misclassification noise and malicious noise) are unrealistically benign (even though the related SQ model is very important) or malicious, and there is a great need for the study of more realistic models. The Massart noise model is a candidate for such a model.
Probabilistic Scoring Lists for Interpretable Machine Learning
Hanselle, Jonas, Heid, Stefan, Fürnkranz, Johannes, Hüllermeier, Eyke
A scoring system is a simple decision model that checks a set of features, adds a certain number of points to a total score for each feature that is satisfied, and finally makes a decision by comparing the total score to a threshold. Scoring systems have a long history of active use in safety-critical domains such as healthcare and justice, where they provide guidance for making objective and accurate decisions. Given their genuine interpretability, the idea of learning scoring systems from data is obviously appealing from the perspective of explainable AI. In this paper, we propose a practically motivated extension of scoring systems called probabilistic scoring lists (PSL), as well as a method for learning PSLs from data. Instead of making a deterministic decision, a PSL represents uncertainty in the form of probability distributions, or, more generally, probability intervals. Moreover, in the spirit of decision lists, a PSL evaluates features one by one and stops as soon as a decision can be made with enough confidence. To evaluate our approach, we conduct a case study in the medical domain.
- Europe > Portugal > Coimbra > Coimbra (0.05)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States > New York (0.04)
- (6 more...)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.47)
Linear Mode Connectivity in Differentiable Tree Ensembles
Kanoh, Ryuichi, Sugiyama, Mahito
Linear Mode Connectivity (LMC) refers to the phenomenon that performance remains consistent for linearly interpolated models in the parameter space. For independently optimized model pairs from different random initializations, achieving LMC is considered crucial for validating the stable success of the non-convex optimization in modern machine learning models and for facilitating practical parameter-based operations such as model merging. While LMC has been achieved for neural networks by considering the permutation invariance of neurons in each hidden layer, its attainment for other models remains an open question. In this paper, we first achieve LMC for soft tree ensembles, which are tree-based differentiable models extensively used in practice. We show the necessity of incorporating two invariances: subtree flip invariance and splitting order invariance, which do not exist in neural networks but are inherent to tree architectures, in addition to permutation invariance of trees. Moreover, we demonstrate that it is even possible to exclude such additional invariances while keeping LMC by designing decision list-based tree architectures, where such invariances do not exist by definition. Our findings indicate the significance of accounting for architecture-specific invariances in achieving LMC.
- North America > United States > California (0.05)
- Asia > Middle East > Jordan (0.04)
Nearest Neighbor Representations of Neural Circuits
Kilic, Kordag Mehmet, Sima, Jin, Bruck, Jehoshua
Neural networks successfully capture the computational power of the human brain for many tasks. Similarly inspired by the brain architecture, Nearest Neighbor (NN) representations is a novel approach of computation. We establish a firmer correspondence between NN representations and neural networks. Although it was known how to represent a single neuron using NN representations, there were no results even for small depth neural networks. Specifically, for depth-2 threshold circuits, we provide explicit constructions for their NN representation with an explicit bound on the number of bits to represent it. Example functions include NN representations of convex polytopes (AND of threshold gates), IP2, OR of threshold gates, and linear or exact decision lists.
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > California (0.04)
Sample Complexity of Robust Learning against Evasion Attacks
It is becoming increasingly important to understand the vulnerability of machine learning models to adversarial attacks. One of the fundamental problems in adversarial machine learning is to quantify how much training data is needed in the presence of evasion attacks, where data is corrupted at test time. In this thesis, we work with the exact-in-the-ball notion of robustness and study the feasibility of adversarially robust learning from the perspective of learning theory, considering sample complexity. We first explore the setting where the learner has access to random examples only, and show that distributional assumptions are essential. We then focus on learning problems with distributions on the input data that satisfy a Lipschitz condition and show that robustly learning monotone conjunctions has sample complexity at least exponential in the adversary's budget (the maximum number of bits it can perturb on each input). However, if the adversary is restricted to perturbing $O(\log n)$ bits, then one can robustly learn conjunctions and decision lists w.r.t. log-Lipschitz distributions. We then study learning models where the learner is given more power. We first consider local membership queries, where the learner can query the label of points near the training sample. We show that, under the uniform distribution, the exponential dependence on the adversary's budget to robustly learn conjunctions remains inevitable. We then introduce a local equivalence query oracle, which returns whether the hypothesis and target concept agree in a given region around a point in the training sample, and a counterexample if it exists. We show that if the query radius is equal to the adversary's budget, we can develop robust empirical risk minimization algorithms in the distribution-free setting. We give general query complexity upper and lower bounds, as well as for concrete concept classes.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.13)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (8 more...)
- Overview (0.92)
- Research Report > New Finding (0.67)
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)
Delivering Inflated Explanations
Izza, Yacine, Ignatiev, Alexey, Stuckey, Peter, Marques-Silva, Joao
In the quest for Explainable Artificial Intelligence (XAI) one of the questions that frequently arises given a decision made by an AI system is, ``why was the decision made in this way?'' Formal approaches to explainability build a formal model of the AI system and use this to reason about the properties of the system. Given a set of feature values for an instance to be explained, and a resulting decision, a formal abductive explanation is a set of features, such that if they take the given value will always lead to the same decision. This explanation is useful, it shows that only some features were used in making the final decision. But it is narrow, it only shows that if the selected features take their given values the decision is unchanged. It's possible that some features may change values and still lead to the same decision. In this paper we formally define inflated explanations which is a set of features, and for each feature of set of values (always including the value of the instance being explained), such that the decision will remain unchanged. Inflated explanations are more informative than abductive explanations since e.g they allow us to see if the exact value of a feature is important, or it could be any nearby value. Overall they allow us to better understand the role of each feature in the decision. We show that we can compute inflated explanations for not that much greater cost than abductive explanations, and that we can extend duality results for abductive explanations also to inflated explanations.