AITopics

2202.0103

Country:

North America > United States > Texas (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Germany > Baden-Württemberg (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Marsili, Matteo, Roudi, Yasser

Quantifying Relevance in Learning and Inference

arXiv.org Machine LearningFeb-1-2022

Learning is a distinctive feature of intelligent behaviour. High-throughput experimental data and Big Data promise to open new windows on complex systems such as cells, the brain or our societies. Yet, the puzzling success of Artificial Intelligence and Machine Learning shows that we still have a poor conceptual understanding of learning. These applications push statistical inference into uncharted territories where data is high-dimensional and scarce, and prior information on "true" models is scant if not totally absent. Here we review recent progress on understanding learning, based on the notion of "relevance". The relevance, as we define it here, quantifies the amount of information that a dataset or the internal representation of a learning machine contains on the generative model of the data. This allows us to define maximally informative samples, on one hand, and optimal learning machines on the other. These are ideal limits of samples and of machines, that contain the maximal amount of information about the unknown generative process, at a given resolution (or level of compression). Both ideal limits exhibit critical features in the statistical sense: Maximally informative samples are characterised by a power-law frequency distribution (statistical criticality) and optimal learning machines by an anomalously large susceptibility. The trade-off between resolution (i.e. compression) and relevance distinguishes the regime of noisy representations from that of lossy compression. These are separated by a special point characterised by Zipf's law statistics. This identifies samples obeying Zipf's law as the most compressed loss-less representations that are optimal in the sense of maximal relevance. Criticality in optimal learning machines manifests in an exponential degeneracy of energy levels, that leads to unusual thermodynamic properties.

information, relevance, representation, (16 more...)

2202.00339

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
North America > United States > Ohio (0.04)
(6 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningJan-23-2022

Active Learning Polynomial Threshold Functions

Ben-Eliezer, Omri, Hopkins, Max, Yang, Chutong, Yu, Hantao

We initiate the study of active learning polynomial threshold functions (PTFs). While traditional lower bounds imply that even univariate quadratics cannot be non-trivially actively learned, we show that allowing the learner basic access to the derivatives of the underlying classifier circumvents this issue and leads to a computationally efficient algorithm for active learning degree-$d$ univariate PTFs in $\tilde{O}(d^3\log(1/\varepsilon\delta))$ queries. We also provide near-optimal algorithms and analyses for active learning PTFs in several average case settings. Finally, we prove that access to derivatives is insufficient for active learning multivariate PTFs, even those of just two variables.

learning, query, query complexity, (13 more...)

2201.09433

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.93)

arXiv.org Artificial IntelligenceJan-15-2022

Hyperplane bounds for neural feature mappings

Yepes, Antonio Jimeno

When minimising the empirical risk, the generalisation of the learnt function still depends on the performance on the training data, the Vapnik-Chervonenkis(VC)- dimension of the function and the number of training examples. Neural networks have a large number of parameters, which correlates with their VC-dimension that is typically large but not infinite, and typically a large number of training instances are needed to effectively train them. In this work, we explore how to optimize feature mappings using neural network with the intention to reduce the effective VC-dimension of the hyperplane found in the space generatedby the mapping. An interpretationofthe resultsofthis study isthat it ispossible to define a loss that controls the VC-dimension of the separating hyperplane. We evaluate this approach and observe that the performance when using this method improves when the size of the training set is small.

feature space, hyperplane, neural network, (13 more...)

2201.05799

Country:

Oceania > Australia > Victoria (0.04)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceJan-13-2022

Exact learning for infinite families of concepts

Moshkov, Mikhail

In this paper, based on results of exact learning, test theory, and rough set theory, we study arbitrary infinite families of concepts each of which consists of an infinite set of elements and an infinite set of subsets of this set called concepts. We consider the notion of a problem over a family of concepts that is described by a finite number of elements: for a given concept, we should recognize which of the elements under consideration belong to this concept. As algorithms for problem solving, we consider decision trees of five types: (i) using membership queries, (ii) using equivalence queries, (iii) using both membership and equivalence queries, (iv) using proper equivalence queries, and (v) using both membership and proper equivalence queries. As time complexity, we study the depth of decision trees. In the worst case, with the growth of the number of elements in the problem description, the minimum depth of decision trees of the first type either grows as a logarithm or linearly, and the minimum depth of decision trees of each of the other types either is bounded from above by a constant or grows as a logarithm, or linearly. The obtained results allow us to distinguish seven complexity classes of infinite families of concepts.

decision tree, equivalence query, query, (16 more...)

2201.08225

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > Slovakia > Bratislava > Bratislava (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

#artificialintelligenceDec-30-2021, 16:39:32 GMT

Pinaki Laskar on LinkedIn: #AI #MachineLearning #DeepLearning

AI Researcher, Cognitive Technologist Inventor - AI Thinking, Think Chain Innovator - AIOT, XAI, Autonomous Cars, IIOT Founder Fisheyebox Spatial Computing Savant, Transformative Leader, Industry X.0 Practitioner Meta-AI is about modeling and simulating reality, causality, mentality by digital technologies. It is key source of data is science as the sum of universal knowledge, all the world's information as coordinated and systematized. It is typically divided into three major branches that consist of the following, - the natural sciences (e.g., biology, chemistry, and physics), which study nature in the broadest sense; - the social sciences (e.g., economics, psychology, and sociology), which study individuals and societies; - the formal sciences (e.g., logic, mathematics, and theoretical computer science), which deal with symbols governed by rules; As to empiricism, stating that knowledge comes only or primarily from sensory experience, both the philosophical sciences and the formal sciences as well as mathematics are out of any science as they do not rely on empirical evidence. It is plain and clear, data or information or knowledge have real value if only coordinated and systematized and organized. Again, drawing on pattern recognition and computational learning theory, Meta-ML is dedicated to the study of problem-solving by computer programs in general, enabling computers to reason about the world and learn from data, to effectively interact with any realities, physical, mental, social, digital, or virtual. AI/ML/DL modelling should consist of the following necessary features, Meta-physical Assumptions: prior knowledge, the basis of our knowing, understanding, or thinking about the whole world or a domain problem (primary causes, principles and elements).

deeplearning, machinelearning, pinaki laskar, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.58)

Leventi-Peetz, A. -M., Peetz, Jörg-Volker, Rohde, Martina

ML Supported Predictions for SAT Solvers Performance

arXiv.org Artificial IntelligenceDec-17-2021

In order to classify the indeterministic termination behavior of the open source SAT solver CryptoMiniSat in multi-threading mode while processing hard to solve boolean satisfiability problem instances, internal solver runtime parameters have been collected and analyzed. A subset of these parameters has been selected and employed as features vector to successfully create a machine learning model for the binary classification of the solver's termination behavior with any single new solving run of a not yet solved instance. The model can be used for the early estimation of a solving attempt as belonging or not belonging to the class of candidates with good chances for a fast termination. In this context a combination of active profiles of runtime characteristics appear to mirror the influence of the solver's momentary heuristics on the immediate quality of the solver's resolution process. Because runtime parameters of already the first two solving iterations are enough to forecast termination of the attempt with good success scores, the results of the present work deliver a promising basis which can be further developed in order to enrich CryptoMiniSat or generally any modern SAT solver with AI abilities.

iteration, runtime parameter, solver, (14 more...)

doi: 10.1007/978-3-030-32520-6_7

2112.09438

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Europe > United Kingdom > Wales > Swansea (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.97)

Wang, Chaoqi, Singla, Adish, Chen, Yuxin

Teaching an Active Learner with Contrastive Examples

arXiv.org Machine LearningDec-10-2021

We study the problem of active learning with the added twist that the learner is assisted by a helpful teacher. We consider the following natural interaction protocol: At each round, the learner proposes a query asking for the label of an instance $x^q$, the teacher provides the requested label $\{x^q, y^q\}$ along with explanatory information to guide the learning process. In this paper, we view this information in the form of an additional contrastive example ($\{x^c, y^c\}$) where $x^c$ is picked from a set constrained by $x^q$ (e.g., dissimilar instances with the same label). Our focus is to design a teaching algorithm that can provide an informative sequence of contrastive examples to the learner to speed up the learning process. We show that this leads to a challenging sequence optimization problem where the algorithm's choices at a given round depend on the history of interactions. We investigate an efficient teaching algorithm that adaptively picks these contrastive examples. We derive strong performance guarantees for our algorithm based on two problem-dependent parameters and further show that for specific types of active learners (e.g., a generalized binary search learner), the proposed teaching algorithm exhibits strong approximation guarantees. Finally, we illustrate our bounds and demonstrate the effectiveness of our teaching framework via two numerical case studies.

active learner, hypothesis, learner, (15 more...)

2110.14888

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Abbas, Amira, Sutter, David, Figalli, Alessio, Woerner, Stefan

Effective dimension of machine learning models

arXiv.org Machine LearningDec-9-2021

Making statements about the performance of trained models on tasks involving new data is one of the primary goals of machine learning, i.e., to understand the generalization power of a model. Various capacity measures try to capture this ability, but usually fall short in explaining important characteristics of models that we observe in practice. In this study, we propose the local effective dimension as a capacity measure which seems to correlate well with generalization error on standard data sets. Importantly, we prove that the local effective dimension bounds the generalization error and discuss the aptness of this capacity measure for machine learning models.

dimension, effective dimension, local effective dimension, (11 more...)

2112.04807

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre:

Research Report (0.70)
Overview (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)

arXiv.org Artificial IntelligenceDec-1-2021

First Steps of an Approach to the ARC Challenge based on Descriptive Grid Models and the Minimum Description Length Principle

Ferré, Sébastien

The Abstraction and Reasoning Corpus (ARC) was recently introduced by Fran\c{c}ois Chollet as a tool to measure broad intelligence in both humans and machines. It is very challenging, and the best approach in a Kaggle competition could only solve 20% of the tasks, relying on brute-force search for chains of hand-crafted transformations. In this paper, we present the first steps exploring an approach based on descriptive grid models and the Minimum Description Length (MDL) principle. The grid models describe the contents of a grid, and support both parsing grids and generating grids. The MDL principle is used to guide the search for good models, i.e. models that compress the grids the most. We report on our progress over a year, improving on the general approach and the models. Out of the 400 training tasks, our performance increased from 5 to 29 solved tasks, only using 30s computation time per task. Our approach not only predicts the output grids, but also outputs an intelligible model and explanations for how the model was incrementally built.

grid, grid model, output grid, (14 more...)

2112.00848

Country: Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)

Genre:

Research Report (0.64)
Workflow (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (1.00)