fundamental theorem
Null Measurability at the Symmetrization Interface in VC Learning
Recent work revisiting measurability in the fundamental theorem of statistical learning imposes Borel measurability of ghost-gap suprema. We show that, at the one-sided ghost-gap interface actually used by the standard symmetrization proof, this requirement is stronger than necessary. For any Borel-parameterized concept class on a Polish domain, the bad event "there exists a hypothesis whose ghost empirical error exceeds its training empirical error by at least ฮต/2" is analytic. By Choquet capacitability, it is therefore measurable in the completion of every finite Borel measure. We then construct a concept class whose bad event is null-measurable but not Borel, giving a strict separation from the Borel supremum condition. Finally, we prove closure under patching, fixed and countable interpolation, and fiber-product amalgamation, showing that the weaker regularity level is stable under natural concept-class constructors. In the realizable setting, where targets belong to the class and are measurable, these results weaken the measurability hypothesis needed by the symmetrization route from finite VC dimension to PAC learnability. The main results and the descriptive-set-theoretic infrastructure used by them are formalized in Lean 4.
Recursively Enumerably Representable Classes and Computable Versions of the Fundamental Theorem of Statistical Learning
Kattermann, David, Krapp, Lothar Sebastian
We study computable probably approximately correct (CPAC) learning, where learners are required to be computable functions. It had been previously observed that the Fundamental Theorem of Statistical Learning, which characterizes PAC learnability by finiteness of the Vapnik-Chervonenkis (VC-)dimension, no longer holds in this framework. Recent works recovered analogs of the Fundamental Theorem in the computable setting, for instance by introducing an effective VC-dimension. Guided by this, we investigate the connection between CPAC learning and recursively enumerable representable (RER) classes, whose members can be algorithmically listed. Our results show that the effective VC-dimensions can take arbitrary values above the traditional one, even for RER classes, which creates a whole family of (non-)examples for various notions of CPAC learning. Yet the two dimensions coincide for classes satisfying sufficiently strong notions of CPAC learning. We then observe that CPAC learnability can also be characterized via containment of RER classes that realize the same samples. Furthermore, it is shown that CPAC learnable classes satisfying a unique identification property are necessarily RER. Finally, we establish that agnostic learnability can be guaranteed for RER classes, by considering the relaxed notion of nonuniform CPAC learning.
If you can distinguish, you can express: Galois theory, Stone--Weierstrass, machine learning, and linguistics
Blum-Smith, Ben, Brugman, Claudia, Conners, Thomas, Villar, Soledad
This essay develops a parallel between the Fundamental Theorem of Galois Theory and the Stone--Weierstrass theorem: both can be viewed as assertions that tie the distinguishing power of a class of objects to their expressive power. We provide an elementary theorem connecting the relevant notions of "distinguishing power". We also discuss machine learning and data science contexts in which these theorems, and more generally the theme of links between distinguishing power and expressive power, appear. Finally, we discuss the same theme in the context of linguistics, where it appears as a foundational principle, and illustrate it with several examples.
The Emergence of Grammar through Reinforcement Learning
Wechsler, Stephen, Shearer, James W., Erk, Katrin
Reinforcement learning in psychology (as opposed to machine learning) refers to a family of mathematical models of how animals and humans learn. It has its origins with Thorndike's Law of Effect: behavior with positive outcomes is reinforced and likely to be repeated (learned). Reinforcement learning is part of a larger family of stochastic learning models where behavior is probabilistic (Bush and Mosteller 1951, 1953, 1955). The key ideas are that the STATE OF LEARNING of a SUBJECT (person or animal) is represented by a vector in a STATE SPACE. The subject's behavior (or RESPONSE) given a STIMULUS is not deterministic, but depends on probabilities determined by the state of learning. The OUTCOME(or PAYOFF) changes the state of learning. In reinforcement learning, the relative size of the payoff determines how strongly (if at all) the behavior is reinforced.
Measurability in the Fundamental Theorem of Statistical Learning
Krapp, Lothar Sebastian, Wirth, Laura
The Fundamental Theorem of Statistical Learning states that a hypothesis space is PAC learnable if and only if its VC dimension is finite. For the agnostic model of PAC learning, the literature so far presents proofs of this theorem that often tacitly impose several measurability assumptions on the involved sets and functions. We scrutinize these proofs from a measure-theoretic perspective in order to extract the assumptions needed for a rigorous argument. This leads to a sound statement as well as a detailed and self-contained proof of the Fundamental Theorem of Statistical Learning in the agnostic setting, showcasing the minimal measurability requirements needed. We then discuss applications in Model Theory, considering NIP and o-minimal structures. Our main theorem presents sufficient conditions for the PAC learnability of hypothesis spaces defined over o-minimal expansions of the reals.
Theory and Explicit Design of a Path Planner for an SE(3) Robot
Zhang, Zhaoqi, Chiang, Yi-Jen, Yap, Chee
We consider path planning for a rigid spatial robot with 6 degrees of freedom (6 DOFs), moving amidst polyhedral obstacles. A correct, complete and practical path planner for such a robot has never been achieved, although this is widely recognized as a key challenge in robotics. This paper provides a complete "explicit" design, down to explicit geometric primitives that are easily implementable. Our design is within an algorithmic framework for path planners, called Soft Subdivision Search (SSS). The framework is based on the twin foundations of $\epsilon$-exactness and soft predicates, which are critical for rigorous numerical implementations. The practicality of SSS has been previously demonstrated for various robots including 5-DOF spatial robots. In this paper, we solve several significant technical challenges for SE(3) robots: (1) We first ensure the correct theory by proving a general form of the Fundamental Theorem of the SSS theory. We prove this within an axiomatic framework, thus making it easy for future applications of this theory. (2) One component of $SE(3) = R^3 \times SO(3)$ is the non-Euclidean, non-orientable space SO(3). We design a novel topologically correct data structure for SO(3). Using the concept of subdivision charts and atlases for SO(3), we can now carry out subdivision of SO(3). (3) The geometric problem of collision detection takes place in $R^3$, via the footprint map. Unlike sampling-based approaches, we must reason with the notion of footprints of configuration boxes, which is much harder to characterize. Exploiting the theory of soft predicates, we design suitable approximate footprints which, when combined with the highly effective feature-set technique, lead to soft predicates. (4) Finally, we make the underlying geometric computation "explicit", i.e., avoiding a general solver of polynomial systems, in order to allow a direct implementation.
A Formal Proof of the Expressiveness of Deep Learning - Journal of Automated Reasoning
Deep learning algorithms enable computers to perform tasks that seem beyond what we can program them to do using traditional techniques. In recent years, we have seen the emergence of unbeatable computer go players, practical speech recognition systems, and self-driving cars. These algorithms also have applications to image recognition, bioinformatics, and many other domains. Yet, on the theoretical side, we are only starting to understand why deep learning works so well. Recently, Cohen et al. [16] used tensor theory to explain the superiority of deep learning over shallow learning for one specific learning architecture called convolutional arithmetic circuits (CACs). Machine learning algorithms attempt to model abstractions of their input data.
Revisit the Fundamental Theorem of Linear Algebra
This survey is meant to provide an introduction to the fundamental theorem of linear algebra and the theories behind them. Our goal is to give a rigorous introduction to the readers with prior exposure to linear algebra. Specifically, we provide some details and proofs of some results from (Strang, 1993). We then describe the fundamental theorem of linear algebra from different views and find the properties and relationships behind the views. The fundamental theorem of linear algebra is essential in many fields, such as electrical engineering, computer science, machine learning, and deep learning.
Revisit the Fundamental Theorem of Linear Algebra
This survey is meant to provide an introduction to the fundamental theorem of linear algebra and the theories behind them. Our goal is to give a rigorous introduction to the readers with prior exposure to linear algebra. Specifically, we provide some details and proofs of some results from (Strang, 1993). We then describe the fundamental theorem of linear algebra from different views and find the properties and relationships behind the views. The fundamental theorem of linear algebra is essential in many fields, such as electrical engineering, computer science, machine learning, and deep learning. This survey is primarily a summary of purpose, significance of important theories behind it. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in theory behind the fundamental theorem of linear algebra and rigorous analysis in order to seamlessly introduce its properties in four subspaces in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results and given the paucity of scope to present this discussion, e.g., the separated analysis of the (orthogonal) projection matrices. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields. Some excellent examples include (Rose, 1982; Strang, 2009; Trefethen and Bau III, 1997; Strang, 2019, 2021).