codomain
Mistake-bounded online learning with operation caps
Geneson, Jesse, Li, Meien, Tang, Linus
We investigate the mistake-bound model of online learning with caps on the number of arithmetic operations per round. We prove general bounds on the minimum number of arithmetic operations per round that are necessary to learn an arbitrary family of functions with finitely many mistakes. We solve a problem on agnostic mistake-bounded online learning with bandit feedback from (Filmus et al, 2024) and (Geneson \& Tang, 2024). We also extend this result to the setting of operation caps.
Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures
Sanborn, Sophia, Mathe, Johan, Papillon, Mathilde, Buracas, Domas, Lillemark, Hansen J, Shewmake, Christian, Bertics, Abby, Pennec, Xavier, Miolane, Nina
The enduring legacy of Euclidean geometry underpins classical machine learning, which, for decades, has been primarily developed for data lying in Euclidean space. Yet, modern machine learning increasingly encounters richly structured data that is inherently nonEuclidean. This data can exhibit intricate geometric, topological and algebraic structure: from the geometry of the curvature of space-time, to topologically complex interactions between neurons in the brain, to the algebraic transformations describing symmetries of physical systems. Extracting knowledge from such non-Euclidean data necessitates a broader mathematical perspective. Echoing the 19th-century revolutions that gave rise to non-Euclidean geometry, an emerging line of research is redefining modern machine learning with non-Euclidean structures. Its goal: generalizing classical methods to unconventional data types with geometry, topology, and algebra. In this review, we provide an accessible gateway to this fast-growing field and propose a graphical taxonomy that integrates recent advances into an intuitive unified framework. We subsequently extract insights into current challenges and highlight exciting opportunities for future development in this field.
Bounds on the price of feedback for mistake-bounded online learning
We improve several worst-case bounds for various online learning scenarios from (Auer and Long, Machine Learning, 1999). In particular, we sharpen an upper bound for delayed ambiguous reinforcement learning by a factor of 2 and an upper bound for learning compositions of families of functions by a factor of 2.41. We also improve a lower bound from the same paper for learning compositions of $k$ families of functions by a factor of $\Theta(\ln{k})$, matching the upper bound up to a constant factor. In addition, we solve a problem from (Long, Theoretical Computer Science, 2020) on the price of bandit feedback with respect to standard feedback for multiclass learning, and we improve an upper bound from (Feng et al., Theoretical Computer Science, 2023) on the price of $r$-input delayed ambiguous reinforcement learning by a factor of $r$, matching a lower bound from the same paper up to the leading term.
Merging with unknown reliability
Such a scenario occurs, but not especially often. Two identical temperature sensors produce readings that are equally likely to be close to the actual value, but a difference in made, age, or position changes their reliability. Two experts hardly have the very same knowledge, experience and ability. The reliability of two databases on a certain area may depend on factors that are unknown when merging them. Merging under equal and unequal reliability are two scenarios, but a third exists: unknown reliability. Most previous work in belief merging is about the first [41, 43, 13, 22, 36, 31, 23]; some is about the second [53, 42, 12, 35]; this one is about the third. The difference between equal and unknown reliability is clear when its implications on some examples are shown.
eHarmony/aloha
The Aloha libraries provide implementations of machine learning models used at eHarmony. So, Aloha models are are not written in terms of Instances, Tensors, or DataModels. Instead, models are written generically, and different semantics implementations are provided to give meaning to the features extracted from the arbitrary input types on which the models operate. While these differences may not sound extremely useful, together they produce a number of advantages. The most notable is probably the way input features make their way to the models.
Calculus for Machine Learning
Although anyone who has completed a basic course in Maths knows what a function is, it is a good idea to review this basic concept and associated terminology, just in case you need a refresher. A function is relationship that defines how one quantity depends on another. A function takes an input from a set and maps it an output from another set. The input set is known as the domain and the output set is known as the codomain or target set of the function. There are many ways to denote functions in machine learning literature, or in Mathematics.
Towards arrow-theoretic semantics of ontologies: conceptories
Ontologies [1] are used in computer science for representing and sharing knowledge about the real world. Usually ontological structures are described in terms of classes(of things) and relationships(between things). This is rather similar to category-theoretic notions of objects and morphisms (see [2, 3] for information about the algebraic category theory). Since the category theory already brings us many benefits in other areasofcomputer science, it is desirable to find arrowtheoretic approaches in the area of knowledge representation. 1 Some authors proposed category-theoretic techniques helpful in different aspects of knowledge representation[5, 6]. Usually they operate with (co)limits that are convenient for merging and interoperating between existing models and metamodels. Our aim is to find a category-theoretic tools that would be useful for description of ontological models from the very beginning.
Automatically Discovering Hidden Transformation Chaining Constraints
Chenouard, Raphael, Jouault, Frédéric
Model transformations operate on models conforming to precisely defined metamodels. Consequently, it often seems relatively easy to chain them: the output of a transformation may be given as input to a second one if metamodels match. However, this simple rule has some obvious limitations. For instance, a transformation may only use a subset of a metamodel. Therefore, chaining transformations appropriately requires more information. We present here an approach that automatically discovers more detailed information about actual chaining constraints by statically analyzing transformations. The objective is to provide developers who decide to chain transformations with more data on which to base their choices. This approach has been successfully applied to the case of a library of endogenous transformations. They all have the same source and target metamodel but have some hidden chaining constraints. In such a case, the simple metamodel matching rule given above does not provide any useful information.