Goto

Collaborating Authors

 functor


Can neural operators always be continuously discretized? Takashi Furuya

Neural Information Processing Systems

We consider the problem of discretization of neural operators between Hilbert spaces in a general framework including skip connections. We focus on bijec-tive neural operators through the lens of diffeomorphisms in infinite dimensions.



Bayesian Networks, Markov Networks, Moralisation, Triangulation: a Categorical Perspective

Lorenzin, Antonio, Zanasi, Fabio

arXiv.org Artificial Intelligence

Moralisation and Triangulation are transformations allowing to switch between different ways of factoring a probability distribution into a graphical model. Moralisation allows to view a Bayesian network (a directed model) as a Markov network (an undirected model), whereas triangulation addresses the opposite direction. We present a categorical framework where these transformations are modelled as functors between a category of Bayesian networks and one of Markov networks. The two kinds of network (the objects of these categories) are themselves represented as functors from a `syntax' domain to a `semantics' codomain. Notably, moralisation and triangulation can be defined inductively on such syntax via functor pre-composition. Moreover, while moralisation is fully syntactic, triangulation relies on semantics. This leads to a discussion of the variable elimination algorithm, reinterpreted here as a functor in its own right, that splits the triangulation procedure in two: one purely syntactic, the other purely semantic. This approach introduces a functorial perspective into the theory of probabilistic graphical models, which highlights the distinctions between syntactic and semantic modifications.


Can neural operators always be continuously discretized? Takashi Furuya

Neural Information Processing Systems

We consider the problem of discretization of neural operators between Hilbert spaces in a general framework including skip connections. We focus on bijec-tive neural operators through the lens of diffeomorphisms in infinite dimensions.



Transparent Semantic Spaces: A Categorical Approach to Explainable Word Embeddings

Fabregat-Hernández, Ares, Palanca, Javier, Botti, Vicent

arXiv.org Artificial Intelligence

The paper introduces a novel framework based on category theory to enhance the explainability of artificial intelligence systems, particularly focusing on word embeddings. Furthermore, the paper defines the categories of configurations Conf and word embeddings Emb, accompanied by the concept of divergence as a decoration on Emb . It establishes a mathematically precise method for comparing word embeddings, demonstrating the equivalence between the GloVe and Word2Vec algorithms and the metric MDS algorithm, transitioning from neural network algorithms (black box) to a transparent framework. Finally, the paper presents a mathematical approach to computing biases before embedding and offers insights on mitigating biases at the semantic space level, advancing the field of explainable artificial intelligence. Introduction Word embeddings have emerged as a cornerstone in natural language processing (NLP) and machine learning (ML) applications, revolutionizing the representation of textual data (see [IUS23]). At the heart of word em-beddings lies the idea of capturing semantic relationships between words in a continuous vector space, enabling machines to understand and process human language more effectively (see [HAMJ16, LG14]). By mapping words to high-dimensional vectors, word embeddings encode semantic similarities and syntactic structures, thereby facilitating a wide array of downstream tasks such as sentiment analysis, named entity recognition, machine translation, and document classification. In addition to enhancing model performance and accuracy, word embeddings offer several practical advantages in ML applications. They provide a compact and dense representation of textual data, enabling efficient storage, retrieval, and computation. Moreover, word embeddings capture contextual nuances and semantic meanings that traditional bag-of-words or one-hot encoding schemes fail to capture, leading to more nuanced and context-aware language understanding. As such, word embeddings serve as foundational building blocks for a broad spectrum of ML tasks, empowering researchers and practitioners to unlock new capabilities in language understanding and processing. In recent years, word embeddings have become indispensable tools for natural language processing tasks, offering compact representations of textual data that capture semantic relationships between words. Biases embedded in the training data can be perpetuated in word embeddings, leading to unfair associations and stereotypes. Addressing these challenges requires interdisciplinary efforts from researchers in machine learning, natural language processing, and ethics. Strategies such as debiasing techniques, dimensionality reduction methods, and transparency-enhancing approaches are being actively explored to mitigate these challenges and improve the reliability and fairness of word embeddings in practical applications.


Consciousness as a Functor

Mahadevan, Sridhar

arXiv.org Artificial Intelligence

We propose a novel theory of consciousness as a functor (CF) that receives and transmits contents from unconscious memory into conscious memory. Our CF framework can be seen as a categorial formulation of the Global Workspace Theory proposed by Baars. CF models the ensemble of unconscious processes as a topos category of coalgebras. The internal language of thought in CF is defined as a Multi-modal Universal Mitchell-Benabou Language Embedding (MUMBLE). We model the transmission of information from conscious short-term working memory to long-term unconscious memory using our recently proposed Universal Reinforcement Learning (URL) framework. To model the transmission of information from unconscious long-term memory into resource-constrained short-term memory, we propose a network economic model.


Universal Reinforcement Learning in Coalgebras: Asynchronous Stochastic Computation via Conduction

Mahadevan, Sridhar

arXiv.org Artificial Intelligence

In this paper, we introduce a categorial generalization of RL, termed universal reinforcement learning (URL), building on powerful mathematical abstractions from the study of coinduction on non-well-founded sets and universal coalgebras, topos theory, and categorial models of asynchronous parallel distributed computation. In the first half of the paper, we review the basic RL framework, illustrate the use of categories and functors in RL, showing how they lead to interesting insights. In particular, we also introduce a standard model of asynchronous distributed minimization proposed by Bertsekas and Tsitsiklis, and describe the relationship between metric coinduction and their proof of the Asynchronous Convergence Theorem. The space of algorithms for MDPs or PSRs can be modeled as a functor category, where the co-domain category forms a topos, which admits all (co)limits, possesses a subobject classifier, and has exponential objects. In the second half of the paper, we move on to universal coalgebras. Dynamical system models, such as Markov decision processes (MDPs), partially observed MDPs (POMDPs), a predictive state representation (PSRs), and linear dynamical systems (LDSs) are all special types of coalgebras. We describe a broad family of universal coalgebras, extending the dynamic system models studied previously in RL. The core problem in finding fixed points in RL to determine the exact or approximate (action) value function is generalized in URL to determining the final coalgebra asynchronously in a parallel distributed manner.


Unraveling the iterative CHAD

Nunes, Fernando Lucatelli, Plotkin, Gordon, Vákár, Matthijs

arXiv.org Artificial Intelligence

Combinatory Homomorphic Automatic Differentiation (CHAD) was originally formulated as a semantics-driven source-to-source transformation for reverse-mode AD of total (terminating) functional programs. In this work, we extend CHAD to encompass programs featuring constructs such as partial (potentially non-terminating) operations, data-dependent conditionals (e.g., real-valued tests), and iteration constructs (i.e. while-loops), while maintaining CHAD's core principle of structure-preserving semantics. A central contribution is the introduction of iteration-extensive indexed categories, which provide a principled integration of iteration into dependently typed programming languages. This integration is achieved by requiring that iteration in the base category lifts to parameterized initial algebras in the indexed category, yielding an op-fibred iterative structure that models while-loops and other iteration constructs in the total category, which corresponds to the category of containers of our dependently typed language. Through the idea of iteration-extensive indexed categories, we extend the CHAD transformation to looping programs as the unique structure-preserving functor in a suitable sense. Specifically, it is the unique iterative Freyd category morphism from the iterative Freyd category corresponding to the source language to the category of containers obtained from the target language, such that each primitive operation is mapped to its (transposed) derivative. We establish the correctness of this extended transformation via the universal property of the syntactic categorical model of the source language, showing that the differentiated programs compute correct reverse-mode derivatives of their originals.


A Rose by Any Other Name Would Smell as Sweet: Categorical Homotopy Theory for Large Language Models

Mahadevan, Sridhar

arXiv.org Artificial Intelligence

Natural language is replete with superficially different statements, such as ``Charles Darwin wrote" and ``Charles Darwin is the author of", which carry the same meaning. Large language models (LLMs) should generate the same next-token probabilities in such cases, but usually do not. Empirical workarounds have been explored, such as using k-NN estimates of sentence similarity to produce smoothed estimates. In this paper, we tackle this problem more abstractly, introducing a categorical homotopy framework for LLMs. We introduce an LLM Markov category to represent probability distributions in language generated by an LLM, where the probability of a sentence, such as ``Charles Darwin wrote" is defined by an arrow in a Markov category. However, this approach runs into difficulties as language is full of equivalent rephrases, and each generates a non-isomorphic arrow in the LLM Markov category. To address this fundamental problem, we use categorical homotopy techniques to capture ``weak equivalences" in an LLM Markov category. We present a detailed overview of application of categorical homotopy to LLMs, from higher algebraic K-theory to model categories, building on powerful theoretical results developed over the past half a century.