Goto

Collaborating Authors

 computable function


FPT-Approximability of Stable Matching Problems

arXiv.org Artificial Intelligence

We study parameterized approximability of three optimization problems related to stable matching: (1) Min-BP-SMI: Given a stable marriage instance and a number k, find a size-at-least-k matching that minimizes the number $ฮฒ$ of blocking pairs; (2) Min-BP-SRI: Given a stable roommates instance, find a matching that minimizes the number $ฮฒ$ of blocking pairs; (3) Max-SMTI: Given a stable marriage instance with preferences containing ties, find a maximum-size stable matching. The first two problems are known to be NP-hard to approximate to any constant factor and W[1]-hard with respect to $ฮฒ$, making the existence of an EPTAS or FPT-algorithms unlikely. We show that they are W[1]-hard with respect to $ฮฒ$ to approximate to any function of $ฮฒ$. This means that unless FPT=W[1], there is no FPT-approximation scheme for the parameter $ฮฒ$. The last problem (Max-SMTI) is known to be NP-hard to approximate to factor-29/33 and W[1]-hard with respect to the number of ties. We complement this and present an FPT-approximation scheme for the parameter "number of agents with ties".


Learning Algorithms in the Limit

arXiv.org Artificial Intelligence

This paper studies the problem of learning computable functions in the limit by extending Gold's inductive inference framework to incorporate \textit{computational observations} and \textit{restricted input sources}. Complimentary to the traditional Input-Output Observations, we introduce Time-Bound Observations, and Policy-Trajectory Observations to study the learnability of general recursive functions under more realistic constraints. While input-output observations do not suffice for learning the class of general recursive functions in the limit, we overcome this learning barrier by imposing computational complexity constraints or supplementing with approximate time-bound observations. Further, we build a formal framework around observations of \textit{computational agents} and show that learning computable functions from policy trajectories reduces to learning rational functions from input and output, thereby revealing interesting connections to finite-state transducer inference. On the negative side, we show that computable or polynomial-mass characteristic sets cannot exist for the class of linear-time computable functions even for policy-trajectory observations.


Position: A Theory of Deep Learning Must Include Compositional Sparsity

arXiv.org Artificial Intelligence

Overparametrized Deep Neural Networks (DNNs) have demonstrated remarkable success in a wide variety of domains too high-dimensional for classical shallow networks subject to the curse of dimensionality. However, open questions about fundamental principles, that govern the learning dynamics of DNNs, remain. In this position paper we argue that it is the ability of DNNs to exploit the compositionally sparse structure of the target function driving their success. As such, DNNs can leverage the property that most practically relevant functions can be composed from a small set of constituent functions, each of which relies only on a low-dimensional subset of all inputs. We show that this property is shared by all efficiently Turing-computable functions and is therefore highly likely present in all current learning problems. While some promising theoretical insights on questions concerned with approximation and generalization exist in the setting of compositionally sparse functions, several important questions on the learnability and optimization of DNNs remain. Completing the picture of the role of compositional sparsity in deep learning is essential to a comprehensive theory of artificial, and even general, intelligence.


Technical Perspective: When Proofs Meet Programs: An Extension of Dependent Type Theory with Church's Thesis

Communications of the ACM

What is a mathematical proof? It can be described as a sequence of logical steps and calculations that serve as evidence of the correctness of a statement. The steps must follow rules that are accepted as correct by the community. One might think there is a set of universal rules. However, this is far from being the case.


Computability of Classification and Deep Learning: From Theoretical Limits to Practical Feasibility through Quantization

arXiv.org Artificial Intelligence

The unwavering success of deep learning in the past decade led to the increasing prevalence of deep learning methods in various application fields. However, the downsides of deep learning, most prominently its lack of trustworthiness, may not be compatible with safety-critical or high-responsibility applications requiring stricter performance guarantees. Recently, several instances of deep learning applications have been shown to be subject to theoretical limitations of computability, undermining the feasibility of performance guarantees when employed on real-world computers. We extend the findings by studying computability in the deep learning framework from two perspectives: From an application viewpoint in the context of classification problems and a general limitation viewpoint in the context of training neural networks. In particular, we show restrictions on the algorithmic solvability of classification problems that also render the algorithmic detection of failure in computations in a general setting infeasible. Subsequently, we prove algorithmic limitations in training deep neural networks even in cases where the underlying problem is well-behaved. Finally, we end with a positive observation, showing that in quantized versions of classification and deep network training, computability restrictions do not arise or can be overcome to a certain degree.


Hallucination is Inevitable: An Innate Limitation of Large Language Models

arXiv.org Artificial Intelligence

Hallucination has been widely recognized to be a significant drawback for large language models (LLMs). There have been many works that attempt to reduce the extent of hallucination. These efforts have mostly been empirical so far, which cannot answer the fundamental question whether it can be completely eliminated. In this paper, we formalize the problem and show that it is impossible to eliminate hallucination in LLMs. Specifically, we define a formal world where hallucination is defined as inconsistencies between a computable LLM and a computable ground truth function. By employing results from learning theory, we show that LLMs cannot learn all of the computable functions and will therefore always hallucinate. Since the formal world is a part of the real world which is much more complicated, hallucinations are also inevitable for real world LLMs. Furthermore, for real world LLMs constrained by provable time complexity, we describe the hallucination-prone tasks and empirically validate our claims. Finally, using the formal world framework, we discuss the possible mechanisms and efficacies of existing hallucination mitigators as well as the practical implications on the safe deployment of LLMs.


Physical Computing: A Category Theoretic Perspective on Physical Computation and System Compositionality

arXiv.org Artificial Intelligence

This paper introduces a category theory-based framework to redefine physical computing in light of advancements in quantum computing and non-standard computing systems. By integrating classical definitions within this broader perspective, the paper rigorously recontextualizes what constitutes physical computing devices and processes. It demonstrates how the compositional nature and relational structures of physical computing systems can be coherently formalized using category theory. This approach not only encapsulates recent formalisms in physical computing but also offers a structured method to explore the dynamic interactions within these systems. Keywords: Computation, Computability, Physical Computation, Category Theory 1. Introduction Roots of computability trace back to Leibniz, who invented a mechanical calculator for automating the evaluation of mathematical expressions [23, 20, 9]. At the Paris international conference for mathematics, David Hilbert extended Leibniz's fascination by proposing the Entscheidungsproblem (the decision problem), questioning the existence of an "effective procedure" to determine the truth or falsity of mathematical statements [29]. Alan Turing and Alonso Church independently demonstrated the impossibility of resolving the Entscheidungsproblem. This discovery, known as the "Church-Turing thesis", posited that no effective procedure (or "algorithm" in contemporary terms) can Present address: McGovern Institute for Brain Research, MIT, Cambridge, 02139, MA, USA Prior to the emergence of algorithm(s), procedural calculation through a finite number of exact, finite instructions was labeled "effective procedure (or effective calculation).


Computability of Optimizers

arXiv.org Artificial Intelligence

Optimization problems are a staple of today's scientific and technical landscape. However, at present, solvers of such problems are almost exclusively run on digital hardware. Using Turing machines as a mathematical model for any type of digital hardware, in this paper, we analyze fundamental limitations of this conceptual approach of solving optimization problems. Since in most applications, the optimizer itself is of significantly more interest than the optimal value of the corresponding function, we will focus on computability of the optimizer. In fact, we will show that in various situations the optimizer is unattainable on Turing machines and consequently on digital computers. Moreover, even worse, there does not exist a Turing machine, which approximates the optimizer itself up to a certain constant error. We prove such results for a variety of well-known problems from very different areas, including artificial intelligence, financial mathematics, and information theory, often deriving the even stronger result that such problems are not Banach-Mazur computable, also not even in an approximate sense.


Is Data Science a science?

#artificialintelligence

At its core, all fundamental science is about making predictions in the form of experiments: precise, quantifiable, falsifiable predictions. As Richard P. Feynman put it: "The fundamental principle of science, the definition almost, is this: the sole test of the validity of any idea is experiment." So if science is about making predictions, how is it different from the predictions that astrologers make? The core distinction is in the kinds of predictions each makes. Most horoscopes, for example, will give you general predictions.


If it's interpretable it's pretty much useless.

#artificialintelligence

Some days ago I was interviewing a candidate for a data-related position: after a couple of technical questions I asked him what algorithm he would have used to have a reliable starting point for a random classification problem. I was just curious to understand how used he was in doing some data science and if he knew some state-of-the-art algorithms and techniques. He told me that he would have gone with a simple decision tree because it's somehow easy to explain and interpret. That answer surprised me a little: I mean, why a decision tree in 2019 when you can get way better and, above all, more robust results using more advanced algorithms? As always happens, once you notice something you see it everywhere, and from that day I keep seeing and reading here and there blog posts about interpretability, explicability and how all of these concepts are connected to machine learning and trust.