AITopics | blowup

Collaborating Authors

blowup

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Pushdown Reward Machines for Reinforcement Learning

Varricchione, Giovanni, Klassen, Toryn Q., Alechina, Natasha, Dastani, Mehdi, Logan, Brian, McIlraith, Sheila A.

arXiv.org Artificial IntelligenceNov-13-2025

Reward machines (RMs) are automata structures that encode (non-Markovian) reward functions for reinforcement learning (RL). RMs can reward any behaviour representable in regular languages and, when paired with RL algorithms that exploit RM structure, have been shown to significantly improve sample efficiency in many domains. In this work, we present pushdown reward machines (pdRMs), an extension of reward machines based on deterministic pushdown automata. pdRMs can recognise and reward temporally extended behaviours representable in deterministic context-free languages, making them more expressive than reward machines. We introduce two variants of pdRM-based policies, one which has access to the entire stack of the pdRM, and one which can only access the top $k$ symbols (for a given constant $k$) of the stack. We propose a procedure to check when the two kinds of policies (for a given environment, pdRM, and constant $k$) achieve the same optimal state values. We then provide theoretical results establishing the expressive power of pdRMs, and space complexity results for the proposed learning problems. Lastly, we propose an approach for off-policy RL algorithms that exploits counterfactual experiences with pdRMs. We conclude by providing experimental results showing how agents can be trained to perform tasks representable in deterministic context-free languages using pdRMs.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2508.06894

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Netherlands (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Government (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Why are Sensitive Functions Hard for Transformers?

Hahn, Michael, Rofin, Mark

arXiv.org Artificial IntelligenceMay-27-2024

Empirical studies have identified a range of learnability biases and limitations of transformers, such as a persistent difficulty in learning to compute simple formal languages such as PARITY, and a bias towards low-degree functions. However, theoretical understanding remains limited, with existing expressiveness theory either overpredicting or underpredicting realistic learning abilities. We prove that, under the transformer architecture, the loss landscape is constrained by the input-space sensitivity: Transformers whose output is sensitive to many parts of the input string inhabit isolated points in parameter space, leading to a low-sensitivity bias in generalization. We show theoretically and empirically that this theory unifies a broad array of empirical observations about the learning abilities and biases of transformers, such as their generalization bias towards low sensitivity and low degree, and difficulty in length generalization for PARITY. This shows that understanding transformers' inductive biases requires studying not just their in-principle expressivity, but also their loss landscape.

parity, sensitivity, transformer, (15 more...)

arXiv.org Artificial Intelligence

2402.09963

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Approximate Nearest Neighbor Search with Window Filters

Engels, Joshua, Landrum, Benjamin, Yu, Shangdi, Dhulipala, Laxman, Shun, Julian

arXiv.org Artificial IntelligenceFeb-1-2024

The nearest neighbor search problem has been widely studied for more than 30 years (Arya & Mount, 1993). Given Although this problem has many motivating examples, there a dataset D, the problem requires the construction of an is a dearth of papers examining it in the literature. Some vector index that can efficiently answer queries of the form "what databases analyze window search-like problem instances is the closest vector to x in D?" Solving this problem exactly as an additional feature of their system, but this analysis degrades to a brute force linear search in high dimensions is typically secondary to their main approach and too slow (Rubinstein, 2018), so instead both theoreticians and for large-scale real-world systems; as far as we are aware, practitioners focus on the relaxed c-approximate nearest we are the first to propose, analyze, and experiment with a neighbor search problem (ANNS), which asks "what is a non-trivial solution to the window search problem.

approximate nearest neighbor search, filter fraction, query, (8 more...)

arXiv.org Artificial Intelligence

2402.00943

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Maryland (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

How a Fervent Belief Split Silicon Valley---and Fueled the Blowup at OpenAI

WSJ.com: WSJD - TechnologyNov-22-2023, 19:25:00 GMT

Sam Altman's firing showed the influence of effective altruism and its view that AI development must slow down; his return marked its limits

blowup, fervent belief split silicon valley, fueled, (1 more...)

WSJ.com: WSJD - Technology

Country: North America > United States > California (0.66)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.66)

Add feedback

Scalable tensor methods for nonuniform hypergraphs

Aksoy, Sinan G., Amburg, Ilya, Young, Stephen J.

arXiv.org Artificial IntelligenceJun-30-2023

While multilinear algebra appears natural for studying the multiway interactions modeled by hypergraphs, tensor methods for general hypergraphs have been stymied by theoretical and practical barriers. A recently proposed adjacency tensor is applicable to nonuniform hypergraphs, but is prohibitively costly to form and analyze in practice. We develop tensor times same vector (TTSV) algorithms for this tensor which improve complexity from $O(n^r)$ to a low-degree polynomial in $r$, where $n$ is the number of vertices and $r$ is the maximum hyperedge size. Our algorithms are implicit, avoiding formation of the order $r$ adjacency tensor. We demonstrate the flexibility and utility of our approach in practice by developing tensor-based hypergraph centrality and clustering algorithms. We also show these tensor measures offer complementary information to analogous graph-reduction approaches on data, and are also able to detect higher-order structure that many existing matrix-based approaches provably cannot.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2306.17825

Country:

North America > United States (0.14)
Africa > Senegal > Kolda Region > Kolda (0.04)
Europe > Portugal (0.04)

Genre: Research Report (0.41)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Data Science > Data Mining (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

Blow-up Algorithm for Sum-of-Products Polynomials and Real Log Canonical Thresholds

Hirose, Joe

arXiv.org Artificial IntelligenceMar-21-2023

When considering an invariant that gives a Bayesian generalization error, that is a real log canonical threshold, in general, papers replace a mean error function with a relatively simple polynomial whose real log canonical threshold corresponds to that of the mean error function, and obtain its real log canonical threshold by resolving its singularities through an algebraic operation called blow-up. Though it is known that the singularities of any polynomial can be resolved by a finite number of blow-up iterations, it is not clarified well whether or not it is possible to resolve singularities of a specific polynomial by applying a specific blow-up algorithm. Therefore this paper proposes a blow-up algorithm that can be applied to the polynomials called sum-of-products polynomials and proves that it halts. Furthermore, this paper considers real log canonical thresholds of sum-of-products polynomials by using the algorithm. First, this section explains the foundation of Bayesian learning theory and details the relation to a real log canonical threshold and blow-up. Then this section defines exclusive sum-of-products polynomials which is subject to previous studies and explains the novelty and utility of this paper.

artificial intelligence, machine learning, polynomial, (17 more...)

arXiv.org Artificial Intelligence

2303.11619

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Fourier Continuation for Exact Derivative Computation in Physics-Informed Neural Operators

Maust, Haydn, Li, Zongyi, Wang, Yixuan, Leibovici, Daniel, Bruno, Oscar, Hou, Thomas, Anandkumar, Anima

arXiv.org Artificial IntelligenceNov-29-2022

The physics-informed neural operator (PINO) is a machine learning architecture that has shown promising empirical results for learning partial differential equations. PINO uses the Fourier neural operator (FNO) architecture to overcome the optimization challenges often faced by physics-informed neural networks. Since the convolution operator in PINO uses the Fourier series representation, its gradient can be computed exactly on the Fourier space. While Fourier series cannot represent nonperiodic functions, PINO and FNO still have the expressivity to learn nonperiodic problems with Fourier extension via padding. However, computing the Fourier extension in the physics-informed optimization requires solving an ill-conditioned system, resulting in inaccurate derivatives which prevent effective optimization. In this work, we present an architecture that leverages Fourier continuation (FC) to apply the exact gradient method to PINO for nonperiodic problems. This paper investigates three different ways that FC can be incorporated into PINO by testing their performance on a 1D blowup problem. Experiments show that FC-PINO outperforms padded PINO, improving equation loss by several orders of magnitude, and it can accurately capture the third order derivatives of nonsmooth solution functions.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.1596

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback