Goto

Collaborating Authors

 target output



Emergent Riemannian geometry over learning discrete computations on continuous manifolds

Brandon, Julian, Chadwick, Angus, Pellegrino, Arthur

arXiv.org Machine Learning

Many tasks require mapping continuous input data (e.g. images) to discrete task outputs (e.g. class labels). Yet, how neural networks learn to perform such discrete computations on continuous data manifolds remains poorly understood. Here, we show that signatures of such computations emerge in the representational geometry of neural networks as they learn. By analysing the Riemannian pullback metric across layers of a neural network, we find that network computation can be decomposed into two functions: discretising continuous input features and performing logical operations on these discretised variables. Furthermore, we demonstrate how different learning regimes (rich vs. lazy) have contrasting metric and curvature structures, affecting the ability of the networks to generalise to unseen inputs. Overall, our work provides a geometric framework for understanding how neural networks learn to perform discrete computations on continuous manifolds.


The QCET Taxonomy of Standard Quality Criterion Names and Definitions for the Evaluation of NLP Systems

Belz, Anya, Mille, Simon, Thomson, Craig

arXiv.org Artificial Intelligence

Prior work has shown that two NLP evaluation experiments that report results for the same quality criterion name (e.g. Fluency) do not necessarily evaluate the same aspect of quality, and the comparability implied by the name can be misleading. Not knowing when two evaluations are comparable in this sense means we currently lack the ability to draw reliable conclusions about system quality on the basis of multiple, independently conducted evaluations. This in turn hampers the ability of the field to progress scientifically as a whole, a pervasive issue in NLP since its beginning (Sparck Jones, 1981). It is hard to see how the issue of unclear comparability can be fully addressed other than by the creation of a standard set of quality criterion names and definitions that the several hundred quality criterion names actually in use in the field can be mapped to, and grounded in. Taking a strictly descriptive approach, the QCET Quality Criteria for Evaluation Taxonomy derives a standard set of quality criterion names and definitions from three surveys of evaluations reported in NLP, and structures them into a hierarchy where each parent node captures common aspects of its child nodes. We present QCET and the resources it consists of, and discuss its three main uses in (i) establishing comparability of existing evaluations, (ii) guiding the design of new evaluations, and (iii) assessing regulatory compliance.



SOInter: A Novel Deep Energy Based Interpretation Method for Explaining Structured Output Models

Seyyedsalehi, S. Fatemeh, Soleymani, Mahdieh, Rabiee, Hamid R.

arXiv.org Artificial Intelligence

We propose a novel interpretation technique to explain the behavior of structured output models, which learn mappings between an input vector to a set of output variables simultaneously. Because of the complex relationship between the computational path of output variables in structured models, a feature can affect the value of output through other ones. We focus on one of the outputs as the target and try to find the most important features utilized by the structured model to decide on the target in each locality of the input space. In this paper, we assume an arbitrary structured output model is available as a black box and argue how considering the correlations between output variables can improve the explanation performance. The goal is to train a function as an interpreter for the target output variable over the input space. We introduce an energy-based training process for the interpreter function, which effectively considers the structural information incorporated into the model to be explained. The effectiveness of the proposed method is confirmed using a variety of simulated and real data sets.


Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints

Yang, Junxiao, Zhang, Zhexin, Cui, Shiyao, Wang, Hongning, Huang, Minlie

arXiv.org Artificial Intelligence

Jailbreaking attacks can effectively induce unsafe behaviors in Large Language Models (LLMs); however, the transferability of these attacks across different models remains limited. This study aims to understand and enhance the transferability of gradient-based jailbreaking methods, which are among the standard approaches for attacking white-box models. Through a detailed analysis of the optimization process, we introduce a novel conceptual framework to elucidate transferability and identify superfluous constraints-specifically, the response pattern constraint and the token tail constraint-as significant barriers to improved transferability. Removing these unnecessary constraints substantially enhances the transferability and controllability of gradient-based attacks. Evaluated on Llama-3-8B-Instruct as the source model, our method increases the overall Transfer Attack Success Rate (T-ASR) across a set of target models with varying safety levels from 18.4% to 50.3%, while also improving the stability and controllability of jailbreak behaviors on both source and target models.


Asymptotic evaluation of the information processing capacity in reservoir computing

Saito, Yohei

arXiv.org Artificial Intelligence

Recurrent neural networks (RNNs) ca n store past input by recursively connecting hidden nodes [1] and can approximate the relationship between input and ou tput time series with arbitrary accuracy [2]. Backpropagation through time (BPTT) is mainly used to train RNNs, b ut it is difficult to optimize network parameters due to the gradient vanishing or the gradient explosion [3]. Many variants of RNNs, such as LSTM [4] and GRU [5], have been proposed to solve the difficulty of training and h ave been very successful. However, BPTT calculations become slower for longer training data. An echo state network (ESN) [6] is a kind of RNNs, which can finish tra ining quickly by fixing the recurrent connections at the initial value and optimizing only the linear transfor mation of the readout layer. Not limited to neural networks, a linear combination of nonlinear dynamical syste ms can be used to approximate the relationship between input and output time series and is called a reservoir comput ing (RC) system [7].


Ornstein-Uhlenbeck Adaptation as a Mechanism for Learning in Brains and Machines

Fernandez, Jesus Garcia, Ahmad, Nasir, van Gerven, Marcel

arXiv.org Artificial Intelligence

One of the main properties of any intelligent system is that it has the capacity to learn. This holds for biological systems, ranging from bacteria and fungi to plants and animals [17, 21, 42, 50], as well as for engineered systems designed by artificial intelligence (AI) researchers [59, 29, 9]. Modern intelligent systems, such as those used in machine learning, typically rely on gradient descent for learning by minimizing error gradients [32, 65, 49]. While gradient-based methods have driven significant advances in AI [29], their reliance on exact gradients, centralized updates, and complex information pathways limits their applicability in biological and neuromorphic systems In contrast, biological learning likely relies on different mechanisms, as organisms often lack the exact gradient information and centralized control that gradient descent requires [31, 68]. Neuromorphic computing, inspired by these principles, aims to replicate the distributed, energyefficient learning of biological systems [38, 40]. However, integrating traditional gradient-based methods into neuromorphic hardware has proven challenging, highlighting a critical gap: the need for gradient-free learning mechanisms that exclusively rely on operations that are local in space and time [24, 12]. To address this, alternative learning principles to gradient descent have been proposed for both rate-based [45, 7, 51, 5, 67] and spike-based models [36, 4, 22, 46]. A class of methods that leverages inherent noise present in biological systems to facilitate learning is perturbation-based methods [57, 69, 66], which adjust the system's parameters based on noise effects and global reinforcement signals, offering gradient-free, local learning suitable for biological or neuromorphic


Reviews: Houdini: Fooling Deep Structured Visual and Speech Recognition Models with Adversarial Examples

Neural Information Processing Systems

This paper presents a way to create adversarial examples based on a task loss (e.g. The approach is tested on a few different domains (pose estimation, semantic segmentation, speech recognition). Overall the approach is nice and the results are impressive. My main issues with the paper (prompting my "marginal accept" decision) are: - The math and notation is confusing and contradictory in places, e.g. It needs to be cleaned up.


Rethinking LLM memorization

AIHub

A central question in the discussion of large language models (LLMs) concerns the extent to which they memorize their training data versus how they generalize to new tasks and settings. Most practitioners seem to (at least informally) believe that LLMs do some degree of both: they clearly memorize parts of the training data--for example, they are often able to reproduce large portions of training data verbatim [Carlini et al., 2023]--but they also seem to learn from this data, allowing them to generalize to new settings. The precise extent to which they do one or the other has massive implications for the practical and legal aspects of such models [Cooper et al., 2023]. Do LLMs truly produce new content, or do they only remix their training data? When dealing with humans, we distinguish plagiarizing content from learning from it, but how should this extend to LLMs?