AITopics | target output

Collaborating Authors

target output

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

7c05147f3029c97ce26c0cb0b2469fca-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 12:05:40 GMT

agent, backprop, map propagation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Emergent Riemannian geometry over learning discrete computations on continuous manifolds

Brandon, Julian, Chadwick, Angus, Pellegrino, Arthur

arXiv.org Machine LearningDec-2-2025

Many tasks require mapping continuous input data (e.g. images) to discrete task outputs (e.g. class labels). Yet, how neural networks learn to perform such discrete computations on continuous data manifolds remains poorly understood. Here, we show that signatures of such computations emerge in the representational geometry of neural networks as they learn. By analysing the Riemannian pullback metric across layers of a neural network, we find that network computation can be decomposed into two functions: discretising continuous input features and performing logical operations on these discretised variables. Furthermore, we demonstrate how different learning regimes (rich vs. lazy) have contrasting metric and curvature structures, affecting the ability of the networks to generalise to unseen inputs. Overall, our work provides a geometric framework for understanding how neural networks learn to perform discrete computations on continuous manifolds.

computation, geometry, manifold, (15 more...)

arXiv.org Machine Learning

2512.00196

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

The QCET Taxonomy of Standard Quality Criterion Names and Definitions for the Evaluation of NLP Systems

Belz, Anya, Mille, Simon, Thomson, Craig

arXiv.org Artificial IntelligenceSep-29-2025

Prior work has shown that two NLP evaluation experiments that report results for the same quality criterion name (e.g. Fluency) do not necessarily evaluate the same aspect of quality, and the comparability implied by the name can be misleading. Not knowing when two evaluations are comparable in this sense means we currently lack the ability to draw reliable conclusions about system quality on the basis of multiple, independently conducted evaluations. This in turn hampers the ability of the field to progress scientifically as a whole, a pervasive issue in NLP since its beginning (Sparck Jones, 1981). It is hard to see how the issue of unclear comparability can be fully addressed other than by the creation of a standard set of quality criterion names and definitions that the several hundred quality criterion names actually in use in the field can be mapped to, and grounded in. Taking a strictly descriptive approach, the QCET Quality Criteria for Evaluation Taxonomy derives a standard set of quality criterion names and definitions from three surveys of evaluations reported in NLP, and structures them into a hierarchy where each parent node captures common aspects of its child nodes. We present QCET and the resources it consists of, and discuss its three main uses in (i) establishing comparability of existing evaluations, (ii) guiding the design of new evaluations, and (iii) assessing regulatory compliance.

computational linguistic, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2509.22064

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.68)

Genre:

Research Report (1.00)
Overview (0.67)

Industry:

Law (1.00)
Government (1.00)
Leisure & Entertainment (0.92)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(7 more...)

Add feedback

MAP Propagation Algorithm: Faster Learning with a Team of Reinforcement Learning Agents

Neural Information Processing SystemsAug-15-2025, 09:12:53 GMT

The high variance stems from the lack of structural credit assignment, i.e. a single scalar

agent, backprop, map propagation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

SOInter: A Novel Deep Energy Based Interpretation Method for Explaining Structured Output Models

Seyyedsalehi, S. Fatemeh, Soleymani, Mahdieh, Rabiee, Hamid R.

arXiv.org Artificial IntelligenceAug-12-2025

We propose a novel interpretation technique to explain the behavior of structured output models, which learn mappings between an input vector to a set of output variables simultaneously. Because of the complex relationship between the computational path of output variables in structured models, a feature can affect the value of output through other ones. We focus on one of the outputs as the target and try to find the most important features utilized by the structured model to decide on the target in each locality of the input space. In this paper, we assume an arbitrary structured output model is available as a black box and argue how considering the correlations between output variables can improve the explanation performance. The goal is to train a function as an interpreter for the target output variable over the input space. We introduce an energy-based training process for the interpreter function, which effectively considers the structural information incorporated into the model to be explained. The effectiveness of the proposed method is confirmed using a variety of simulated and real data sets.

artificial intelligence, interpreter, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2202.09914

Country: Asia > Middle East (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints

Yang, Junxiao, Zhang, Zhexin, Cui, Shiyao, Wang, Hongning, Huang, Minlie

arXiv.org Artificial IntelligenceFeb-25-2025

Jailbreaking attacks can effectively induce unsafe behaviors in Large Language Models (LLMs); however, the transferability of these attacks across different models remains limited. This study aims to understand and enhance the transferability of gradient-based jailbreaking methods, which are among the standard approaches for attacking white-box models. Through a detailed analysis of the optimization process, we introduce a novel conceptual framework to elucidate transferability and identify superfluous constraints-specifically, the response pattern constraint and the token tail constraint-as significant barriers to improved transferability. Removing these unnecessary constraints substantially enhances the transferability and controllability of gradient-based attacks. Evaluated on Llama-3-8B-Instruct as the source model, our method increases the overall Transfer Attack Success Rate (T-ASR) across a set of target models with varying safety levels from 18.4% to 50.3%, while also improving the stability and controllability of jailbreak behaviors on both source and target models.

arxiv preprint arxiv, constraint, llama-3-8b-instruct, (13 more...)

arXiv.org Artificial Intelligence

2503.01865

Country: North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.47)
Materials > Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Asymptotic evaluation of the information processing capacity in reservoir computing

Saito, Yohei

arXiv.org Artificial IntelligenceFeb-15-2025

Recurrent neural networks (RNNs) ca n store past input by recursively connecting hidden nodes [1] and can approximate the relationship between input and ou tput time series with arbitrary accuracy [2]. Backpropagation through time (BPTT) is mainly used to train RNNs, b ut it is difficult to optimize network parameters due to the gradient vanishing or the gradient explosion [3]. Many variants of RNNs, such as LSTM [4] and GRU [5], have been proposed to solve the difficulty of training and h ave been very successful. However, BPTT calculations become slower for longer training data. An echo state network (ESN) [6] is a kind of RNNs, which can finish tra ining quickly by fixing the recurrent connections at the initial value and optimizing only the linear transfor mation of the readout layer. Not limited to neural networks, a linear combination of nonlinear dynamical syste ms can be used to approximate the relationship between input and output time series and is called a reservoir comput ing (RC) system [7].

ipc, sequence, variance, (15 more...)

arXiv.org Artificial Intelligence

2502.15769

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Ornstein-Uhlenbeck Adaptation as a Mechanism for Learning in Brains and Machines

Fernandez, Jesus Garcia, Ahmad, Nasir, van Gerven, Marcel

arXiv.org Artificial IntelligenceDec-9-2024

One of the main properties of any intelligent system is that it has the capacity to learn. This holds for biological systems, ranging from bacteria and fungi to plants and animals [17, 21, 42, 50], as well as for engineered systems designed by artificial intelligence (AI) researchers [59, 29, 9]. Modern intelligent systems, such as those used in machine learning, typically rely on gradient descent for learning by minimizing error gradients [32, 65, 49]. While gradient-based methods have driven significant advances in AI [29], their reliance on exact gradients, centralized updates, and complex information pathways limits their applicability in biological and neuromorphic systems In contrast, biological learning likely relies on different mechanisms, as organisms often lack the exact gradient information and centralized control that gradient descent requires [31, 68]. Neuromorphic computing, inspired by these principles, aims to replicate the distributed, energyefficient learning of biological systems [38, 40]. However, integrating traditional gradient-based methods into neuromorphic hardware has proven challenging, highlighting a critical gap: the need for gradient-free learning mechanisms that exclusively rely on operations that are local in space and time [24, 12]. To address this, alternative learning principles to gradient descent have been proposed for both rate-based [45, 7, 51, 5, 67] and spike-based models [36, 4, 22, 46]. A class of methods that leverages inherent noise present in biological systems to facilitate learning is perturbation-based methods [57, 69, 66], which adjust the system's parameters based on noise effects and global reinforcement signals, offering gradient-free, local learning suitable for biological or neuromorphic

artificial intelligence, machine learning, noise, (14 more...)

arXiv.org Artificial Intelligence

2410.13563

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.94)
Energy > Oil & Gas (0.68)
Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Reviews: Houdini: Fooling Deep Structured Visual and Speech Recognition Models with Adversarial Examples

Neural Information Processing SystemsOct-8-2024, 09:52:55 GMT

This paper presents a way to create adversarial examples based on a task loss (e.g. The approach is tested on a few different domains (pose estimation, semantic segmentation, speech recognition). Overall the approach is nice and the results are impressive. My main issues with the paper (prompting my "marginal accept" decision) are: - The math and notation is confusing and contradictory in places, e.g. It needs to be cleaned up.

houdini, target output, visual and speech recognition model, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.63)

Add feedback

Rethinking LLM memorization

AIHubSep-23-2024, 10:14:43 GMT

A central question in the discussion of large language models (LLMs) concerns the extent to which they memorize their training data versus how they generalize to new tasks and settings. Most practitioners seem to (at least informally) believe that LLMs do some degree of both: they clearly memorize parts of the training data--for example, they are often able to reproduce large portions of training data verbatim [Carlini et al., 2023]--but they also seem to learn from this data, allowing them to generalize to new settings. The precise extent to which they do one or the other has massive implications for the practical and legal aspects of such models [Cooper et al., 2023]. Do LLMs truly produce new content, or do they only remix their training data? When dealing with humans, we distinguish plagiarizing content from learning from it, but how should this extend to LLMs?

llm, memorization, training data, (16 more...)

AIHub

Industry: Law (0.90)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.57)

Add feedback