AITopics | stag

Collaborating Authors

stag

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

gAttention Sinks: A ' Catch, Tag, Release ' Mechanism for Embeddings

Neural Information Processing SystemsJun-18-2026, 16:28:16 GMT

Large language models (LLMs) often concentrate their attention on a few specific tokens referred to as attention sinks. Common examples include the first token, a prompt-independent sink, and punctuation tokens, which are prompt-dependent. While the tokens causing the sinks often lack direct semantic meaning, the presence of the sinks is critical for model performance, particularly under model compression and KV-caching. Despite their ubiquity, the function, semantic role, and origin of attention sinks--especially those beyond the first token--remain poorly understood. In this work, we conduct a comprehensive investigation demonstrating that attention sinks: catch a sequence of tokens, tag them using a common direction in embedding space, and release them back into the residual stream, where tokens are later retrieved based on the tags they have acquired. Probing experiments reveal these tags carry semantically meaningful information, such as the truth of a statement. These findings extend to reasoning models, where the mechanism spans more heads and explains greater variance in embeddings, or recent models with querykey normalization, where sinks remain just as prevalent. To encourage future theoretical analysis, we introduce a minimal problem which can be solved through the'catch, tag, release' mechanism, and where it emerges through training.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Asia (0.93)
North America > United States (0.93)
North America > Canada > Ontario (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

LLMs as Policy-Agnostic Teammates: A Case Study in Human Proxy Design for Heterogeneous Agent Teams

Justus, Aju Ani, Baber, Chris

arXiv.org Artificial IntelligenceOct-8-2025

A critical challenge in modelling Heterogeneous-Agent Teams is training agents to collaborate with teammates whose policies are inaccessible or non-stationary, such as humans. Traditional approaches rely on expensive human-in-the-loop data, which limits scalability. We propose using Large Language Models (LLMs) as policy-agnostic human proxies to generate synthetic data that mimics human decision-making. To evaluate this, we conduct three experiments in a grid-world capture game inspired by Stag Hunt, a game theory paradigm that balances risk and reward. In Experiment 1, we compare decisions from 30 human participants and 2 expert judges with outputs from LLaMA 3.1 and Mixtral 8x22B models. LLMs, prompted with game-state observations and reward structures, align more closely with experts than participants, demonstrating consistency in applying underlying decision criteria. Experiment 2 modifies prompts to induce risk-sensitive strategies (e.g. "be risk averse"). LLM outputs mirror human participants' variability, shifting between risk-averse and risk-seeking behaviours. Finally, Experiment 3 tests LLMs in a dynamic grid-world where the LLM agents generate movement actions. LLMs produce trajectories resembling human participants' paths. While LLMs cannot yet fully replicate human adaptability, their prompt-guided diversity offers a scalable foundation for simulating policy-agnostic teammates.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.06151

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Quantizing Text-attributed Graphs for Semantic-Structural Integration

Bo, Jianyuan, Wu, Hao, Fang, Yuan

arXiv.org Artificial IntelligenceJul-29-2025

Text-attributed graphs (TAGs) have emerged as a powerful representation for modeling complex relationships across diverse domains. With the rise of large language models (LLMs), there is growing interest in leveraging their capabilities for graph learning. However, current approaches face significant challenges in embedding structural information into LLM-compatible formats, requiring either computationally expensive alignment mechanisms or manual graph verbalization techniques that often lose critical structural details. Moreover, these methods typically require labeled data from source domains for effective transfer learning, significantly constraining their adaptability. We propose STAG, a novel self-supervised framework that directly quantizes graph structural information into discrete tokens using a frozen codebook. Unlike traditional quantization approaches, our method employs soft assignment and KL divergence guided quantization to address the unique challenges of graph data, which lacks natural tokenization structures. Our framework enables both LLM-based and traditional learning approaches, supporting true zero-shot transfer learning without requiring labeled data even in the source domain. Extensive experiments demonstrate state-of-the-art performance across multiple node classification benchmarks while maintaining compatibility with different LLM architectures, offering an elegant solution to bridging graph learning with LLMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.19526

Country:

North America > Canada (0.16)
Asia > Singapore (0.14)

Genre: Research Report (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Facets in Argumentation: A Formal Approach to Argument Significance

Fichte, Johannes, Fröhlich, Nicolas, Hecher, Markus, Lagerkvist, Victor, Mahmood, Yasir, Meier, Arne, Persson, Jonathan

arXiv.org Artificial IntelligenceMay-19-2025

Argumentation is a central subarea of Artificial Intelligence (AI) for modeling and reasoning about arguments. The semantics of abstract argumentation frameworks (AFs) is given by sets of arguments (extensions) and conditions on the relationship between them, such as stable or admissible. Today's solvers implement tasks such as finding extensions, deciding credulous or skeptical acceptance, counting, or enumerating extensions. While these tasks are well charted, the area between decision, counting/enumeration and fine-grained reasoning requires expensive reasoning so far. We introduce a novel concept (facets) for reasoning between decision and enumeration. Facets are arguments that belong to some extensions (credulous) but not to all extensions (skeptical). They are most natural when a user aims to navigate, filter, or comprehend the significance of specific arguments, according to their needs. We study the complexity and show that tasks involving facets are much easier than counting extensions. Finally, we provide an implementation, and conduct experiments to demonstrate feasibility.

argument, logic & formal reasoning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2505.10982

Country:

North America > United States > Massachusetts (0.28)
Europe > Germany (0.28)
Europe > Austria (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)

Add feedback

Seeing World Dynamics in a Nutshell

Shen, Qiuhong, Yi, Xuanyu, Lin, Mingbao, Zhang, Hanwang, Yan, Shuicheng, Wang, Xinchao

arXiv.org Artificial IntelligenceFeb-5-2025

We consider the problem of efficiently representing casually captured monocular videos in a spatially- and temporally-coherent manner. While existing approaches predominantly rely on 2D/2.5D techniques treating videos as collections of spatiotemporal pixels, they struggle with complex motions, occlusions, and geometric consistency due to absence of temporal coherence and explicit 3D structure. Drawing inspiration from monocular video as a projection of the dynamic 3D world, we explore representing videos in their intrinsic 3D form through continuous flows of Gaussian primitives in space-time. In this paper, we propose NutWorld, a novel framework that efficiently transforms monocular videos into dynamic 3D Gaussian representations in a single forward pass. At its core, NutWorld introduces a structured spatial-temporal aligned Gaussian (STAG) representation, enabling optimization-free scene modeling with effective depth and flow regularization. Through comprehensive experiments, we demonstrate that NutWorld achieves high-fidelity video reconstruction quality while enabling various downstream applications in real-time. Demos and code will be available at https://github.com/Nut-World/NutWorld.

artificial intelligence, machine learning, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2502.03465

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Think Smart, Act SMARL! Analyzing Probabilistic Logic Driven Safety in Multi-Agent Reinforcement Learning

Chatterji, Satchit, Acar, Erman

arXiv.org Artificial IntelligenceNov-7-2024

An important challenge for enabling the deployment of reinforcement learning (RL) algorithms in the real world is safety. This has resulted in the recent research field of Safe RL, which aims to learn optimal policies that are safe. One successful approach in that direction is probabilistic logic shields (PLS), a model-based Safe RL technique that uses formal specifications based on probabilistic logic programming, constraining an agent's policy to comply with those specifications in a probabilistic sense. However, safety is inherently a multi-agent concept, since real-world environments often involve multiple agents interacting simultaneously, leading to a complex system which is hard to control. Moreover, safe multi-agent RL (Safe MARL) is still underexplored. In order to address this gap, in this paper we ($i$) introduce Shielded MARL (SMARL) by extending PLS to MARL -- in particular, we introduce Probabilistic Logic Temporal Difference Learning (PLTD) to enable shielded independent Q-learning (SIQL), and introduce shielded independent PPO (SIPPO) using probabilistic logic policy gradients; ($ii$) show its positive effect and use as an equilibrium selection mechanism in various game-theoretic environments including two-player simultaneous games, extensive-form games, stochastic games, and some grid-world extensions in terms of safety, cooperation, and alignment with normative behaviors; and ($iii$) look into the asymmetric case where only one agent is shielded, and show that the shielded agent has a significant influence on the unshielded one, providing further evidence of SMARL's ability to enhance safety and cooperation in diverse multi-agent environments.

agent, sensor, shield, (16 more...)

arXiv.org Artificial Intelligence

2411.04867

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Games (1.00)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Spectral Toolkit of Algorithms for Graphs: Technical Report (2)

Macgregor, Peter, Sun, He

arXiv.org Artificial IntelligenceJun-6-2024

Spectral Toolkit of Algorithms for Graphs (STAG) is an open-source C++ and Python library providing several methods for working with graphs and performing graph-based data analysis. In this technical report, we provide an update on the development of the STAG library. The report serves as a user's guide for the newly implemented algorithms, and gives implementation details and engineering choices made in the development of the library. The report is structured as follows: Section 2 describes the locality sensitive hashing, and the main components used in its construction. Section 3 describes the kernel density estimation, and the state-of-the-art algorithm for the kernel density estimation.

algorithm, kernel density, stag, (13 more...)

arXiv.org Artificial Intelligence

2407.07096

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Spectral Toolkit of Algorithms for Graphs: Technical Report (1)

Macgregor, Peter, Sun, He

arXiv.org Artificial IntelligenceApr-5-2023

Spectral Toolkit of Algorithms for Graphs (STAG) is an open-source C++ and Python library of efficient spectral algorithms for graphs. Our objective is to implement advanced graph algorithms developed through algorithmic spectral graph theory, while making it practical to end users. This series of technical reports is to document our progress on STAG, including implementation details, engineering considerations, and the data sets against which our implementation is tested. The report is structured as follows: Section 2 describes the local clustering algorithm, which is the main update in this STAG release. The discussion is at a high level such that domain knowledge beyond basic algorithms is not needed. Section 3 provides a user guide to the essential features of STAG which allow a user to apply local clustering. Section 4 includes experiments and demonstrations of the functionality of STAG. Finally, Section 5 discusses several technical details; these include our choice of implemented algorithms, the default setup of parameters, and other technical choices. We leave these details to the final section, as it's not necessary for the reader to understand this when using STAG.

data mining, machine learning, programming language, (17 more...)

arXiv.org Artificial Intelligence

2304.0317

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (0.92)
Information Technology > Software > Programming Languages (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.34)

Add feedback

Stochastic Aggregation in Graph Neural Networks

Wang, Yuanqing, Karaletsos, Theofanis

arXiv.org Artificial IntelligenceFeb-25-2021

We herein present a unifying framework for stochastic aggregation (STAG) in GNNs, where noise is (adaptively) Nonetheless, such aggregation scheme also causes limitations injected into the aggregation process from of GNNs. Firstly, without proper choices of aggregation the neighborhood to form node embeddings. We functions, GNNs are not always as powerful as WL provide theoretical arguments that STAG models, test. When pooling from (transformed) neighborhood representations, with little overhead, remedy both of the aforementioned if the underlying set for the neighborhood problems. In addition to fixed-noise multiset (See Definition 1 of Xu et al. (2018)) is countable, models, we also propose probabilistic versions of as has been studied in detail in Xu et al. (2018), although STAG models and a variational inference framework different multiset functions learn different attributes of the to learn the noise posterior. We conduct illustrative neighborhood--MAX learns distinct elements and MEAN experiments clearly targeting oversmoothing learns distributions--only SUM is injective and thus capable and multiset aggregation limitations.

neural network, stag, stochastic aggregation, (15 more...)

arXiv.org Artificial Intelligence

2102.12648

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Representing Pure Nash Equilibria in Argumentation

Yun, Bruno, Vesic, Srdjan, Oren, Nir

arXiv.org Artificial IntelligenceJun-19-2020

In this paper we describe an argumentation-based representation of normal form games, and demonstrate how argumentation can be used to compute pure strategy Nash equilibria. Our approach builds on Modgil's Extended Argumentation Frameworks. We demonstrate its correctness, prove several theoretical properties it satisfies, and outline how it can be used to explain why certain strategies are Nash equilibria to a non-expert human user.

argument, artificial intelligence, natural language, (16 more...)

arXiv.org Artificial Intelligence

2006.1102

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > Scotland > City of Aberdeen > Aberdeen (0.04)
Europe > France (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.71)

Technology: Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)

Add feedback