AITopics

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Neural Information Processing SystemsFeb-16-2026, 13:17:13 GMT

8cea78701eb986f3ec357eb9b7c6badd-Paper-Conference.pdf

large language model, machine learning, natural language, (21 more...)

Country:

North America > United States (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Game Theory (0.93)

Neural Information Processing SystemsFeb-11-2026, 13:55:28 GMT

e140dbab44e01e699491a59c9978b924-Paper.pdf

Success stories of deep reinforcement learning (RL) from high dimensional inputs such as pixels or large spatial layouts include achieving superhuman performance on Atari games [30, 37, 1], grandmaster levelinStarcraft II[50]andgrasping adiverse setofobjects with impressivesuccess rates and generalization with robots in the real world [21].

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Industry: Leisure & Entertainment > Games > Computer Games (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsDec-25-2025, 02:01:56 GMT

Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings

Recent advances in off-policy deep reinforcement learning (RL) have led to impressive success in complex tasks from visual observations. Experience replay improves sample-efficiency by reusing experiences from the past, and convolutional neural networks (CNNs) process high-dimensional inputs effectively. However, such techniques demand high memory and computational bandwidth. In this paper, we present Stored Embeddings for Efficient Reinforcement Learning (SEER), a simple modification of existing off-policy RL methods, to address these computational and memory requirements. To reduce the computational overhead of gradient updates in CNNs, we freeze the lower layers of CNN encoders early in training due to early convergence of their parameters. Additionally, we reduce memory requirements by storing the low-dimensional latent vectors for experience replay instead of high-dimensional images, enabling an adaptive increase in the replay buffer capacity, a useful technique in constrained-memory settings. In our experiments, we show that SEER does not degrade the performance of RL agents while significantly saving computation and memory across a diverse set of DeepMind Control environments and Atari games.

computational efficiency, stored embedding, visual reinforcement learning, (4 more...)

Industry: Leisure & Entertainment > Games > Computer Games (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

arXiv.org Artificial IntelligenceNov-25-2025

Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning

Qin, Ruoyu, He, Weiran, Huang, Weixiao, Zhang, Yangkun, Zhao, Yikai, Pang, Bo, Xu, Xinran, Shan, Yingdi, Wu, Yongwei, Zhang, Mingxing

Reinforcement Learning (RL) has become critical for advancing modern Large Language Models (LLMs), yet existing synchronous RL systems face severe performance bottlenecks. The rollout phase, which dominates end-to-end iteration time, suffers from substantial long-tail latency and poor resource utilization due to inherent workload imbalance. We present Seer, a novel online context learning system that addresses these challenges by exploiting previously overlooked similarities in output lengths and generation patterns among requests sharing the same prompt. Seer introduces three key techniques: divided rollout for dynamic load balancing, context-aware scheduling, and adaptive grouped speculative decoding. Together, these mechanisms substantially reduce long-tail latency and improve resource efficiency during rollout. Evaluations on production-grade RL workloads demonstrate that Seer improves end-to-end rollout throughput by 74% to 97% and reduces long-tail latency by 75% to 93% compared to state-of-the-art synchronous RL systems, significantly accelerating RL training iterations.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2511.14617

Genre:

Research Report (1.00)
Instructional Material > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Sampath, Aneesha, Aran, Oya, Provost, Emily Mower

SEER: The Span-based Emotion Evidence Retrieval Benchmark

arXiv.org Artificial IntelligenceOct-29-2025

We introduce the SEER (Span-based Emotion Evidence Retrieval) Benchmark to test Large Language Models' (LLMs) ability to identify the specific spans of text that express emotion. Unlike traditional emotion recognition tasks that assign a single label to an entire sentence, SEER targets the underexplored task of emotion evidence detection: pinpointing which exact phrases convey emotion. This span-level approach is crucial for applications like empathetic dialogue and clinical support, which need to know how emotion is expressed, not just what the emotion is. SEER includes two tasks: identifying emotion evidence within a single sentence, and identifying evidence across a short passage of five consecutive sentences. It contains new annotations for both emotion and emotion evidence on 1200 real-world sentences. We evaluate 14 open-source LLMs and find that, while some models approach average human performance on single-sentence inputs, their accuracy degrades in longer passages. Our error analysis reveals key failure modes, including overreliance on emotion keywords and false positives in neutral text.

large language model, machine learning, natural language, (21 more...)

2510.0349

Country: North America (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Neural Information Processing SystemsOct-10-2025, 09:07:50 GMT

Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf

As a variant of the famous communication game Werewolf, One Night Ultimate W erewolf (ONUW) requires players to develop strategic discussion policies due to the potential role changes that increase the uncertainty and complexity of the game.

agent, player 1, werewolf, (16 more...)

Country:

North America > United States (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Game Theory (0.93)

arXiv.org Artificial IntelligenceSep-18-2025

Reasoning Efficiently Through Adaptive Chain-of-Thought Compression: A Self-Optimizing Framework

Huang, Kerui, Liu, Shuhan, Hu, Xing, Xu, Tongtong, Bao, Lingfeng, Xia, Xin

Chain-of-Thought (CoT) reasoning enhances Large Language Models (LLMs) by prompting intermediate steps, improving accuracy and robustness in arithmetic, logic, and commonsense tasks. However, this benefit comes with high computational costs: longer outputs increase latency, memory usage, and KV-cache demands. These issues are especially critical in software engineering tasks where concise and deterministic outputs are required. To investigate these trade-offs, we conduct an empirical study based on code generation benchmarks. The results reveal that longer CoT does not always help. Excessive reasoning often causes truncation, accuracy drops, and latency up to five times higher, with failed outputs consistently longer than successful ones. These findings challenge the assumption that longer reasoning is inherently better and highlight the need for adaptive CoT control. Motivated by this, we propose SEER (Self-Enhancing Efficient Reasoning), an adaptive framework that compresses CoT while preserving accuracy. SEER combines Best-of-N sampling with task-aware adaptive filtering, dynamically adjusting thresholds based on pre-inference outputs to reduce verbosity and computational overhead. We then evaluate SEER on three software engineering tasks and one math task. On average, SEER shortens CoT by 42.1%, improves accuracy by reducing truncation, and eliminates most infinite loops. These results demonstrate SEER as a practical method to make CoT-enhanced LLMs more efficient and robust, even under resource constraints.

large language model, machine learning, natural language, (16 more...)

2509.14093

Country: Asia > China (0.29)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceSep-18-2025

Self-Guided Function Calling in Large Language Models via Stepwise Experience Recall

Cui, Sijia, He, Aiyao, Xu, Shuai, Zhang, Hongming, Wang, Yanna, Zhang, Qingyang, Wang, Yajing, Xu, Bo

Function calling enables large language models (LLMs) to interact with external systems by leveraging tools and APIs. When faced with multi-step tool usage, LLMs still struggle with tool selection, parameter generation, and tool-chain planning. Existing methods typically rely on manually designing task-specific demonstrations, or retrieving from a curated library. These approaches demand substantial expert effort and prompt engineering becomes increasingly complex and inefficient as tool diversity and task difficulty scale. To address these challenges, we propose a self-guided method, Stepwise Experience Recall (SEER), which performs fine-grained, stepwise retrieval from a continually updated experience pool. Instead of relying on static or manually curated library, SEER incrementally augments the experience pool with past successful trajectories, enabling continuous expansion of the pool and improved model performance over time. Evaluated on the ToolQA benchmark, SEER achieves an average improvement of 6.1% on easy and 4.7% on hard questions. We further test SEER on $τ$-bench, which includes two real-world domains. Powered by Qwen2.5-7B and Qwen2.5-72B models, SEER demonstrates substantial accuracy gains of 7.44% and 23.38%, respectively.

large language model, machine learning, seer, (18 more...)

2508.15214

Country:

Asia (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceAug-25-2025

Ethical Considerations of Large Language Models in Game Playing

Zhang, Qingquan, Li, Yuchen, Yuan, Bo, Togelius, Julian, Yannakakis, Georgios N., Liu, Jialin

Large language models (LLMs) have demonstrated tremendous potential in game playing, while little attention has been paid to their ethical implications in those contexts. This work investigates and analyses the ethical considerations of applying LLMs in game playing, using Werewolf, also known as Mafia, as a case study. Gender bias, which affects game fairness and player experience, has been observed from the behaviour of LLMs. Some roles, such as the Guard and Werewolf, are more sensitive than others to gender information, presented as a higher degree of behavioural change. We further examine scenarios in which gender information is implicitly conveyed through names, revealing that LLMs still exhibit discriminatory tendencies even in the absence of explicit gender labels. This research showcases the importance of developing fair and ethical LLMs. Beyond our research findings, we discuss the challenges and opportunities that lie ahead in this field, emphasising the need for diving deeper into the ethical implications of LLMs in gaming and other interactive domains.

large language model, machine learning, natural language, (14 more...)