AITopics | student agent

Collaborating Authors

student agent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching

Neural Information Processing SystemsJun-22-2026, 22:12:34 GMT

Instruction fine-tuning is crucial in NLP tasks, enhancing pretrained models' instruction-following capabilities and task-specific performance. However, obtaining high-quality fine-tuning data for large models is challenging due to data collection difficulties and high production costs. To address this, we propose MASTER, a novel data augmentation method that enriches original data through interactions among multiple agents with varying cognitive levels. We simulate three pedagogically grounded teaching scenarios, leveraging multi-agent conversations to generate high-quality teacher-student interaction data. Utilizing MASTER, we construct BOOST-QA, a fine-tuning dataset augmented from existing datasets like Orca-Math-200k, ProcQA, and OpenHermes2.5. Experiments show that models fine-tuned with BOOST-QA perform excellently across multiple benchmarks, demonstrating strong multitask generalization. Notably, MASTER significantly improves models' reasoning abilities in complex tasks, providing valuable insights for future research. Our code is publicly available at https://github.com/Toyhom/MASTER.

artificial intelligence, large language model, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia (1.00)
Europe (0.93)
North America > United States (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.93)

Industry:

Information Technology (0.92)
Education > Educational Setting (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Results

Neural Information Processing SystemsApr-24-2026, 17:09:42 GMT

In this section we prove the theoretical results around the dual curriculum game and use these results to show approximation bounds for our methods, given that they have reached a Nash equilibrium (NE). The first theorem is the main result that allows us to analyze dual curriculum games. The high-level result says that the NE of a dual curriculum game are approximate NE of the base game from the perspective of any of the individual players, or from the perspective of the joint strategy. Let Bbe the maximum difference between U1t and U2t, and let (π,θ1,θ2) be a NE for G. Then (π,pθ1 + (1 p)θ2) is an approximate NE for the base game with either teacher or for a teacher optimizing their joint objective. More precisely, it is a 2Bp(1 p)-approximate NE when Ut = pU1t + (1 p)U2t, a 2B(1 p)-approximate NE when Ut = U1t, and a 2Bp-approximate NE when Ut = U2t. At a high level, this is true because, for low values of p, the best-response strategies for the individual players can be thought of as approximate-best response strategies for the joint-player, and vis-versa. Since the Nash Equilibrium consists of each of the players playing their own best response, they must be playing an approximate best response for the joint-player. We provide a formal proof below: Proof. Let B be the maximum difference between U1t and U2t, and let (π,θ1,θ2) be a Nash Equilibrium for G. Then consider pθ1 + (1 p)θ2 as a strategy in the base game for the joint player pU1t + (1 p)U2t.

agent, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Sports > Motorsports > Formula One (1.00)
Leisure & Entertainment > Games (0.74)

Technology:

Information Technology > Game Theory (0.90)
Information Technology > Artificial Intelligence > Machine Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Add feedback

fa54b0edce5eef0bb07654e8ee800cb4-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 18:52:59 GMT

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education (0.68)
Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Improving Environment Novelty Quantification for Effective Unsupervised Environment Design

Neural Information Processing SystemsFeb-18-2026, 17:11:31 GMT

Unsupervised Environment Design (UED) formalizes the problem of autocur-ricula through interactive training between a teacher agent and a student agent. The teacher generates new training environments with high learning potential, curating an adaptive curriculum that strengthens the student's ability to handle unseen scenarios. Existing UED methods mainly rely on regret, a metric that measures the difference between the agent's optimal and actual performance, to

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States (0.04)
South America > Brazil (0.04)
(18 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Leisure & Entertainment > Sports > Motorsports (0.46)
Education > Educational Technology > Educational Software (0.34)
Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

fa54b0edce5eef0bb07654e8ee800cb4-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 22:05:04 GMT

agent, reflection, reflexion, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Education (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

f445ba15f0f05c26e1d24f908ea78d60-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 21:33:33 GMT

agent, algorithm, novelty, (14 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States (0.04)
South America > Brazil (0.04)
(18 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Education (1.00)
Leisure & Entertainment > Sports > Motorsports (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Add feedback

Style-Preserving Policy Optimization for Game Agents

Li, Lingfeng, Lu, Yunlong, Wang, Yongyi, Li, Wenxin

arXiv.org Artificial IntelligenceSep-23-2025

Proficient game agents with diverse play styles enrich the gaming experience and enhance the replay value of games. However, recent advancements in game AI based on reinforcement learning have predominantly focused on improving proficiency, whereas methods based on evolution algorithms generate agents with diverse play styles but exhibit subpar performance compared to RL methods. To address this gap, this paper proposes Mixed Proximal Policy Optimization (MPPO), a method designed to improve the proficiency of existing suboptimal agents while retaining their distinct styles. MPPO unifies loss objectives for both online and offline samples and introduces an implicit constraint to approximate demonstrator policies by adjusting the empirical distribution of samples. Empirical results across environments of varying scales demonstrate that MPPO achieves proficiency levels comparable to, or even superior to, pure online algorithms while preserving demonstrators' play styles. This work presents an effective approach for generating highly proficient and diverse game agents, ultimately contributing to more engaging gameplay experiences.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2506.16995

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Towards Agent-based Test Support Systems: An Unsupervised Environment Design Approach

Ogbodo, Collins O., Rogers, Timothy J., Borgo, Mattia Dal, Wagg, David J.

arXiv.org Artificial IntelligenceAug-21-2025

Modal testing plays a critical role in structural analysis by providing essential insights into dynamic behaviour across a wide range of engineering industries. In practice, designing an effective modal test campaign involves complex experimental planning, comprising a series of interdependent decisions that significantly influence the final test outcome. Traditional approaches to test design are typically static-focusing only on global tests without accounting for evolving test campaign parameters or the impact of such changes on previously established decisions, such as sensor configurations, which have been found to significantly influence test outcomes. These rigid methodologies often compromise test accuracy and adaptability. To address these limitations, this study introduces an agent-based decision support framework for adaptive sensor placement across dynamically changing modal test environments. The framework formulates the problem using an underspecified partially observable Markov decision process, enabling the training of a generalist reinforcement learning agent through a dual-curriculum learning strategy. A detailed case study on a steel cantilever structure demonstrates the efficacy of the proposed method in optimising sensor locations across frequency segments, validating its robustness and real-world applicability in experimental settings.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2508.14135

Country: Europe > United Kingdom (0.29)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.87)

Add feedback

1.5M Steps 3.1M Steps RND BeBold 6.4M Steps 4.6M Steps 7.5M Steps 9.8M Steps 1.0M Steps 1.4M Steps 3.4M Steps 2.4M Steps 3.9M Steps 4.8M Steps

Neural Information Processing SystemsAug-17-2025, 14:08:59 GMT

We provide final testing performance for NovelD and all baselines in MiniGrid. We also provide more intrinsic analysis similar to Sec. 4.2 in a seven-room environment in Figure 1. There are other categories of static environment. The initial position of the agent and goal can be random. The position of the agent and goal is randomized.

artificial intelligence, coefficient, step 3, (15 more...)

Neural Information Processing Systems

Genre: Workflow (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.33)

Add feedback

AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes

Qiu, Jiahao, Juan, Xinzhe, Wang, Yimin, Yang, Ling, Qi, Xuan, Zhang, Tongcheng, Guo, Jiacheng, Lu, Yifu, Yao, Zixin, Wang, Hongru, Liu, Shilong, Jiang, Xun, Leqi, Liu, Wang, Mengdi

arXiv.org Artificial IntelligenceJun-18-2025

While knowledge distillation has become a mature field for compressing large language models (LLMs) into smaller ones by aligning their outputs or internal representations, the distillation of LLM-based agents, which involve planning, memory, and tool use, remains relatively underexplored. Existing agent distillation methods typically replay full teacher trajectories or imitate step-by-step teacher tool usage, but they often struggle to train student agents to dynamically plan and act in novel environments. We propose AgentDistill, a novel, training-free agent distillation framework that enables efficient and scalable knowledge transfer via direct reuse of Model-Context-Protocols (MCPs), which are structured and reusable task-solving modules autonomously generated by teacher agents. The reuse of these distilled MCPs enables student agents to generalize their capabilities across domains and solve new problems with minimal supervision or human intervention. Experiments on biomedical and mathematical benchmarks demonstrate that our distilled student agents, built on small language models, can achieve performance comparable to advanced systems using large LLMs such as OctoTools (GPT-4o), highlighting the effectiveness of our framework in building scalable and cost-efficient intelligent agents.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.14728

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Industry: