AITopics

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Neural Information Processing SystemsFeb-17-2026, 17:25:33 GMT

e7938ede51225b490bb69f7b361a9259-Supplemental-Conference.pdf

artificial intelligence, dataset, machine learning, (18 more...)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Neural Information Processing SystemsFeb-17-2026, 17:25:28 GMT

Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots Ruixiang T ang

In the field of natural language processing, the prevalent approach involves fine-tuning pretrained language models (PLMs) using local samples.

artificial intelligence, machine learning, natural language, (19 more...)

Country:

Asia > Nepal (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.73)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Artificial IntelligenceOct-27-2025

Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm

Guo, Dadi, Zhou, Tianyi, Liu, Dongrui, Qian, Chen, Ren, Qihan, Shao, Shuai, Fan, Zhiyuan, Fung, Yi R., Wang, Kun, Zhang, Linfeng, Shao, Jing

Recent advances in large language models (LLMs) and agent system designs have empowered agents with unprecedented levels of capability. However, existing agent benchmarks are showing a trend of rapid ceiling-hitting by newly developed agents, making it difficult to meet the demands for evaluating agent abilities. To address this problem, we propose the Trajectory-based V alidated-by-Reproducing Agent-benchmark Complexity Evolution (TRACE) framework. This framework takes an original task from an existing benchmark and encourages agents to freely explore and evolve it into a new task with higher difficulty while recording validatable agent trajectories. The framework proceeds in three stages: (1) evolutionary proposal mining, which provides task evolution proposals through preliminary exploration and divergent thinking; (2) problem formation and free exploration, where proposals are conceptualized into feasible problem candidates and the agents then explore them freely while recording their execution trajectories; and (3) multi-level validation, which ensures that the evolved tasks are accompanied by validatable and reproducible trajectories. Experiments on the GAIA benchmark demonstrate that the TRACE framework consistently enhances task complexity while improving the reliability of correctness through validatable execution trajectories. In addition, our framework can successfully adapt to and improve reasoning datasets represented by AIME-2024. This work marks a paradigm shift from static, manually curated benchmarks to dynamic, self-evolving evaluation systems, providing a sustainable and challenging runway for agent development.

large language model, machine learning, natural language, (20 more...)

2510.00415

Country:

Asia > China (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.82)

Industry:

Media > Music (0.68)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Neural Information Processing SystemsOct-9-2025, 10:29:40 GMT

e7938ede51225b490bb69f7b361a9259-Paper-Conference.pdf

artificial intelligence, machine learning, natural language, (19 more...)

Country:

Asia > Nepal (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.37)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
(2 more...)

Neural Information Processing SystemsOct-3-2025, 00:38:58 GMT

Rethinking Deep Neural Network Ownership Verification: Embedding Passports to Defeat Ambiguity Attacks

Lixin Fan, Kam Woh Ng, Chee Seng Chan

DNN ownership verification methods in the face of ambiguity attacks, which aim to cast doubts on the ownership verification by forging counterfeit watermarks.

artificial intelligence, machine learning, passport, (16 more...)

Country: Asia > Malaysia (0.28)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

van Buuren, Jannick, Giglio, Roberto, Roveda, Loris, Peternel, Luka

Robotic Skill Diversification via Active Mutation of Reward Functions in Reinforcement Learning During a Liquid Pouring Task

arXiv.org Artificial IntelligenceSep-24-2025

This paper explores how deliberate mutations of reward function in reinforcement learning can produce diversified skill variations in robotic manipulation tasks, examined with a liquid pouring use case. To this end, we developed a new reward function mutation framework that is based on applying Gaussian noise to the weights of the different terms in the reward function. Inspired by the cost-benefit tradeoff model from human motor control, we designed the reward function with the following key terms: accuracy, time, and effort. The study was performed in a simulation environment created in NVIDIA Isaac Sim, and the setup included Franka Emika Panda robotic arm holding a glass with a liquid that needed to be poured into a container. The reinforcement learning algorithm was based on Proximal Policy Optimization. We systematically explored how different configurations of mutated weights in the rewards function would affect the learned policy. The resulting policies exhibit a wide range of behaviours: from variations in execution of the originally intended pouring task to novel skills useful for unexpected tasks, such as container rim cleaning, liquid mixing, and watering. This approach offers promising directions for robotic systems to perform diversified learning of specific tasks, while also potentially deriving meaningful skills for future tasks.

machine learning, reinforcement learning, variation, (12 more...)

2509.18463

Country: Europe (0.68)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.68)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Pozanco, Alberto, Morales, Marianela, Borrajo, Daniel, Veloso, Manuela

Planning with Minimal Disruption

arXiv.org Artificial IntelligenceAug-22-2025

In many planning applications, we might be interested in finding plans that minimally modify the initial state to achieve the goals. We refer to this concept as plan disruption. In this paper, we formally introduce it, and define various planning-based compilations that aim to jointly optimize both the sum of action costs and plan disruption. Experimental results in different benchmarks show that the reformulated task can be effectively solved in practice to generate plans that balance both objectives.

artificial intelligence, plan disruption, planning & scheduling, (15 more...)

2508.15358

Genre: Research Report (0.50)

Industry: Banking & Finance (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Neural Information Processing SystemsAug-15-2025, 05:50:07 GMT

A Proofs Proposition 1 The mapping f

See proof of Proposition 3 below for the form of the Jacobian. Theorem 4.7] and so is the product p Equation ( 50) is an element-wise division. The main preprocessing we did was to (i) remove the "label" attribute from each data set, and (ii) Descriptions for all data set are below. All data have been completely anonymized. The original task was to predict whether an applicant would be recommended for acceptance by hierarchical decision model, which has been removed during preprocessing.

test example, training example, validation example, (15 more...)

Country:

North America > United States > Colorado (0.04)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
Europe > Belgium > Flanders (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

arXiv.org Artificial IntelligenceApr-15-2025

FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions

Marta, Daniel, Holk, Simon, Vasco, Miguel, Lundell, Jens, Homberger, Timon, Busch, Finn, Andersson, Olov, Kragic, Danica, Leite, Iolanda

Preference-based reinforcement learning (PbRL) is a suitable approach for style adaptation of pre-trained robotic behavior: adapting the robot's policy to follow human user preferences while still being able to perform the original task. However, collecting preferences for the adaptation process in robotics is often challenging and time-consuming. In this work we explore the adaptation of pre-trained robots in the low-preference-data regime. We show that, in this regime, recent adaptation approaches suffer from catastrophic reward forgetting (CRF), where the updated reward model overfits to the new preferences, leading the agent to become unable to perform the original task. To mitigate CRF, we propose to enhance the original reward model with a small number of parameters (low-rank matrices) responsible for modeling the preference adaptation. Our evaluation shows that our method can efficiently and effectively adjust robotic behavior to human preferences across simulation benchmark tasks and multiple real-world robotic tasks.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2504.10002

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)