AITopics | toppo

Collaborating Authors

toppo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TOPPO: Rethinking PPO for Multi-Task Reinforcement Learning with Critic Balancing

Li, Yuanpeng, Lin, Gefei, Qu, Annie, Miao, Rui

arXiv.org Machine LearningMay-13-2026

Soft Actor-Critic (SAC) and its variants dominate Multi-Task Reinforcement Learning (MTRL) due to their off-policy sample efficiency, while on-policy methods such as Proximal Policy Optimization (PPO) remain underexplored. We diagnose that PPO in MTRL suffers from a previously overlooked issue: critic-side gradient ill-conditioning, which may cause tail tasks to stall while easy tasks dominate the value function's updates. To address this, we propose TOPPO (Tail-Optimized PPO), a reformulation of PPO via Critic Balancing -- a set of modules that improve gradient conditioning and balance learning dynamics across tasks. Unlike prior approaches that rely on modular architectures or large models, TOPPO targets the optimization bottleneck within PPO itself. Empirically, TOPPO achieves stronger mean and tail-task performance than published SAC-family and ARS-family baselines while using substantially fewer parameters and environment steps on Meta-World+ benchmark. Notably, TOPPO matches or surpasses strong SAC baselines early in training and maintains superior performance at full budget. Ablations confirm the effectiveness of each module in TOPPO and provide insights into their interactions. Our results demonstrate that, with proper optimization, on-policy methods can rival or exceed off-policy approaches in MTRL, challenging the prevailing reliance on SAC and highlighting critic-side gradient conditioning as the central bottleneck.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2605.11473

Genre: Research Report > New Finding (0.85)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Transductive Off-policy Proximal Policy Optimization

Gan, Yaozhong, Yan, Renye, Tan, Xiaoyang, Wu, Zhe, Xing, Junliang

arXiv.org Artificial IntelligenceJun-6-2024

Proximal Policy Optimization (PPO) is a popular model-free reinforcement learning algorithm, esteemed for its simplicity and efficacy. However, due to its inherent on-policy nature, its proficiency in harnessing data from disparate policies is constrained. This paper introduces a novel off-policy extension to the original PPO method, christened Transductive Off-policy PPO (ToPPO). Herein, we provide theoretical justification for incorporating off-policy data in PPO training and prudent guidelines for its safe application. Our contribution includes a novel formulation of the policy improvement lower bound for prospective policies derived from off-policy data, accompanied by a computationally efficient mechanism to optimize this bound, underpinned by assurances of monotonic improvement. Comprehensive experimental results across six representative tasks underscore ToPPO's promising performance.

algorithm, timestep, toppo, (11 more...)

arXiv.org Artificial Intelligence

2406.03894

Country:

North America > United States (0.05)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

How High School Should Change for an Era of AI and Robots

#artificialintelligenceNov-30-2022, 09:45:15 GMT

Public high school in America was the product of the time of its invention, which was way back in 1821. But in this era of rapid technological change marked by artificial intelligence and robots moving into more aspects of work and social life, maybe the way teaching is done in high school needs a reboot. It is framed around the thought experiment: What would an ideal high school of the year 2040 look like? The tour guides of this imagined school of the future are two authors: Jim Tracy, a senior advisor at the nonprofit Jobs for the Future who in his career has led private K-12 schools and served as a college president; and Greg Toppo, longtime education journalist. They instead focus on how coming technological change will end up shifting the relationship between people and machines, and therefore between students and teachers.

algorithm, high school, student, (13 more...)

#artificialintelligence

Country: North America > United States > Iowa (0.06)

Industry: Education > Educational Setting > K-12 Education > Secondary School (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (0.74)

Add feedback