AITopics | ued

Collaborating Authors

ued

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

985e9a46e10005356bbaf194249f6856-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-9-2026, 11:20:18 GMT

Reviewer2points out that "clarity for UED could be improved" with "a clearer description saying we haveX and28 wanttoachieveY".

artificial intelligence, perform well, ued

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

Reviewer 4 finds the theoretical connection

Neural Information Processing SystemsAug-15-2025, 06:49:54 GMT

We call the problem of choosing how to generate these environments UED.

curriculum, reviewer 2, reviewer 4, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence (0.32)
Information Technology > Game Theory (0.31)

Add feedback

Refining Minimax Regret for Unsupervised Environment Design

Beukman, Michael, Coward, Samuel, Matthews, Michael, Fellows, Mattie, Jiang, Minqi, Dennis, Michael, Foerster, Jakob

arXiv.org Artificial IntelligenceJun-8-2024

In unsupervised environment design, reinforcement learning agents are trained on environment configurations (levels) generated by an adversary that maximises some objective. Regret is a commonly used objective that theoretically results in a minimax regret (MMR) policy with desirable robustness guarantees; in particular, the agent's maximum regret is bounded. However, once the agent reaches this regret bound on all levels, the adversary will only sample levels where regret cannot be further reduced. Although there are possible performance improvements to be made outside of these regret-maximising levels, learning stagnates. In this work, we introduce Bayesian level-perfect MMR (BLP), a refinement of the minimax regret objective that overcomes this limitation. We formally show that solving for this objective results in a subset of MMR policies, and that BLP policies act consistently with a Perfect Bayesian policy over all levels. We further introduce an algorithm, ReMiDi, that results in a BLP policy at convergence. We empirically demonstrate that training on levels from a minimax regret adversary causes learning to prematurely stagnate, but that ReMiDi continues learning.

adversary, agent, minimax regret, (14 more...)

arXiv.org Artificial Intelligence

2402.12284

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(6 more...)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Industry: Education > Educational Setting > Continuing Education (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Generalization through Diversity: Improving Unsupervised Environment Design

Li, Wenjun, Varakantham, Pradeep, Li, Dexun

arXiv.org Artificial IntelligenceSep-18-2023

Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e.g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board). Due to this dependence, small changes in the environment (e.g., positions of obstacles in the maze, size of the board) can severely affect the effectiveness of the policy learned by the agent. To that end, existing work has proposed training RL agents on an adaptive curriculum of environments (generated automatically) to improve performance on out-of-distribution (OOD) test scenarios. Specifically, existing research has employed the potential for the agent to learn in an environment (captured using Generalized Advantage Estimation, GAE) as the key factor to select the next environment(s) to train the agent. However, such a mechanism can select similar environments (with a high potential to learn) thereby making agent training redundant on all but one of those environments. To that end, we provide a principled approach to adaptively identify diverse environments based on a novel distance measure relevant to environment design. We empirically demonstrate the versatility and effectiveness of our method in comparison to multiple leading approaches for unsupervised environment design on three distinct benchmark problems used in literature.

agent, student agent, test environment, (12 more...)

arXiv.org Artificial Intelligence

2301.08025

Country:

Asia > Singapore (0.05)
Europe > Italy (0.04)

Genre: Research Report (0.82)

Industry:

Education (0.96)
Leisure & Entertainment > Games (0.66)
Leisure & Entertainment > Sports > Motorsports (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

CLUTR: Curriculum Learning via Unsupervised Task Representation Learning

Azad, Abdus Salam, Gur, Izzeddin, Emhoff, Jasper, Alexis, Nathaniel, Faust, Aleksandra, Abbeel, Pieter, Stoica, Ion

arXiv.org Artificial IntelligenceMar-7-2023

Deep Reinforcement Learning (RL) has shown exciting progress in the past decade in many challenging domains including Reinforcement Learning (RL) algorithms are often Atari (Mnih et al., 2015), Dota (Berner et al., 2019), known for sample inefficiency and difficult Go (Silver et al., 2016). However, deep RL is also known generalization. Recently, Unsupervised Environment for its sample inefficiency and difficult generalization-- Design (UED) emerged as a new paradigm performing poorly on unseen tasks or failing altogether for zero-shot generalization by simultaneously with the slightest change (Cobbe et al., 2019; Azad et al., learning a task distribution and agent policies 2022; Zhang et al., 2018). While, Curriculum Learning on the generated tasks. This is a non-stationary (CL) algorithms have shown to improve RL sample efficiency process where the task distribution evolves along by adapting the training task distribution, i.e., the with agent policies; creating an instability over curriculum (Portelas et al., 2020; Narvekar et al., 2020), time. While past works demonstrated the potential recently a class of Unsupervised CL algorithms, called Unsupervised of such approaches, sampling effectively from Environment Design (UED) (Dennis et al., 2020; the task space remains an open challenge, bottlenecking Jiang et al., 2021a) has shown promising zero-shot generalization these approaches. To this end, we introduce by automatically generating the training tasks and CLUTR: a novel unsupervised curriculum adapting the curriculum simultaneously.

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2210.10243

Country:

Europe (1.00)
Asia (0.93)
North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Education (1.00)
Information Technology (0.93)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Uncertain Time Series Classification With Shapelet Transform

Mbouopda, Michael Franklin, Nguifo, Engelbert Mephu

arXiv.org Machine LearningFeb-3-2021

Time series classification is a task that aims at classifying chronological data. It is used in a diverse range of domains such as meteorology, medicine and physics. In the last decade, many algorithms have been built to perform this task with very appreciable accuracy. However, applications where time series have uncertainty has been under-explored. Using uncertainty propagation techniques, we propose a new uncertain dissimilarity measure based on Euclidean distance. We then propose the uncertain shapelet transform algorithm for the classification of uncertain time series. The large experiments we conducted on state of the art datasets show the effectiveness of our contribution. The source code of our contribution and the datasets we used are all available on a public repository.

classification, dataset, time sery, (14 more...)

arXiv.org Machine Learning

2102.0209

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France > Auvergne-Rhône-Alpes > Puy-de-Dôme > Clermont-Ferrand (0.04)
Asia (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.95)

Add feedback