AITopics | hyperagent

Collaborating Authors

hyperagent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adaptive Foundation Models for Online Decisions: HyperAgent with Fast Incremental Uncertainty Estimation

Li, Yingru, Xu, Jiawei, Luo, Zhi-Quan

arXiv.org Artificial IntelligenceJul-18-2024

Foundation models often struggle with uncertainty when faced with new situations in online decision-making, necessitating scalable and efficient exploration to resolve this uncertainty. We introduce GPT-HyperAgent, an augmentation of GPT with HyperAgent for uncertainty-aware, scalable exploration in contextual bandits, a fundamental online decision problem involving natural language input. We prove that HyperAgent achieves fast incremental uncertainty estimation with $\tilde{O}(\log T)$ per-step computational complexity over $T$ periods under the linear realizable assumption. Our analysis demonstrates that HyperAgent's regret order matches that of exact Thompson sampling in linear contextual bandits, closing a significant theoretical gap in scalable exploration. Empirical results in real-world contextual bandit tasks, such as automated content moderation with human feedback, validate the practical effectiveness of GPT-HyperAgent for safety-critical decisions. Our code is open-sourced at \url{https://github.com/szrlee/GPT-HyperAgent/}.

adaptive foundation model, fast incremental uncertainty estimation, online decision, (1 more...)

arXiv.org Artificial Intelligence

2407.13195

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback

HyperAgent: A Simple, Scalable, Efficient and Provable Reinforcement Learning Framework for Complex Environments

Li, Yingru, Xu, Jiawei, Han, Lei, Luo, Zhi-Quan

arXiv.org Machine LearningMar-18-2024

To solve complex tasks under resource constraints, reinforcement learning (RL) agents need to be simple, efficient, and scalable, addressing (1) large state spaces and (2) the continuous accumulation of interaction data. We propose HyperAgent, an RL framework featuring the hypermodel and index sampling schemes that enable computation-efficient incremental approximation for the posteriors associated with general value functions without the need for conjugacy, and data-efficient action selection. Implementing HyperAgent is straightforward, requiring only one additional module beyond what is necessary for Double-DQN. HyperAgent stands out as the first method to offer robust performance in large-scale deep RL benchmarks while achieving provably scalable per-step computational complexity and attaining sublinear regret under tabular assumptions. HyperAgent can solve Deep Sea hard exploration problems with episodes that optimally scale with problem size and exhibits significant efficiency gains in both data and computation under the Atari benchmark. The core of our theoretical analysis is the sequential posterior approximation argument, enabled by the first analytical tool for sequential random projection -- a non-trivial martingale extension of the Johnson-Lindenstrauss. This work bridges the theoretical and practical realms of RL, establishing a new benchmark for RL algorithm design.

complex environment, hyperagent, provable reinforcement learning framework, (1 more...)

arXiv.org Machine Learning

2402.10228

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback