AITopics | zero-shot reinforcement learning

Collaborating Authors

zero-shot reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Zero-Shot Reinforcement Learning Under Partial Observability

Jeen, Scott, Bewley, Tom, Cullen, Jonathan M.

arXiv.org Artificial IntelligenceJun-19-2025

Recent work has shown that, under certain assumptions, zero-shot reinforcement learning (RL) methods can generalise to any unseen task in an environment after reward-free pre-training. Access to Markov states is one such assumption, yet, in many real-world applications, the Markov state is only partially observable . Here, we explore how the performance of standard zero-shot RL methods degrades when subjected to partially observability, and show that, as in single-task RL, memory-based architectures are an effective remedy. We evaluate our memory-based zero-shot RL methods in domains where the states, rewards and a change in dynamics are partially observed, and show improved performance over memory-free baselines.

arxiv preprint arxiv, large language model, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2506.15446

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Zero-Shot Reinforcement Learning from Low Quality Data

Neural Information Processing SystemsMay-26-2025, 18:08:31 GMT

Zero-shot reinforcement learning (RL) promises to provide agents that can perform any task in an environment after an offline, reward-free pre-training phase. Methods leveraging successor measures and successor features have shown strong performance in this setting, but require access to large heterogenous datasets for pre-training which cannot be expected for most real problems. Here, we explore how the performance of zero-shot RL methods degrades when trained on small homogeneous datasets, and propose fixes inspired by conservatism, a well-established feature of performant single-task offline RL algorithms. We evaluate our proposals across various datasets, domains and tasks, and show that conservative zero-shot RL algorithms outperform their non-conservative counterparts on low quality datasets, and perform no worse on high quality datasets. Somewhat surprisingly, our proposals also outperform baselines that get to see the task during training.

large language model, machine learning, natural language, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Zero-Shot Reinforcement Learning via Function Encoders

Ingebrand, Tyler, Zhang, Amy, Topcu, Ufuk

arXiv.org Artificial IntelligenceJan-30-2024

Although reinforcement learning (RL) can solve many challenging sequential decision making problems, achieving zero-shot transfer across related tasks remains a challenge. The difficulty lies in finding a good representation for the current task so that the agent understands how it relates to previously seen tasks. To achieve zero-shot transfer, we introduce the function encoder, a representation learning algorithm which represents a function as a weighted combination of learned, non-linear basis functions. By using a function encoder to represent the reward function or the transition function, the agent has information on how the current task relates to previously seen tasks via a coherent vector representation. Thus, the agent is able to achieve transfer between related tasks at run time with no additional training. We demonstrate state-of-the-art data efficiency, asymptotic performance, and training stability in three RL fields by augmenting basic RL algorithms with a function encoder task representation.

algorithm, function encoder, representation, (14 more...)

arXiv.org Artificial Intelligence

2401.17173

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Low Emission Building Control with Zero-Shot Reinforcement Learning

Jeen, Scott R., Abate, Alessandro, Cullen, Jonathan M.

arXiv.org Artificial IntelligenceMar-5-2023

Heating and cooling systems in buildings account for 31% of global energy use, much of which are regulated by Rule Based Controllers (RBCs) that neither maximise energy efficiency nor minimise emissions by interacting optimally with the grid. Control via Reinforcement Learning (RL) has been shown to significantly improve building energy efficiency, but existing solutions require access to building-specific simulators or data that cannot be expected for every building in the world. In response, we show it is possible to obtain emission-reducing policies without such knowledge a priori--a paradigm we call zero-shot building control. We combine ideas from system identification and model-based RL to create PEARL (Probabilistic Emission-Abating Reinforcement Learning) and show that a short period of active exploration is all that is required to build a performant model. In experiments across three varied building energy simulations, we show PEARL outperforms an existing RBC once, and popular RL baselines in all cases, reducing building emissions by as much as 31% whilst maintaining thermal comfort. Our source code is available online via https://enjeeneer.io/projects/pearl/

low emission building control, zero-shot reinforcement learning

arXiv.org Artificial Intelligence

doi: 10.1609/aaai.v37i12.26668

2206.14191

Genre: Research Report (0.40)

Industry:

Energy (0.53)
Construction & Engineering > HVAC (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.80)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)

Add feedback

Low Emission Building Control with Zero-Shot Reinforcement Learning

Jeen, Scott R., Abate, Alessandro, Cullen, Jonathan M.

arXiv.org Artificial IntelligenceAug-15-2022

Heating and cooling systems in buildings account for 31\% of global energy use, much of which are regulated by Rule Based Controllers (RBCs) that neither maximise energy efficiency nor minimise emissions by interacting optimally with the grid. Control via Reinforcement Learning (RL) has been shown to significantly improve building energy efficiency, but existing solutions require access to building-specific simulators or data that cannot be expected for every building in the world. In response, we show it is possible to obtain emission-reducing policies without such knowledge a priori--a paradigm we call zero-shot building control. We combine ideas from system identification and model-based RL to create PEARL (Probabilistic Emission-Abating Reinforcement Learning) and show that a short period of active exploration is all that is required to build a performant model. In experiments across three varied building energy simulations, we show PEARL outperforms an existing RBC once, and popular RL baselines in all cases, reducing building emissions by as much as 31\% whilst maintaining thermal comfort. Our source code is available online via https://enjeeneer.io/projects/pearl .

low emission building control, zero-shot reinforcement learning

arXiv.org Artificial Intelligence

2208.06385

Genre: Research Report (0.66)

Industry:

Energy (0.53)
Construction & Engineering > HVAC (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.80)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)

Add feedback