AITopics | openai gym

Collaborating Authors

openai gym

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

e615c82aba461681ade82da2da38004a-AuthorFeedback.pdf

Neural Information Processing SystemsAug-17-2025, 01:17:27 GMT

augmentation, data augmentation, final draft, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.34)

Add feedback

Prompt Informed Reinforcement Learning for Visual Coverage Path Planning

Margapuri, Venkat

arXiv.org Artificial IntelligenceJul-15-2025

Visual coverage path planning with unmanned aerial vehicles (UAVs) requires agents to strategically coordinate UAV motion and camera control to maximize coverage, minimize redundancy, and maintain battery efficiency. Traditional reinforcement learning (RL) methods rely on environment-specific reward formulations that lack semantic adaptability. This study proposes Prompt-Informed Reinforcement Learning (PIRL), a novel approach that integrates the zero-shot reasoning ability and in-context learning capability of large language models with curiosity-driven RL. PIRL leverages semantic feedback from an LLM, GPT-3.5, to dynamically shape the reward function of the Proximal Policy Optimization (PPO) RL policy guiding the agent in position and camera adjustments for optimal visual coverage. The PIRL agent is trained using OpenAI Gym and evaluated in various environments. Furthermore, the sim-to-real-like ability and zero-shot generalization of the agent are tested by operating the agent in Webots simulator which introduces realistic physical dynamics. Results show that PIRL outperforms multiple learning-based baselines such as PPO with static rewards, PPO with exploratory weight initialization, imitation learning, and an LLM-only controller. Across different environments, PIRL outperforms the best-performing baseline by achieving up to 14% higher visual coverage in OpenAI Gym and 27% higher in Webots, up to 25% higher battery efficiency, and up to 18\% lower redundancy, depending on the environment. The results highlight the effectiveness of LLM-guided reward shaping in complex spatial exploration tasks and suggest a promising direction for integrating natural language priors into RL for robotics.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.10284

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (0.87)
Information Technology > Robotics & Automation (0.35)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym

La, Ngoc, Mon-Williams, Ruaridh, Shah, Julie A.

arXiv.org Artificial IntelligenceMay-29-2025

In recent years, reinforcement learning (RL) methods have been widely tested using tools like OpenAI Gym, though many tasks in these environments could also benefit from hierarchical planning. However, there is a lack of a tool that enables seamless integration of hierarchical planning with RL. Hierarchical Domain Definition Language (HDDL), used in classical planning, introduces a structured approach well-suited for model-based RL to address this gap. To bridge this integration, we introduce HDDLGym, a Python-based tool that automatically generates OpenAI Gym environments from HDDL domains and problems. HDDLGym serves as a link between RL and hierarchical planning, supporting multi-agent scenarios and enabling collaborative planning among agents. This paper provides an overview of HDDLGym's design and implementation, highlighting the challenges and design choices involved in integrating HDDL with the Gym interface, and applying RL policies to support hierarchical planning. We also provide detailed instructions and demonstrations for using the HDDLGym framework, including how to work with existing HDDL domains and problems from International Planning Competitions, exemplified by the Transport domain. Additionally, we offer guidance on creating new HDDL domains for multi-agent scenarios and demonstrate the practical use of HDDLGym in the Overcooked domain. By leveraging the advantages of HDDL and Gym, HDDL-Gym aims to be a valuable tool for studying RL in hierarchical planning, particularly in multi-agent contexts.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.22597

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.82)

Add feedback

Stealing That Free Lunch: Exposing the Limits of Dyna-Style Reinforcement Learning

Barkley, Brett, Fridovich-Keil, David

arXiv.org Artificial IntelligenceDec-20-2024

Dyna-style off-policy model-based reinforcement learning (DMBRL) algorithms are a family of techniques for generating synthetic state transition data and thereby enhancing the sample efficiency of off-policy RL algorithms. This paper identifies and investigates a surprising performance gap observed when applying DMBRL algorithms across different benchmark environments with proprioceptive observations. We show that, while DMBRL algorithms perform well in OpenAI Gym, their performance can drop significantly in DeepMind Control Suite (DMC), even though these settings offer similar tasks and identical physics backends. Modern techniques designed to address several key issues that arise in these settings do not provide a consistent improvement across all environments, and overall our results show that adding synthetic rollouts to the training process -- the backbone of Dyna-style algorithms -- significantly degrades performance across most DMC environments. Our findings contribute to a deeper understanding of several fundamental challenges in model-based RL and show that, like many optimization fields, there is no free lunch when evaluating performance across diverse benchmarks in RL.

machine learning, mbpo, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2412.14312

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym

Raponi, Elena, Carraz, Nathanael Rakotonirina, Rapin, Jérémy, Doerr, Carola, Teytaud, Olivier

arXiv.org Artificial IntelligenceJan-2-2024

The growing ubiquity of machine learning (ML) has led it to enter various areas of computer science, including black-box optimization (BBO). Recent research is particularly concerned with Bayesian optimization (BO). BO-based algorithms are popular in the ML community, as they are used for hyperparameter optimization and more generally for algorithm configuration. However, their efficiency decreases as the dimensionality of the problem and the budget of evaluations increase. Meanwhile, derivative-free optimization methods have evolved independently in the optimization community. Therefore, we urge to understand whether cross-fertilization is possible between the two communities, ML and BBO, i.e., whether algorithms that are heavily used in ML also work well in BBO and vice versa. Comparative experiments often involve rather small benchmarks and show visible problems in the experimental setup, such as poor initialization of baselines, overfitting due to problem-specific setting of hyperparameters, and low statistical significance. With this paper, we update and extend a comparative study presented by Hutter et al. in 2013. We compare BBO tools for ML with more classical heuristics, first on the well-known BBOB benchmark suite from the COCO environment and then on Direct Policy Search for OpenAI Gym, a reinforcement learning benchmark. Our results confirm that BO-based optimizers perform well on both benchmarks when budgets are limited, albeit with a higher computational cost, while they are often outperformed by algorithms from other families when the evaluation budget becomes larger. We also show that some algorithms from the BBO community perform surprisingly well on ML tasks.

algorithm, budget, dimension, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TEVC.2023.3346788

2310.00077

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Air (0.62)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

Add feedback

Everything to know about Elon Musk's OpenAI, The Maker Of ChatGPT

#artificialintelligenceDec-30-2022, 09:40:19 GMT

Speak of Elon Musk and in all probability, companies like Twitter, Tesla or SpaceX will come to your mind. But little do people know about Elon Musk's company OpenAI -- an artificial intelligence (AI) research and development firm that is behind the disruptive chatbot ChatGPT. The brainchild of Musk and former Y Combinator president Sam Altman, OpenAI launched ChatGPT in November 2022 and within a week, the application saw a spike of over a million users. Being able to do anything between coding and interacting that mimics human intelligence, ChatGPT has surpassed previous standards of AI capabilities and has introduced a new chapter in AI technologies and machine learning systems. If you are intrigued by artificial intelligence and take an interest in deep learning and how they can benefit humanity, then you must know about the history of OpenAI and the levels AI development has reached.

elon musk, intelligence, openai, (15 more...)

#artificialintelligence

Country: North America > United States > California > San Francisco County > San Francisco (0.05)

Industry: Information Technology (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

A Survey on Quantum Reinforcement Learning

Meyer, Nico, Ufrecht, Christian, Periyasamy, Maniraman, Scherer, Daniel D., Plinge, Axel, Mutschler, Christopher

arXiv.org Artificial IntelligenceNov-7-2022

With recent advances in the fabrication and control of hardware for quantum information processing, the possibilities of merging quantum computing (QC) with machine learning (ML) have received a huge amount of attention within the growing research community. Hereby, reinforcement learning (RL) is the third paradigm besides supervised and unsupervised learning. In this survey article, we provide an overview over so-called quantum reinforcement learning (QRL) algorithms. We understand these as quantum-assisted approaches, that solve a particular task (be they classical or quantum in nature) by employing quantum resources (either in simulation and/or in experiment). In order to keep this contribution as self-contained as possible, we provide the necessary backgrounds before venturing into the QRL literature. We start out with a brief recap of the essentials of the RL paradigm in the fully classical setting in Sec. 2. Further, in Sec. 3 we provide a quick introduction to QC and variational quantum circuits (VQCs). Readers familiar with either of the topics may safely skip these sections. In Sec. 4 we turn our attention to the emerging field of QRL, starting out with a quick overview of the literature.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2211.03464

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Advanced AI: Deep Reinforcement Learning in Python

#artificialintelligenceSep-24-2022, 21:13:25 GMT

Created by Lazy Programmer Team, Lazy Programmer Inc. This course is all about the application of deep learning and neural networks to reinforcement learning. If you've taken my first reinforcement learning class, then you know that reinforcement learning is on the bleeding edge of what we can do with AI. Specifically, the combination of deep learning with reinforcement learning has led to AlphaGo beating a world champion in the strategy game Go, it has led to self-driving cars, and it has led to machines that can play video games at a superhuman level. Reinforcement learning has been around since the 70s but none of this has been possible until now.

deep reinforcement learning, neural network, reinforcement, (7 more...)

#artificialintelligence

Country: North America > United States > California (0.05)

Genre: Instructional Material > Course Syllabus & Notes (0.70)

Industry:

Leisure & Entertainment > Games (0.91)
Information Technology (0.57)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)

Add feedback

COOL-MC: A Comprehensive Tool for Reinforcement Learning and Model Checking

Gross, Dennis, Jansen, Nils, Junges, Sebastian, Perez, Guillermo A.

arXiv.org Artificial IntelligenceSep-15-2022

This paper presents COOL-MC, a tool that integrates state-of-the-art reinforcement learning (RL) and model checking. Specifically, the tool builds upon the OpenAI gym and the probabilistic model checker Storm. COOL-MC provides the following features: (1) a simulator to train RL policies in the OpenAI gym for Markov decision processes (MDPs) that are defined as input for Storm, (2) a new model builder for Storm, which uses callback functions to verify (neural network) RL policies, (3) formal abstractions that relate models and policies specified in OpenAI gym or Storm, and (4) algorithms to obtain bounds on the performance of so-called permissive policies. We describe the components and architecture of COOL-MC and demonstrate its features on multiple benchmark environments.

cool-mc, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2209.07133

Country: