new state
LLM-Driven Intrinsic Motivation for Sparse Reward Reinforcement Learning
Quadros, Andrรฉ, Silva, Cassio, Alves, Ronnie
This paper explores the combination of two intrinsic motivation strategies to improve the efficiency of reinforcement learning (RL) agents in environments with extreme sparse rewards, where traditional learning struggles due to infrequent positive feedback. We propose integrating Variational State as Intrinsic Reward (VSIMR), which uses Variational AutoEncoders (VAEs) to reward state novelty, with an intrinsic reward approach derived from Large Language Models (LLMs). The LLMs leverage their pre-trained knowledge to generate reward signals based on environment and goal descriptions, guiding the agent. We implemented this combined approach with an Actor-Critic (A2C) agent in the MiniGrid DoorKey environment, a benchmark for sparse rewards. Our empirical results show that this combined strategy significantly increases agent performance and sampling efficiency compared to using each strategy individually or a standard A2C agent, which failed to learn. Analysis of learning curves indicates that the combination effectively complements different aspects of the environment and task: VSIMR drives exploration of new states, while the LLM-derived rewards facilitate progressive exploitation towards goals.
LLM-BABYBENCH: Understanding and Evaluating Grounded Planning and Reasoning in LLMs
Choukrani, Omar, Malek, Idriss, Orel, Daniil, Xie, Zhuohan, Iklassov, Zangir, Takรกฤ, Martin, Lahlou, Salem
Assessing the capacity of Large Language Models (LLMs) to plan and reason within the constraints of interactive environments is crucial for developing capable AI agents. We introduce $\textbf{LLM-BabyBench}$, a new benchmark suite designed specifically for this purpose. Built upon a textual adaptation of the procedurally generated BabyAI grid world, this suite evaluates LLMs on three fundamental aspects of grounded intelligence: (1) predicting the consequences of actions on the environment state ($\textbf{Predict}$ task), (2) generating sequences of low-level actions to achieve specified objectives ($\textbf{Plan}$ task), and (3) decomposing high-level instructions into coherent subgoal sequences ($\textbf{Decompose}$ task). We detail the methodology for generating the three corresponding datasets ($\texttt{LLM-BabyBench-Predict}$, $\texttt{-Plan}$, $\texttt{-Decompose}$) by extracting structured information from an expert agent operating within the text-based environment. Furthermore, we provide a standardized evaluation harness and metrics, including environment interaction for validating generated plans, to facilitate reproducible assessment of diverse LLMs. Initial baseline results highlight the challenges posed by these grounded reasoning tasks. The benchmark suite, datasets, data generation code, and evaluation code are made publicly available ($\href{https://github.com/choukrani/llm-babybench}{\text{GitHub}}$, $\href{https://huggingface.co/datasets/salem-mbzuai/LLM-BabyBench}{\text{HuggingFace}}$).
New state of matter powers Microsoft quantum computing chip
Microsoft says its researchers have created a new quantum computer processor that relies on a never-before-seen state of matter. The technological leap--called Majorana 1--represents a major step forward towards an era of powerful quantum computers that unlock currently unachievable advancements across artificial intelligence, medical research, sustainable energy, and many other industries. Since their invention, traditional computers have almost always relied on semiconductor chips that use binary "bits" of information represented as strings of 1's and 0's. While these chips have become increasingly powerful and simultaneously smaller, there is a physical limit to the amount of information that can be stored on this hardware. Quantum computers, by comparison, utilize "qubits" (quantum bits) to exploit the strange properties exhibited by subatomic particles, often at extremely cold temperatures.
Microsoft makes quantum breakthrough by creating new state of matter
Microsoft has announced a breakthrough that could lead to the most powerful quantum computers of all time. They created a new chip, called'Majorana 1,' which is powered by a new state of matter called a topological state. This phase of matter is not characterized by the traditional physical properties that define a solid, liquid or gas. Instead, it's defined by its topological properties -- how the material's wavefunctions behave and connect across space. This topological state is created by the chip's topoconductor, a first-of-its-kind material that produces fundamental units of information that serve as the building blocks for quantum computers.