jericho
ReflAct: World-Grounded Decision Making in LLM Agents via Goal-State Reflection
Kim, Jeonghye, Rhee, Sojeong, Kim, Minbeom, Kim, Dohyung, Lee, Sangmook, Sung, Youngchul, Jung, Kyomin
Recent advances in LLM agents have largely built on reasoning backbones like ReAct, which interleave thought and action in complex environments. However, ReAct often produces ungrounded or incoherent reasoning steps, leading to misalignment between the agent's actual state and goal. Our analysis finds that this stems from ReAct's inability to maintain consistent internal beliefs and goal alignment, causing compounding errors and hallucinations. To address this, we introduce ReflAct, a novel backbone that shifts reasoning from merely planning next actions to continuously reflecting on the agent's state relative to its goal. By explicitly grounding decisions in states and enforcing ongoing goal alignment, ReflAct dramatically improves strategic reliability. This design delivers substantial empirical gains: ReflAct surpasses ReAct by 27.7% on average, achieving a 93.3% success rate in ALFWorld. Notably, ReflAct even outperforms ReAct with added enhancement modules (e.g., Reflexion, WKM), showing that strengthening the core reasoning backbone is key to reliable agent performance.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Law (0.94)
- Materials > Metals & Mining (0.46)
- Leisure & Entertainment > Games (0.45)
Affordance Extraction with an External Knowledge Database for Text-Based Simulated Environments
Gelhausen, P., Fischer, M., Peters, G.
The evaluation in the previous Section showed that external databases can be used to generate affordance for text based input with comparatively little effort. The results however have to be interpreted carefully to account for all upsides and limitations of this simple approach. For the automated evaluation (see Tables 1 and 2) three major observations have been made: The total amount and percentage of valid commands for the basic approach is rather low, yielding 52 commands (0.4%) for Jericho and 330 commands (6.4%) for TextWorld. This serves as an illustration point to the many potential challenges of automated affordance extraction. The amount of valid commands is increased by a factor of 2.9 (for Jericho) and 1.3 (for TextWorld), respectively, by manually adding trivial "take"- affordances to every evaluation step. This illustrates that information retrieved by external databases might often omit "trivial" information. The comparison between the results of Jericho and TextWorld showed a significant increase of the percentage of valid commands (by a factor of ca.
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- Europe > Germany (0.04)
A Survey of Text Games for Reinforcement Learning informed by Natural Language
Osborne, Philip, Nõmm, Heido, Freitas, Andre
Reinforcement Learning (RL) has shown human-level performance in solving complex, single setting virtual environments Mnih et al. [2013] & Silver et al. [2016]. However, applications and theory in RL problems have been far less developed and it has been posed that this is due to a wide divide between the empirical methodology associated with virtual environments in RL research and the challenges associated with reality Dulac-Arnold et al. [2019]. Simply put, Text Games provide a safe and data efficient way to learn from environments that mimic language found in real-world scenarios Shridhar et al. [2020]. Natural language (NL) has been introduced as a solution to many of the challenges in RL Luketina et al. [2019], as NL can facilitate the transfer of abstract knowledge to downstream tasks. However, RL approaches on these language driven environments are still limited in their development and therefore a call has been made for an improvement on the evaluation settings where language is a first-class component. Text Games gained wider acceptance as a testbed for NL research following work Figure 1: Sample gameplay from Narasimhan et al. [2015] who leveraged the Deep Q Network (DQN) framework from a fantasy Text Game as for policy learning on a set of synthetic textual games. Text Games are both partially given by Narasimhan et al. observable (as shown in Figure 1) and include outcomes that make reward signals [2015] where the player takes simple to define, making them a suitable problem for Reinforcement Learning to the action'Go East' to cross solve. However, research so far has been performed independently, with many authors the bridge.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Education (0.93)
- Leisure & Entertainment > Games > Computer Games (0.46)
Microsoft Jericho is an Open Source Framework for Training Machine Learning Models Using…
I recently started a new newsletter focus on AI education. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Language is one of the hallmarks of human intelligence and one that plays a key role in our learning processes. By using language, we constantly formulate our understanding of a situation of a specific context.
Using Ai to search and save
Plan Jericho has introduced Ai-Search – an artificial intelligence (Ai) prototype – to transform airborne search and rescue. The prototype came about after Air Commodore Darren Goldie challenged Jericho to find a way of using a detector on an aircraft to enhance search and rescue (SAR). Plan Jericho's Ai lead Wing Commander Michael Gan said Jericho saw the opportunity to use Ai to augment and enhance SAR. "The idea was to train a machine-learning algorithm and Ai sensors to complement existing visual search techniques. Our vision was to give any aircraft and other Defence platforms, including unmanned aerial systems, a low-cost, improvised SAR capability," Wing Commander Gan said.
- Government > Regional Government > Oceania Government > Australia Government (0.45)
- Transportation > Air (0.40)
Interactive Fiction Games: A Colossal Adventure
Hausknecht, Matthew, Ammanabrolu, Prithviraj, Côté, Marc-Alexandre, Yuan, Xingdi
A hallmark of human intelligence is the ability to understand and communicate with language. Interactive Fiction games are fully text-based simulation environments where a player issues text commands to effect change in the environment and progress through the story. We argue that IF games are an excellent testbed for studying language-based autonomous agents. In particular, IF games combine challenges of combinatorial action spaces, language understanding, and commonsense reasoning. To facilitate rapid development of language-based agents, we introduce Jericho, a learning environment for man-made IF games and conduct a comprehensive study of text-agents across a rich set of games, highlighting directions in which agents can improve.
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Puerto Rico (0.04)
WWE Payback 2017: Results, Recap, Video For Every Match On 'Monday Night Raw' PPV
WWE Payback 2017 certainly lived up to its name Sunday night, allowing several superstars to get retribution for recent losses. The eight matches resulted in two titles changing hands, and the main event likely set up a match for the No.1 championship on "Monday Night Raw." Braun Strowman ended the pay-per-view by defeating Roman Reigns, continuing to be the most dominant wrestler in all of WWE. He now has his eyes on the WWE Universal Championship, though Finn Balor announced his intention to reclaim the belt when he appeared on "Miz TV" as part of the Payback kickoff show. Let's take a look at the complete results of WWE Payback, including a recap and video for each match. Reigns stood little chance from the start.
WWE WrestleMania 33: Predictions, Match Card, Preview For 2017 PPV
After months of rumors regarding WWE's biggest event of 2017, the WrestleMania 33 card is finally set. Thirteen matches are scheduled for Sunday's pay-per-view at Camping World Stadium in Orlando, and eight titles will be on the line. The WWE Universal Championship Match between Brock Lesnar and Goldberg is expected to go on last, putting an end to their feud that began with Goldberg's return in November. It seems to be clear which WWE superstar is winning the main event, but predictions for other matches aren't as easy to make. The SmackDown Tag Team Championships are the only belts that won't be defended at WrestleMania 33.