Goto

Collaborating Authors

 relic


RELIC: Evaluating Compositional Instruction Following via Language Recognition

Petty, Jackson, Hu, Michael Y., Wang, Wentao, Ravfogel, Shauli, Merrill, William, Linzen, Tal

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly expected to perform tasks based only on a specification of the task provided in context, without examples of inputs and outputs; this ability is referred to as instruction following. We introduce the Recognition of Languages In-Context (RELIC) framework to evaluate instruction following using language recognition: the task of determining if a string is generated by formal grammar. Unlike many standard evaluations of LLMs' ability to use their context, this task requires composing together a large number of instructions (grammar productions) retrieved from the context. Because the languages are synthetic, the task can be increased in complexity as LLMs' skills improve, and new instances can be automatically generated, mitigating data contamination. We evaluate state-of-the-art LLMs on RELIC and find that their accuracy can be reliably predicted from the complexity of the grammar and the individual example strings, and that even the most advanced LLMs currently available show near-chance performance on more complex grammars and samples, in line with theoretical expectations. We also use RELIC to diagnose how LLMs attempt to solve increasingly difficult reasoning tasks, finding that as the complexity of the language recognition task increases, models switch to relying on shallow heuristics instead of following complex instructions.


ReLIC: A Recipe for 64k Steps of In-Context Reinforcement Learning for Embodied AI

Elawady, Ahmad, Chhablani, Gunjan, Ramrakhya, Ram, Yadav, Karmesh, Batra, Dhruv, Kira, Zsolt, Szot, Andrew

arXiv.org Artificial Intelligence

Intelligent embodied agents need to quickly adapt to new scenarios by integrating long histories of experience into decision-making. For instance, a robot in an unfamiliar house initially wouldn't know the locations of objects needed for tasks and might perform inefficiently. However, as it gathers more experience, it should learn the layout of its environment and remember where objects are, allowing it to complete new tasks more efficiently. To enable such rapid adaptation to new tasks, we present ReLIC, a new approach for in-context reinforcement learning (RL) for embodied agents. With ReLIC, agents are capable of adapting to new environments using 64,000 steps of in-context experience with full attention while being trained through self-generated experience via RL. We achieve this by proposing a novel policy update scheme for on-policy RL called "partial updates" as well as a Sink-KV mechanism that enables effective utilization of a long observation history for embodied agents. Our method outperforms a variety of meta-RL baselines in adapting to unseen houses in an embodied multi-object navigation task. In addition, we find that ReLIC is capable of few-shot imitation learning despite never being trained with expert demonstrations. We also provide a comprehensive analysis of ReLIC, highlighting that the combination of large-scale RL training, the proposed partial updates scheme, and the Sink-KV are essential for effective in-context learning. The code for ReLIC and all our experiments is at github.com/aielawady/relic. A desired capability of intelligent embodied agents is to rapidly adapt to new scenarios through experience. An essential requirement for this capability is integrating a long history of experience into decision-making to enable an agent to accumulate knowledge about the new scenario that it is encountering. For example, a robot placed in an unseen house initially has no knowledge of the home layout and where to find objects. The robot should leverage its history of experiences of completing tasks in this new home to learn the home layout details, where to find objects, and how to act to complete tasks successfully. To achieve adaptation of decision-making to new tasks, prior work has leveraged a technique called in-context reinforcement learning (RL) where an agent is trained with RL to utilize past experience in an environment (Wang et al., 2016; Team et al., 2023; Duan et al., 2016; Grigsby et al., 2023; Melo, 2022). By using sequence models over a history of interactions in an environment, these methods adapt to new scenarios by conditioning policy actions on this context of interaction history without updating the policy parameters.


RELIC: Investigating Large Language Model Responses using Self-Consistency

Cheng, Furui, Zouhar, Vilém, Arora, Simran, Sachan, Mrinmaya, Strobelt, Hendrik, El-Assady, Mennatallah

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are notorious for blending fact with fiction and generating non-factual content, known as hallucinations. To tackle this challenge, we propose an interactive system that helps users obtain insights into the reliability of the generated text. Our approach is based on the idea that the self-consistency of multiple samples generated by the same LLM relates to its confidence in individual claims in the generated texts. Using this idea, we design RELIC, an interactive system that enables users to investigate and verify semantic-level variations in multiple long-form responses. This allows users to recognize potentially inaccurate information in the generated text and make necessary corrections. From a user study with ten participants, we demonstrate that our approach helps users better verify the reliability of the generated text. We further summarize the design implications and lessons learned from this research for inspiring future studies on reliable human-LLM interactions.


New Relic's Ambitious Plan to Apply AI and ML to Incident Responses - The New Stack

#artificialintelligence

Application performance management company New Relic has begun to apply machine learning (ML) and artificial intelligence (AI) to automate incident response, management and remediation. If successful, the new features could serve to mitigate a major source of lost IT productivity among organizations with often different operations to manage, including multicloud and on-premises infrastructures. New Relic AI offers a wide sweep of AIOps capabilities to help reduce "noise" and other distractions when managing workflows. The idea is to solve a common pain point of having to devote IT resources to respond to an often overwhelming number of telemetry alerts. Such "noisy" alerts often consist of false positives.


New Relic Previews Artificial Intelligence Technology: Project Seymour – Military Technologies

#artificialintelligence

SAN FRANCISCO–(BUSINESS WIRE)– NEW RELIC FUTURESTACK – Digital intelligence leader New Relic, Inc. (NYSE:NEWR) today shared a preview of its artificial intelligence (AI) technology, code-named "Project Seymour," at the company's fourth annual FutureStack event in San Francisco. Project Seymour is designed to deliver advanced AI and machine learning capabilities to help companies uncover the most interesting, most relevant, and most actionable insights to improve their customer experience, and the performance and availability of their digital initiatives. "Our customers have increasingly complex systems and often struggle to understand all of the facets of what's going on in their customer experience, in their applications, and in their infrastructure. Seymour is another manifestation of New Relic's continued obsession to make it easy for our customers to understand everything going on in their digital business," said Lew Cirne, CEO and founder, New Relic. "New Relic has a unique opportunity to leverage the power of AI because our cloud-based platform already analyzes billions of metrics for our customers every day. We simply do not believe you can get the same benefits from on-premise solutions because you wouldn't have enough data to uncover the same meaningful insights."


Breakaway preview: Amazon's first game blends basketball with the MOBA genre

PCWorld

Breakaway's Black Knight is a fearsome foe--all 400 pounds of him. Clad in spiky black armor, standing seven feet tall, and with a man-sized axe in his massive gauntlets, he's basically a murderous brick wall standing between me and freedom. In a modern-day David and Goliath situation, I strafe my way around the Black Knight's left side, flitting just past his weapon and making a break for-- I sprint up the stairs of fabled El Dorado, leap into the air like some ancient-world Michael Jordan, and slam a golden ball down into the pit on the ground. It's an interesting phenomenon that the best "Sports" video games are at best an abstract representation of real-world sports. Oh sure, developers have made astonishing simulations of real sports, with the world's best football and hockey and soccer stars meticulously recreated not just in appearance, but with tables upon tables of stats to delve into.