arthur
- Asia > China (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (4 more...)
- Education (0.68)
- Information Technology (0.46)
- North America > Canada > Alberta (0.04)
- Europe > Bulgaria (0.04)
- North America > United States > Kentucky (0.04)
- (10 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology (0.92)
- Media (0.68)
- Government > Regional Government > North America Government > United States Government (0.46)
- Asia > China (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (4 more...)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Education (0.68)
- Information Technology > Communications (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
- Europe > Bulgaria (0.04)
- North America > United States > Kentucky (0.04)
- Asia > Singapore (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Goal-Driven Reasoning in DatalogMTL with Magic Sets
Wang, Shaoyu, Zhao, Kaiyue, Wei, Dongliang, Wałęga, Przemysław Andrzej, Wang, Dingmin, Cai, Hongming, Hu, Pan
DatalogMTL is a powerful rule-based language for temporal reasoning. Due to its high expressive power and flexible modeling capabilities, it is suitable for a wide range of applications, including tasks from industrial and financial sectors. However, due its high computational complexity, practical reasoning in DatalogMTL is highly challenging. To address this difficulty, we introduce a new reasoning method for DatalogMTL which exploits the magic sets technique -- a rewriting approach developed for (non-temporal) Datalog to simulate top-down evaluation with bottom-up reasoning. We implement this approach and evaluate it on several publicly available benchmarks, showing that the proposed approach significantly and consistently outperforms performance of the state-of-the-art reasoning techniques.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
The images of Spain's floods weren't created by AI. The trouble is, people think they were
My eye was caught by a striking photograph in the most recent edition of Charles Arthur's Substack newsletter Social Warming. It shows a narrow street in the aftermath of the "rain bomb" that devastated the region of Valencia in Spain. A year's worth of rain fell in a single day, and in some towns more than 490 litres a square metre fell in eight hours. Water is very heavy, so if there's a gradient it will flow downhill with the kind of force that can pick up a heavy SUV and toss it around like a toy. And if it channels down a narrow urban street, it will throw parked cars around like King Kong in a bad mood.
- Media (0.32)
- Information Technology > Services (0.31)
On-Policy Fine-grained Knowledge Feedback for Hallucination Mitigation
Wen, Xueru, Lu, Xinyu, Guan, Xinyan, Lu, Yaojie, Lin, Hongyu, He, Ben, Han, Xianpei, Sun, Le
Hallucination occurs when large language models (LLMs) exhibit behavior that deviates from the boundaries of their knowledge during the response generation process. Previous learning-based methods focus on detecting knowledge boundaries and finetuning models with instance-level feedback, but they suffer from inaccurate signals due to off-policy data sampling and coarse-grained feedback. In this paper, we introduce \textit{\b{R}einforcement \b{L}earning \b{f}or \b{H}allucination} (RLFH), a fine-grained feedback-based online reinforcement learning method for hallucination mitigation. Unlike previous learning-based methods, RLFH enables LLMs to explore the boundaries of their internal knowledge and provide on-policy, fine-grained feedback on these explorations. To construct fine-grained feedback for learning reliable generation behavior, RLFH decomposes the outcomes of large models into atomic facts, provides statement-level evaluation signals, and traces back the signals to the tokens of the original responses. Finally, RLFH adopts the online reinforcement algorithm with these token-level rewards to adjust model behavior for hallucination mitigation. For effective on-policy optimization, RLFH also introduces an LLM-based fact assessment framework to verify the truthfulness and helpfulness of atomic facts without human intervention. Experiments on HotpotQA, SQuADv2, and Biography benchmarks demonstrate that RLFH can balance their usage of internal knowledge during the generation process to eliminate the hallucination behavior of LLMs.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > New York (0.04)
- North America > United States > New Jersey (0.04)
- (4 more...)
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Zhang, Jianguo, Lan, Tian, Murthy, Rithesh, Liu, Zhiwei, Yao, Weiran, Tan, Juntao, Hoang, Thai, Yang, Liangwei, Feng, Yihao, Liu, Zuxin, Awalgaonkar, Tulika, Niebles, Juan Carlos, Savarese, Silvio, Heinecke, Shelby, Wang, Huan, Xiong, Caiming
Autonomous agents powered by large language models (LLMs) have garnered significant research attention. However, fully harnessing the potential of LLMs for agent-based tasks presents inherent challenges due to the heterogeneous nature of diverse data sources featuring multi-turn trajectories. In this paper, we introduce AgentOhana as a comprehensive solution to address these challenges. Leveraging the data unification, our training pipeline maintains equilibrium across different data sources and preserves independent randomness across devices during dataset partitioning and model training. Additionally, we present xLAM-v0.1, a large action model tailored for AI agents, which demonstrates exceptional performance across various benchmarks. Large language models (LLMs) have shown strong abilities in code generation, mathematical reasoning, conversational AI, and AI agents (OpenAI, 2023; Jiang et al., 2023; Zhang et al., 2023; Liu et al., 2023a; Nijkamp et al., 2023). Among these, LLM-powered autonomous agents are gaining increasing attention.
- Research Report (0.82)
- Workflow (0.72)
AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
Guan, Jian, Wu, Wei, Wen, Zujie, Xu, Peng, Wang, Hongning, Huang, Minlie
The notable success of large language models (LLMs) has sparked an upsurge in building language agents to complete various complex tasks. We present AMOR, an agent framework based on open-source LLMs, which reasons with external knowledge bases and adapts to specific domains through human supervision to the reasoning process. AMOR builds reasoning logic over a finite state machine (FSM) that solves problems through autonomous executions and transitions over disentangled modules. This allows humans to provide direct feedback to the individual modules, and thus naturally forms process supervision. Based on this reasoning and feedback framework, we develop AMOR through two-stage fine-tuning: warm-up and adaptation. The former fine-tunes the LLM with examples automatically constructed from various public datasets and enables AMOR to generalize across different knowledge environments, while the latter tailors AMOR to specific domains using process feedback. Extensive experiments across multiple domains demonstrate the advantage of AMOR to strong baselines, thanks to its FSM-based reasoning and process feedback mechanism.
- North America > Canada > Alberta (0.05)
- Europe > Bulgaria (0.04)
- North America > United States > Kentucky (0.04)
- (6 more...)
- Leisure & Entertainment (0.47)
- Government (0.46)
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Zhou, Andy, Yan, Kai, Shlapentokh-Rothman, Michal, Wang, Haohan, Wang, Yu-Xiong
While large language models (LLMs) have demonstrated impressive performance on a range of decision-making tasks, they rely on simple acting processes and fall short of broad deployment as autonomous agents. We introduce LATS (Language Agent Tree Search), a general framework that synergizes the capabilities of LLMs in planning, acting, and reasoning. Drawing inspiration from Monte Carlo tree search in model-based reinforcement learning, LATS employs LLMs as agents, value functions, and optimizers, repurposing their latent strengths for enhanced decision-making. What is crucial in this method is the use of an environment for external feedback, which offers a more deliberate and adaptive problem-solving mechanism that moves beyond the limitations of existing techniques. Our experimental evaluation across diverse domains, such as programming, HotPotQA, and WebShop, illustrates the applicability of LATS for both reasoning and acting. In particular, LATS achieves 94.4% for programming on HumanEval with GPT-4 and an average score of 75.9 for web browsing on WebShop with GPT-3.5, demonstrating the effectiveness and generality of our method.
- Asia > South Korea (0.05)
- North America > United States > Illinois (0.04)
- North America > United States > Colorado (0.04)
- North America > United States > New Jersey (0.04)
- Research Report (0.64)
- Workflow (0.46)