reasoning technique
Specification and Evaluation of Multi-Agent LLM Systems -- Prototype and Cybersecurity Applications
Recent advancements in LLMs indicate potential for novel applications, as evidenced by the reasoning capabilities in the latest OpenAI and DeepSeek models. To apply these models to domain-specific applications beyond text generation, LLM-based multi-agent systems can be utilized to solve complex tasks, particularly by combining reasoning techniques, code generation, and software execution across multiple, potentially specialized LLMs. However, while many evaluations are performed on LLMs, reasoning techniques, and applications individually, their joint specification and combined application are not well understood. Defined specifications for multi-agent LLM systems are required to explore their potential and suitability for specific applications, allowing for systematic evaluations of LLMs, reasoning techniques, and related aspects. This paper reports the results of exploratory research on (1.) multi-agent specification by introducing an agent schema language and (2.) the execution and evaluation of the specifications through a multi-agent system architecture and prototype. The specification language, system architecture, and prototype are first presented in this work, building on an LLM system from prior research. Test cases involving cybersecurity tasks indicate the feasibility of the architecture and evaluation approach. As a result, evaluations could be demonstrated for question answering, server security, and network security tasks completed correctly by agents with LLMs from OpenAI and DeepSeek.
Spontaneous Giving and Calculated Greed in Language Models
Large language models demonstrate advanced problem-solving capabilities by incorporating reasoning techniques such as chain of thought and reflection. However, how these reasoning capabilities extend to social intelligence remains unclear. In this study, we investigate this question using economic games that model social dilemmas, where social intelligence plays a crucial role. First, we examine the effects of chain-of-thought and reflection techniques in a public goods game. We then extend our analysis to six economic games on cooperation and punishment, comparing off-the-shelf non-reasoning and reasoning models. We find that reasoning models significantly reduce cooperation and norm enforcement, prioritizing individual rationality. Consequently, groups with more reasoning models exhibit less cooperation and lower gains through repeated interactions. These behaviors parallel human tendencies of "spontaneous giving and calculated greed." Our results suggest the need for AI architectures that incorporate social intelligence alongside reasoning capabilities to ensure that AI supports, rather than disrupts, human cooperative intuition. Recent innovations in reasoning techniques, such as chain of thought [1] and reflection [2], are advancing the intellectual capabilities of large language models (LLMs) to the next level. Models such as OpenAI o1 leverage these techniques to solve complex problems, generate coherent arguments, and improve decision-making in multi-step reasoning scenarios [3-5]. Indeed, these reasoning models have demonstrated excellence in mathematical proofs, logical deduction, and strategic planning [6, 7]. The necessity of social intelligence is highlighted in social dilemmas, where individual rationality leads to collective irrationality [12].
Reasoning Does Not Necessarily Improve Role-Playing Ability
Feng, Xiachong, Dou, Longxu, Kong, Lingpeng
The application of role-playing large language models (LLMs) is rapidly expanding in both academic and commercial domains, driving an increasing demand for high-precision role-playing models. Simultaneously, the rapid advancement of reasoning techniques has continuously pushed the performance boundaries of LLMs. This intersection of practical role-playing demands and evolving reasoning capabilities raises an important research question: "Can reasoning techniques enhance the role-playing capabilities of LLMs?" To address this, we conduct a comprehensive study using 6 role-playing benchmarks, 24 LLMs, and 3 distinct role-playing strategies, comparing the effectiveness of direct zero-shot role-playing, role-playing with Chain-of-Thought (CoT), and role-playing using reasoning-optimized LLMs. Our findings reveal that CoT may reduce role-playing performance, reasoning-optimized LLMs are unsuitable for role-playing, reasoning ability disrupts the role-playing scaling law, large models still lack proficiency in advanced role-playing, and Chinese role-playing performance surpasses English role-playing performance. Furthermore, based on extensive experimental results, we propose two promising future research directions: Role-aware CoT for improving role-playing LLMs and Reinforcement Learning for role-playing LLMs, aiming to enhance the adaptability, consistency, and effectiveness of role-playing LLMs for both research and real-world applications.
RAGAR, Your Falsehood Radar: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models
Khaliq, M. Abdul, Chang, P., Ma, M., Pflugfelder, B., Miletić, F.
The escalating challenge of misinformation, particularly in political discourse, requires advanced fact-checking solutions; this is even clearer in the more complex scenario of multimodal claims. We tackle this issue using a multimodal large language model in conjunction with retrieval-augmented generation (RAG), and introduce two novel reasoning techniques: Chain of RAG (CoRAG) and Tree of RAG (ToRAG). They fact-check multimodal claims by extracting both textual and image content, retrieving external information, and reasoning subsequent questions to be answered based on prior evidence. We achieve a weighted F1-score of 0.85, surpassing a baseline reasoning technique by 0.14 points. Human evaluation confirms that the vast majority of our generated fact-check explanations contain all information from gold standard data.
Meta Reasoning for Large Language Models
Gao, Peizhong, Xie, Ao, Mao, Shaoguang, Wu, Wenshan, Xia, Yan, Mi, Haipeng, Wei, Furu
We introduce Meta-Reasoning Prompting (MRP), a novel and efficient system prompting method for large language models (LLMs) inspired by human meta-reasoning. Traditional in-context learning-based reasoning techniques, such as Tree-of-Thoughts, show promise but lack consistent state-of-the-art performance across diverse tasks due to their specialized nature. MRP addresses this limitation by guiding LLMs to dynamically select and apply different reasoning methods based on the specific requirements of each task, optimizing both performance and computational efficiency. With MRP, LLM reasoning operates in two phases. Initially, the LLM identifies the most appropriate reasoning method using task input cues and objective descriptions of available methods. Subsequently, it applies the chosen method to complete the task. This dynamic strategy mirrors human meta-reasoning, allowing the model to excel in a wide range of problem domains. We evaluate the effectiveness of MRP through comprehensive benchmarks. The results demonstrate that MRP achieves or approaches state-of-the-art performance across diverse tasks. MRP represents a significant advancement in enabling LLMs to identify cognitive challenges across problems and leverage benefits across different reasoning approaches, enhancing their ability to handle diverse and complex problem domains efficiently. Every LLM deserves a Meta-Reasoning Prompting to unlock its full potential and ensure adaptability in an ever-evolving landscape of challenges and applications.
A Survey on the Integration of Generative AI for Critical Thinking in Mobile Networks
Karapantelakis, Athanasios, Nikou, Alexandros, Kattepur, Ajay, Martins, Jean, Mokrushin, Leonid, Mohalik, Swarup Kumar, Orlic, Marin, Feljan, Aneta Vulgarakis
In the near future, mobile networks are expected to broaden their services and coverage to accommodate a larger user base and diverse user needs. Thus, they will increasingly rely on artificial intelligence (AI) to manage network operation and control costs, undertaking complex decision-making roles. This shift will necessitate the application of techniques that incorporate critical thinking abilities, including reasoning and planning. Symbolic AI techniques already facilitate critical thinking based on existing knowledge. Yet, their use in telecommunications is hindered by the high cost of mostly manual curation of this knowledge and high computational complexity of reasoning tasks. At the same time, there is a spurt of innovations in industries such as telecommunications due to Generative AI (GenAI) technologies, operating independently of human-curated knowledge. However, their capacity for critical thinking remains uncertain. This paper aims to address this gap by examining the current status of GenAI algorithms with critical thinking capabilities and investigating their potential applications in telecom networks. Specifically, the aim of this study is to offer an introduction to the potential utilization of GenAI for critical thinking techniques in mobile networks, while also establishing a foundation for future research.
Empirical Study of Zero-Shot NER with ChatGPT
Xie, Tingyu, Li, Qi, Zhang, Jian, Zhang, Yan, Liu, Zuozhu, Wang, Hongwei
Large language models (LLMs) exhibited powerful capability in various natural language processing tasks. This work focuses on exploring LLM performance on zero-shot information extraction, with a focus on the ChatGPT and named entity recognition (NER) task. Inspired by the remarkable reasoning capability of LLM on symbolic and arithmetic reasoning, we adapt the prevalent reasoning methods to NER and propose reasoning strategies tailored for NER. First, we explore a decomposed question-answering paradigm by breaking down the NER task into simpler subproblems by labels. Second, we propose syntactic augmentation to stimulate the model's intermediate thinking in two ways: syntactic prompting, which encourages the model to analyze the syntactic structure itself, and tool augmentation, which provides the model with the syntactic information generated by a parsing tool. Besides, we adapt self-consistency to NER by proposing a two-stage majority voting strategy, which first votes for the most consistent mentions, then the most consistent types. The proposed methods achieve remarkable improvements for zero-shot NER across seven benchmarks, including Chinese and English datasets, and on both domain-specific and general-domain scenarios. In addition, we present a comprehensive analysis of the error types with suggestions for optimization directions. We also verify the effectiveness of the proposed methods on the few-shot setting and other LLMs.
Knowledge-enhanced Neural Machine Reasoning: A Review
Chowdhury, Tanmoy, Ling, Chen, Zhang, Xuchao, Zhao, Xujiang, Bai, Guangji, Pei, Jian, Chen, Haifeng, Zhao, Liang
Knowledge-enhanced neural machine reasoning has garnered significant attention as a cutting-edge yet challenging research area with numerous practical applications. Over the past few years, plenty of studies have leveraged various forms of external knowledge to augment the reasoning capabilities of deep models, tackling challenges such as effective knowledge integration, implicit knowledge mining, and problems of tractability and optimization. However, there is a dearth of a comprehensive technical review of the existing knowledge-enhanced reasoning techniques across the diverse range of application domains. This survey provides an in-depth examination of recent advancements in the field, introducing a novel taxonomy that categorizes existing knowledge-enhanced methods into two primary categories and four subcategories. We systematically discuss these methods and highlight their correlations, strengths, and limitations. Finally, we elucidate the current application domains and provide insight into promising prospects for future research.
Modelling and Reasoning Techniques for Context Aware Computing in Intelligent Transportation System
M, Swarnamugi., R, Chinnaiyan.
The emergence of IoT technology and recent advancement in sensor networks enabled transportation systems to a new dimension called Intelligent Transportation System (ITS). Due to increased usage of vehicles and communication among entities in road traffic scenarios, the amount of raw data generation in ITS is huge. This raw data are to be processed to infer contextual information and provide new services related to different modes of road transport such as traffic signal management, accident prediction, object detection etc.,. To understand the importance of context, this article aims to study context-awareness in the Intelligent Transportation System. We present a review on prominent applications developed in the literature concerning context - awareness in the intelligent transportation system. The objective of this survey article is (i) to highlight context and its features in ITS (ii) to address the applicability of modelling techniques and reasoning approaches in ITS (iii) to shed light on impact of IoT and machine learning in ITS development. I. Introduction The Intelligent Transportation System (ITS) plays a significant role in the transformation of conventional transportation by enhancing user convenience and safety, ensuring efficiency and improving transportation networks. Many countries around the globe have widely accepted and started implementing intelligent transportation systems as a solution to the current road traffic management practices [1]. Research in ITS aims to afford services relating to different modes of road transport and assist users to be better informed on road safety and make safer, coordinated, comfort and smarter use of transportation [2]. The traditional road transportation currently in practise encompasses more of human to human communication. Imparting intelligence to it increases many technical challenges in enabling human to machine communication and machine to machine communication. Yet thanks to advanced technologies such as electronic sensor technology, wireless sensor network, mobile computing, cloud computing, Internet of Things, computer vision, data transmission technology and intelligent control have enabled large scale deployment of intelligent transportation in reality [56].
Pairwise Symmetry Reasoning for Multi-Agent Path Finding Search
Li, Jiaoyang, Harabor, Daniel, Stuckey, Peter J., Koenig, Sven
Multi-Agent Path Finding (MAPF) is a challenging combinatorial problem that asks us to plan collision-free paths for a team of cooperative agents. In this work, we show that one of the reasons why MAPF is so hard to solve is due to a phenomenon called pairwise symmetry, which occurs when two agents have many different paths to their target locations, all of which appear promising, but every combination of them results in a collision. We identify several classes of pairwise symmetries and show that each one arises commonly in practice and can produce an exponential explosion in the space of possible collision resolutions, leading to unacceptable runtimes for current state-of-the-art (bounded-sub)optimal MAPF algorithms. We propose a variety of reasoning techniques that detect the symmetries efficiently as they arise and resolve them by using specialized constraints to eliminate all permutations of pairwise colliding paths in a single branching step. We implement these ideas in the context of the leading optimal MAPF algorithm CBS and show that the addition of the symmetry reasoning techniques can have a dramatic positive effect on its performance - we report a reduction in the number of node expansions by up to four orders of magnitude and an increase in scalability by up to thirty times. These gains allow us to solve to optimality a variety of challenging MAPF instances previously considered out of reach for CBS.