Goto

Collaborating Authors

 crewai


Architecting Resilient LLM Agents: A Guide to Secure Plan-then-Execute Implementations

Del Rosario, Ron F., Krawiecka, Klaudia, de Witt, Christian Schroeder

arXiv.org Artificial Intelligence

As Large Language Model (LLM) agents become increasingly capable of automating complex, multi-step tasks, the need for robust, secure, and predictable architectural patterns is paramount. This paper provides a comprehensive guide to the ``Plan-then-Execute'' (P-t-E) pattern, an agentic design that separates strategic planning from tactical execution. We explore the foundational principles of P-t-E, detailing its core components - the Planner and the Executor - and its architectural advantages in predictability, cost-efficiency, and reasoning quality over reactive patterns like ReAct (Reason + Act). A central focus is placed on the security implications of this design, particularly its inherent resilience to indirect prompt injection attacks by establishing control-flow integrity. We argue that while P-t-E provides a strong foundation, a defense-in-depth strategy is necessary, and we detail essential complementary controls such as the Principle of Least Privilege, task-scoped tool access, and sandboxed code execution. To make these principles actionable, this guide provides detailed implementation blueprints and working code references for three leading agentic frameworks: LangChain (via LangGraph), CrewAI, and AutoGen. Each framework's approach to implementing the P-t-E pattern is analyzed, highlighting unique features like LangGraph's stateful graphs for re-planning, CrewAI's declarative tool scoping for security, and AutoGen's built-in Docker sandboxing. Finally, we discuss advanced patterns, including dynamic re-planning loops, parallel execution with Directed Acyclic Graphs (DAGs), and the critical role of Human-in-the-Loop (HITL) verification, to offer a complete strategic blueprint for architects, developers, and security engineers aiming to build production-grade, resilient, and trustworthy LLM agents.


Agentic AI Frameworks: Architectures, Protocols, and Design Challenges

Derouiche, Hana, Brahmi, Zaki, Mazeni, Haithem

arXiv.org Artificial Intelligence

Aspect Traditional AI agents Modern agentic AI systems (LLM-based agents) Definition Autonomous entities with fixed sensing/acting loops; limited by static rules or models Autonomous reasoning systems using LLMs with dynamic behavior, tool orchestration, and context-awarenessAutonomy Limited autonomy; often dependent on human input or predefined instructions High autonomy; capable of independently performing complex and extended tasks Goal Management Focused on single, static goals or fixed task planning Capable of managing multiple, evolving, and nested goals adaptivelyArchitecture Rule-based or BDI (Belief-Desire-Intention) models; monolithic design Modular architecture centered on LLMs, with components for memory, tools, context injection, and rolesAdaptability Suited to controlled, predictable environments; poor generalization Designed for open, dynamic, and unpredictable environmentsDecision-Making Deterministic or rule-based logic; symbolic reasoning Context-sensitive, probabilistic reasoning with adaptive planning and self-reflection Learning Mechanism Rule-based or supervised learning with limited updates Self-supervised and reinforcement learning; continual fine-tuning possible Context Handling Static or manually coded states and rules Dynamic context injection via agent protocols (e.g., MCP, A2A) and runtime awareness Communication Message-passing via ACL or KQML Real-time, event-driven collaboration; natural language interfacesTool Use Limited or predefined tools and actions Dynamic tool invocation, chaining, and API calling based on contextMemory Optional, often hardcoded or task-specific Integrated memory systems supporting long-and short-term information retention


AI Agents-as-Judge: Automated Assessment of Accuracy, Consistency, Completeness and Clarity for Enterprise Documents

Dasgupta, Sudip, Shankar, Himanshu

arXiv.org Artificial Intelligence

This study presents a modular, multi-agent system for the automated review of highly structured enterprise business documents using AI agents. Unlike prior solutions focused on unstructured texts or limited compliance checks, this framework leverages modern orchestration tools such as LangChain, CrewAI, TruLens, and Guidance to enable section-by-section evaluation of documents for accuracy, consistency, completeness, and clarity. Specialized agents, each responsible for discrete review criteria such as template compliance or factual correctness, operate in parallel or sequence as required. Evaluation outputs are enforced to a standardized, machine-readable schema, supporting downstream analytics and auditability. Continuous monitoring and a feedback loop with human reviewers allow for iterative system improvement and bias mitigation. Quantitative evaluation demonstrates that the AI Agent-as-Judge system approaches or exceeds human performance in key areas: achieving 99% information consistency (vs. 92% for humans), halving error and bias rates, and reducing average review time from 30 to 2.5 minutes per document, with a 95% agreement rate between AI and expert human judgment. While promising for a wide range of industries, the study also discusses current limitations, including the need for human oversight in highly specialized domains and the operational cost of large-scale LLM usage. The proposed system serves as a flexible, auditable, and scalable foundation for AI-driven document quality assurance in the enterprise context.


Exploration of LLM Multi-Agent Application Implementation Based on LangGraph+CrewAI

Duan, Zhihua, Wang, Jialin

arXiv.org Artificial Intelligence

With the rapid development of large model technology, the application of agent technology in various fields is becoming increasingly widespread, profoundly changing people's work and lifestyles. In complex and dynamic systems, multi-agents achieve complex tasks that are difficult for a single agent to complete through division of labor and collaboration among agents. This paper discusses the integrated application of LangGraph and CrewAI. LangGraph improves the efficiency of information transmission through graph architecture, while CrewAI enhances team collaboration capabilities and system performance through intelligent task allocation and resource management. The main research contents of this paper are: (1) designing the architecture of agents based on LangGraph for precise control; (2) enhancing the capabilities of agents based on CrewAI to complete a variety of tasks. This study aims to delve into the application of LangGraph and CrewAI in multi-agent systems, providing new perspectives for the future development of agent technology, and promoting technological progress and application innovation in the field of large model intelligent agents.