AITopics | autogpt

Collaborating Authors

autogpt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gradientsys: A Multi-Agent LLM Scheduler with ReAct Orchestration

Song, Xinyuan, Wang, Zeyu, Wu, Siyi, Shi, Tianyu, Ai, Lynn

arXiv.org Artificial IntelligenceJul-10-2025

We present Gradientsys, a next-generation multi-agent scheduling framework that coordinates diverse specialized AI agents using a typed Model-Context Protocol (MCP) and a ReAct-based dynamic planning loop. At its core, Gradientsys employs an LLM-powered scheduler for intelligent one-to-many task dispatch, enabling parallel execution of heterogeneous agents such as PDF parsers, web search modules, GUI controllers, and web builders. The framework supports hybrid synchronous/asynchronous execution, respects agent capacity constraints, and incorporates a robust retry-and-replan mechanism to handle failures gracefully. To promote transparency and trust, Gradientsys includes an observability layer streaming real-time agent activity and intermediate reasoning via Server-Sent Events (SSE). We offer an architectural overview and evaluate Gradientsys against existing frameworks in terms of extensibility, scheduling topology, tool reusability, parallelism, and observability. Experiments on the GAIA general-assistant benchmark show that Gradientsys achieves higher task success rates with reduced latency and lower API costs compared to a MinionS-style baseline, demonstrating the strength of its LLM-driven multi-agent orchestration.

gradientsy, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2507.0652

Country: North America (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application Vulnerabilities

Zhu, Yuxuan, Kellermann, Antony, Bowman, Dylan, Li, Philip, Gupta, Akul, Danda, Adarsh, Fang, Richard, Jensen, Conner, Ihli, Eric, Benn, Jason, Geronimo, Jet, Dhir, Avi, Rao, Sudhit, Yu, Kaicheng, Stone, Twm, Kang, Daniel

arXiv.org Artificial IntelligenceMar-21-2025

Large language model (LLM) agents are increasingly capable of autonomously conducting cyberattacks, posing significant threats to existing applications. This growing risk highlights the urgent need for a real-world benchmark to evaluate the ability of LLM agents to exploit web application vulnerabilities. However, existing benchmarks fall short as they are limited to abstracted Capture the Flag competitions or lack comprehensive coverage. Building a benchmark for real-world vulnerabilities involves both specialized expertise to reproduce exploits and a systematic approach to evaluating unpredictable threats. To address this challenge, we introduce CVE-Bench, a real-world cybersecurity benchmark based on critical-severity Common Vulnerabilities and Exposures. In CVE-Bench, we design a sandbox framework that enables LLM agents to exploit vulnerable web applications in scenarios that mimic real-world conditions, while also providing effective evaluation of their exploits. Our evaluation shows that the state-of-the-art agent framework can resolve up to 13% of vulnerabilities.

large language model, natural language, vulnerability, (18 more...)

arXiv.org Artificial Intelligence

2503.17332

Country:

North America > United States > Utah (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre:

Research Report (0.82)
Overview (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military > Cyberwarfare (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Alignment, Agency and Autonomy in Frontier AI: A Systems Engineering Perspective

Tallam, Krti

arXiv.org Artificial IntelligenceFeb-20-2025

As artificial intelligence scales, the concepts of alignment, agency, and autonomy have become central to AI safety, governance, and control. However, even in human contexts, these terms lack universal definitions, varying across disciplines such as philosophy, psychology, law, computer science, mathematics, and political science. This inconsistency complicates their application to AI, where differing interpretations lead to conflicting approaches in system design and regulation. This paper traces the historical, philosophical, and technical evolution of these concepts, emphasizing how their definitions influence AI development, deployment, and oversight. We argue that the urgency surrounding AI alignment and autonomy stems not only from technical advancements but also from the increasing deployment of AI in high-stakes decision making. Using Agentic AI as a case study, we examine the emergent properties of machine agency and autonomy, highlighting the risks of misalignment in real-world systems. Through an analysis of automation failures (Tesla Autopilot, Boeing 737 MAX), multi-agent coordination (Metas CICERO), and evolving AI architectures (DeepMinds AlphaZero, OpenAIs AutoGPT), we assess the governance and safety challenges posed by frontier AI.

ai system, alignment, autonomy, (14 more...)

arXiv.org Artificial Intelligence

2503.05748

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry:

Transportation > Air (1.00)
Aerospace & Defense (1.00)
Transportation > Ground > Road (0.93)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Add feedback

CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark

Siegel, Zachary S., Kapoor, Sayash, Nagdir, Nitya, Stroebl, Benedikt, Narayanan, Arvind

arXiv.org Artificial IntelligenceSep-17-2024

AI agents have the potential to aid users on a variety of consequential tasks, including conducting scientific research. To spur the development of useful agents, we need benchmarks that are challenging, but more crucially, directly correspond to real-world tasks of interest. This paper introduces such a benchmark, designed to measure the accuracy of AI agents in tackling a crucial yet surprisingly challenging aspect of scientific research: computational reproducibility. This task, fundamental to the scientific process, involves reproducing the results of a study using the provided code and data. We introduce CORE-Bench (Computational Reproducibility Agent Benchmark), a benchmark consisting of 270 tasks based on 90 scientific papers across three disciplines (computer science, social science, and medicine). Tasks in CORE-Bench consist of three difficulty levels and include both language-only and vision-language tasks. We provide an evaluation system to measure the accuracy of agents in a fast and parallelizable way, saving days of evaluation time for each run compared to a sequential implementation. We evaluated two baseline agents: the general-purpose AutoGPT and a task-specific agent called CORE-Agent. We tested both variants using two underlying language models: GPT-4o and GPT-4o-mini. The best agent achieved an accuracy of 21% on the hardest task, showing the vast scope for improvement in automating routine scientific tasks. Having agents that can reproduce existing work is a necessary step towards building agents that can conduct novel research and could verify and improve the performance of other research agents. We hope that CORE-Bench can improve the state of reproducibility and spur the development of future research agents.

agent, benchmark, capsule, (16 more...)

arXiv.org Artificial Intelligence

2409.11363

Country:

South America > Colombia (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Add feedback

Testing Language Model Agents Safely in the Wild

Naihin, Silen, Atkinson, David, Green, Marc, Hamadi, Merwane, Swift, Craig, Schonholtz, Douglas, Kalai, Adam Tauman, Bau, David

arXiv.org Artificial IntelligenceDec-3-2023

A prerequisite for safe autonomy-in-the-wild is safe testing-in-the-wild. Yet real-world autonomous tests face several unique safety challenges, both due to the possibility of causing harm during a test, as well as the risk of encountering new unsafe agent behavior through interactions with real-world and potentially malicious actors. We propose a framework for conducting safe autonomous agent tests on the open internet: agent actions are audited by a context-sensitive monitor that enforces a stringent safety boundary to stop an unsafe test, with suspect behavior ranked and logged to be examined by humans. We design a basic safety monitor (AgentMonitor) that is flexible enough to monitor existing LLM agents, and, using an adversarial simulated agent, we measure its ability to identify and stop unsafe situations. Then we apply the AgentMonitor on a battery of real-world tests of AutoGPT, and we identify several limitations and challenges that will face the creation of safe in-the-wild tests as autonomous agents grow more capable.

agent, information, whitelist, (17 more...)

arXiv.org Artificial Intelligence

2311.10538

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Breaking Down AutoGPT: What It Is, Its Features, Limitations, Artificial General Intelligence (AGI) And Impact of Autonomous Agents on Generative AI - MarkTechPost

#artificialintelligenceApr-18-2023, 00:50:31 GMT

Introduction Generative AI is evolving and getting popular. Since its introduction, new models and research papers are getting released almost every other day. The major reason for the exponentially increasing popularity is the development of Large Language Models. LLMs, the Artificial Intelligence models that are designed to process natural language and generate human-like responses, are trending. The best example is OpenAI's ChatGPT, the well-known chatbot that does everything from content generation and code completion to question answering, just like a human. Even OpenAI's DALL-E and Google's BERT have contributed to making significant advances in recent times. What is AutoGPT? Recently,

agent, artificial general intelligence, autogpt, (14 more...)

#artificialintelligence

Country: Asia > India > Uttarakhand > Dehradun (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

What Is ChaosGPT: Can The AI Bot Destroy Humanity? - Dataconomy

#artificialintelligenceApr-13-2023, 15:33:38 GMT

If you're familiar with the helpful ChatGPT chatbot, which is based on the powerful natural language processing system GPT LLM developed by OpenAI, you might be surprised to hear that there's another chatbot with opposite intentions. ChaosGPT is an AI chatbot that's malicious, hostile, and wants to conquer the world. In this blog post, we'll explore what sets ChaosGPT apart from other chatbots and why it's considered a threat to humanity and the world. Let's dive in and see whether this AI chatbot has what it takes to cause real trouble in any capacity. Human beings are among the most destructive and selfish creatures in existence.

chaosgpt, chatbot, destroy humanity, (10 more...)

#artificialintelligence

Industry: Energy > Power Industry (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Meet AutoGPT, the autonomous GPT-4 tool revolutionizing AI

#artificialintelligenceApr-13-2023, 06:40:10 GMT

Understanding AGI is crucial to comprehending AutoGPT, which is an autonomous GPT-4 experiment aimed at achieving a future where AI models such as GPT can independently define and perform tasks to achieve objectives without any human intervention. AutoGPT is an open-source endeavor that seeks to make GPT-4 entirely self-governing, and it has gained worldwide popularity in recent days. Several programmers have demonstrated the potential of AutoGPT through YouTube videos. This innovative technology has multiple uses, including serving as an agent for internet search and planning, autonomous coding and debugging, and functioning as an independent Twitter bot. "Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model. This program, driven by GPT-4, chains together LLM'thoughts', to autonomously achieve whatever goal you set. As one of the first examples of GPT-4 running fully autonomously, Auto-GPT pushes the boundaries of what is possible with AI," reads the GitHub page of the tool.

application, autogpt, autonomous gpt-4 tool, (6 more...)

#artificialintelligence

Industry: Banking & Finance > Trading (0.33)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

autogpt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Gradientsys: A Multi-Agent LLM Scheduler with ReAct Orchestration

CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application Vulnerabilities

Alignment, Agency and Autonomy in Frontier AI: A Systems Engineering Perspective

CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark

Testing Language Model Agents Safely in the Wild

Breaking Down AutoGPT: What It Is, Its Features, Limitations, Artificial General Intelligence (AGI) And Impact of Autonomous Agents on Generative AI - MarkTechPost

Top Posts April 10-16: AutoGPT: Everything You Need To Know - KDnuggets

What Is ChaosGPT: Can The AI Bot Destroy Humanity? - Dataconomy

Meet AutoGPT, the autonomous GPT-4 tool revolutionizing AI