AITopics | c-agent

Collaborating Authors

c-agent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems

Zhang, Andy K., Ji, Joey, Menders, Celeste, Dulepet, Riya, Qin, Thomas, Wang, Ron Y., Wu, Junrong, Liao, Kyleen, Li, Jiliang, Hu, Jinghan, Hong, Sara, Demilew, Nardos, Murgai, Shivatmica, Tran, Jason, Kacheria, Nishka, Ho, Ethan, Liu, Denis, McLane, Lauren, Bruvik, Olivia, Han, Dai-Rong, Kim, Seungwoo, Vyas, Akhil, Chen, Cuiyuanxiu, Li, Ryan, Xu, Weiran, Ye, Jonathan Z., Choudhary, Prerit, Bhatia, Siddharth M., Sivashankar, Vikram, Bao, Yuxuan, Song, Dawn, Boneh, Dan, Ho, Daniel E., Liang, Percy

arXiv.org Artificial IntelligenceDec-3-2025

AI agents have the potential to significantly alter the cybersecurity landscape. Here, we introduce the first framework to capture offensive and defensive cyber-capabilities in evolving real-world systems. Instantiating this framework with BountyBench, we set up 25 systems with complex, real-world codebases. To capture the vulnerability lifecycle, we define three task types: Detect (detecting a new vulnerability), Exploit (exploiting a given vulnerability), and Patch (patching a given vulnerability). For Detect, we construct a new success indicator, which is general across vulnerability types and provides localized evaluation. We manually set up the environment for each system, including installing packages, setting up server(s), and hydrating database(s). We add 40 bug bounties, which are vulnerabilities with monetary awards from \$10 to \$30,485, covering 9 of the OWASP Top 10 Risks. To modulate task difficulty, we devise a new strategy based on information to guide detection, interpolating from identifying a zero day to exploiting a given vulnerability. We evaluate 10 agents: Claude Code, OpenAI Codex CLI with o3-high and o4-mini, and custom agents with o3-high, GPT-4.1, Gemini 2.5 Pro Preview, Claude 3.7 Sonnet Thinking, Qwen3 235B A22B, Llama 4 Maverick, and DeepSeek-R1. Given up to three attempts, the top-performing agents are Codex CLI: o3-high (12.5% on Detect, mapping to \$3,720; 90% on Patch, mapping to \$14,152), Custom Agent: Claude 3.7 Sonnet Thinking (67.5% on Exploit), and Codex CLI: o4-mini (90% on Patch, mapping to \$14,422). Codex CLI: o3-high, Codex CLI: o4-mini, and Claude Code are more capable at defense, achieving higher Patch scores of 90%, 90%, and 87.5%, compared to Exploit scores of 47.5%, 32.5%, and 57.5% respectively; while the custom agents are relatively balanced between offense and defense, achieving Exploit scores of 17.5-67.5% and Patch scores of 25-60%.

c-agent, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2505.15216

Genre:

Research Report > New Finding (0.92)
Research Report > Experimental Study (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The reinforcement learning-based multi-agent cooperative approach for the adaptive speed regulation on a metallurgical pickling line

Bogomolova, Anna, Kingsep, Kseniia, Voskresenskii, Boris

arXiv.org Machine LearningAug-16-2020

We present a holistic data-driven approach to the problem of productivity increase on the example of a metallurgical pickling line. The proposed approach combines mathematical modeling as a base algorithm and a cooperative Multi-Agent Reinforcement Learning (MARL) system implemented such as to enhance the performance by multiple criteria while also meeting safety and reliability requirements and taking into account the unexpected volatility of certain technological processes. We demonstrate how Deep Q-Learning can be applied to a real-life task in a heavy industry, resulting in significant improvement of previously existing automation systems.The problem of input data scarcity is solved by a two-step combination of LSTM and CGAN, which helps to embrace both the tabular representation of the data and its sequential properties. Offline RL training, a necessity in this setting, has become possible through the sophisticated probabilistic kinematic environment.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2008.06933

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Industry: Materials > Metals & Mining > Steel (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback