AITopics | artemis

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

arXiv.org Artificial IntelligenceDec-11-2025

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

Lin, Justin W., Jones, Eliot Krzysztof, Jasper, Donovan Julian, Ho, Ethan Jun-shen, Wu, Anna, Yang, Arnold Tianyi, Perry, Neil, Zou, Andy, Fredrikson, Matt, Kolter, J. Zico, Liang, Percy, Boneh, Dan, Ho, Daniel E.

We present the first comprehensive evaluation of AI agents against human cybersecurity professionals in a live enterprise environment. We evaluate ten cybersecurity professionals alongside six existing AI agents and ARTEMIS, our new agent scaffold, on a large university network consisting of ~8,000 hosts across 12 subnets. ARTEMIS is a multi-agent framework featuring dynamic prompt generation, arbitrary sub-agents, and automatic vulnerability triaging. In our comparative study, ARTEMIS placed second overall, discovering 9 valid vulnerabilities with an 82% valid submission rate and outperforming 9 of 10 human participants. While existing scaffolds such as Codex and CyAgent underperformed relative to most human participants, ARTEMIS demonstrated technical sophistication and submission quality comparable to the strongest participants. We observe that AI agents offer advantages in systematic enumeration, parallel exploitation, and cost -- certain ARTEMIS variants cost $18/hour versus $60/hour for professional penetration testers. We also identify key capability gaps: AI agents exhibit higher false-positive rates and struggle with GUI-based tasks.

artificial intelligence, machine learning, vulnerability, (15 more...)

2512.09882

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Los Angeles County > Santa Monica (0.04)
Africa > Mozambique > Gaza Province > Xai-Xai (0.04)

Genre: Research Report (0.84)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.74)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

arXiv.org Artificial IntelligenceDec-11-2025

Evolving Excellence: Automated Optimization of LLM-based Agents

Brookes, Paul, Voskanyan, Vardan, Giavrimis, Rafail, Truscott, Matthew, Ilieva, Mina, Pavlou, Chrystalla, Staicu, Alexandru, Adham, Manal, Hood, Will Evers-, Gong, Jingzhi, Zhang, Kejia, Fedoseev, Matvey, Sharma, Vishal, Bauer, Roman, Wang, Zheng, Nair, Hema, Jie, Wei, Xu, Tianhua, Constantin, Aurora, Kanthan, Leslie, Basios, Michail

Agentic AI systems built on large language models (LLMs) offer significant potential for automating complex workflows, from software development to customer support. However, LLM agents often underperform due to suboptimal configurations; poorly tuned prompts, tool descriptions, and parameters that typically require weeks of manual refinement. Existing optimization methods either are too complex for general use or treat components in isolation, missing critical interdependencies. We present ARTEMIS, a no-code evolutionary optimization platform that jointly optimizes agent configurations through semantically-aware genetic operators. Given only a benchmark script and natural language goals, ARTEMIS automatically discovers configurable components, extracts performance signals from execution logs, and evolves configurations without requiring architectural modifications. We evaluate ARTEMIS on four representative agent systems: the \emph{ALE Agent} for competitive programming on AtCoder Heuristic Contest, achieving a \textbf{$13.6\%$ improvement} in acceptance rate; the \emph{Mini-SWE Agent} for code optimization on SWE-Perf, with a statistically significant \textbf{10.1\% performance gain}; and the \emph{CrewAI Agent} for cost and mathematical reasoning on Math Odyssey, achieving a statistically significant \textbf{$36.9\%$ reduction} in the number of tokens required for evaluation. We also evaluate the \emph{MathTales-Teacher Agent} powered by a smaller open-source model (Qwen2.5-7B) on GSM8K primary-level mathematics problems, achieving a \textbf{22\% accuracy improvement} and demonstrating that ARTEMIS can optimize agents based on both commercial and local models.

evolutionary algorithm, large language model, machine learning, (18 more...)

2512.09108

Country:

Europe > United Kingdom > England > Greater London > London (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > West Yorkshire > Leeds (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

arXiv.org Artificial IntelligenceDec-11-2025

A Hierarchical, Model-Based System for High-Performance Humanoid Soccer

Wang, Quanyou, Zhu, Mingzhang, Hou, Ruochen, Gillespie, Kay, Zhu, Alvin, Wang, Shiqi, Wang, Yicheng, Fernandez, Gaberiel I., Liu, Yeting, Togashi, Colin, Nam, Hyunwoo, Navghare, Aditya, Xu, Alex, Zhu, Taoyuanmin, Ahn, Min Sung, Alvarez, Arturo Flores, Quan, Justin, Hong, Ethan, Hong, Dennis W.

The development of athletic humanoid robots has gained significant attention as advances in actuation, sensing, and control enable increasingly dynamic, real-world capabilities. RoboCup, an international competition of fully autonomous humanoid robots, provides a uniquely challenging benchmark for such systems, culminating in the long-term goal of competing against human soccer players by 2050. This paper presents the hardware and software innovations underlying our team's victory in the RoboCup 2024 Adult-Sized Humanoid Soccer Competition. On the hardware side, we introduce an adult-sized humanoid platform built with lightweight structural components, high-torque quasi-direct-drive actuators, and a specialized foot design that enables powerful in-gait kicks while preserving locomotion robustness. On the software side, we develop an integrated perception and localization framework that combines stereo vision, object detection, and landmark-based fusion to provide reliable estimates of the ball, goals, teammates, and opponents. A mid-level navigation stack then generates collision-aware, dynamically feasible trajectories, while a centralized behavior manager coordinates high-level decision making, role selection, and kick execution based on the evolving game state. The seamless integration of these subsystems results in fast, precise, and tactically effective gameplay, enabling robust performance under the dynamic and adversarial conditions of real matches. This paper presents the design principles, system architecture, and experimental results that contributed to ARTEMIS's success as the 2024 Adult-Sized Humanoid Soccer champion.

artificial intelligence, machine learning, robot, (18 more...)

2512.09431

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > United Kingdom > North Sea > Southern North Sea (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Soccer Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

SlateDec-9-2025, 10:40:00 GMT

America's Journey in Space Is About to Face Its Most Consequential Moment in Half a Century. Everyone Agrees: It's a Complete Disaster.

America's great journey in space is about to face its most consequential moment in half a century. Everyone agrees: It's a complete disaster. I. Artemis, We Have a Problem As you may have heard, NASA plans to send a crew of astronauts around the moon in early 2026, followed by a lunar landing in 2027. Or maybe you haven't heard. When I told one of my daughters about this plan to send people to the moon, she said, after a long silence: "But I thought we already sent a bunch of people there a long time ago." This is a standard response when I quiz people about Artemis, NASA's program to return to the moon, and this time to stay . It's named for Apollo's twin sister and the goddess of the moon and the hunt. The other day, I was in a gaggle with six neighbors, all highly informed professional people--two of them with long careers at the National Science Foundation--and none knew anything about Artemis except one thing: It's a plan to send people to Mars. Artemis is a moon mission. There is no Mars mission NASA has no Mars rocket, no Mars capsule, no Mars mission crew. What it does have is a very troubled moon program. Artemis faces fundamental engineering challenges that have called into question the program's basic architecture. Reconfiguring a mission this important is hard in the best of times, but the agency is being forced to do it during a year of unprecedented internal turmoil. A new administration always means turnover, but NASA has been in an uncontrolled spin every bit as alarming as the one Neil Armstrong famously pulled out of during in 1966. More than a year ago, President-elect Donald Trump nominated a billionaire entrepreneur and Elon Musk ally, Jared Isaacman, to become NASA administrator. It was an unconventional choice, but Isaacman drew support from many quarters in the space community. Then, right before Isaacman was poised for confirmation by the Senate, Trump and Musk had a nasty falling-out, and Trump yanked Isaacman's nomination. Since Inauguration Day, NASA had been run by acting administrator Janet Petro, a veteran agency official, and with Isaacman out, she remained in charge until one day in July when Trump suddenly named Secretary of Transportation Sean Duffy as interim administrator.

artificial intelligence, nasa, social media, (15 more...)

Slate

Country:

Asia > China (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Florida > Brevard County > Cape Canaveral (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)

Industry:

Government > Space Agency (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Architecture (0.66)
Information Technology > Communications > Social Media (0.46)

Neural Information Processing SystemsNov-20-2025, 04:21:35 GMT

Artemis: Towards Referential Understanding in Complex Videos

Videos carry rich visual information including object description, action, interaction, etc., but the existing multimodal large language models (MLLMs) fell short

large language model, machine learning, natural language, (18 more...)

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Neural Information Processing SystemsOct-10-2025, 17:08:27 GMT

Artemis: Towards Referential Understanding in Complex Videos

Videos carry rich visual information including object description, action, interaction, etc., but the existing multimodal large language models (MLLMs) fell short

large language model, machine learning, natural language, (18 more...)

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Neural Information Processing SystemsMay-27-2025, 17:18:39 GMT

Artemis: Towards Referential Understanding in Complex Videos

artemis, artificial intelligence, natural language, (2 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

arXiv.org Artificial IntelligenceMay-6-2025

ARTEMIS: Autoregressive End-to-End Trajectory Planning with Mixture of Experts for Autonomous Driving

Feng, Renju, Xi, Ning, Chu, Duanfeng, Wang, Rukang, Deng, Zejian, Wang, Anzheng, Lu, Liping, Wang, Jinxiang, Huang, Yanjun

--This paper presents ARTEMIS, an end-to-end autonomous driving framework that combines autoregressive trajectory planning with Mixture-of-Experts (MoE). Traditional modular methods suffer from error propagation, while existing end-to-end models typically employ static one-shot inference paradigms that inadequately capture the dynamic changes of the environment. ARTEMIS takes a different method by generating trajectory waypoints sequentially, preserves critical temporal dependencies while dynamically routing scene-specific queries to specialized expert networks. It effectively relieves trajectory quality degradation issues encountered when guidance information is ambiguous, and overcomes the inherent representational limitations of singular network architectures when processing diverse driving scenarios. Additionally, we use a lightweight batch reallocation strategy that significantly improves the training speed of the Mixture-of-Experts model. Through experiments on the NA VSIM dataset, ARTEMIS exhibits superior competitive performance, achieving 87.0 PDMS and 83.1 EPDMS with ResNet-34 backbone, demonstrates state-of-the-art performance on multiple metrics. Code will be available under https://github. UTONOMOUS driving has experienced rapid development over the past few decades.

artificial intelligence, autonomous driving, machine learning, (17 more...)

2504.1958

Country:

Asia > China > Hubei Province > Wuhan (0.05)
Asia > China > Hong Kong (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Transportation > Ground > Road (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.70)
Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Afifi, Salma, Thakkar, Ishan, Pasricha, Sudeep

ARTEMIS: A Mixed Analog-Stochastic In-DRAM Accelerator for Transformer Neural Networks

arXiv.org Artificial IntelligenceJul-17-2024

Transformers have emerged as a powerful tool for natural language processing (NLP) and computer vision. Through the attention mechanism, these models have exhibited remarkable performance gains when compared to conventional approaches like recurrent neural networks (RNNs) and convolutional neural networks (CNNs). Nevertheless, transformers typically demand substantial execution time due to their extensive computations and large memory footprint. Processing in-memory (PIM) and near-memory computing (NMC) are promising solutions to accelerating transformers as they offer high compute parallelism and memory bandwidth. However, designing PIM/NMC architectures to support the complex operations and massive amounts of data that need to be moved between layers in transformer neural networks remains a challenge. We propose ARTEMIS, a mixed analog-stochastic in-DRAM accelerator for transformer models. Through employing minimal changes to the conventional DRAM arrays, ARTEMIS efficiently alleviates the costs associated with transformer model execution by supporting stochastic computing for multiplications and temporal analog accumulations using a novel in-DRAM metal-on-metal capacitor. Our analysis indicates that ARTEMIS exhibits at least 3.0x speedup, 1.8x lower energy, and 1.9x better energy efficiency compared to GPU, TPU, CPU, and state-of-the-art PIM transformer hardware accelerators.

artificial intelligence, machine learning, natural language, (21 more...)

2407.12638

Country: Asia > Middle East > Oman > Al Wusta Governorate > Haima (0.05)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)