Goto

Collaborating Authors

 Government


This Defense Company Made AI Agents That Blow Things Up

WIRED

Scout AI is using technology borrowed from the AI industry to power lethal weapons--and recently demonstrated its explosive potential. Like many Silicon Valley companies today, Scout AI is training large AI models and agents to automate chores. The big difference is that instead of writing code, answering emails, or buying stuff online, Scout AI's agents are designed to seek and destroy things in the physical world with exploding drones. In a recent demonstration, held at an undisclosed military base in central California, Scout AI's technology was put in charge of a self-driving off-road vehicle and a pair of lethal drones. The agents used these systems to find a truck hiding in the area, and then blew it to bits using an explosive charge.



Search for Efficient Large Language Models

Neural Information Processing Systems

Large Language Models (LLMs) have long held sway in the realm s of artificial intelligence research. Numerous efficient techniques, inc luding weight pruning, quantization, and distillation, have been embraced to comp ress LLMs, targeting memory reduction and inference acceleration, which unders core the redundancy in LLMs. However, most model compression techniques concen trate on weight optimization, overlooking the exploration of optimal arch itectures. Besides, traditional architecture search methods, limited by the eleva ted complexity with extensive parameters, struggle to demonstrate their effecti veness on LLMs. In this paper, we propose a training-free architecture search fram ework to identify optimal subnets that preserve the fundamental strengths of the o riginal LLMs while achieving inference acceleration. Furthermore, after gen erating subnets that inherit specific weights from the original LLMs, we introduce a reformation algorithm that utilizes the omitted weights to rectify the inher ited weights with a small amount of calibration data. Compared with SOT A training-fr ee structured pruning works that can generate smaller networks, our method dem onstrates superior performance across standard benchmarks. Furthermore, our generated subnets can directly reduce the usage of GPU memory and achieve infer ence acceleration.