Agents
The Download: AI "coworkers" and stratospheric internet
Plus: The US House has passed new youth online safety legislation. AI agents are not your "coworkers" Imagine coming in to work to learn that a new underling will report to you. The worker is not a person but an AI tool--one that your company nonetheless calls Alex, an "employee" with a title and defined responsibilities. How well do you think you would work with Alex? If you're anything like the managers studied by Boston University professor Emma Wiles, treating that AI as a coworker would lead you to do a worse job. They caught 18% fewer errors when the work was attributed to an agentic AI employee rather than a chatbot. This is an alarming glimpse of the future Silicon Valley is hurling us toward.
AI agents are not your "coworkers"
AI agents are not your "coworkers" Marketing AI agents as digital employees may make human workers worse at spotting errors and more likely to offload accountability. Imagine coming in to work to learn that a new underling will report to you. The worker is not a person but an AI tool--one that your company nonetheless calls Alex, an "employee" with a title and defined responsibilities. How well do you think you would work with Alex? If you're anything like the managers recently studied by Emma Wiles, a Boston University business professor, treating Alex as a "coworker" and not a software tool would lead you to do a worse job. Wiles found that people caught 18% fewer errors when the work was said to have come from an agentic "AI employee" rather than a chatbot. It turns out that what's in a name matters.
Agent confidence on the technical frontier
A ranking of 101 agent tasks reveals where workflows are trending and where connected intelligence is critical. Enterprise investment in AI is booming. Gartner is calling 2026 an " inflection year " for organizations to align their AI projects with strategic business objectives. As the pressure to prove ROI mounts, executives and technology leaders are looking to agentic AI to drive the measurable financial outcomes their businesses seek. A prime opportunity for AI agents exists in the tech function, where IT infrastructure costs are projected to grow two to three times by 2030, even as budgets remain unchanged, according to McKinsey . And in the last 18 months, tech teams--the engineers, developers, architects, and other practitioners who are building, deploying, and continually improving their organizations' infrastructure and applications--are clearly putting agents to work.
SYMPHONY: Synergistic Multi-agent Planning with Heterogeneous Language Model Assembly
Recent advancements have increasingly focused on leveraging large language models (LLMs) to construct autonomous agents for complex problem-solving tasks. However, existing approaches predominantly employ a single-agent framework to generate search branches and estimate rewards during Monte Carlo Tree Search (MCTS) planning. This single-agent paradigm inherently limits exploration capabilities, often resulting in insufficient diversity among generated branches and suboptimal planning performance.
Stay In Control Of AI With The MSI Cubi NUC AI 3MG Mini PC
When you purchase through links in our articles, we may earn a small commission. Run private, reliable local AI workflows with MSI's Cubi NUC AI+ 3MG mini PC, built for compact on-device AI, agents, and secure automation. Agentic AI is the current buzzword of the day, but if you're chatting with ChatGPT, having Gemini make images for you, or coding with Claude, then you're almost exclusively interacting with a cloud AI ecosystem. This is great for the most capable, frontier models, but it's dependent on server uptime, gives few guarantees for privacy and security, and once you start running agents, token costs can quickly run away from you. For smaller, everyday AI tasks, persistent agentic work, or partnering with more powerful cloud or local AI tools, small-scale local AI PCs can be ideal.
LOPT: Learning Optimal Pigovian Tax in Sequential Social Dilemmas
Multi-agent reinforcement learning (MARL) has emerged as a powerful framework for modeling autonomous agents that independently optimize their individual objectives. However, in mixed-motive MARL environments, rational self-interested behaviors often lead to collectively suboptimal outcomes situations commonly referred to as social dilemmas. A key challenge in addressing social dilemmas lies in accurately quantifying and representing them in a numerical form that captures how self-interested agent behaviors impact social welfare. To address this challenge, \textit{externalities} in the economic concept is adopted and extended to denote the unaccounted-for impact of one agent's actions on others, as a means to rigorously quantify social dilemmas.
MisoDICE: Multi-Agent Imitation from Unlabeled Mixed-Quality Demonstrations
We study offline imitation learning (IL) in cooperative multi-agent settings, where demonstrations have unlabeled mixed quality -- containing both expert and suboptimal trajectories. Our proposed solution is structured in two stages: trajectory labeling and multi-agent imitation learning, designed jointly to enable effective learning from heterogeneous, unlabeled data. In the first stage, we combine advances in large language models and preference-based reinforcement learning to construct a progressive labeling pipeline that distinguishes expert-quality trajectories. In the second stage, we introduce MisoDICE, a novel multi-agent IL algorithm that leverages these labels to learn robust policies while addressing the computational complexity of large joint state-action spaces. By extending the popular single-agent DICE framework to multi-agent settings with a new value decomposition and mixing architecture, our method yields a convex policy optimization objective and ensures consistency between global and local policies. We evaluate MisoDICE on multiple standard multi-agent RL benchmarks and demonstrate superior performance, especially when expert data is scarce.
Strategic Hypothesis Testing
We examine hypothesis testing within a principal-agent framework, where a strategic agent, holding private beliefs about the effectiveness of a product, submits data to a principal who decides on approval. The principal employs a hypothesis testing rule, aiming to pick a p-value threshold that balances false positives and false negatives while anticipating the agent's incentive to maximize expected profitability. Building on prior work, we develop a game-theoretic model that captures how the agent's participation and reporting behavior respond to the principal's statistical decision rule. Despite the complexity of the interaction, we show that the principal's errors exhibit clear monotonic behavior when segmented by an efficiently computable critical p-value threshold, leading to an interpretable characterization of their optimal p-value threshold.
APrinciple of Targeted Intervention for Multi-Agent Reinforcement Learning
Steering cooperative multi-agent reinforcement learning (MARL) towards desired outcomes is challenging, particularly when the global guidance from a human on the whole multi-agent system is impractical in a large-scale MARL. On the other hand, designing external mechanisms (e.g., intrinsic rewards and human feedback) to coordinate agents mostly relies on empirical studies, lacking a easy-to-use research tool. In this work, we employ multi-agent influence diagrams (MAIDs) as a graphical framework to address the above issues. First, we introduce the concept of MARL interaction paradigms (orthogonal to MARL learning paradigms), using MAIDs to analyze and visualize both unguided self-organization and global guidance mechanisms in MARL. Then, we design a new MARL interaction paradigm, referred to as the targeted intervention paradigm that is applied to only a single targeted agent, so the problem of global guidance can be mitigated. In implementation, we introduce a causal inference technique--referred to as Pre-Strategy Intervention (PSI)--to realize the targeted intervention paradigm. Since MAIDs can be regarded as a special class of causal diagrams, a composite desired outcome that integrates the primary task goal and an additional desired outcome can be achieved by maximizing the corresponding causal effect through the PSI. Moreover, the bundled relevance graph analysis of MAIDs provides a tool to identify whether an MARL learning paradigm is workable under the design of an MARL interaction paradigm. In experiments, we demonstrate the effectiveness of our proposed targeted intervention, and verify the result of relevance graph analysis.
Wonder Wins Ways: Curiosity-Driven Exploration through Multi-Agent Contextual Calibration
Autonomous exploration in complex multi-agent reinforcement learning (MARL) with sparse rewards critically depends on providing agents with effective intrinsic motivation. While artificial curiosity offers a powerful self-supervised signal, it often confuses environmental stochasticity with meaningful novelty.