maestro
MAESTRO: Multi-Agent Environment Shaping through Task and Reward Optimization
Cooperative Multi-Agent Reinforcement Learning (MARL) faces two major design bottlenecks: crafting dense reward functions and constructing curricula that avoid local optima in high-dimensional, non-stationary environments. Existing approaches rely on fixed heuristics or use Large Language Models (LLMs) directly in the control loop, which is costly and unsuitable for real-time systems. We propose MAESTRO (Multi-Agent Environment Shaping through Task and Reward Optimization), a framework that moves the LLM outside the execution loop and uses it as an offline training architect. MAESTRO introduces two generative components: (i) a semantic curriculum generator that creates diverse, performance-driven traffic scenarios, and (ii) an automated reward synthesizer that produces executable Python reward functions adapted to evolving curriculum difficulty. These components guide a standard MARL backbone (MADDPG) without increasing inference cost at deployment. We evaluate MAESTRO on large-scale traffic signal control (Hangzhou, 16 intersections) and conduct controlled ablations. Results show that combining LLM-generated curricula with LLM-generated reward shaping yields improved performance and stability. Across four seeds, the full system achieves +4.0% higher mean return (163.26 vs. 156.93) and 2.2% better risk-adjusted performance (Sharpe 1.53 vs. 0.70) over a strong curriculum baseline. These findings highlight LLMs as effective high-level designers for cooperative MARL training.
- North America > Canada (0.28)
- Asia > China > Zhejiang Province > Hangzhou (0.24)
- Education (0.93)
- Transportation > Ground > Road (0.49)
- Transportation > Infrastructure & Services (0.34)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
AAGATE: A NIST AI RMF-Aligned Governance Platform for Agentic AI
Huang, Ken, Lambros, Kyriakos Rock, Huang, Jerry, Mehmood, Yasir, Atta, Hammad, Beck, Joshua, Narajala, Vineeth Sai, Baig, Muhammad Zeeshan, Haq, Muhammad Aziz Ul, Shahzad, Nadeem, Gupta, Bhavya
This paper introduces the Agentic AI Governance Assurance & Trust Engine (AAGATE), a Kubernetes-native control plane designed to address the unique security and governance challenges posed by autonomous, language-model-driven agents in production. Recognizing the limitations of traditional Application Security (AppSec) tooling for improvisational, machine-speed systems, AAGATE operationalizes the NIST AI Risk Management Framework (AI RMF). It integrates specialized security frameworks for each RMF function: the Agentic AI Threat Modeling MAESTRO framework for Map, a hybrid of OWASP's AIVSS and SEI's SSVC for Measure, and the Cloud Security Alliance's Agentic AI Red Teaming Guide for Manage. By incorporating a zero-trust service mesh, an explainable policy engine, behavioral analytics, and decentralized accountability hooks, AAGATE provides a continuous, verifiable governance solution for agentic AI, enabling safe, accountable, and scalable deployment. The framework is further extended with DIRF for digital identity rights, LPCI defenses for logic-layer injection, and QSAF monitors for cognitive degradation, ensuring governance spans systemic, adversarial, and ethical risks.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > San Mateo County > Redwood City (0.04)
- Europe > Germany (0.04)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
MAESTRO : Adaptive Sparse Attention and Robust Learning for Multimodal Dynamic Time Series
Mohapatra, Payal, Sui, Yueyuan, Pandey, Akash, Xia, Stephen, Zhu, Qi
From clinical healthcare to daily living, continuous sensor monitoring across multiple modalities has shown great promise for real-world intelligent decision-making but also faces various challenges. In this work, we introduce MAESTRO, a novel framework that overcomes key limitations of existing multimodal learning approaches: (1) reliance on a single primary modality for alignment, (2) pairwise modeling of modalities, and (3) assumption of complete modality observations. These limitations hinder the applicability of these approaches in real-world multimodal time-series settings, where primary modality priors are often unclear, the number of modalities can be large (making pairwise modeling impractical), and sensor failures often result in arbitrary missing observations. At its core, MAESTRO facilitates dynamic intra- and cross-modal interactions based on task relevance, and leverages symbolic tokenization and adaptive attention budgeting to construct long multimodal sequences, which are processed via sparse cross-modal attention. The resulting cross-modal tokens are routed through a sparse Mixture-of-Experts (MoE) mechanism, enabling black-box specialization under varying modality combinations. We evaluate MAESTRO against 10 baselines on four diverse datasets spanning three applications, and observe average relative improvements of 4% and 8% over the best existing multimodal and multivariate approaches, respectively, under complete observations. Under partial observations -- with up to 40% of missing modalities -- MAESTRO achieves an average 9% improvement. Further analysis also demonstrates the robustness and efficiency of MAESTRO's sparse, modality-aware design for learning from dynamic time series.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.92)
Multi-modal Adaptive Estimation for Temporal Respiratory Disease Outbreak
Liu, Hong, Cen, Kerui, Chen, Yanxing, Liu, Zige, Chen, Dong, Yang, Zifeng, Hon, Chitin
Timely and robust influenza incidence forecasting is critical for public health decision-making. This paper presents MAESTRO (Multi-modal Adaptive Estimation for Temporal Respiratory Disease Outbreak), a novel, unified framework that synergistically integrates advanced spectro-temporal modeling with multi-modal data fusion, including surveillance, web search trends, and meteorological data. By adaptively weighting heterogeneous data sources and decomposing complex time series patterns, the model achieves robust and accurate forecasts. Evaluated on over 11 years of Hong Kong influenza data (excluding the COVID-19 period), MAESTRO demonstrates state-of-the-art performance, achieving a superior model fit with an R-square of 0.956. Extensive ablations confirm the significant contributions of its multi-modal and spectro-temporal components. The modular and reproducible pipeline is made publicly available to facilitate deployment and extension to other regions and pathogens, presenting a powerful tool for epidemiological forecasting.
- Asia > China > Hong Kong (0.25)
- Asia > Macao (0.08)
- Asia > China > Guangdong Province > Guangzhou (0.05)
- (3 more...)
- Research Report > Experimental Study (0.68)
- Research Report > New Finding (0.46)
Maestro: Self-Improving Text-to-Image Generation via Agent Orchestration
Wan, Xingchen, Zhou, Han, Sun, Ruoxi, Nakhost, Hootan, Jiang, Ke, Sinha, Rajarishi, Arık, Sercan Ö.
Text-to-image (T2I) models, while offering immense creative potential, are highly reliant on human intervention, posing significant usability challenges that often necessitate manual, iterative prompt engineering over often underspecified prompts. This paper introduces Maestro, a novel self-evolving image generation system that enables T2I models to autonomously self-improve generated images through iterative evolution of prompts, using only an initial prompt. Maestro incorporates two key innovations: 1) self-critique, where specialized multimodal LLM (MLLM) agents act as 'critics' to identify weaknesses in generated images, correct for under-specification, and provide interpretable edit signals, which are then integrated by a 'verifier' agent while preserving user intent; and 2) self-evolution, utilizing MLLM-as-a-judge for head-to-head comparisons between iteratively generated images, eschewing problematic images, and evolving creative prompt candidates that align with user intents. Extensive experiments on complex T2I tasks using black-box models demonstrate that Maestro significantly improves image quality over initial prompts and state-of-the-art automated methods, with effectiveness scaling with more advanced MLLM components. This work presents a robust, interpretable, and effective pathway towards self-improving T2I generation.
Maestro: Joint Graph & Config Optimization for Reliable AI Agents
Wang, Wenxiao, Kattakinda, Priyatham, Feizi, Soheil
Building reliable LLM agents requires decisions at two levels: the graph (which modules exist and how information flows) and the configuration of each node (models, prompts, tools, control knobs). Most existing optimizers tune configurations while holding the graph fixed, leaving structural failure modes unaddressed. We introduce Maestro, a framework-agnostic holistic optimizer for LLM agents that jointly searches over graphs and configurations to maximize agent quality, subject to explicit rollout/token budgets. Beyond numeric metrics, Maestro leverages reflective textual feedback from traces to prioritize edits, improving sample efficiency and targeting specific failure modes. On the IFBench and HotpotQA benchmarks, Maestro consistently surpasses leading prompt optimizers--MIPROv2, GEPA, and GEPA+Merge--by an average of 12%, 4.9%, and 4.86%, respectively; even when restricted to prompt-only optimization, it still leads by 9.65%, 2.37%, and 2.41%. Maestro achieves these results with far fewer rollouts than GEPA. We further show large gains on two applications (interviewer & RAG agents), highlighting that joint graph & configuration search addresses structural failure modes that prompt tuning alone cannot fix.
- Banking & Finance (1.00)
- Information Technology (0.68)
Guess who brought back Agatha Christie as an AI clone
Feedback is New Scientist's popular sideways look at the latest science and technology news. You can submit items you believe may amuse readers to Feedback by emailing feedback@newscientist.com Now and then, Feedback sees ads for courses promising to teach us how to become an excellent creative writer. It sounds like fun, but why learn to be a good writer when we can just do this stuff instead? One brand that recently caught Feedback's eye is BBC Maestro.
Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs
Omar, Reham, Mangukiya, Omij, Mansour, Essam
Dialogue benchmarks are crucial in training and evaluating chatbots engaging in domain-specific conversations. Knowledge graphs (KGs) represent semantically rich and well-organized data spanning various domains, such as DBLP, DBpedia, and YAGO. Traditionally, dialogue benchmarks have been manually created from documents, neglecting the potential of KGs in automating this process. Some question-answering benchmarks are automatically generated using extensive preprocessing from KGs, but they do not support dialogue generation. This paper introduces Chatty-Gen, a novel multi-stage retrieval-augmented generation platform for automatically generating high-quality dialogue benchmarks tailored to a specific domain using a KG. Chatty-Gen decomposes the generation process into manageable stages and uses assertion rules for automatic validation between stages. Our approach enables control over intermediate results to prevent time-consuming restarts due to hallucinations. It also reduces reliance on costly and more powerful commercial LLMs. Chatty-Gen eliminates upfront processing of the entire KG using efficient query-based retrieval to find representative subgraphs based on the dialogue context. Our experiments with several real and large KGs demonstrate that Chatty-Gen significantly outperforms state-of-the-art systems and ensures consistent model and system performance across multiple LLMs of diverse capabilities, such as GPT-4o, Gemini 1.5, Llama 3, and Mistral.
- North America > Canada (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Liaoning Province > Shenyang (0.04)
- Research Report (0.64)
- Workflow (0.46)
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition
Horvath, Samuel, Laskaridis, Stefanos, Rajput, Shashank, Wang, Hongyi
Deep Neural Networks (DNNs) have been a large driver for AI breakthroughs in recent years. However, these models have been getting increasingly large as they become more accurate and safe. This means that their training becomes increasingly costly and time-consuming and typically yields a single model to fit all targets. Various techniques have been proposed in the literature to mitigate this, including pruning, sparsification, or quantization of model weights and updates. While achieving high compression rates, they often incur significant computational overheads at training or lead to non-negligible accuracy penalty. Alternatively, factorization methods have been leveraged for low-rank compression of DNNs. Similarly, such techniques (e.g., SVD) frequently rely on heavy iterative decompositions of layers and are potentially sub-optimal for non-linear models, such as DNNs. We take a further step in designing efficient low-rank models and propose Maestro, a framework for trainable low-rank layers. Instead of iteratively applying a priori decompositions, the low-rank structure is baked into the training process through LoD, a low-rank ordered decomposition. Not only is this the first time importance ordering via sampling is applied on the decomposed DNN structure, but it also allows selecting ranks at a layer granularity. Our theoretical analysis demonstrates that in special cases LoD recovers the SVD decomposition and PCA. Applied to DNNs, Maestro enables the extraction of lower footprint models that preserve performance. Simultaneously, it enables the graceful trade-off between accuracy-latency for deployment to even more constrained devices without retraining.
- North America > United States (0.14)
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- (2 more...)
Amazon debuts a generative AI-powered playlist feature
Amazon Music is joining Spotify in starting to offer a generative AI-powered playlist feature. For now, Maestro is available in beta to a small number of Amazon Music users in the US on iOS and Android. Folks who are included in the beta will see Maestro on the home screen after they update to the latest version of the app. They can also access the tool by tapping the plus button to create a new playlist. The idea is to use natural language prompts to create any kind of playlist imaginable.
- North America > United States (0.27)
- Oceania > Australia (0.07)
- Europe > United Kingdom (0.07)
- Media > Music (0.43)
- Leisure & Entertainment (0.43)