Goto

Collaborating Authors

 orchestrator


Throughput-OptimalTopology Design forCross-SiloFederatedLearning

Neural Information Processing Systems

Federated learning (FL) "involves training statistical models over remote devices or siloed data centers,suchasmobile phones orhospitals, whilekeepingdatalocalized"[56]because ofprivacy concerns orlimitedcommunication resources. Hence, clients only communicate with apotentially far-away (e.g., in another continent) orchestrator and do not Recent experimental and theoretical work suggests that, in practice,the first effect has been over-estimated by classic worst-caseconvergencebounds.


Throughput-Optimal Topology Design for Cross-Silo Federated Learning

Neural Information Processing Systems

Federated learning usually employs a client-server architecture where an orchestrator iteratively aggregates model updates from remote clients and pushes them back a refined model. This approach may be inefficient in cross-silo settings, as close-by data silos with high-speed access links may exchange information faster than with the orchestrator, and the orchestrator may become a communication bottleneck. In this paper we define the problem of topology design for cross-silo federated learning using the theory of max-plus linear systems to compute the system throughput---number of communication rounds per time unit. We also propose practical algorithms that, under the knowledge of measurable network characteristics, find a topology with the largest throughput or with provable throughput guarantees.


An End-to-end Planning Framework with Agentic LLMs and PDDL

La Malfa, Emanuele, Zhu, Ping, Marro, Samuele, Bernardini, Sara, Wooldridge, Michael

arXiv.org Artificial Intelligence

We present an end-to-end framework for planning supported by verifiers. An orchestrator receives a human specification written in natural language and converts it into a PDDL (Planning Domain Definition Language) model, where the domain and problem are iteratively refined by sub-modules (agents) to address common planning requirements, such as time constraints and optimality, as well as ambiguities and contradictions that may exist in the human specification. The validated domain and problem are then passed to an external planning engine to generate a plan. The orchestrator and agents are powered by Large Language Models (LLMs) and require no human intervention at any stage of the process. Finally, a module translates the final plan back into natural language to improve human readability while maintaining the correctness of each step. We demonstrate the flexibility and effectiveness of our framework across various domains and tasks, including the Google NaturalPlan benchmark and PlanBench, as well as planning problems like Blocksworld and the Tower of Hanoi (where LLMs are known to struggle even with small instances). Our framework can be integrated with any PDDL planning engine and validator (such as Fast Downward, LPG, POPF, V AL, and uV AL, which we have tested) and represents a significant step toward end-to-end planning aided by LLMs.


Designing for Distributed Heterogeneous Modularity: On Software Architecture and Deployment of MoonBots

Neppel, Elian, Karimov, Shamistan, Mishra, Ashutosh, Huenupan, Gustavo Hernan Diaz, Gozbasi, Hazal, Uno, Kentaro, Santra, Shreya, Yoshida, Kazuya

arXiv.org Artificial Intelligence

This paper presents the software architecture and deployment strategy behind the MoonBot platform: a modular space robotic system composed of heterogeneous components distributed across multiple computers, networks and ultimately celestial bodies. We introduce a principled approach to distributed, heterogeneous modularity, extending modular robotics beyond physical reconfiguration to software, communication and orchestration. We detail the architecture of our system that integrates component-based design, a data-oriented communication model using ROS2 and Zenoh, and a deployment orchestrator capable of managing complex multi-module assemblies. These abstractions enable dynamic reconfiguration, decentralized control, and seamless collaboration between numerous operators and modules. At the heart of this system lies our open-source Motion Stack software, validated by months of field deployment with self-assembling robots, inter-robot cooperation, and remote operation. Our architecture tackles the significant hurdles of modular robotics by significantly reducing integration and maintenance overhead, while remaining scalable and robust. Although tested with space in mind, we propose generalizable patterns for designing robotic systems that must scale across time, hardware, teams and operational environments.


Beyond Connectivity: An Open Architecture for AI-RAN Convergence in 6G

Polese, Michele, Mohamadi, Niloofar, D'Oro, Salvatore, Bonati, Leonardo, Melodia, Tommaso

arXiv.org Artificial Intelligence

Abstract--Data-intensive Artificial Intelligence (AI) applications at the network edge demand a fundamental shift in Radio Access Network (RAN) design, from merely consuming AI for network optimization, to actively enabling distributed AI workloads. This presents a significant opportunity for network operators to monetize AI while leveraging existing infrastructure. T o realize this vision, this article presents a novel converged O-RAN and AI-RAN architecture for unified orchestration and management of telecommunications and AI workloads on shared infrastructure. The proposed architecture extends the Open RAN principles of modularity, disaggregation, and cloud-nativeness to support heterogeneous AI deployments. We introduce two key architectural innovations: (i) the AI-RAN Orchestrator, which extends the O-RAN Service Management and Orchestration (SMO) to enable integrated resource and allocation across RAN and AI workloads; and (ii) AI-RAN sites that provide distributed edge AI platforms with real-time processing capabilities. The proposed architecture enables flexible orchestration, meeting requirements for managing heterogeneous workloads at different time scales while maintaining open, standardized interfaces and multi-vendor interoperability.This paper has been submitted to IEEE for publication. M. Polese, L. Bonati, and T. Melodia are with the Institute for the Wireless Internet of Things, Northeastern University, Boston, MA, USA. This article is based upon work partially supported by the NTIA PWSCIF under A ward No. 25-60-IF054, the U.S. NSF under award CNS-2112471, and by OUSD(R&E) through Army Research Laboratory Cooperative Agreement Number W911NF-24-2-0065.


AgentODRL: A Large Language Model-based Multi-agent System for ODRL Generation

Zhong, Wanle, Huang, Keman, Du, Xiaoyong

arXiv.org Artificial Intelligence

The Open Digital Rights Language (ODRL) is a pivotal standard for automating data rights management. However, the inherent logical complexity of authorization policies, combined with the scarcity of high-quality "Natural Language-to-ODRL" training datasets, impedes the ability of current methods to efficiently and accurately translate complex rules from natural language into the ODRL format. To address this challenge, this research leverages the potent comprehension and generation capabilities of Large Language Models (LLMs) to achieve both automation and high fidelity in this translation process. We introduce AgentODRL, a multi-agent system based on an Orchestrator-Workers architecture. The architecture consists of specialized Workers, including a Generator for ODRL policy creation, a Decomposer for breaking down complex use cases, and a Rewriter for simplifying nested logical relationships. The Orchestrator agent dynamically coordinates these Workers, assembling an optimal pathway based on the complexity of the input use case. Specifically, we enhance the ODRL Generator by incorporating a validator-based syntax strategy and a semantic reflection mechanism powered by a LoRA-finetuned model, significantly elevating the quality of the generated policies. Extensive experiments were conducted on a newly constructed dataset comprising 770 use cases of varying complexity, all situated within the context of data spaces. The results, evaluated using ODRL syntax and semantic scores, demonstrate that our proposed Orchestrator-Workers system, enhanced with these strategies, achieves superior performance on the ODRL generation task.


Joint Partitioning and Placement of Foundation Models for Real-Time Edge AI

Djuhera, Aladin, Koch, Fernando, Binotto, Alecio

arXiv.org Artificial Intelligence

Static partitioning of model layers presumes temporal stability across compute and network resources, which is misaligned with the volatility of real-world deployments. We introduce a framework in which both the spatial placement and internal segmentation of foundation models are elevated to runtime-resolved constructs. The orchestration problem is formalized as a constrained optimization over layer-wise assignments, subject to evolving latency, utilization, and privacy gradients. The framework implements reactive inference composition responsive to infrastructural fluctuations by integrating model-aware capacity profiling with dynamic graph re-partitioning and reallocation. We introduce architectural and algorithmic components, along with a representative use case in 6G multi-access edge computing.


Energy-Aware Data-Driven Model Selection in LLM-Orchestrated AI Systems

Smirnova, Daria, Nasiri, Hamid, Adamska, Marta, Yu, Zhengxin, Garraghan, Peter

arXiv.org Artificial Intelligence

As modern artificial intelligence (AI) systems become more advanced and capable, they can leverage a wide range of tools and models to perform complex tasks. Today, the task of orchestrating these models is often performed by Large Language Models (LLMs) that rely on qualitative descriptions of models for decision-making. However, the descriptions provided to these LLM-based orchestrators do not reflect true model capabilities and performance characteristics, leading to suboptimal model selection, reduced accuracy, and increased energy costs. In this paper, we conduct an empirical analysis of LLM-based orchestration limitations and propose GUIDE, a new energy-aware model selection framework that accounts for performance-energy trade-offs by incorporating quantitative model performance characteristics in decision-making. Experimental results demonstrate that GUIDE increases accuracy by 0.90%-11.92% across various evaluated tasks, and achieves up to 54% energy efficiency improvement, while reducing orchestrator model selection latency from 4.51 s to 7.2 ms.


Multilingual Conversational AI for Financial Assistance: Bridging Language Barriers in Indian FinTech

Hazarika, Bharatdeep, Suneesh, Arya, Devadiga, Prasanna, Rajpoot, Pawan Kumar, Suresh, Anshuman B, Hussain, Ahmed Ifthaquar

arXiv.org Artificial Intelligence

India's linguistic diversity presents both opportunities and challenges for fintech platforms. While the country has 31 major languages and over 100 minor ones, only 10\% of the population understands English, creating barriers to financial inclusion. We present a multilingual conversational AI system for a financial assistance use case that supports code-mixed languages like Hinglish, enabling natural interactions for India's diverse user base. Our system employs a multi-agent architecture with language classification, function management, and multilingual response generation. Through comparative analysis of multiple language models and real-world deployment, we demonstrate significant improvements in user engagement while maintaining low latency overhead (4-8\%). This work contributes to bridging the language gap in digital financial services for emerging markets.


ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Su, Hongjin, Diao, Shizhe, Lu, Ximing, Liu, Mingjie, Xu, Jiacheng, Dong, Xin, Fu, Yonggan, Belcak, Peter, Ye, Hanrong, Yin, Hongxu, Dong, Yi, Bakhturina, Evelina, Yu, Tao, Choi, Yejin, Kautz, Jan, Molchanov, Pavlo

arXiv.org Artificial Intelligence

Large language models are powerful generalists, yet solving deep and complex problems such as those of the Humanity's Last Exam (HLE) remains both conceptually challenging and computationally expensive. We show that small orchestrators managing other models and a variety of tools can both push the upper bound of intelligence and improve efficiency in solving difficult agentic tasks. We introduce ToolOrchestra, a method for training small orchestrators that coordinate intelligent tools. ToolOrchestra explicitly uses reinforcement learning with outcome-, efficiency-, and user-preference-aware rewards. Using ToolOrchestra, we produce Orchestrator, an 8B model that achieves higher accuracy at lower cost than previous tool-use agents while aligning with user preferences on which tools are to be used for a given query. On HLE, Orchestrator achieves a score of 37.1%, outperforming GPT-5 (35.1%) while being 2.5x more efficient. On tau2-Bench and FRAMES, Orchestrator surpasses GPT-5 by a wide margin while using only about 30% of the cost. Extensive analysis shows that Orchestrator achieves the best trade-off between performance and cost under multiple metrics, and generalizes robustly to unseen tools. These results demonstrate that composing diverse tools with a lightweight orchestration model is both more efficient and more effective than existing methods, paving the way for practical and scalable tool-augmented reasoning systems.