Optimization
ATR-Bench: A Federated Learning Benchmark for Adaptation, Trust, and Reasoning
Ashraf, Tajamul, Peerzada, Mohammed Mohsen, Abdar, Moloud, Xie, Yutong, Zhou, Yuyin, Liu, Xiaofeng, Gillani, Iqra Altaf, Bashir, Janibul
Federated Learning (FL) has emerged as a promising paradigm for collaborative model training while preserving data privacy across decentralized participants. As FL adoption grows, numerous techniques have been proposed to tackle its practical challenges. However, the lack of standardized evaluation across key dimensions hampers systematic progress and fair comparison of FL methods. In this work, we introduce ATR-Bench, a unified framework for analyzing federated learning through three foundational dimensions: Adaptation, Trust, and Reasoning. We provide an in-depth examination of the conceptual foundations, task formulations, and open research challenges associated with each theme. We have extensively benchmarked representative methods and datasets for adaptation to heterogeneous clients and trustworthiness in adversarial or unreliable environments. Due to the lack of reliable metrics and models for reasoning in FL, we only provide literature-driven insights for this dimension. ATR-Bench lays the groundwork for a systematic and holistic evaluation of federated learning with real-world relevance. We will make our complete codebase publicly accessible and a curated repository that continuously tracks new developments and research in the FL literature.
Causal Predictive Optimization and Generation for Business AI
Zhao, Liyang, Seton, Olurotimi, Reddivari, Himadeep Reddy, Jena, Suvendu, Zhao, Shadow, Kumar, Rachit, Wei, Changshuai
The sales process involves sales functions converting leads or opportunities to customers and selling more products to existing customers. The optimization of the sales process thus is key to success of any B2B business. In this work, we introduce a principled approach to sales optimization and business AI, namely the Causal Predictive Optimization and Generation, which includes three layers: 1) prediction layer with causal ML 2) optimization layer with constraint optimization and contextual bandit 3) serving layer with Generative AI and feedback-loop for system enhancement. We detail the implementation and deployment of the system in LinkedIn, showcasing significant wins over legacy systems and sharing learning and insight broadly applicable to this field.
Stochastic Processes with Modified Lognormal Distribution Featuring Flexible Upper Tail
Hristopulos, Dionissios T., Baxevani, Anastassia, Kaniadakis, Giorgio
Asymmetric, non-Gaussian probability distributions are often observed in the analysis of natural and engineering datasets. The lognormal distribution is a standard model for data with skewed frequency histograms and fat tails. However, the lognormal law severely restricts the asymptotic dependence of the probability density and the hazard function for high values. Herein we present a family of three-parameter non-Gaussian probability density functions that are based on generalized kappa-exponential and kappa-logarithm functions and investigate its mathematical properties. These kappa-lognormal densities represent continuous deformations of the lognormal with lighter right tails, controlled by the parameter kappa. In addition, bimodal distributions are obtained for certain parameter combinations. We derive closed-form analytic expressions for the main statistical functions of the kappa-lognormal distribution. For the moments, we derive bounds that are based on hypergeometric functions as well as series expansions. Explicit expressions for the gradient and Hessian of the negative log-likelihood are obtained to facilitate numerical maximum-likelihood estimates of the kappa-lognormal parameters from data. We also formulate a joint probability density function for kappa-lognormal stochastic processes by applying Jacobi's multivariate theorem to a latent Gaussian process. Estimation of the kappa-lognormal distribution based on synthetic and real data is explored. Furthermore, we investigate applications of kappa-lognormal processes with different covariance kernels in time series forecasting and spatial interpolation using warped Gaussian process regression. Our results are of practical interest for modeling skewed distributions in various scientific and engineering fields.
Policy Testing in Markov Decision Processes
Ariu, Kaito, Wang, Po-An, Proutiere, Alexandre, Abe, Kenshi
We study the policy testing problem in discounted Markov decision processes (MDPs) under the fixed-confidence setting. The goal is to determine whether the value of a given policy exceeds a specified threshold while minimizing the number of observations. We begin by deriving an instance-specific lower bound that any algorithm must satisfy. This lower bound is characterized as the solution to an optimization problem with non-convex constraints. We propose a policy testing algorithm inspired by this optimization problem--a common approach in pure exploration problems such as best-arm identification, where asymptotically optimal algorithms often stem from such optimization-based characterizations. As for other pure exploration tasks in MDPs, however, the non-convex constraints in the lower-bound problem present significant challenges, raising doubts about whether statistically optimal and computationally tractable algorithms can be designed. To address this, we reformulate the lower-bound problem by interchanging the roles of the objective and the constraints, yielding an alternative problem with a non-convex objective but convex constraints. Strikingly, this reformulated problem admits an interpretation as a policy optimization task in a newly constructed reversed MDP. Leveraging recent advances in policy gradient methods, we efficiently solve this problem and use it to design a policy testing algorithm that is statistically optimal--matching the instance-specific lower bound on sample complexity--while remaining computationally tractable. We validate our approach with numerical experiments.
Aligning Explanations with Human Communication
Teneggi, Jacopo, Wang, Zhenzhen, Yi, Paul H., Shu, Tianmin, Sulam, Jeremias
Machine learning explainability aims to make the decision-making process of black-box models more transparent by finding the most important input features for a given prediction task. Recent works have proposed composing explanations from semantic concepts (e.g., colors, patterns, shapes) that are inherently interpretable to the user of a model. However, these methods generally ignore the communicative context of explanation-the ability of the user to understand the prediction of the model from the explanation. For example, while a medical doctor might understand an explanation in terms of clinical markers, a patient may need a more accessible explanation to make sense of the same diagnosis. In this paper, we address this gap with listener-adaptive explanations. We propose an iterative procedure grounded in principles of pragmatic reasoning and the rational speech act to generate explanations that maximize communicative utility. Our procedure only needs access to pairwise preferences between candidate explanations, relevant in real-world scenarios where a listener model may not be available. We evaluate our method in image classification tasks, demonstrating improved alignment between explanations and listener preferences across three datasets. Furthermore, we perform a user study that demonstrates our explanations increase communicative utility.
Fast Rate Bounds for Multi-Task and Meta-Learning with Different Sample Sizes
Zakerinia, Hossein, Lampert, Christoph H.
We present new fast-rate generalization bounds for multi-task and meta-learning in the unbalanced setting, i.e. when the tasks have training sets of different sizes, as is typically the case in real-world scenarios. Previously, only standard-rate bounds were known for this situation, while fast-rate bounds were limited to the setting where all training sets are of equal size. Our new bounds are numerically computable as well as interpretable, and we demonstrate their flexibility in handling a number of cases where they give stronger guarantees than previous bounds. Besides the bounds themselves, we also make conceptual contributions: we demonstrate that the unbalanced multi-task setting has different statistical properties than the balanced situation, specifically that proofs from the balanced situation do not carry over to the unbalanced setting. Additionally, we shed light on the fact that the unbalanced situation allows two meaningful definitions of multi-task risk, depending on whether if all tasks should be considered equally important or if sample-rich tasks should receive more weight than sample-poor ones.
Human in the Loop Adaptive Optimization for Improved Time Series Forecasting
Tiomoko, Malik, Cherkaoui, Hamza, Paolo, Giuseppe, Yili, Zhang, Meng, Yu, Keli, Zhang, Ali, Hafiz Tiomoko
Time series forecasting models often produce systematic, predictable errors even in critical domains such as energy, finance, and healthcare. We introduce a novel post training adaptive optimization framework that improves forecast accuracy without retraining or architectural changes. Our method automatically applies expressive transformations optimized via reinforcement learning, contextual bandits, or genetic algorithms to correct model outputs in a lightweight and model agnostic way. Theoretically, we prove that affine corrections always reduce the mean squared error; practically, we extend this idea with dynamic action based optimization. The framework also supports an optional human in the loop component: domain experts can guide corrections using natural language, which is parsed into actions by a language model. Across multiple benchmarks (e.g., electricity, weather, traffic), we observe consistent accuracy gains with minimal computational overhead. Our interactive demo shows the framework's real time usability. By combining automated post hoc refinement with interpretable and extensible mechanisms, our approach offers a powerful new direction for practical forecasting systems.
SwarmDiff: Swarm Robotic Trajectory Planning in Cluttered Environments via Diffusion Transformer
Ding, Kang, Jiao, Chunxuan, Hu, Yunze, Zhou, Kangjie, Wu, Pengying, Mu, Yao, Liu, Chang
Swarm robotic trajectory planning faces challenges in computational efficiency, scalability, and safety, particularly in complex, obstacle-dense environments. To address these issues, we propose SwarmDiff, a hierarchical and scalable generative framework for swarm robots. We model the swarm's macroscopic state using Probability Density Functions (PDFs) and leverage conditional diffusion models to generate risk-aware macroscopic trajectory distributions, which then guide the generation of individual robot trajectories at the microscopic level. To ensure a balance between the swarm's optimal transportation and risk awareness, we integrate Wasserstein metrics and Conditional Value at Risk (CVaR). Additionally, we introduce a Diffusion Transformer (DiT) to improve sampling efficiency and generation quality by capturing long-range dependencies. Extensive simulations and real-world experiments demonstrate that SwarmDiff outperforms existing methods in computational efficiency, trajectory validity, and scalability, making it a reliable solution for swarm robotic trajectory planning.
Design of a 3-DOF Hopping Robot with an Optimized Gearbox: An Intermediate Platform Toward Bipedal Robots
Choe, JongHun, Kim, Gijeong, Kim, Hajun, Kang, Dongyun, Kim, Min-Su, Park, Hae-Won
-- This paper presents a 3-DOF hopping robot with a human-like lower-limb joint configuration and a flat foot, capable of performing dynamic and repetitive jumping motions. T o achieve both high torque output and a large hollow shaft diameter for efficient cable routing, a compact 3K compound planetary gearbox was designed using mixed-integer nonlinear programming for gear tooth optimization. T o meet performance requirements within the constrained joint geometry, all major components--including the actuator, motor driver, and communication interface--were custom-designed. The robot weighs 12.45 kg, including a dummy mass, and measures 840 mm in length when the knee joint is fully extended. A reinforcement learning-based controller was employed, and the robot's performance was validated through hardware experiments, demonstrating stable and repetitive hopping motions in response to user inputs. These experimental results indicate that the platform serves as a solid foundation for future bipedal robot development. A supplementary video is available at: https://youtu.be/BZ2H0dQBcXc
Average Reward Reinforcement Learning for Omega-Regular and Mean-Payoff Objectives
Kazemi, Milad, Perez, Mateo, Somenzi, Fabio, Soudjani, Sadegh, Trivedi, Ashutosh, Velasquez, Alvaro
Recent advances in reinforcement learning (RL) have renewed focus on the design of reward functions that shape agent behavior. Manually designing reward functions is tedious and error-prone. A principled alternative is to specify behaviors in a formal language that can be automatically translated into rewards. Omega-regular languages are a natural choice for this purpose, given their established role in formal verification and synthesis. However, existing methods using omega-regular specifications typically rely on discounted reward RL in episodic settings, with periodic resets. This setup misaligns with the semantics of omega-regular specifications, which describe properties over infinite behavior traces. In such cases, the average reward criterion and the continuing setting -- where the agent interacts with the environment over a single, uninterrupted lifetime -- are more appropriate. To address the challenges of infinite-horizon, continuing tasks, we focus on absolute liveness specifications -- a subclass of omega-regular languages that cannot be violated by any finite behavior prefix, making them well-suited to the continuing setting. We present the first model-free RL framework that translates absolute liveness specifications to average-reward objectives. Our approach enables learning in communicating MDPs without episodic resetting. We also introduce a reward structure for lexicographic multi-objective optimization, aiming to maximize an external average-reward objective among the policies that also maximize the satisfaction probability of a given omega-regular specification. Our method guarantees convergence in unknown communicating MDPs and supports on-the-fly reductions that do not require full knowledge of the environment, thus enabling model-free RL. Empirical results show our average-reward approach in continuing setting outperforms discount-based methods across benchmarks.