Optimization
ScrewSplat: An End-to-End Method for Articulated Object Recognition
Kim, Seungyeon, Ha, Junsu, Kim, Young Hun, Lee, Yonghyeon, Park, Frank C.
Figure 1: Articulated object recognition by splatting screw axes and Gaussians. Articulated objects with movable parts - such as doors, laptops, and drawers - are common in everyday environments, and manipulating them requires understanding both their 3D geometry and underlying kinematic structure (e.g., joint types and axes). While prior work has addressed this using large-scale datasets of 3D objects with annotated joint axes in supervised settings [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], such methods struggle to generalize to unseen categories - a natural limitation of supervised learning. In this work, we tackle a more challenging yet practical scenario: inferring kinematic structure directly from multi-view RGB images under varying object configurations, without relying on category-specific supervision (see the left of Figure 1). Spurred in part by the success of neural rendering-based 3D reconstruction methods that require no supervised training [12, 13, 14, 15], recent works have adapted these frameworks for articulated object recognition [16, 17, 18, 19, 20], achieving promising results using raw RGB observations. However, a key drawback of these methods lies in their reliance on strong assumptions, such as a known number of articulated components or predefined joint types.
Optimal Batch-Size Control for Low-Latency Federated Learning with Device Heterogeneity
Yang, Huiling, Wang, Zhanwei, Huang, Kaibin
Federated learning (FL) has emerged as a popular approach for collaborative machine learning in sixth-generation (6G) networks, primarily due to its privacy-preserving capabilities. The deployment of FL algorithms is expected to empower a wide range of Internet-of-Things (IoT) applications, e.g., autonomous driving, augmented reality, and healthcare. The mission-critical and time-sensitive nature of these applications necessitates the design of low-latency FL frameworks that guarantee high learning performance. In practice, achieving low-latency FL faces two challenges: the overhead of computing and transmitting high-dimensional model updates, and the heterogeneity in communication-and-computation (C$^2$) capabilities across devices. To address these challenges, we propose a novel C$^2$-aware framework for optimal batch-size control that minimizes end-to-end (E2E) learning latency while ensuring convergence. The framework is designed to balance a fundamental C$^2$ tradeoff as revealed through convergence analysis. Specifically, increasing batch sizes improves the accuracy of gradient estimation in FL and thus reduces the number of communication rounds required for convergence, but results in higher per-round latency, and vice versa. The associated problem of latency minimization is intractable; however, we solve it by designing an accurate and tractable surrogate for convergence speed, with parameters fitted to real data. This approach yields two batch-size control strategies tailored to scenarios with slow and fast fading, while also accommodating device heterogeneity. Extensive experiments using real datasets demonstrate that the proposed strategies outperform conventional batch-size adaptation schemes that do not consider the C$^2$ tradeoff or device heterogeneity.
Mergenetic: a Simple Evolutionary Model Merging Library
Minut, Adrian Robert, Mencattini, Tommaso, Santilli, Andrea, Crisostomi, Donato, Rodolà, Emanuele
Model merging allows combining the capabilities of existing models into a new one - post hoc, without additional training. This has made it increasingly popular thanks to its low cost and the availability of libraries that support merging on consumer GPUs. Recent work shows that pairing merging with evolutionary algorithms can boost performance, but no framework currently supports flexible experimentation with such strategies in language models. We introduce Mergenetic, an open-source library for evolutionary model merging. Mergenetic enables easy composition of merging methods and evolutionary algorithms while incorporating lightweight fitness estimators to reduce evaluation costs. We describe its design and demonstrate that Mergenetic produces competitive results across tasks and languages using modest hardware.
CO-Bench: Benchmarking Language Model Agents in Algorithm Search for Combinatorial Optimization
Sun, Weiwei, Feng, Shengyu, Li, Shanda, Yang, Yiming
Although LLM-based agents have attracted significant attention in domains such as software engineering and machine learning research, their role in advancing combinatorial optimization (CO) remains relatively underexplored. This gap underscores the need for a deeper understanding of their potential in tackling structured, constraint-intensive problems -- a pursuit currently limited by the absence of comprehensive benchmarks for systematic investigation. To address this, we introduce CO-Bench, a benchmark suite featuring 36 real-world CO problems drawn from a broad range of domains and complexity levels. CO-Bench includes structured problem formulations and curated data to support rigorous investigation of LLM agents. We evaluate multiple agentic frameworks against established human-designed algorithms, revealing the strengths and limitations of existing LLM agents and identifying promising directions for future research. CO-Bench is publicly available at https://github.com/sunnweiwei/CO-Bench.
Comparative Explanations: Explanation Guided Decision Making for Human-in-the-Loop Preference Selection
Chakraborty, Tanmay, Wirth, Christian, Seifert, Christin
This paper introduces Multi-Output LOcal Narrative Explanation (MOLONE), a novel comparative explanation method designed to enhance preference selection in human-in-the-loop Preference Bayesian optimization (PBO). The preference elicitation in PBO is a non-trivial task because it involves navigating implicit trade-offs between vector-valued outcomes, subjective priorities of decision-makers, and decision-makers' uncertainty in preference selection. Existing explainable AI (XAI) methods for BO primarily focus on input feature importance, neglecting the crucial role of outputs (objectives) in human preference elicitation. MOLONE addresses this gap by providing explanations that highlight both input and output importance, enabling decision-makers to understand the trade-offs between competing objectives and make more informed preference selections. MOLONE focuses on local explanations, comparing the importance of input features and outcomes across candidate samples within a local neighborhood of the search space, thus capturing nuanced differences relevant to preference-based decision-making. We evaluate MOLONE within a PBO framework using benchmark multi-objective optimization functions, demonstrating its effectiveness in improving convergence compared to noisy preference selections. Furthermore, a user study confirms that MOLONE significantly accelerates convergence in human-in-the-loop scenarios by facilitating more efficient identification of preferred options.
5c5bc7df3d37b2a7ea29e1b47b2bd4ab-Paper.pdf
Most real world applications require dealing with stochasticity like sensor noise or predictive uncertainty, where formal specifications of desired behavior are inherently probabilistic. Despite the promise of formal verification in ensuring the reliability of neural networks, progress in the direction of probabilistic specifications has been limited.
Stateful Strategic Regression
A recent line of research investigates how strategic agents may respond to such scoring tools to receive favorable assessments. While prior work has focused on the short-term strategic interactions between a decision-making institution (modeled as a principal) and individual decision-subjects (modeled as agents), we investigate interactions spanning multiple time-steps . In particular, we consider settings in which the agent's effort investment