optimal cost
LEXICON: a Benchmark for Planning under Temporal Constraints in Natural Language
Owing to their reasoning capabilities, large language models (LLMs) have been evaluated on planning tasks described in natural language. However, LLMs have largely been tested on planning domains without constraints. In order to deploy them in real-world settings where adherence to constraints, in particular safety constraints, is critical, we need to evaluate their performance on constrained planning tasks. We introduce LEXICON--a natural language-based (LEXI) constrained (CON) planning benchmark, consisting of a suite of environments, that can be used to evaluate the planning capabilities of LLMs in a principled fashion. The core idea behind LEXICON is to take existing planning environments and impose temporal constraints on the states.
A-MHA*: Anytime Multi-Heuristic A*
Natarajan, Ramkumar, Saleem, Muhammad Suhail, Xiao, William, Aine, Sandip, Choset, Howie, Likhachev, Maxim
Designing good heuristic functions for graph search requires adequate domain knowledge. It is often easy to design heuristics that perform well and correlate with the underlying true cost-to-go values in certain parts of the search space but these may not be admissible throughout the domain thereby affecting the optimality guarantees of the search. Bounded suboptimal search using several such partially good but inadmissible heuristics was developed in Multi-Heuristic A* (MHA*). Although MHA* leverages multiple inadmissible heuristics to potentially generate a faster suboptimal solution, the original version does not improve the solution over time. It is a one shot algorithm that requires careful setting of inflation factors to obtain a desired one time solution. In this work, we tackle this issue by extending MHA* to an anytime version that finds a feasible suboptimal solution quickly and continually improves it until time runs out. Our work is inspired from the Anytime Repairing A* (ARA*) algorithm. We prove that our precise adaptation of ARA* concepts in the MHA* framework preserves the original suboptimal and completeness guarantees and enhances MHA* to perform in an anytime fashion. Furthermore, we report the performance of A-MHA* in 3-D path planning domain and sliding tiles puzzle and compare against MHA* and other anytime algorithms.
Learning in Repeated Multi-Objective Stackelberg Games with Payoff Manipulation
Srisawad, Phurinut, Branke, Juergen, Tran-Thanh, Long
We study payoff manipulation in repeated multi-objective Stackelberg games, where a leader may strategically influence a follower's deterministic best response, e.g., by offering a share of their own payoff. We assume that the follower's utility function, representing preferences over multiple objectives, is unknown but linear, and its weight parameter must be inferred through interaction. This introduces a sequential decision-making challenge for the leader, who must balance preference elicitation with immediate utility maximisation. We formalise this problem and propose manipulation policies based on expected utility (EU) and long-term expected utility (longEU), which guide the leader in selecting actions and offering incentives that trade off short-term gains with long-term impact. We prove that under infinite repeated interactions, longEU converges to the optimal manipulation. Empirical results across benchmark environments demonstrate that our approach improves cumulative leader utility while promoting mutually beneficial outcomes, all without requiring explicit negotiation or prior knowledge of the follower's utility function.
A Regression Approach to Learning Augmented Online Algorithms (Supplementary)
K. Anand, R. Ge, A. Kumar, D. Panigrahi
Do the main claims made in the abstract and introduction accurately reflect the paper's Did you discuss any potential negative societal impacts of your work? Did you include complete proofs of all theoretical results? If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A] (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they Did you include the total amount of compute and the type of resources used (e.g., type Did you include any new assets either in the supplemental material or as a URL?[N/A] Did you discuss whether and how consent was obtained from people whose data you're If you used crowdsourcing or conducted research with human subjects... (a) In this section, we prove Theorems 6 and 12 which give upper bounds on the sample complexity in the standard and agnostic settings respectively. The following is a well-known result that relates covering numbers to the pseudo dimension (cf. A.1 The Standard Model: Proof of Theorem 6 First, we relate covering numbers to this error measure.
Facility Location Problem with Aleatory Agents
Auricchio, Gennaro, Zhang, Jie
In this paper, we introduce and study the Facility Location Problem with Aleatory Agents (FLPAA), where the facility accommodates n agents larger than the number of agents reporting their preferences, namely n_r. The spare capacity is used by n_u=n-n_r aleatory agents sampled from a probability distribution \mu. The goal of FLPAA is to find a location that minimizes the ex-ante social cost, which is the expected cost of the n_u agents sampled from \mu plus the cost incurred by the agents reporting their position. We investigate the mechanism design aspects of the FLPAA under the assumption that the Mechanism Designer (MD) lacks knowledge of the distribution $\mu$ but can query k quantiles of \mu. We explore the trade-off between acquiring more insights into the probability distribution and designing a better-performing mechanism, which we describe through the strong approximation ratio (SAR). The SAR of a mechanism measures the highest ratio between the cost of the mechanisms and the cost of the optimal solution on the worst-case input x and worst-case distribution \mu, offering a metric for efficiency that does not depend on \mu. We divide our study into four different information settings: the zero information case, in which the MD has access to no quantiles; the median information case, in which the MD has access to the median of \mu; the n_u-quantile information case, in which the MD has access to n_u quantiles of its choice, and the k-quantile information case, in which the MD has access to k
Cost-Based Semantics for Querying Inconsistent Weighted Knowledge Bases
Bienvenu, Meghyn, Bourgaux, Camille, Jean, Robin
In this paper, we explore a quantitative approach to querying inconsistent description logic knowledge bases. We consider weighted knowledge bases in which both axioms and assertions have (possibly infinite) weights, which are used to assign a cost to each interpretation based upon the axioms and assertions it violates. Two notions of certain and possible answer are defined by either considering interpretations whose cost does not exceed a given bound or restricting attention to optimal-cost interpretations. Our main contribution is a comprehensive analysis of the combined and data complexity of bounded cost satisfiability and certain and possible answer recognition, for description logics between ELbot and ALCO.
$A^*$ for Graphs of Convex Sets
Sundar, Kaarthik, Rathinam, Sivakumar
We present a novel algorithm that fuses the existing convex-programming based approach with heuristic information to find optimality guarantees and near-optimal paths for the Shortest Path Problem in the Graph of Convex Sets (SPP-GCS). Our method, inspired by $A^*$, initiates a best-first-like procedure from a designated subset of vertices and iteratively expands it until further growth is neither possible nor beneficial. Traditionally, obtaining solutions with bounds for an optimization problem involves solving a relaxation, modifying the relaxed solution to a feasible one, and then comparing the two solutions to establish bounds. However, for SPP-GCS, we demonstrate that reversing this process can be more advantageous, especially with Euclidean travel costs. In other words, we initially employ $A^*$ to find a feasible solution for SPP-GCS, then solve a convex relaxation restricted to the vertices explored by $A^*$ to obtain a relaxed solution, and finally, compare the solutions to derive bounds. We present numerical results to highlight the advantages of our algorithm over the existing approach in terms of the sizes of the convex programs solved and computation time.