Goto

Collaborating Authors

 algorithm return


Near-Optimal Sample Complexity for Online Constrained MDPs

Neural Information Processing Systems

Safety is a fundamental challenge in reinforcement learning (RL), particularly in real-world applications such as autonomous driving, robotics, and healthcare. To address this, Constrained Markov Decision Processes (CMDPs) are commonly used to enforce safety constraints while optimizing performance. However, existing methods often suffer from significant safety violations or require a high sample complexity to generate near-optimal policies. We address two settings: relaxed feasibility, where small violations are allowed, and strict feasibility, where no violation is allowed. We propose a model-based primal-dual algorithm that balances regret and bounded constraint violations, drawing on techniques from online RL and constrained optimization.


Hierarchical Clustering via Spreading Metrics

Neural Information Processing Systems

We study the cost function for hierarchical clusterings introduced by [Dasgupta, 2015] where hierarchies are treated as first-class objects rather than deriving their cost from projections into flat clusters. It was also shown in [Dasgupta, 2015] that a top-down algorithm returns a hierarchical clustering of cost at most (O\left(\alpha n) is the approximation ratio of the Sparsest Cut subroutine used. Thus using the best known approximation algorithm for Sparsest Cut due to Arora-Rao-Vazirani, the top down algorithm returns a hierarchical clustering of cost at most (O\left(\log^{3/2} n\right)) times the cost of the optimal solution. We improve this by giving an (O(\log{n}))-approximation algorithm for this problem. Our main technical ingredients are a combinatorial characterization of ultrametrics induced by this cost function, deriving an Integer Linear Programming (ILP) formulation for this family of ultrametrics, and showing how to iteratively round an LP relaxation of this formulation by using the idea of \emph{sphere growing} which has been extensively used in the context of graph partitioning. We also prove that our algorithm returns an (O(\log{n}))-approximate hierarchical clustering for a generalization of this cost function also studied in [Dasgupta, 2015]. Experiments show that the hierarchies found by using the ILP formulation as well as our rounding algorithm often have better projections into flat clusters than the standard linkage based algorithms. We conclude with an inapproximability result for this problem, namely that no polynomial sized LP or SDP can be used to obtain a constant factor approximation for this problem.



Efficient Algorithms for Electing Successive Committees

arXiv.org Artificial Intelligence

In a recently introduced model of successive committee elections (Bredereck et al., AAAI-20) for a given set of ordinal or approval preferences one aims to find a sequence of a given length of "best" same-size committees such that each candidate is a member of a limited number of consecutive committees. However, the practical usability of this model remains limited, as the described task turns out to be NP-hard for most selection criteria already for seeking committees of size three. Non-trivial or somewhat efficient algorithms for these cases are lacking too. Motivated by a desire to unlock the full potential of the described temporal model of committee elections, we devise (parameterized) algorithms that effectively solve the mentioned hard cases in realistic scenarios of a moderate number of candidates or of a limited time horizon.


Hierarchical Clustering via Spreading Metrics

Neural Information Processing Systems

We study the cost function for hierarchical clusterings introduced by [Dasgupta, 2015] where hierarchies are treated as first-class objects rather than deriving their cost from projections into flat clusters. It was also shown in [Dasgupta, 2015] that a top-down algorithm returns a hierarchical clustering of cost at most \(O\left(\alpha_n \log n\right)\) times the cost of the optimal hierarchical clustering, where \(\alpha_n\) is the approximation ratio of the Sparsest Cut subroutine used. Thus using the best known approximation algorithm for Sparsest Cut due to Arora-Rao-Vazirani, the top down algorithm returns a hierarchical clustering of cost at most \(O\left(\log {3/2} n\right)\) times the cost of the optimal solution. We improve this by giving an \(O(\log{n})\)-approximation algorithm for this problem. Our main technical ingredients are a combinatorial characterization of ultrametrics induced by this cost function, deriving an Integer Linear Programming (ILP) formulation for this family of ultrametrics, and showing how to iteratively round an LP relaxation of this formulation by using the idea of \emph{sphere growing} which has been extensively used in the context of graph partitioning.


Approximate Equilibrium and Incentivizing Social Coordination

AAAI Conferences

We study techniques to incentivize self-interested agents to form socially desirable solutions in scenarios where they benefit from mutual coordination. Towards this end, we consider coordination games where agents have different intrinsic preferences but they stand to gain if others choose the same strategy as them. For non-trivial versions of our game, stable solutions like Nash Equilibrium may not exist, or may be socially inefficient even when they do exist. This motivates us to focus on designing efficient algorithms to compute (almost) stable solutions like Approximate Equilibrium that can be realized if agents are provided some additional incentives. Our results apply in many settings like adoption of new products, project selection, and group formation, where a central authority can direct agents towards a strategy but agents may defect if they have better alternatives. We show that for any given instance, we can either compute a high quality approximate equilibrium or a near-optimal solution that can be stabilized by providing small payments to some players. Our results imply that a little influence is necessary in order to ensure that selfish players coordinate and form socially efficient solutions.