Goto

Collaborating Authors

 inserting



Strategyproof Facility Location for Five Agents on a Circle using PCD

Farjoun, Ido, Meir, Reshef

arXiv.org Artificial Intelligence

We consider the strategyproof facility location problem on a circle. We focus on the case of 5 agents, and find a tight bound for the PCD strategyproof mechanism, which selects the reported location of an agent in proportion to the length of the arc in front of it. We methodically "reduce" the size of the instance space and then use standard optimization techniques to find and prove the bound is tight. Moreover we hypothesize the approximation ratio of PCD for general odd $n$.


V Inference for Continuous Time Switching Dynamical Systems Supplementary Material

Neural Information Processing Systems

If it is clear from the context, we will mostly use the favorable uncluttered notation. Using Leibniz' theorem, we have ˆ A.2.1 Calculation of the Filtering Distribution The filtering distribution is defined as α ( y,z,t):= p (y,z,t | x Consider the case where there is no observation in the interval [t,t + h], h > 0. We compute α ( y,z,t + h) = p (y,z,t + h | x A.2.2 Calculation of the Backward Distribution The backward distribution is defined as β (y,z,t):= p( x We find the dynamics of the smoothing distribution by calculating its time derivative. Using the terms in Eq. (28) we have The second term of Eq. (31) does not depend on the Accordingly, both sides of Eq. (35) have Eq. Appendix A.3.4) and provide the gradient with respect to the dispersion A comprehensive overview over the ground-truth and learned parameters is given in Table 2. Note that we utilize this procedure for all experiments.


Minimax-Optimal Multi-Agent Robust Reinforcement Learning

Jiao, Yuchen, Li, Gen

arXiv.org Artificial Intelligence

The rapidly evolving field of multi-agent reinforcement learning (MARL), also referred to as Markov games (MGs) (Littman, 1994; Shapley, 1953), explores how a group of agents interacts in a shared, dynamic environment to maximize their individual expected cumulative rewards (Zhang et al., 2020a; Lanctot et al., 2019; Silver et al., 2017; Vinyals et al., 2019). This area has found wide applications in fields such as ecosystem management (Fang et al., 2015), strategic decision-making in board games (Silver et al., 2017), management science (Saloner, 1991), and autonomous driving (Zhou et al., 2020). However, in real-world applications, environmental uncertainties--stemming from factors such as system noise, model misalignment, and the sim-to-real gap--can significantly alter both the qualitative outcomes of the game and the cumulatiev rewards that agents receive (Slumbers et al., 2023). It has been demonstrated that when solutions learned in a simulated environment are applied, even a small deviation in the deployed environment from the expected model can result in catastrophic performance drops for one or more agents (Shi et al., 2024c; Balaji et al., 2019; Yeh et al., 2021; Zeng et al., 2022; Zhang et al., 2020b). These challenges motivate the study of robust Markov games (RMGs), which assume that each agent aims to maximize its worst-case cumulative reward in an environment where the transition model is constrained by an uncertainty set centered around an unknown nominal model. Given the competitive nature of the game, the objective of RMGs is to reach an equilibrium where no agent has an incentive to unilaterally change its policy to increase its own payoff. A classical type of equilibrium is the robust Nash equilibrium (NE) (Nash Jr, 1950), where each agent's policy is independent, and no agent can improve its worst-case performance by deviating from its current strategy. Due to the high computational cost of solving robust NEs, especially in games with more than two agents, this concept is often relaxed to the robust coarse correlated equilibrium (CCE), where agents' policies may be correlated (Moulin & Vial, 1978). In the context of RMGs, achieving equilibrium with minimal samples is of particular interest, as data is often limited in practical applications.


Inserting a Backdoor into a Machine-Learning System - Schneier on Security

#artificialintelligence

Nice to hear from you, I hope you are well and life is not to hectic. "For myself, it is the front door into ML that is more worrying." What actually worries me is not "the method" of perversion of which ML appears to have endless varieties at every point (thus is not fit for honest purpose). As I've pointed out before, in "The King Game" there is the notion of "The Godhead". Where the King is a direct conduit to God's words thus wishes.


Mixing Time Guarantees for Unadjusted Hamiltonian Monte Carlo

Bou-Rabee, Nawaf, Eberle, Andreas

arXiv.org Machine Learning

We provide quantitative upper bounds on the total variation mixing time of the Markov chain corresponding to the unadjusted Hamiltonian Monte Carlo (uHMC) algorithm. For two general classes of models and fixed time discretization step size $h$, the mixing time is shown to depend only logarithmically on the dimension. Moreover, we provide quantitative upper bounds on the total variation distance between the invariant measure of the uHMC chain and the true target measure. As a consequence, we show that an $\varepsilon$-accurate approximation of the target distribution $\mu$ in total variation distance can be achieved by uHMC for a broad class of models with $O\left(d^{3/4}\varepsilon^{-1/2}\log (d/\varepsilon )\right)$ gradient evaluations, and for mean field models with weak interactions with $O\left(d^{1/2}\varepsilon^{-1/2}\log (d/\varepsilon )\right)$ gradient evaluations. The proofs are based on the construction of successful couplings for uHMC that realize the upper bounds.