knull
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- North America > United States > Ohio (0.04)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Asia > South Korea > Daejeon > Daejeon (0.04)
- North America > Canada (0.04)
A Theoretical details
A.2 Proof of Theorem 1 We restate the theorem for completeness: Theorem 1. Assume Any ODE's solution, if it exists and converges, converges to an's estimate of the conditional effect is We now bound the remaining term. 's computation of the surrogate intervention involved Thus, such error does not accumulate even with large step sizes. Theorem 4. Effect Connectivity is necessary for nonparametric effect estimation in Let Effect Connectivity be violated, i.e. there exists a Thus, nonparametric effect estimation is impossible. The effect threshold here is 0.1.Figure 7: True positive vs. False negative rate as we vary the threshold on average
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Orange County > Irvine (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Asia > South Korea > Daejeon > Daejeon (0.04)
- North America > Canada (0.04)
- Research Report (0.67)
- Workflow (0.46)
Rank-One Modified Value Iteration
Kolarijani, Arman Sharifi, Ok, Tolga, Esfahani, Peyman Mohajerin, Kolarijani, Mohamad Amin Sharif
In this paper, we provide a novel algorithm for solving planning and learning problems of Markov decision processes. The proposed algorithm follows a policy iteration-type update by using a rank-one approximation of the transition probability matrix in the policy evaluation step. This rank-one approximation is closely related to the stationary distribution of the corresponding transition probability matrix, which is approximated using the power method. We provide theoretical guarantees for the convergence of the proposed algorithm to optimal (action-)value function with the same rate and computational complexity as the value iteration algorithm in the planning problem and as the Q-learning algorithm in the learning problem. Through our extensive numerical simulations, however, we show that the proposed algorithm consistently outperforms first-order algorithms and their accelerated versions for both planning and learning problems.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Netherlands > South Holland > Delft (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.57)
Convergence of Time-Averaged Mean Field Gradient Descent Dynamics for Continuous Multi-Player Zero-Sum Games
The approximation of mixed Nash equilibria (MNE) for zero-sum games with mean-field interacting players has recently raised much interest in machine learning. In this paper we propose a mean-field gradient descent dynamics for finding the MNE of zero-sum games involving $K$ players with $K\geq 2$. The evolution of the players' strategy distributions follows coupled mean-field gradient descent flows with momentum, incorporating an exponentially discounted time-averaging of gradients. First, in the case of a fixed entropic regularization, we prove an exponential convergence rate for the mean-field dynamics to the mixed Nash equilibrium with respect to the total variation metric. This improves a previous polynomial convergence rate for a similar time-averaged dynamics with different averaging factors. Moreover, unlike previous two-scale approaches for finding the MNE, our approach treats all player types on the same time scale. We also show that with a suitable choice of decreasing temperature, a simulated annealing version of the mean-field dynamics converges to an MNE of the initial unregularized problem.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Game Theory (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Reveal-or-Obscure: A Differentially Private Sampling Algorithm for Discrete Distributions
Tasnim, Naima, Gilani, Atefeh, Sankar, Lalitha, Kosut, Oliver
--We introduce a differentially private (DP) algorithm called reveal-or-obscure (ROO) to generate a single representative sample from a dataset of n observations drawn i.i.d. Unlike methods that add explicit noise to the estimated empirical distribution, ROO achieves ϵ - differential privacy by randomly choosing whether to "reveal" or "obscure" the empirical distribution. While ROO is structurally identical to Algorithm 1 proposed by Cheu and Nayak [1], we prove a strictly better bound on the sampling complexity than that extablished in Theorem 12 of [1]. T o further improve the privacy-utility trade-off, we propose a novel generalized sampling algorithm called Data-Specific ROO (DS-ROO), where the probability of obscuring the empirical distribution of the dataset is chosen adaptively. We prove that DS-ROO satisfies ϵ - DP, and provide empirical evidence that DS-ROO can achieve better utility under the same privacy budget of vanilla ROO. The widespread use of sensitive data across various domains, including healthcare, finance, law enforcement, and social sciences, has heightened the importance of privacy-preserving data analysis.
Variance-Reduced Fast Operator Splitting Methods for Stochastic Generalized Equations
We develop two classes of variance-reduced fast operator splitting methods to approximate solutions of both finite-sum and stochastic generalized equations. Our approach integrates recent advances in accelerated fixed-point methods, co-hypomonotonicity, and variance reduction. First, we introduce a class of variance-reduced estimators and establish their variance-reduction bounds. This class covers both unbiased and biased instances and comprises common estimators as special cases, including SVRG, SAGA, SARAH, and Hybrid-SGD. Next, we design a novel accelerated variance-reduced forward-backward splitting (FBS) algorithm using these estimators to solve finite-sum and stochastic generalized equations. Our method achieves both $\mathcal{O}(1/k^2)$ and $o(1/k^2)$ convergence rates on the expected squared norm $\mathbb{E}[ \| G_{\lambda}x^k\|^2]$ of the FBS residual $G_{\lambda}$, where $k$ is the iteration counter. Additionally, we establish, for the first time, almost sure convergence rates and almost sure convergence of iterates to a solution in stochastic accelerated methods. Unlike existing stochastic fixed-point algorithms, our methods accommodate co-hypomonotone operators, which potentially include nonmonotone problems arising from recent applications. We further specify our method to derive an appropriate variant for each stochastic estimator -- SVRG, SAGA, SARAH, and Hybrid-SGD -- demonstrating that they achieve the best-known complexity for each without relying on enhancement techniques. Alternatively, we propose an accelerated variance-reduced backward-forward splitting (BFS) method, which attains similar convergence rates and oracle complexity as our FBS method. Finally, we validate our results through several numerical experiments and compare their performance.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)