scalable approach
Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games
Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game theory that is guaranteed to converge to an approximate Nash equilibrium. However, PSRO requires training a reinforcement learning policy at each iteration, making it too slow for large games. We show through counterexamples and experiments that DCH and Rectified PSRO, two existing approaches to scaling up PSRO, fail to converge even in small games. We introduce Pipeline PSRO (P2SRO), the first scalable PSRO-based method for finding approximate Nash equilibria in large zero-sum imperfect-information games. P2SRO is able to parallelize PSRO with convergence guarantees by maintaining a hierarchical pipeline of reinforcement learning workers, each training against the policies generated by lower levels in the hierarchy. We show that unlike existing methods, P2SRO converges to an approximate Nash equilibrium, and does so faster as the number of parallel workers increases, across a variety of imperfect information games. We also introduce an open-source environment for Barrage Stratego, a variant of Stratego with an approximate game tree complexity of 10^50. P2SRO is able to achieve state-of-the-art performance on Barrage Stratego and beats all existing bots.
A Scalable Approach for Privacy-Preserving Collaborative Machine Learning
We consider a collaborative learning scenario in which multiple data-owners wish to jointly train a logistic regression model, while keeping their individual datasets private from the other parties. We propose COPML, a fully-decentralized training framework that achieves scalability and privacy-protection simultaneously. The key idea of COPML is to securely encode the individual datasets to distribute the computation load effectively across many parties and to perform the training computations as well as the model updates in a distributed manner on the securely encoded data. We provide the privacy analysis of COPML and prove its convergence. Furthermore, we experimentally demonstrate that COPML can achieve significant speedup in training over the benchmark protocols. Our protocol provides strong statistical privacy guarantees against colluding parties (adversaries) with unbounded computational power, while achieving up to $16\times$ speedup in the training time against the benchmark protocols.
A Scalable Approach for Safe and Robust Learning via Lipschitz-Constrained Networks
Abdeen, Zain ul, Kekatos, Vassilis, Jin, Ming
Certified robustness is a critical property for deploying neural networks (NN) in safety-critical applications. A principle approach to achieving such guarantees is to constrain the global Lipschitz constant of the network. However, accurate methods for Lipschitz-constrained training often suffer from non-convex formulations and poor scalability due to reliance on global semidefinite programs (SDPs). In this letter, we propose a convex training framework that enforces global Lipschitz constraints via semidefinite relaxation. By reparameterizing the NN using loop transformation, we derive a convex admissibility condition that enables tractable and certifiable training. While the resulting formulation guarantees robustness, its scalability is limited by the size of global SDP. To overcome this, we develop a randomized subspace linear matrix inequalities (RS-LMI) approach that decomposes the global constraints into sketched layerwise constraints projected onto low-dimensional subspaces, yielding a smooth and memory-efficient training objective. Empirical results on MNIST, CIFAR-10, and ImageNet demonstrate that the proposed framework achieves competitive accuracy with significantly improved Lipschitz bounds and runtime performance.
- North America > United States > Virginia (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- Asia > Middle East > Jordan (0.04)
Review for NeurIPS paper: Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games
Weaknesses: The paper is missing a comparison with the most relevant previous work, namely XFP [1] Heinrich, Johannes, and David Silver. Both of these works are mentioned in the Background and Related Work, but: 1) XFP is just mentioned but never compared to in experiments 2)DeepCFR is just discarded with "However, Deep CFR uses external sampling, which may be impractical for games with a large branching factor such as Stratego and Barrage Stratego." Furthermore, there are newer variants based on this work, and it is not limited to a particular form of sampling. The paper only really compares to other variants from the PSRO family Furthermore, the theory and algorithms (the way described in the text) deal only with matrix games, while the experiments are on extensive form games. If the goal is to run on top of the exponentially large matrix game, this should be discussed.
Safety Cases: A Scalable Approach to Frontier AI Safety
Hilton, Benjamin, Buhl, Marie Davidsen, Korbak, Tomek, Irving, Geoffrey
Safety cases - clear, assessable arguments for the safety of a system in a given context - are a widely-used technique across various industries for showing a decision-maker (e.g. boards, customers, third parties) that a system is safe. In this paper, we cover how and why frontier AI developers might also want to use safety cases. We then argue that writing and reviewing safety cases would substantially assist in the fulfilment of many of the Frontier AI Safety Commitments. Finally, we outline open research questions on the methodology, implementation, and technical details of safety cases.
- Government (0.93)
- Information Technology > Security & Privacy (0.47)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.40)
Review for NeurIPS paper: A Scalable Approach for Privacy-Preserving Collaborative Machine Learning
The initial reviews showed some disagreement about this paper, with two positive reviewers noting the reduction in computational and communication costs compared to prior solutions, and two more negative reviewers with some concerns in particular regarding novelty and comparison with respect to previous work. After reading the author rebuttal and further discussion, the doubts regarding the comparison to recent work were lifted, leading to one reviewer increasing his/her score. While some concerns remain regarding the applicability of the work to non-linear models, the merits of the work are judged significant enough, and we decided the paper should be accepted. In the final version, the authors are asked to be more explicit about the potential limitations of the degree-1 approximation to the sigmod, and to add a discussion about how one may go about extending the approach to more complicated (deep) models.
- Information Technology > Artificial Intelligence > Machine Learning (0.76)
- Information Technology > Data Science > Data Mining > Big Data (0.40)
Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games
Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game theory that is guaranteed to converge to an approximate Nash equilibrium. However, PSRO requires training a reinforcement learning policy at each iteration, making it too slow for large games. We show through counterexamples and experiments that DCH and Rectified PSRO, two existing approaches to scaling up PSRO, fail to converge even in small games. We introduce Pipeline PSRO (P2SRO), the first scalable PSRO-based method for finding approximate Nash equilibria in large zero-sum imperfect-information games.
A Scalable Approach to Covariate and Concept Drift Management via Adaptive Data Segmentation
Yarabolu, Vennela, Waghmare, Govind, Gupta, Sonia, Asthana, Siddhartha
In many real-world applications, continuous machine learning (ML) systems are crucial but prone to data drift, a phenomenon where discrepancies between historical training data and future test data lead to significant performance degradation and operational inefficiencies. Traditional drift adaptation methods typically update models using ensemble techniques, often discarding drifted historical data, and focus primarily on either covariate drift or concept drift. These methods face issues such as high resource demands, inability to manage all types of drifts effectively, and neglecting the valuable context that historical data can provide. We contend that explicitly incorporating drifted data into the model training process significantly enhances model accuracy and robustness. This paper introduces an advanced framework that integrates the strengths of data-centric approaches with adaptive management of both covariate and concept drift in a scalable and efficient manner. Our framework employs sophisticated data segmentation techniques to identify optimal data batches that accurately reflect test data patterns. These data batches are then utilized for training on test data, ensuring that the models remain relevant and accurate over time. By leveraging the advantages of both data segmentation and scalable drift management, our solution ensures robust model accuracy and operational efficiency in large-scale ML deployments. It also minimizes resource consumption and computational overhead by selecting and utilizing relevant data subsets, leading to significant cost savings. Experimental results on classification task on real-world and synthetic datasets show our approach improves model accuracy while reducing operational costs and latency. This practical solution overcomes inefficiencies in current methods, providing a robust, adaptable, and scalable approach.
- South America > Brazil > Maranhão (0.04)
- Oceania > Australia > New South Wales (0.04)
- North America > United States > Nebraska > Sarpy County > Bellevue (0.04)
- (2 more...)
A Scalable Approach for Privacy-Preserving Collaborative Machine Learning
We consider a collaborative learning scenario in which multiple data-owners wish to jointly train a logistic regression model, while keeping their individual datasets private from the other parties. We propose COPML, a fully-decentralized training framework that achieves scalability and privacy-protection simultaneously. The key idea of COPML is to securely encode the individual datasets to distribute the computation load effectively across many parties and to perform the training computations as well as the model updates in a distributed manner on the securely encoded data. We provide the privacy analysis of COPML and prove its convergence. Furthermore, we experimentally demonstrate that COPML can achieve significant speedup in training over the benchmark protocols.