law
A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning
In Multi-Agent Reinforcement Learning (MARL), multiple agents interact with a common environment, as also with each other, for solving a shared problem in sequential decision-making. It has wide-ranging applications in gaming, robotics, finance, communication, etc. In this work, we derive a novel law of iterated logarithm for a family of distributed nonlinear stochastic approximation schemes that is useful in MARL. In particular, our result describes the convergence rate on almost every sample path where the algorithm converges. This result is the first of its kind in the distributed setup and provides deeper insights than the existing ones, which only discuss convergence rates in the expected or the CLT sense. Importantly, our result holds under significantly weaker assumptions: neither the gossip matrix needs to be doubly stochastic nor the stepsizes square summable. As an application, we show that, for the stepsize $n^{-\gamma}$ with $\gamma \in (0, 1),$ the distributed TD(0) algorithm with linear function approximation has a convergence rate of $O(\sqrt{n^{-\gamma} \ln n })$ a.s.; for the $1/n$ type stepsize, the same is $O(\sqrt{n^{-1} \ln \ln n})$ a.s. These decay rates do not depend on the graph depicting the interactions among the different agents.
Learning better with Dale's Law: A Spectral Perspective
Most recurrent neural networks (RNNs) do not include a fundamental constraint of real neural circuits: Dale's Law, which implies that neurons must be excitatory (E) or inhibitory (I). Dale's Law is generally absent from RNNs because simply partitioning a standard network's units into E and I populations impairs learning. However, here we extend a recent feedforward bio-inspired EI network architecture, named Dale's ANNs, to recurrent networks, and demonstrate that good performance is possible while respecting Dale's Law. This begs the question: What makes some forms of EI network learn poorly and others learn well? And, why does the simple approach of incorporating Dale's Law impair learning? Historically the answer was thought to be the sign constraints on EI network parameters, and this was a motivation behind Dale's ANNs. However, here we show the spectral properties of the recurrent weight matrix at initialisation are more impactful on network performance than sign constraints. We find that simple EI partitioning results in a singular value distribution that is multimodal and dispersed, whereas standard RNNs have an unimodal, more clustered singular value distribution, as do recurrent Dale's ANNs. We also show that the spectral properties and performance of partitioned EI networks are worse for small networks with fewer I units, and we present normalised SVD entropy as a measure of spectrum pathology that correlates with performance.
On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law
Out-of-distribution (OOD) testing is increasingly popular for evaluating a machine learning system's ability to generalize beyond the biases of a training set. OOD benchmarks are designed to present a different joint distribution of data and labels between training and test time. VQA-CP has become the standard OOD benchmark for visual question answering, but we discovered three troubling practices in its current use. First, most published methods rely on explicit knowledge of the construction of the OOD splits. They often rely on yes'' when the common training answer was ``no''.
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
Continual Pre-Training (CPT) on Large Language Models (LLMs) has been widely used to expand the model's fundamental understanding of specific downstream domains (e.g., math and code). For the CPT on domain-specific LLMs, one important question is how to choose the optimal mixture ratio between the general-corpus (e.g., Dolma, Slim-pajama) and the downstream domain-corpus. Existing methods usually adopt laborious human efforts by grid-searching on a set of mixture ratios, which require high GPU training consumption costs. Besides, we cannot guarantee the selected ratio is optimal for the specific domain. To address the limitations of existing methods, inspired by the Scaling Law for performance prediction, we propose to investigate the Scaling Law of the Domain-specific Continual Pre-Training (D-CPT Law) to decide the optimal mixture ratio with acceptable training costs for LLMs of different sizes.
Risk of Transfer Learning and its Applications in Finance
Cao, Haoyang, Gu, Haotian, Guo, Xin, Rosenbaum, Mathieu
Transfer learning is an emerging and popular paradigm for utilizing existing knowledge from previous learning tasks to improve the performance of new ones. In this paper, we propose a novel concept of transfer risk and and analyze its properties to evaluate transferability of transfer learning. We apply transfer learning techniques and this concept of transfer risk to stock return prediction and portfolio optimization problems. Numerical results demonstrate a strong correlation between transfer risk and overall transfer learning performance, where transfer risk provides a computationally efficient way to identify appropriate source tasks in transfer learning, including cross-continent, cross-sector, and cross-frequency transfer for portfolio optimization.
- North America > United States (0.46)
- Europe (0.28)
- Asia (0.14)
- Health & Medicine (1.00)
- Banking & Finance > Trading (1.00)
- Energy > Oil & Gas > Upstream (0.54)
Nonlinear Parameter-Varying Modeling for Soft Pneumatic Actuators and Data-Driven Parameter Estimation
Yang, Wu-Te, Stuart, Hannah, Kurkcu, Burak, Tomizuka, Masayoshi
Accurately modeling soft robots remains a challenge due to their inherent nonlinear behavior and parameter variations. This paper presents a novel approach to modeling soft pneumatic actuators using a nonlinear parameter-varying framework. The research begins by introducing Ludwick's Law, providing a more accurate representation of the complex mechanical behavior exhibited by soft materials. Three key material properties, namely Young's modulus, tensile stress, and mixed viscosity, are utilized to estimate the parameter inside the nonlinear model using the least squares method. Subsequently, a nonlinear dynamic model for soft actuators is constructed by applying Ludwick's Law. To validate the accuracy and effectiveness of the proposed method, experimental validations are performed. We perform several experiments, demonstrating the model's capabilities in predicting the dynamical behavior of soft pneumatic actuators. In conclusion, this work contributes to the advancement of soft pneumatic actuator modeling that represents their nonlinear behavior.
2022 National Security Symposium
On Thursday June 23, the National Security Institute (NSI), in partnership with the Federalist Society, hosted the 2022 National Security Symposium entitled, "Next Generation National Security." The first panel of the Symposium considered the various threats and potential benefits that digital currencies may pose, as well as discussed whether and how the U.S. might develop policies on digital assets that both protect and encourage freedom and payment security while maintaining our safety from bad actors, and thwarting the schemes and initiatives of bad actors. The panel on Cryptocurrency and National Security featured Gus Coldebella, Partner, True Ventures; William Hughes, Senior Counsel and Director of Global Regulatory Matters, Consensys Software; Dr. Oonagh McDonald, Senior Adviser, Crito Capital, and; Hon. Eric Kadel, Partner, Sullivan & Cromwell LLP, served as the moderator. The second panel titled, " Artificial Intelligence: Implications for National Security," addressed the national security ramifications of the scaling artificial intelligence (AI) developments.
- North America > United States (0.36)
- Asia > China (0.10)
- Government > Military (1.00)
- Government > Foreign Policy (1.00)
- Government > Regional Government > North America Government > United States Government (0.36)
Book review: 'The Reasonable Robot -- Artificial Intelligence and the Law'
Today, humans may outperform AI in hazardous activities (e.g., road traffic), but there will come a time when AI surpasses humans, and then the question might be whether a reasonable person could have used AI to avoid damage. However, the principle of AI legal neutrality does not mean that AI and people must be treated equally, or that AI should enjoy the same rights as humans. Therefore, the author argues that AI should be recognized as an entity that morally deserves rights and can, for example, claim tangible or intangible property rights "only" if this would exceptionally benefit people. Furthermore, he states that AI legal neutrality should not come at the expense of transparency and accountability.
Book Reviews
AI Magazine Volume 9 Number 3 (1988) ( AAAI) The first part of the book is intended to be an introduction to computational jurisprudence for both groups. It identifies issues critical to the purpose, behavior, knowledge sources, knowledge structures, and reasoning processes of expert legal systems. The second part implements a simple prototype system for a well-defined area of contract law and is more appropriate for experienced developers of knowledge-based systems. Law is a domain in which the experts are supposed to disagree, and lawyers must be able to argue either side of a case. A judge or juror must decide which argument is "best."