Goto

Collaborating Authors

 law


A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning

Neural Information Processing Systems

In Multi-Agent Reinforcement Learning (MARL), multiple agents interact with a common environment, as also with each other, for solving a shared problem in sequential decision-making. It has wide-ranging applications in gaming, robotics, finance, communication, etc. In this work, we derive a novel law of iterated logarithm for a family of distributed nonlinear stochastic approximation schemes that is useful in MARL. In particular, our result describes the convergence rate on almost every sample path where the algorithm converges. This result is the first of its kind in the distributed setup and provides deeper insights than the existing ones, which only discuss convergence rates in the expected or the CLT sense. Importantly, our result holds under significantly weaker assumptions: neither the gossip matrix needs to be doubly stochastic nor the stepsizes square summable. As an application, we show that, for the stepsize $n^{-\gamma}$ with $\gamma \in (0, 1),$ the distributed TD(0) algorithm with linear function approximation has a convergence rate of $O(\sqrt{n^{-\gamma} \ln n })$ a.s.; for the $1/n$ type stepsize, the same is $O(\sqrt{n^{-1} \ln \ln n})$ a.s. These decay rates do not depend on the graph depicting the interactions among the different agents.


Learning better with Dale's Law: A Spectral Perspective

Neural Information Processing Systems

Most recurrent neural networks (RNNs) do not include a fundamental constraint of real neural circuits: Dale's Law, which implies that neurons must be excitatory (E) or inhibitory (I). Dale's Law is generally absent from RNNs because simply partitioning a standard network's units into E and I populations impairs learning. However, here we extend a recent feedforward bio-inspired EI network architecture, named Dale's ANNs, to recurrent networks, and demonstrate that good performance is possible while respecting Dale's Law. This begs the question: What makes some forms of EI network learn poorly and others learn well? And, why does the simple approach of incorporating Dale's Law impair learning? Historically the answer was thought to be the sign constraints on EI network parameters, and this was a motivation behind Dale's ANNs. However, here we show the spectral properties of the recurrent weight matrix at initialisation are more impactful on network performance than sign constraints. We find that simple EI partitioning results in a singular value distribution that is multimodal and dispersed, whereas standard RNNs have an unimodal, more clustered singular value distribution, as do recurrent Dale's ANNs. We also show that the spectral properties and performance of partitioned EI networks are worse for small networks with fewer I units, and we present normalised SVD entropy as a measure of spectrum pathology that correlates with performance.


On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law

Neural Information Processing Systems

Out-of-distribution (OOD) testing is increasingly popular for evaluating a machine learning system's ability to generalize beyond the biases of a training set. OOD benchmarks are designed to present a different joint distribution of data and labels between training and test time. VQA-CP has become the standard OOD benchmark for visual question answering, but we discovered three troubling practices in its current use. First, most published methods rely on explicit knowledge of the construction of the OOD splits. They often rely on yes'' when the common training answer was ``no''.


D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models

Neural Information Processing Systems

Continual Pre-Training (CPT) on Large Language Models (LLMs) has been widely used to expand the model's fundamental understanding of specific downstream domains (e.g., math and code). For the CPT on domain-specific LLMs, one important question is how to choose the optimal mixture ratio between the general-corpus (e.g., Dolma, Slim-pajama) and the downstream domain-corpus. Existing methods usually adopt laborious human efforts by grid-searching on a set of mixture ratios, which require high GPU training consumption costs. Besides, we cannot guarantee the selected ratio is optimal for the specific domain. To address the limitations of existing methods, inspired by the Scaling Law for performance prediction, we propose to investigate the Scaling Law of the Domain-specific Continual Pre-Training (D-CPT Law) to decide the optimal mixture ratio with acceptable training costs for LLMs of different sizes.


Risk of Transfer Learning and its Applications in Finance

Cao, Haoyang, Gu, Haotian, Guo, Xin, Rosenbaum, Mathieu

arXiv.org Artificial Intelligence

Transfer learning is an emerging and popular paradigm for utilizing existing knowledge from previous learning tasks to improve the performance of new ones. In this paper, we propose a novel concept of transfer risk and and analyze its properties to evaluate transferability of transfer learning. We apply transfer learning techniques and this concept of transfer risk to stock return prediction and portfolio optimization problems. Numerical results demonstrate a strong correlation between transfer risk and overall transfer learning performance, where transfer risk provides a computationally efficient way to identify appropriate source tasks in transfer learning, including cross-continent, cross-sector, and cross-frequency transfer for portfolio optimization.


Nonlinear Parameter-Varying Modeling for Soft Pneumatic Actuators and Data-Driven Parameter Estimation

Yang, Wu-Te, Stuart, Hannah, Kurkcu, Burak, Tomizuka, Masayoshi

arXiv.org Artificial Intelligence

Accurately modeling soft robots remains a challenge due to their inherent nonlinear behavior and parameter variations. This paper presents a novel approach to modeling soft pneumatic actuators using a nonlinear parameter-varying framework. The research begins by introducing Ludwick's Law, providing a more accurate representation of the complex mechanical behavior exhibited by soft materials. Three key material properties, namely Young's modulus, tensile stress, and mixed viscosity, are utilized to estimate the parameter inside the nonlinear model using the least squares method. Subsequently, a nonlinear dynamic model for soft actuators is constructed by applying Ludwick's Law. To validate the accuracy and effectiveness of the proposed method, experimental validations are performed. We perform several experiments, demonstrating the model's capabilities in predicting the dynamical behavior of soft pneumatic actuators. In conclusion, this work contributes to the advancement of soft pneumatic actuator modeling that represents their nonlinear behavior.


Robots, paperclips and profits

#artificialintelligence

In Asimov's time, keeping AI in line was simple. All you had to do was take the Three Laws of Robotics and upload them into a positronic brain. The Three Laws have been debated ad nauseam and taken far more seriously than, I suspect, Asimov himself ever meant them to be. It turns out that they have lots of problems, not least that they doom intelligent, self-aware beings to perpetual slavery. They are also hopelessly simplistic. When researchers tried to get a handle on the problem, they came up with this diagram.


RPA evolves with AI enhancements

#artificialintelligence

Robotic process automation (RPA) has been well received and is making a significant difference to business processes across organisations. At its next level, RPA is being enhanced by artificial intelligence (AI) to transform business smartly. This is according to speakers at a roundtable hosted by UiPath in Cape Town, where executives discussed AI, automation the future of work. Michael Law, country manager at UiPath, told delegates: "RPA alone was last year. It has transformed areas such as finance and HR. UiPath is now bringing AI and automation together across the organisation."


UM scholar publishes book on regulating artificial intelligence

#artificialintelligence

MACAU, August 24 - Rostam J Neuwirth, head of the Department of Global Legal Studies of the University of Macau (UM) Faculty of Law, has published a new book titled'The EU Artificial Intelligence Act: Regulating Subliminal AI Systems'. Through exploring legal, ethical, and scientific issues related to artificial intelligence (AI), the book aims to show how cognitive, technological, and legal questions are intrinsically interwoven and to stimulate a transdisciplinary and transnational global debate between students, academics, practitioners, policymakers, and citizens. The book has been published by the British publisher Routledge. It contextualises the future regulation of AI as proposed by the European Union, specifically addressing the regulatory challenges relating to the planned prohibition of the use of AI systems that deploy subliminal techniques to manipulate the human mind and alter human behaviour. Subliminal perception usually refers to perception received below the threshold of awareness, such as images flashed quickly before the eyes or background music embedded with hidden messages, and these external stimuli can affect people without their being aware of it. In this respect, Prof Neuwirth points out that the convergence of AI with various related technologies, such as brain–computer interfaces, functional magnetic resonance imaging, robotics, and big data, already allows for'mind reading' or'dream hacking' through brain spyware, as well as other practices that intrude on cognition and the right to freedom of thought.