esp
Echoes of the past: A unified perspective on fading memory and echo states
Ortega, Juan-Pablo, Rossmannek, Florian
Recurrent neural networks (RNNs) have become increasingly popular in information processing tasks involving time series and temporal data. A fundamental property of RNNs is their ability to create reliable input/output responses, often linked to how the network handles its memory of the information it processed. Various notions have been proposed to conceptualize the behavior of memory in RNNs, including steady states, echo states, state forgetting, input forgetting, and fading memory. Although these notions are often used interchangeably, their precise relationships remain unclear. This work aims to unify these notions in a common language, derive new implications and equivalences between them, and provide alternative proofs to some existing results. By clarifying the relationships between these concepts, this research contributes to a deeper understanding of RNNs and their temporal information processing capabilities.
Coherence influx is indispensable for quantum reservoir computing
Kobayashi, Shumpei, Tran, Quoc Hoan, Nakajima, Kohei
Echo state property (ESP) is a fundamental property that allows an input-driven dynamical system to perform information processing tasks. Recently, extensions of ESP to potentially nonstationary systems and subsystems, that is, nonstationary ESP and subset/subspace ESP, have been proposed. In this paper, we theoretically and numerically analyze the sufficient and necessary conditions for a quantum system to satisfy nonstationary ESP and subset/subspace nonstationary ESP. Based on extensive usage of the Pauli transfer matrix (PTM) form, we find that (1) the interaction with a quantum-coherent environment, termed \textit{coherence influx}, is indispensable in realizing nonstationary ESP, and (2) the spectral radius of PTM can characterize the fading memory property of quantum reservoir computing (QRC). Our numerical experiment, involving a system with a Hamiltonian that entails a spin-glass/many-body localization phase, reveals that the spectral radius of PTM can describe the dynamical phase transition intrinsic to such a system. To comprehensively understand the mechanisms under ESP of QRC, we propose a simplified model, multiplicative reservoir computing (mRC), which is a reservoir computing (RC) system with a one-dimensional multiplicative input. Theoretically and numerically, we show that the parameters corresponding to the spectral radius and coherence influx in mRC directly correlates with its linear memory capacity (MC). Our findings about QRC and mRC will provide a theoretical aspect of PTM and the input multiplicativity of QRC. The results will lead to a better understanding of QRC and information processing in open quantum systems.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > Middle East > Jordan (0.04)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Parm: Efficient Training of Large Sparsely-Activated Models with Dedicated Schedules
Pan, Xinglin, Lin, Wenxiang, Shi, Shaohuai, Chu, Xiaowen, Sun, Weinong, Li, Bo
Sparsely-activated Mixture-of-Expert (MoE) layers have found practical applications in enlarging the model size of large-scale foundation models, with only a sub-linear increase in computation demands. Despite the wide adoption of hybrid parallel paradigms like model parallelism, expert parallelism, and expert-sharding parallelism (i.e., MP+EP+ESP) to support MoE model training on GPU clusters, the training efficiency is hindered by communication costs introduced by these parallel paradigms. To address this limitation, we propose Parm, a system that accelerates MP+EP+ESP training by designing two dedicated schedules for placing communication tasks. The proposed schedules eliminate redundant computations and communications and enable overlaps between intra-node and inter-node communications, ultimately reducing the overall training time. As the two schedules are not mutually exclusive, we provide comprehensive theoretical analyses and derive an automatic and accurate solution to determine which schedule should be applied in different scenarios. Experimental results on an 8-GPU server and a 32-GPU cluster demonstrate that Parm outperforms the state-of-the-art MoE training system, DeepSpeed-MoE, achieving 1.13$\times$ to 5.77$\times$ speedup on 1296 manually configured MoE layers and approximately 3$\times$ improvement on two real-world MoE models based on BERT and GPT-2.
- Asia > China > Hong Kong (0.05)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > China > Heilongjiang Province > Harbin (0.04)
- (2 more...)
LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Wu, Bingyang, Liu, Shengyu, Zhong, Yinmin, Sun, Peng, Liu, Xuanzhe, Jin, Xin
The context window of large language models (LLMs) is rapidly increasing, leading to a huge variance in resource usage between different requests as well as between different phases of the same request. Restricted by static parallelism strategies, existing LLM serving systems cannot efficiently utilize the underlying resources to serve variable-length requests in different phases. To address this problem, we propose a new parallelism paradigm, elastic sequence parallelism (ESP), to elastically adapt to the variance between different requests and phases. Based on ESP, we design and build LoongServe, an LLM serving system that (1) improves computation efficiency by elastically adjusting the degree of parallelism in real-time, (2) improves communication efficiency by reducing key-value cache migration overhead and overlapping partial decoding communication with computation, and (3) improves GPU memory efficiency by reducing key-value cache fragmentation across instances. Our evaluation under diverse real-world datasets shows that LoongServe improves the maximum throughput by up to 3.85$\times$ compared to the chunked prefill and 5.81$\times$ compared to the prefill-decoding disaggregation.
Hierarchy of the echo state property in quantum reservoir computing
Kobayashi, Shumpei, Tran, Quoc Hoan, Nakajima, Kohei
The echo state property (ESP) represents a fundamental concept in the reservoir computing (RC) framework that ensures output-only training of reservoir networks by being agnostic to the initial states and far past inputs. However, the traditional definition of ESP does not describe possible non-stationary systems in which statistical properties evolve. To address this issue, we introduce two new categories of ESP: $\textit{non-stationary ESP}$, designed for potentially non-stationary systems, and $\textit{subspace/subset ESP}$, designed for systems whose subsystems have ESP. Following the definitions, we numerically demonstrate the correspondence between non-stationary ESP in the quantum reservoir computer (QRC) framework with typical Hamiltonian dynamics and input encoding methods using non-linear autoregressive moving-average (NARMA) tasks. We also confirm the correspondence by computing linear/non-linear memory capacities that quantify input-dependent components within reservoir states. Our study presents a new understanding of the practical design of QRC and other possibly non-stationary RC systems in which non-stationary systems and subsystems are exploited.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Forget Midjourney, AI is being used to figure out animal languages
Many of us talk to our pets like they are people. Like they can understand us, and we can understand them. But artificial intelligence could one day make that a possibility. According to reports from the World Economic Forum, scientists are working to decode animal languages to help advance humanity's conservation and sustainability efforts. AI has seen many improvements over the past several years, including the introduction of AI that can make pieces of art, as well as AI chatbots that can convey conversations with the people interacting with it.
🇺🇸 Machine learning job: Senior AI Research Scientist at Earth Species Project (work from anywhere!)
Senior AI Research Scientist at Earth Species Project Remote › Worldwide, 100% remote position (Posted Aug 3 2022) Job description The Earth Species Project (ESP) is a nonprofit organization dedicated to decoding animal communication and translating non-human language. ESP partners with biologists and machine learning researchers at universities and institutions around the world and we are honored to be supported by many forward-looking philanthropists and groups, including the Internet Archive, TED Audacious 2020, and the entrepreneur and author Reid Hoffman. Our work has been featured on NPR's Invisibilia documentary, "Two Heart Beats a Minute," "How to Talk to Animals" in Wall Street Journal's The Future of Everything, "The Challenges of Animal Translation" in the New Yorker, published in Scientific Reports, and was honored at the inaugural Anthem Awards. We aim to enable every person to more deeply understand our co-inhabitants on Earth and in doing so, to permanently alter human perspective and culture. Purpose of role You will join an incredible and global remote team, and will be responsible for developing pioneering research towards decoding and translating non-human communication, including extending unsupervised translation techniques and tackling cornerstone biological and computational problems on large-scale multimodal behavioral datasets.
- North America > United States > New York (0.25)
- North America > United States > Massachusetts (0.05)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.05)
- (3 more...)
- Media (0.50)
- Leisure & Entertainment (0.49)
- Law (0.30)
Senior AI Research Scientist
The Earth Species Project (ESP) is a nonprofit organization dedicated to decoding animal communication and translating non-human language. ESP partners with biologists and machine learning researchers at universities and institutions around the world and we are honored to be supported by many forward-looking philanthropists and groups, including the Internet Archive, TED Audacious 2020, and the entrepreneur and author Reid Hoffman. Our work has been featured on NPR's Invisibilia documentary, "Two Heart Beats a Minute," "How to Talk to Animals" in Wall Street Journal's The Future of Everything, "The Challenges of Animal Translation" in the New Yorker, published in Scientific Reports, and was honored at the inaugural Anthem Awards. We aim to enable every person to more deeply understand our co-inhabitants on Earth and in doing so, to permanently alter human perspective and culture. You will join an incredible and global remote team, and will be responsible for developing pioneering research towards decoding and translating non-human communication, including extending unsupervised translation techniques and tackling cornerstone biological and computational problems on large-scale multimodal behavioral datasets.
- North America > United States > New York (0.25)
- North America > United States > Massachusetts (0.05)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.05)
- (2 more...)
- Media (0.36)
- Leisure & Entertainment (0.32)
- Law (0.32)
Can artificial intelligence really help us talk to the animals?
A dolphin handler makes the signal for "together" with her hands, followed by "create". The two trained dolphins disappear underwater, exchange sounds and then emerge, flip on to their backs and lift their tails. They have devised a new trick of their own and performed it in tandem, just as requested. "It doesn't prove that there's language," says Aza Raskin. "But it certainly makes a lot of sense that, if they had access to a rich, symbolic way of communicating, that would make this task much easier."
- North America > United States > Pennsylvania (0.05)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.05)
- Europe > Denmark > Capital Region > Copenhagen (0.05)