lola
Falcon: FastSpectralInferenceonEncryptedData
IntheHE-based MLaaSsetting,aclientencrypts thesensitive data, and uploads the encrypted data to the server that directly processes the encrypted data without decryption, and returns the encrypted result to the client. The client'S data privacy is preserved since only the client has the private key. Existing HE-enabled Neural Networks (HENNs), however, suffer from heavy computational overheads.
- North America > United States > Washington > King County > Redmond (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Proximal Learning With Opponent-Learning Awareness
Learning With Opponent-Learning Awareness (LOLA) (Foerster et al. [2018a]) is a multi-agent reinforcement learning algorithm that typically learns reciprocity-based cooperation in partially competitive environments. However, LOLA often fails to learn such behaviour on more complex policy spaces parameterized by neural networks, partly because the update rule is sensitive to the policy parameterization. This problem is especially pronounced in the opponent modeling setting, where the opponent's policy is unknown and must be inferred from observations; in such settings, LOLA is ill-specified because behaviorally equivalent opponent policies can result in non-equivalent updates. To address this shortcoming, we reinterpret LOLA as approximating a proximal operator, and then derive a new algorithm, proximal LOLA (POLA), which uses the proximal formulation directly. Unlike LOLA, the POLA updates are parameterization invariant, in the sense that when the proximal objective has a unique optimum, behaviorally equivalent policies result in behaviorally equivalent updates. We then present practical approximations to the ideal POLA update, which we evaluate in several partially competitive environments with function approximation and opponent modeling. This empirically demonstrates that POLA achieves reciprocity-based cooperation more reliably than LOLA.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Remarkable robot images provide a vision of the future
Rollin' Justin can avoid obstacles and serve drinks, among other tasks We have long been fascinated with our own image. In the 1920s play, Czech writer Karel Čapek coined the term robot to describe human-looking creatures forced to work in factories. Since then, we have built many humanoid robots that can move and interact with the world in anthropomorphic ways. Award-winning photographer Henrik Spohler at photo agency laif explores such endeavours in his project Tomorrow Is the Question . The main image, above, shows a metallic creation by the German Aerospace Center's Institute of Robotics and Mechatronics in Oberpfaffenhofen.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.06)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.06)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- North America > United States > Indiana (0.04)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
LoLA: Low-Rank Linear Attention With Sparse Caching
McDermott, Luke, Heath, Robert W. Jr., Parhi, Rahul
Linear attention is an efficient alternative that maintains a constant memory footprint, even on infinite context lengths. While this is a potential candidate for lifelong learning, it falls short in memory capacity. In this paper, we propose LoLA, a training-free augmentation to linear attention that boosts associative recall. LoLA distributes past key-value pairs from context into three memory systems: (i) recent pairs in a local sliding window cache; (ii) difficult-to-memorize pairs in a sparse, global cache; and (iii) generic pairs in the recurrent hidden state of linear attention. We show through ablations that our self-recall error metric is crucial to efficiently manage long-term associative memories. On pass-key retrieval tasks, LoLA improves the base model's performance from 0.6% to 97.4% accuracy. This is achieved with a 4.6 smaller cache than Llama-3.1 8B on 4K context length. LoLA also outperforms other 1B and 8B parameter subquadratic models on zero-shot commonsense reasoning tasks. Transformer-based large language models (LLMs) rely on storing all past tokens in an ever-growing key-value (KV) cache (V aswani et al., 2017). This allows future query tokens to access past memories with associative recall, which enables in-context learning (Olsson et al., 2022). Since no previous information is discarded, the KV cache continues to grow with context length. This eventually leads to a memory bottleneck on long context tasks, such as lifelong in-context learning. Alternative architectures to transformers have been proposed--such as Mamba (Gu & Dao, 2024), DeltaNet (Schlag et al., 2021), linear attention (Katharopoulos et al., 2020), and others (Y ang et al., 2024a; Behrouz et al., 2024; Sun et al., 2024)--to reduce the compute complexity from quadratic to linear. Additionally, these approaches reduce the memory cost from linear to constant. In particular, linear attention removes the exponential dot product in softmax (Katharopoulos et al., 2020).
- North America > United States > California > San Diego County > San Diego (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Proximal Learning With Opponent-Learning Awareness
Learning With Opponent-Learning Awareness (LOLA) (Foerster et al. [2018a]) is a multi-agent reinforcement learning algorithm that typically learns reciprocity-based cooperation in partially competitive environments. However, LOLA often fails to learn such behaviour on more complex policy spaces parameterized by neural networks, partly because the update rule is sensitive to the policy parameterization. This problem is especially pronounced in the opponent modeling setting, where the opponent's policy is unknown and must be inferred from observations; in such settings, LOLA is ill-specified because behaviorally equivalent opponent policies can result in non-equivalent updates. To address this shortcoming, we reinterpret LOLA as approximating a proximal operator, and then derive a new algorithm, proximal LOLA (POLA), which uses the proximal formulation directly. Unlike LOLA, the POLA updates are parameterization invariant, in the sense that when the proximal objective has a unique optimum, behaviorally equivalent policies result in behaviorally equivalent updates.
Fast Adaptation with Behavioral Foundation Models
Sikchi, Harshit, Tirinzoni, Andrea, Touati, Ahmed, Xu, Yingchen, Kanervisto, Anssi, Niekum, Scott, Zhang, Amy, Lazaric, Alessandro, Pirotta, Matteo
Unsupervised zero-shot reinforcement learning (RL) has emerged as a powerful paradigm for pretraining behavioral foundation models (BFMs), enabling agents to solve a wide range of downstream tasks specified via reward functions in a zero-shot fashion, i.e., without additional test-time learning or planning. This is achieved by learning self-supervised task embeddings alongside corresponding near-optimal behaviors and incorporating an inference procedure to directly retrieve the latent task embedding and associated policy for any given reward function. Despite promising results, zero-shot policies are often suboptimal due to errors induced by the unsupervised training process, the embedding, and the inference procedure. In this paper, we focus on devising fast adaptation strategies to improve the zero-shot performance of BFMs in a few steps of online interaction with the environment while avoiding any performance drop during the adaptation process. Notably, we demonstrate that existing BFMs learn a set of skills containing more performant policies than those identified by their inference procedure, making them well-suited for fast adaptation. Motivated by this observation, we propose both actor-critic and actor-only fast adaptation strategies that search in the low-dimensional task-embedding space of the pre-trained BFM to rapidly improve the performance of its zero-shot policies on any downstream task. Notably, our approach mitigates the initial "unlearning" phase commonly observed when fine-tuning pre-trained RL models. We evaluate our fast adaptation strategies on top of four state-of-the-art zero-shot RL methods in multiple navigation and locomotion domains. Our results show that they achieve 10-40% improvement over their zero-shot performance in a few tens of episodes, outperforming existing baselines.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Proximal Learning With Opponent-Learning Awareness
Learning With Opponent-Learning Awareness (LOLA) (Foerster et al. [2018a]) is a multi-agent reinforcement learning algorithm that typically learns reciprocity-based cooperation in partially competitive environments. However, LOLA often fails to learn such behaviour on more complex policy spaces parameterized by neural networks, partly because the update rule is sensitive to the policy parameterization. This problem is especially pronounced in the opponent modeling setting, where the opponent's policy is unknown and must be inferred from observations; in such settings, LOLA is ill-specified because behaviorally equivalent opponent policies can result in non-equivalent updates. To address this shortcoming, we reinterpret LOLA as approximating a proximal operator, and then derive a new algorithm, proximal LOLA (POLA), which uses the proximal formulation directly. Unlike LOLA, the POLA updates are parameterization invariant, in the sense that when the proximal objective has a unique optimum, behaviorally equivalent policies result in behaviorally equivalent updates.