propagate
ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields
We introduce ViCA-NeRF, the view-consistency-aware method for 3D editing with text instructions. In addition to the implicit neural radiance field (NeRF) modeling, our key insight is to exploit two sources of regularization that propagate the editing information across different views, thus ensuring multi-view consistency. For, we leverage the depth information derived from NeRF to establish image correspondences between different views. For, we align the latent codes in the 2D diffusion model between edited and unedited images, enabling us to edit key views and propagate the update throughout the entire scene. Incorporating these two strategies, our ViCA-NeRF operates in two stages. In the initial stage, we blend edits from different views to create a preliminary 3D edit. This is followed by a second stage of NeRF training, dedicated to further refining the scene's appearance. Experimental results demonstrate that ViCA-NeRF provides more flexible, efficient (3 times faster) editing with higher levels of consistency and details, compared with the state of the art.
Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters
How does the uncertainty of the value function propagate when performing temporal difference learning? In this paper, we address this question by proposing a Bayesian framework in which we employ approximate posterior distributions to model the uncertainty of the value function and Wasserstein barycenters to propagate it across state-action pairs. Leveraging on these tools, we present an algorithm, Wasserstein Q-Learning (WQL), starting in the tabular case and then, we show how it can be extended to deal with continuous domains. Furthermore, we prove that, under mild assumptions, a slight variation of WQL enjoys desirable theoretical properties in the tabular setting. Finally, we present an experimental campaign to show the effectiveness of WQL on finite problems, compared to several RL algorithms, some of which are specifically designed for exploration, along with some preliminary results on Atari games.
Learning to Propagate for Graph Meta-Learning
Meta-learning extracts the common knowledge from learning different tasks and uses it for unseen tasks. It can significantly improve tasks that suffer from insufficient training data, e.g., few-shot learning. In most meta-learning methods, tasks are implicitly related by sharing parameters or optimizer. In this paper, we show that a meta-learner that explicitly relates tasks on a graph describing the relations of their output dimensions (e.g., classes) can significantly improve few-shot learning. The graph's structure is usually free or cheap to obtain but has rarely been explored in previous works. We develop a novel meta-learner of this type for prototype based classification, in which a prototype is generated for each class, such that the nearest neighbor search among the prototypes produces an accurate classification. The meta-learner, called "Gated Propagation Network (GPN)", learns to propagate messages between prototypes of different classes on the graph, so that learning the prototype of each class benefits from the data of other related classes.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (2 more...)
Faster Certified Symmetry Breaking Using Orders With Auxiliary Variables
Anders, Markus, Bogaerts, Bart, Bogø, Benjamin, Gontier, Arthur, Koops, Wietze, McCreesh, Ciaran, Myreen, Magnus O., Nordström, Jakob, Oertel, Andy, Rebola-Pardo, Adrian, Tan, Yong Kiam
Symmetry breaking is a crucial technique in modern combinatorial solving, but it is difficult to be sure it is implemented correctly. The most successful approach to deal with bugs is to make solvers certifying, so that they output not just a solution, but also a mathematical proof of correctness in a standard format, which can then be checked by a formally verified checker. This requires justifying symmetry reasoning within the proof, but developing efficient methods for this has remained a long-standing open challenge. A fully general approach was recently proposed by Bogaerts et al. (2023), but it relies on encoding lexicographic orders with big integers, which quickly becomes infeasible for large symmetries. In this work, we develop a method for instead encoding orders with auxiliary variables. We show that this leads to orders-of-magnitude speed-ups in both theory and practice by running experiments on proof logging and checking for SAT symmetry breaking using the state-of-the-art satsuma symmetry breaker and the VeriPB proof checking toolchain.
- Europe > Austria > Vienna (0.14)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
- (7 more...)
Backdoor Attacks Against Speech Language Models
Fortier, Alexandrine, Thebaud, Thomas, Villalba, Jesús, Dehak, Najim, Cardinal, Patrick
Large Language Models (LLMs) and their multimodal extensions are becoming increasingly popular. One common approach to enable multimodality is to cascade domain-specific encoders with an LLM, making the resulting model inherit vulnerabilities from all of its components. In this work, we present the first systematic study of audio backdoor attacks against speech language models. We demonstrate its effectiveness across four speech encoders and three datasets, covering four tasks: automatic speech recognition (ASR), speech emotion recognition, and gender and age prediction. The attack consistently achieves high success rates, ranging from 90.76% to 99.41%. To better understand how backdoors propagate, we conduct a component-wise analysis to identify the most vulnerable stages of the pipeline. Finally, we propose a fine-tuning-based defense that mitigates the threat of poisoned pretrained encoders. Large language models (LLMs) are increasingly extended to multimodal settings, processing combinations of text, images, video, and audio (DeepMind, 2023; Biadsy et al., 2023; Radford et al., 2021; Rajaa & Tushar, 2024). While powerful, these systems inherit vulnerabilities from each of their components. Among them are backdoor attacks, in which a model behaves normally on clean inputs but produces targeted outputs when a hidden trigger is present (Gu et al., 2017). Prior backdoor studies have largely focused on single-modality large language models (Xu et al., 2023; Y ao et al., 2024) or speech processing models (Zhai et al., 2021; Koffas et al., 2022), leaving open questions about how such attacks propagate in a cascaded speech language model.
- North America > United States (0.04)
- North America > Canada (0.04)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Cooperation Under Network-Constrained Communication
Mordo, Tommy, Madmon, Omer, Tennenholtz, Moshe
In this paper, we study cooperation in distributed games under network-constrained communication. Building on the framework of Monderer and Tennenholtz (1999), we derive a sufficient condition for cooperative equilibrium in settings where communication between agents is delayed by the underlying network topology. Each player deploys an agent at every location, and local interactions follow a Prisoner's Dilemma structure. We derive a sufficient condition that depends on the network diameter and the number of locations, and analyze extreme cases of instantaneous, delayed, and proportionally delayed communication. We also discuss the asymptotic case of scale-free communication networks, in which the network diameter grows sub-linearly in the number of locations. These insights clarify how communication latency and network design jointly determine the emergence of distributed cooperation.
- Asia > Middle East > Israel > Haifa District > Haifa (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)