Plotting

recursively defined as z

Neural Information Processing Systems

We are grateful for all the reviewers' valuable suggestions and questions. The results are displayed in Figure 1. We can see that mZAS initialization always outperforms the Xavier initialization. ICLR2019), but with the top layer to be zero. We will clarify this in the revised version.



HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness Mi Luo 1 Changan Chen 1 Kristen Grauman

Neural Information Processing Systems

We study the problem of precisely swapping objects in videos, with a focus on those interacted with by hands, given one user-provided reference object image. Despite the great advancements that diffusion models have made in video editing recently, these models often fall short in handling the intricacies of hand-object interactions (HOI), failing to produce realistic edits--especially when object swapping results in object shape or functionality changes. To bridge this gap, we present HOI-Swap, a novel diffusion-based video editing framework trained in a self-supervised manner. Designed in two stages, the first stage focuses on object swapping in a single frame with HOI awareness; the model learns to adjust the interaction patterns, such as the hand grasp, based on changes in the object's properties. The second stage extends the single-frame edit across the entire sequence; we achieve controllable motion alignment with the original video by: (1) warping a new sequence from the stage-I edited frame based on sampled motion points and (2) conditioning video generation on the warped sequence. Comprehensive qualitative and quantitative evaluations demonstrate that HOI-Swap significantly outperforms existing methods, delivering high-quality video edits with realistic HOIs.


Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control

Neural Information Processing Systems

Multi-agent reinforcement learning (MARL) has recently received considerable attention due to its applicability to a wide range of real-world applications. However, achieving efficient communication among agents has always been an overarching problem in MARL. In this work, we propose Variance Based Control (VBC), a simple yet efficient technique to improve communication efficiency in MARL. By limiting the variance of the exchanged messages between agents during the training phase, the noisy component in the messages can be eliminated effectively, while the useful part can be preserved and utilized by the agents for better performance. Our evaluation using multiple MARL benchmarks indicates that our method achieves 2 10 lower in communication overhead than state-of-the-art MARL algorithms, while allowing agents to achieve better overall performance.


14cfdb59b5bda1fc245aadae15b1984a-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their insightful comments. We will incorporate the feedback and suggestions into the next revision of the paper. A: The messages exchanged between the agents generally convey agent status information (location, health status, etc.) Overtime, communication level gradually decreases as agents move to the right position (step 250,430). We can also design similar experiments to infer the meaning of other types of messages. A: VBC is most beneficial to multi-agent systems that require quick decision making and low communication overhead.


A GAN Variants

Neural Information Processing Systems

We apply top-k training to all of the following GAN variants: DC-GAN [38]: A simple, widely used architecture that uses convolutions and deconvolutions for the critic and the generator. WGAN with Gradient Clipping [1]: Attempts to use an approximate Wasserstein distance as the critic loss by clipping the weights of the critic to bound the gradient of the critic with respect to its inputs. Mode-Seeking GAN [29]: Attempts to generate more diverse images by selecting more samples from under-represented modes of the target distribution. Spectral Normalization GAN [32]: Replaces the gradient penalty with a (loose) bound on the spectral norm of the weight matrices of the critic. The'Held-out Digit' is the digit that was held out of the training set during training and treated as the'anomaly' class.


Causal Temporal Representation Learning with Nonstationary Sparse Transition Xiangchen Song 1 Zijian Li2 Guangyi Chen 1,2

Neural Information Processing Systems

Causal Temporal Representation Learning (Ctrl) methods aim to identify the temporal causal dynamics of complex nonstationary temporal sequences. Despite the success of existing Ctrl methods, they require either directly observing the domain variables or assuming a Markov prior on them. Such requirements limit the application of these methods in real-world scenarios when we do not have such prior knowledge of the domain variables. To address this problem, this work adopts a sparse transition assumption, aligned with intuitive human understanding, and presents identifiability results from a theoretical perspective. In particular, we explore under what conditions on the significance of the variability of the transitions we can build a model to identify the distribution shifts. Based on the theoretical result, we introduce a novel framework, Causal Temporal Representation Learning with Nonstationary Sparse Transition (CtrlNS), designed to leverage the constraints on transition sparsity and conditional independence to reliably identify both distribution shifts and latent factors. Our experimental evaluations on synthetic and real-world datasets demonstrate significant improvements over existing baselines, highlighting the effectiveness of our approach.


Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf

Neural Information Processing Systems

Communication is a fundamental aspect of human society, facilitating the exchange of information and beliefs among people. Despite the advancements in large language models (LLMs), recent agents built with these often neglect the control over discussion tactics, which are essential in communication scenarios and games. As a variant of the famous communication game Werewolf, One Night Ultimate Werewolf (ONUW) requires players to develop strategic discussion policies due to the potential role changes that increase the uncertainty and complexity of the game. In this work, we first present the existence of the Perfect Bayesian Equilibria (PBEs) in two scenarios of the ONUW game: one with discussion and one without. The results showcase that the discussion greatly changes players' utilities by affecting their beliefs, emphasizing the significance of discussion tactics. Based on the insights obtained from the analyses, we propose an RL-instructed language agent framework, where a discussion policy trained by reinforcement learning (RL) is employed to determine appropriate discussion tactics to adopt. Our experimental results on several ONUW game settings demonstrate the effectiveness and generalizability of our proposed framework.



A Appendix (H) Y defined as ḡ (f) = f(x) foreach f = MAJ(h,, h

Neural Information Processing Systems

Similarly, define another dual space G: a set of functions g: H Y defined as g(x) = h(x) for each h H and each x X. We begin with describing the construction of the adversary U. Let m N; we For any point x X \ Z that is remaining, define U(x) = {}. Let B be an arbitrary reduction algorithm, and let ε > 0 be the error requirement. We will now describe the construction of the target class C. The target class C will be constructed randomly. We now state a few properties of the randomly-constructed target class C that we will use in the remainder of the proof.