Plotting

A Weighted Products of Gaussians

Neural Information Processing Systems

A well-known result is that a product of Gaussian PDFs collapses to a scaled Gaussian PDF (e.g. Every neuron in a G-GLN takes one-or-more Gaussian PDFs as input and produces a Gaussian PDF as output. This raises the question of what input to provide to neurons in the first layer, i.e. the base prediction. We consider three solutions: (1) None. The input sufficient statistics to each neuron are already concatenated with so-called "bias" Gaussians to ensure that the target mean falls within the convex hull defined by the input means (described in Section 3).


Gaussian Gated Linear Networks David Budden Adam H. Marblestone Tor Lattimore Greg Wayne

Neural Information Processing Systems

We propose the Gaussian Gated Linear Network (G-GLN), an extension to the recently proposed GLN family of deep neural networks. Instead of using backpropagation to learn features, GLNs have a distributed and local credit assignment mechanism based on optimizing a convex objective. This gives rise to many desirable properties including universality, data-efficient online learning, trivial interpretability and robustness to catastrophic forgetting. We extend the GLN framework from classification to multiple regression and density modelling by generalizing geometric mixing to a product of Gaussian densities. The G-GLN achieves competitive or state-of-the-art performance on several univariate and multivariate regression benchmarks, and we demonstrate its applicability to practical tasks including online contextual bandits and density estimation via denoising.


DAC: The Double Actor-Critic Architecture for Learning Options

Neural Information Processing Systems

Under this novel formulation, all policy optimization algorithms can be used off the shelf to learn intra-option policies, option termination conditions, and a master policy over options. We apply an actor-critic algorithm on each augmented MDP, yielding the Double Actor-Critic (DAC) architecture. Furthermore, we show that, when state-value functions are used as critics, one critic can be expressed in terms of the other, and hence only one critic is necessary. We conduct an empirical study on challenging robot simulation tasks. In a transfer learning setting, DAC outperforms both its hierarchy-free counterpart and previous gradient-based option learning algorithms.


Reprogramming Pretrained Target-Specific Diffusion Models for Dual-Target Drug Design Xiangxin Zhou 1,2 Jiaqi Guan 3

Neural Information Processing Systems

Dual-target therapeutic strategies have become a compelling approach and attracted significant attention due to various benefits, such as their potential in overcoming drug resistance in cancer therapy. Considering the tremendous success that deep generative models have achieved in structure-based drug design in recent years, we formulate dual-target drug design as a generative task and curate a novel dataset of potential target pairs based on synergistic drug combinations. We propose to design dual-target drugs with diffusion models that are trained on single-target protein-ligand complex pairs. Specifically, we align two pockets in 3D space with protein-ligand binding priors and build two complex graphs with shared ligand nodes for SE(3)-equivariant composed message passing, based on which we derive a composed drift in both 3D and categorical probability space in the generative process. Our algorithm can well transfer the knowledge gained in single-target pretraining to dual-target scenarios in a zero-shot manner.


Bayesian Domain Adaptation with Gaussian Mixture Domain-Indexing

Neural Information Processing Systems

Recent methods are proposed to improve performance of domain adaptation by inferring domain index under an adversarial variational bayesian framework, where domain index is unavailable. However, existing methods typically assume that the global domain indices are sampled from a vanilla gaussian prior, overlooking the inherent structures among different domains. To address this challenge, we propose a Bayesian Domain Adaptation with Gaussian Mixture Domain-Indexing(GMDI) algorithm. GMDI employs a Gaussian Mixture Model for domain indices, with the number of component distributions in the "domain-themes" space adaptively determined by a Chinese Restaurant Process. By dynamically adjusting the mixtures at the domain indices level, GMDI significantly improves domain adaptation performance. Our theoretical analysis demonstrates that GMDI achieves a more stringent evidence lower bound, closer to the log-likelihood. For classification, GMDI outperforms all approaches, and surpasses the state-of-the-art method, VDI, by up to 3.4%, reaching 99.3%. For regression, GMDI reduces MSE by up to 21% (from 3.160 to 2.493), achieving the lowest errors among all methods. Source code is publicly available from https://github.com/lingyf3/GMDI.


Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based Games

Neural Information Processing Systems

We study reinforcement learning (RL) for text-based games, which are interactive simulations in the context of natural language. While different methods have been developed to represent the environment information and language actions, existing RL agents are not empowered with any reasoning capabilities to deal with textual games. In this work, we aim to conduct explicit reasoning with knowledge graphs for decision making, so that the actions of an agent are generated and supported by an interpretable inference procedure. We propose a stacked hierarchical attention mechanism to construct an explicit representation of the reasoning process by exploiting the structure of the knowledge graph. We extensively evaluate our method on a number of man-made benchmark games, and the experimental results demonstrate that our method performs better than existing text-based agents.


bf65417dcecc7f2b0006e1f5793b7143-AuthorFeedback.pdf

Neural Information Processing Systems

We thank all reviewers for their valuable comments and suggestions. We'll incorporate suggestions and clarifications in We first address a shared point (by Reviewer 1 and 2) and then respond to each reviewer respectively. In fact, subgraphs are allowed to be constructed using different approaches. Q2: What is missing from full KG that sub-graph captures? As shown in Figure 1 (c), applying sub-graphs enables us to explicitly capture such information. Reviewer 2 Q1: Why SHA-KG architecture is leading to higher scores?


End-to-End Ontology Learning with Large Language Models

Neural Information Processing Systems

Ontologies are useful for automatic machine processing of domain knowledge as they represent it in a structured format. Yet, constructing ontologies requires substantial manual effort. To automate part of this process, large language models (LLMs) have been applied to solve various subtasks of ontology learning. However, this partial ontology learning does not capture the interactions between subtasks. We address this gap by introducing OLLM, a general and scalable method for building the taxonomic backbone of an ontology from scratch.


Amnesia as a Catalyst for Enhancing Black Box Pixel Attacks in Image Classification and Object Detection

Neural Information Processing Systems

It is well known that query-based attacks tend to have relatively higher success rates in adversarial black-box attacks. While research on black-box attacks is actively being conducted, relatively few studies have focused on pixel attacks that target only a limited number of pixels. In image classification, query-based pixel attacks often rely on patches, which heavily depend on randomness and neglect the fact that scattered pixels are more suitable for adversarial attacks. Moreover, to the best of our knowledge, query-based pixel attacks have not been explored in the field of object detection. To address these issues, we propose a novel pixel-based black-box attack called Remember and Forget Pixel Attack using Reinforcement Learning(RFPAR), consisting of two main components: the Remember and Forget processes.


Appendix of GAN Memory with No Forgetting Yulai Cong, Miaoyun Zhao, Jianqiao Li, Sijia Wang, Lawrence Carin Department of ECE, Duke University A Experimental settings

Neural Information Processing Systems

For all experiments about GAN memory and MeRGAN, we inherit the architecture and experimental settings from GP-GAN [49]. Note, for both GAN memory and MeRGAN, we use the GP-GAN model pretrained on CelebA. The architecture for the generator is shown in Figure 7, where we denote the mth residual block as Bm and the nth convolutional layer within residual block as Ln. In the implementation of GAN memory, we apply style modulation on all layers except the last Conv layer for generator, and apply the proposed style modulation on all layers except the last FC layer for discriminator. Figure 7: (a) The generator architecture adopted in this paper.