Goto

Collaborating Authors

 discriminator


0004d0b59e19461ff126e3a08a814c33-AuthorFeedback.pdf

Neural Information Processing Systems

We sincerely appreciate the reviewers for their careful reading, constructive questions and suggestions. We would very1 much like further exchanges to improve our work, but the following is our best effort within the current limits.2 First, we address questions appeared at least twice. We write P1, P2 for paragraph reference, and Rx for reviewers.3 We discuss two main motivations here: lack of graph loss, and empirical failure4 of distinguishing power.


Appendix

Neural Information Processing Systems

A. The input spikes, xti, are one of the main drivers of the activity of our RSNN. They are 300 Poisson neurons, where the first 100encode the whisker stimulus, the next 100 encode the auditory cue and the last 100 act as an extra noise source for our model. Out of the 300 neurons, 60 of them are inhibitory (red). The input neurons project unrestrictedly to the whole RSNN. The baseline firing rate of all input neurons is 5 Hz.


Learning List-Level Domain-Invariant Representations for Ranking

Neural Information Processing Systems

Domain adaptation aims to transfer the knowledge learned on (data-rich) source domains to (low-resource) target domains, and a popular method is invariant representation learning, which matches and aligns the data distributions on the feature space. Although this method is studied extensively and applied on classification and regression problems, its adoption on ranking problems is sporadic, and the few existing implementations lack theoretical justifications. This paper revisits invariant representation learning for ranking. Upon reviewing prior work, we found that they implement what we call item-level alignment, which aligns the distributions of the items being ranked from all lists in aggregate but ignores their list structure. However, the list structure should be leveraged, because it is intrinsic to ranking problems where the data and the metrics are defined and computed on lists, not the items by themselves. To close this discrepancy, we propose list-level alignment--learning domain-invariant representations at the higher level of lists. The benefits are twofold: it leads to the first domain adaptation generalization bound for ranking, in turn providing theoretical support for the proposed method, and it achieves better empirical transfer performance for unsupervised domain adaptation on ranking tasks, including passage reranking.


Generating multivariate time series with COmmon Source CoordInated GAN (COSCI-GAN)

Neural Information Processing Systems

Generating multivariate time series is a promising approach for sharing sensitive data in many medical, financial, and IoT applications. A common type of multivariate time series originates from a single source such as the biometric measurements from a medical patient. This leads to complex dynamical patterns between individual time series that are hard to learn by typical generation models such as GANs. There is valuable information in those patterns that machine learning models can use to better classify, predict or perform other downstream tasks. We propose a novel framework that takes time series' common origin into account and favors channel/feature relationships preservation. The two key points of our method are: 1) the individual time series are generated from a common point in latent space and 2) a central discriminator favors the preservation of inter-channel/feature dynamics. We demonstrate empirically that our method helps preserve channel/feature correlations and that our synthetic data performs very well in downstream tasks with medical and financial data.


Reliable Estimation of KLDivergence using a Discriminator in Reproducing Kernel Hilbert Space Supplementary Material

Neural Information Processing Systems

Organization: This supplementary material is presented in a format parallel to the main paper. The section numbers and titles are consistent with the main paper. But, here we also add one new section: Section 10 where we describe the societal impacts and possible negative impacts of the paper. Similarly, the Theorem numbers are consistent with the main paper, but we also have several additional theorems and lemmas which were not included in the main paper. GAN-type Objective for KLEstimation Let f be a discriminator, f: X IR. Let p(x) and q(x) be two probability density functions defined over the space X.



The proposition makes use of the following observation: For the discriminator defined in (1), the norm of gradient for wt is upper bounded by k wtDฮธ(x)k F kxk LY

Neural Information Processing Systems

The upper bound of gradient's Frobenius norm for spectrally-normalized discriminators follows directly. As lw(x) is a linear transformation, we have lcw(x) = c lw(x), and lw(cx) = c lw(x). Moreover, since ReLU and leaky ReLU is linear in R+ and R region, we have ai(cx) = c ai(x). In this section we discuss the gradients with respect the actual parameter wi. From Eq. (12) in [30] we know wtDฮธ(x) = A, we know that w0tDฮธ(x) F, otl(x)Dฮธ(x), and kotl (x)k have upper bounds. From Theorem 1.1 in [44] we know that if wt is initialized with i.i.d random variables from uniform or Gaussian distribution, E kwtkspis lower bounded away from zero at initialization. So k wtDฮธ(x)kF is upper bounded at initialization. Moreover, we observe empirically that kwtksp is usually increasing during training. Therefore, k wtDฮธ(x)kF is typically upper bounded during training as well. The following proposition states that spectral normalization also gives an upper bound on kHwi(Dฮธ)(x)ksp for networks with ReLU or leaky ReLU internal activations.



Data-Efficient Instance Generation from Instance Discrimination

Neural Information Processing Systems

Generative Adversarial Networks (GANs) have significantly advanced image synthesis, however, the synthesis quality drops significantly given a limited amount of training data. To improve the data efficiency of GAN training, prior work typically employs data augmentation to mitigate the overfitting of the discriminator yet still learn the discriminator with a bi-classification (i.e., real vs.