Goto

Collaborating Authors

 logq


LiftingWeakSupervisionToStructuredPrediction

Neural Information Processing Systems

For labels taking values in a finite metric space, we introduce techniques new to weak supervision based on pseudo-Euclidean embeddings andtensor decompositions, providing anearly-consistent noise rate estimator.


Invariant Representations without Adversarial Training

Daniel Moyer, Shuyang Gao, Rob Brekelmans, Aram Galstyan, Greg Ver Steeg

Neural Information Processing Systems

We show that adversarial training is unnecessary and sometimes counter-productive; we instead cast invariant representation learning asasingle information-theoretic objectivethat can bedirectly optimized.



Correlated Uncertainty for Learning Dense Correspondences from Noisy Labels

Natalia Neverova, David Novotny, Andrea Vedaldi

Neural Information Processing Systems

Analternativeapproach isto predict instead adistributionp(ˆy|x) = Φˆy(x) over possible values of the annotationy. Theannotators are shown a set of points sampled randomly and uniformly over one of predefined body parts of aperson inan image.


Symmetry-inducedDisentanglementonGraphs

Neural Information Processing Systems

Disentanglementhasbeen formalized using a symmetry-centric notion for unstructured spaces, however, graphs have eluded a similarly rigorous treatment. We fill this gap with a new notionofconditional symmetryfordisentanglement, andleveragetoolsfromLie algebras toencode graph properties intosubgroups using suitable adaptations of generative models such as Variational Autoencoders.


Appendix: VariationalContinualBayesian Meta-Learning

Neural Information Processing Systems

In variational continual learning, the posterior distribution of interest is frequently intractable and approximation is required. We summarize the meta-training process of our VC-BML in algorithm 1. Moreover,we evaluate FTML onthe unseen tasks (i.e., tasks sampled from meta-test set) instead ofthe training tasksthattheoriginalFTMLused. It would be unfair to adopt the original initialization procedure in OSML. BOMVI [10]: In our experiments, we use variational inference to approximate the posterior of meta-parameters. E.3.2 Settings As the latent variables in this paper are meta-parameters and task-specific parameters, the dimensionality ofthelatent space isactually determined bythenumber ofparameters inthedeep neural network. In particular, we define a CNN architecture and present its details in Table 1.


SupplementaryMaterialFor StochasticMultipleTargetSamplingGradientDescent

Neural Information Processing Systems

By contrast, there isonly one quadratic programming problem solving inour proposed method, which significantly reduces time complexity, especially when the number of particles is high. The mean square error for each task and the average results are shown in Table 1. MT-SGD outperforms thesecond-best method, MOO-SVGD, with0.2251vs. However, on the one hand, computingU's entries can be accelerated in practice bycalculating theminparallel sincethereisnointeraction between themduring forwardpass. Allimagesareresizedto 64 64 3. Due tospace constraints, we report only the abbreviation ofeach task inthe main paper,their full namesarepresentedbelow.


Markovian with Christian Columbia chr Columbia d

Neural Information Processing Systems

Output: K ?, K ?. 1 for k=1,...,K do 2 Samplez[k] M( |z[k 1]; k 1, k 1) 3 Computes(z[k]; k 1)= r logq(z[k]; k 1) 4 Compute bgML( k 1)= r logp(z[k],x; k 1) 5 Set k= k 1+"ks(z[k]; k 1) 6 Set k= k 1+ kbgML( k 1) 7 end F hood (this obtained or WecompareMSCwith SMC-based [22] using [29].


8e5e15c4e6d09c8333a17843461041a9-Supplemental.pdf

Neural Information Processing Systems

Tiny-ImageNet isasmall subset of ImageNet dataset, containing 100,000 training images, 10,000 validation images, and 10,000 testing images separated in 200 different classes, dimensionsofwhichare64 64pixels. Here,anapproximate featureprobability q(Z) is introduced to approximate the true feature probabilityp(Z). The additional results are illustrated in Figure 1. We provide additional feature visualization under various adversarial attack methods including NRF in Figure 1-5 (CIFAR-10, SVHN, and Tiny-ImageNet are utilized). Moreover,thedistilled features still include therobustand brittle information eveninthefailed attack examples.