Goto

Collaborating Authors

 zhangetal



fdb55ce855129e05da8374059cc82728-Supplemental.pdf

Neural Information Processing Systems

A.1 Fullexperimentalresults In this section we provide the full experimental results that extend the results demonstrated in the Section 4.2. Table 8 demonstrates the evaluation on 16 robustly trained CIFAR10 models from RobustBench [28] that was summarized in the Table 2. We consider four configurations of the attack for each of the models. SA and AA correspond to the update size schedules proposed by Andriushchenko et al.[1]and Croce and Hein[2]respectively. "Uni" denotes sampling the color fortheupdateuniformly. A.2 Meta-trainingtheControllers The meta-training of controllers was described in Section 3 and Section 4.1.


Theoretical

Neural Information Processing Systems

The question of if and how rank collapse affects training is still largelyunanswered, anditsinvestigation isnecessary foramore comprehensive understanding ofthisarchitecture.


d1588e685562af341ff2448de4b674d1-Paper.pdf

Neural Information Processing Systems

However,existing algorithms lack universality in the sense that they can only handle one type of convex functions and need apriori knowledge of parameters.


TrashorTreasure?AnInteractiveDual-Stream StrategyforSingleImageReflectionSeparation

Neural Information Processing Systems

Existing deep learning based solutions typically restore the target layers individually, or with some concerns at the end of the output, barely taking into account the interaction across thetwostreams/branches. Inorder toutilize information more efficiently, this work presents a general yet simple interactive strategy, namely your trash is my treasure(YTMT), for constructing dual-stream decomposition networks.




DISCO: AdversarialDefensewith LocalImplicitFunctions

Neural Information Processing Systems

In this section, we ablate the kernel size used to train DISCO on ImageNet. TableIshowsthats=3 achieves the best performance, which degrades fors = 5 by a significant margin (3.26%). This is consistent with the well known complexity of synthesizing images withglobalmodels, suchasGANs. For a single ImageNet image of size 224, STL requires 23.71 seconds while DISCO (K=1) only requires0.027. In this section, we list the url links that are used for training and evaluating DISCO.