Goto

Collaborating Authors

 batchnorm2d



Supplementary Material: Learning Distilled Collaboration Graph for Multi-Agent Perception

Neural Information Processing Systems

V ehicles are spawned in CARLA via SUMO, and managed by the Traffic Manager. We employ the dataset format of the nuScenes and extend it to multi-agent scenarios, seen in Fig. IV. Each log file can produce 100 scenes, and each scene includes 100 frames. The input BEV map's dimension is (c, w,h) = (13, 256, 256). II.1 Architecture of student/teacher encoder We describe the architecture of the encoder below.






A Experimental Setup

Neural Information Processing Systems

A.2 Training Settings of T eacher We provide training settings of the teacher w.r.t. In practice, we do not optimize the student and the generator via the plain losses in Eq. 4 and Eq. 6, Number of steps for pretraining G, δ: the bound in Eqs. A.4 Generator Architectures In Table 8, we show different architectures of the generator w.r.t. ResNetBlockY are provided in Table 9. ConvBlockX(c This is because the "uncond" generator has learned to jump "sum" generator enables stable training of our model and gives the best accuracy and crossentropy The "cat" generator only yields good results at "uncond" generator does not encounter any problem with MAD to learn faster than the "cat" generator. An important question is "What is a reasonable upper bound


Temporally Disentangled Representation Learning under Unknown Nonstationarity Xiangchen Song

Neural Information Processing Systems

However, in nonstationary setting, existing work only partially addressed the problem by either utilizing observed auxiliary variables (e.g., class labels and/or domain indexes) as side-information or assuming simplified latent causal dynamics. Both constrain the method to a limited range of scenarios.


Vision GNN: An Image is Worth Graph of Nodes Kai Han 1,2 Yunhe Wang

Neural Information Processing Systems

Given a FFN module, the diversity γ (FFN (X)) of its output features satisfies γ ( FFN( X)) λγ ( X), (2) where λ is the Lipschitz constant of FFN with respect to p-norm for p [1, ].


The Online Patch Redundancy Eliminator (OPRE): A novel approach to online agnostic continual learning using dataset compression

Bayle, Raphaël, Mermillod, Martial, French, Robert M.

arXiv.org Artificial Intelligence

In order to achieve Continual Learning (CL), the problem of catastrophic forgetting, one that has plagued neural networks since their inception, must be overcome. The evaluation of continual learning methods relies on splitting a known homogeneous dataset and learning the associated tasks one after the other. We argue that most CL methods introduce a priori information about the data to come and cannot be considered agnostic. We exemplify this point with the case of methods relying on pretrained feature extractors, which are still used in CL. After showing that pretrained feature extractors imply a loss of generality with respect to the data that can be learned by the model, we then discuss other kinds of a priori information introduced in other CL methods. We then present the Online Patch Redundancy Eliminator (OPRE), an online dataset compression algorithm, which, along with the training of a classifier at test time, yields performance on CIFAR-10 and CIFAR-100 superior to a number of other state-of-the-art online continual learning methods. Additionally, OPRE requires only minimal and interpretable hypothesis on the data to come. We suggest that online dataset compression could well be necessary to achieve fully agnostic CL.