augerino
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
50d005f92a6c5c9646db4b761da676ba-Supplemental-Conference.pdf
Failure case 2: Augerino depends on the used parameterisation of invariance. The full GGN approximation in Eq. 5 is inO(NP2C) for computingN matrix-products. The diagonalGGNapproximation would be inO(NPC)and computation of the log-determinant onlyO(P). Computing the log-determinant can be done efficiently inO(D3 +G3)by decomposing the Kronecker factors (Immer et al., 2021a). The last two terms dependent onS come up due to the aggregation ofaugmentation samples inour approximation, that is,the expectations overaandg in the second line of Eq. 15.
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Appendix A Training details
Models are trained with Stochastic Gradient Descent with momentum equal to 0.9 [ We use a learning rate annealing scheme, decreasing the learning rate by a factor of 0.1 every 30 epochs. We train all models for 150 epochs. Then, we select the best learning rate and weight decay for each method and run 5 different seeds to report mean and standard deviation. We use the validation set of ImageNet to perform cross-validation and report performance on it. In section G we train the Augerino method on top of the Resnet-18 architecture.
A Experimental settings
The data was generated as described in Figure A.1 and Section 4.1. Augmentations were also initialized with uniform weights for AugNet.Figure A.1: Illustration of the data generation process for the Mario-Iggy experiment.Figure A.2: Magnitudes learned by Augerino are not sparse. The data was generated as described in Figure A.4 and Section 4.2. The trunk model used for Augnet is described in Table A.2. Moreover, λ was set to 0. 8 and initial magnitudes set to 0.05 in Section 4.3.
- North America > Canada > Quebec > Montreal (0.04)
- Europe > France (0.04)
- Asia > Middle East > Israel (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
augmentations, from training data
We thank the reviewers for thoughtful feedback. Thank you for the constructive review, we appreciate the feedback and will incorporate the suggestions accordingly. We also note that for Augerino to be applicable we need only the invariance to be represented by a group transformation. We thank the reviewer for the constructive comments and ideas for extensions. One of the key strengths of Augerino is its ability to be combined freely with any model.