40bb79c081828bebdc39d65a82367246-Supplemental-Conference.pdf

Feb-8-2026, 13:34:47 GMT–Neural Information Processing Systems

Table1: Linearnetwork Layer# Name Layer Inshape Outshape 1 Flatten() (3,32,32) 3072 2 fc1 nn.Linear(3072, 200) 3072 200 3 fc2 nn.Linear(200, 1) 200 1 Fully-connected Network We conduct further experiments on several different fully-connected networks with 4 hidden layers with various activation functions. Our subset is smaller because of the computation limitation when calculating the Gram matrix. Experiments show that the properties along GD trajectory(e.g. We consider simple linear networks, fully-connected networks, convolutional networks in this appendix. The following Figure 4 illustrates the positive correlation between thesharpness andtheA-norm, andtherelationship between theloss D(t) 2 and R(t) 2 alongthetrajectory.

artificial intelligence, lemmac, machine learning, (17 more...)

Neural Information Processing Systems

Feb-8-2026, 13:34:47 GMT

Conferences PDF

Add feedback

Country:
- North America
  - United States (0.04)
  - Canada > Ontario
    - Toronto (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Duplicate Docs Excel Report

Title
A Experimental Setup

Similar Docs Excel Report more

Title	Similarity	Source
None found