AITopics | resnet34

cd5404354496e39d37b7947d8a0d7b72-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-29-2026, 19:33:37 GMT

A.1 Additional Experiments on CIFAR102 We expanded our experiments on the CIFAR10 dataset by utilizing weights pretrained for 1003 iterations with a batch size of 128 per iteration. The CIFAR10 dataset consists of 50,000 training4 images and 10,000 testing images, divided into 10 different classes. The results of these experiments5 are summarized in Table 1.6 We observed performance improvement relative to baseline. However, compared to other modes of7 pretraining for CIFAR10, certain PaI generators exhibited higher-than-expected standard deviation and8 lower average performance, indicating some instability in generating sparse structures. Specifically,9 we observed this trend with GraSP in ResNet18 and SNIP in ResNet34.10

artificial intelligence, iteration, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training

Neural Information Processing SystemsApr-29-2026, 17:51:37 GMT

Regularization in modern machine learning is crucial, and it can take various forms in algorithmic design: training set, model family, error function, regularization terms, and optimizations. In particular, the learning rate, which can be interpreted as a temperature-like parameter within the statistical mechanics of learning, plays a crucial role in neural network training. Indeed, many widely adopted training strategies basically just define the decay of the learning rate over time. This process can be interpreted as decreasing a temperature, using either a global learning rate (for the entire model) or a learning rate that varies for each parameter. This paper proposes TempBalance, a straightforward yet effective layer-wise learning rate method. TempBalanceis based on Heavy-Tailed Self-Regularization (HT-SR) Theory, an approach which characterizes the implicit self-regularization of different layers in trained models. We demonstrate the efficacy of using HT-SR-motivated metrics to guide the scheduling and balancing of temperature across all network layers during model training, resulting in improved performance during testing.

artificial intelligence, machine learning, tempbalance, (17 more...)

Neural Information Processing Systems

Country: Europe (0.67)

Genre: Research Report > New Finding (1.00)

Add feedback

CLDA: Contrastive Learning for Semi-Supervised Domain Adaptation (Supplementary Material)

Neural Information Processing SystemsApr-25-2026, 05:11:19 GMT

The supplementary material consists of the following. Additional Results of the DomainNet dataset for 5 and 10-shot settings with Resnet34 as backbone network are shown in Table 1. Results are reported in Tables 2 and 3 Discussion on Limitations and Societal Impacts. The architecture of the network is similar to [2]. All other hyperparameters used in our framework are described in the main paper.

artificial intelligence, dataset, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training Yefan Zhou

Neural Information Processing SystemsFeb-17-2026, 01:43:40 GMT

However, such a global learning rate schedule does not take into account the structural characteristics of neural networks (NNs).

artificial intelligence, machine learning, tempbalance, (17 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Add feedback

Appendix for " Residual Alignment: Uncovering the Mechanisms of Residual Networks " Anonymous Author(s) Affiliation Address email

Neural Information Processing SystemsFeb-16-2026, 16:14:07 GMT

We start by providing motivation for the unconstrained Jacobians problem introduced in the main text. We will continue our proof using contradiction. Figure 1: Fully-connected ResNet34 (Type 1 model) trained on MNIST.Figure 2: Fully-connected ResNet34 (Type 1 model) trained on FashionMNIST. Figure 10: Fully-connected ResNet34 (Type 1 model) trained on MNIST. Figure 24: Fully-connected ResNet34 (Type 1 model) trained on MNIST.

artificial intelligence, convolutional resnet34, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback