AITopics

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry:

Health & Medicine > Therapeutic Area (0.98)
Health & Medicine > Diagnostic Medicine > Imaging (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.75)

Neural Information Processing SystemsFeb-7-2026, 08:45:17 GMT

0607f4c705595b911a4f3e7a127b44e0-Paper.pdf

basin, checkpoint, initialization, (14 more...)

Country:

North America > Canada (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsOct-1-2025, 22:38:17 GMT

0607f4c705595b911a4f3e7a127b44e0-Paper.pdf

We use a series of analysis to answer the question of what is being transferred.

artificial intelligence, machine learning, natural language, (18 more...)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsAug-20-2025, 08:16:23 GMT

Transfusion: Understanding Transfer Learning for Medical Imaging

This basic formula has seen almost universal adoption across many different medical specialties.

initialization, pretrained weight, representation, (14 more...)

Country:

North America > United States (0.28)
North America > Canada (0.04)

Genre: Research Report (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Neural Information Processing SystemsAug-17-2025, 08:06:26 GMT

Meta-ticket: Finding optimal subnetworks for few-shot learning within randomly initialized neural networks

The main challenge is how to avoid overfitting since over-parameterized NNs can easily overfit to such small dataset.

artificial intelligence, machine learning, meta-ticket, (14 more...)

Country:

North America > United States > California (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsMay-27-2025, 18:41:18 GMT

Inductive biases of multi-task learning and finetuning: multiple regimes of feature reuse

Neural networks are often trained on multiple tasks, either simultaneously (multi-task learning, MTL) or sequentially (pretraining and subsequent finetuning, PT FT). In particular, it is common practice to pretrain neural networks on a large auxiliary task before finetuning on a downstream task with fewer samples. Despite the prevalence of this approach, the inductive biases that arise from learning multiple tasks are poorly characterized. In this work, we address this gap. We describe novel implicit regularization penalties associated with MTL and PT FT in diagonal linear networks and single-hidden-layer ReLU networks.

feature reuse, nested feature selection, regime, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceNov-14-2024

ResidualDroppath: Enhancing Feature Reuse over Residual Connections

Park, Sejik

Residual connections are one of the most important components in neural network architectures for mitigating the vanishing gradient problem and facilitating the training of much deeper networks. One possible explanation for how residual connections aid deeper network training is by promoting feature reuse. However, we identify and analyze the limitations of feature reuse with vanilla residual connections. To address these limitations, we propose modifications in training methods. Specifically, we provide an additional opportunity for the model to learn feature reuse with residual connections through two types of iterations during training. The first type of iteration involves using droppath, which enforces feature reuse by randomly dropping a subset of layers. The second type of iteration focuses on training the dropped parts of the model while freezing the undropped parts. As a result, the dropped parts learn in a way that encourages feature reuse, as the model relies on the undropped parts with feature reuse in mind. Overall, we demonstrated performance improvements in models with residual connections for image classification in certain cases.

arxiv preprint arxiv, pattern recognition, proceedings, (9 more...)

2411.09475

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Monaco (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

So, Junhyuk, Lee, Jungwon, Park, Eunhyeok

FRDiff: Feature Reuse for Exquisite Zero-shot Acceleration of Diffusion Models

arXiv.org Artificial IntelligenceDec-6-2023

The substantial computational costs of diffusion models, particularly due to the repeated denoising steps crucial for high-quality image generation, present a major obstacle to their widespread adoption. While several studies have attempted to address this issue by reducing the number of score function evaluations using advanced ODE solvers without fine-tuning, the decreased number of denoising iterations misses the opportunity to update fine details, resulting in noticeable quality degradation. In our work, we introduce an advanced acceleration technique that leverages the temporal redundancy inherent in diffusion models. Reusing feature maps with high temporal similarity opens up a new opportunity to save computation without sacrificing output quality. To realize the practical benefits of this intuition, we conduct an extensive analysis and propose a novel method, FRDiff. FRDiff is designed to harness the advantages of both reduced NFE and feature reuse, achieving a Pareto frontier that balances fidelity and latency trade-offs in various generative tasks.

diffusion model, feature map, frdiff, (14 more...)

2312.03517

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Chijiwa, Daiki, Yamaguchi, Shin'ya, Kumagai, Atsutoshi, Ida, Yasutoshi

Meta-ticket: Finding optimal subnetworks for few-shot learning within randomly initialized neural networks

arXiv.org Artificial IntelligenceFeb-9-2023

Few-shot learning for neural networks (NNs) is an important problem that aims to train NNs with a few data. The main challenge is how to avoid overfitting since over-parameterized NNs can easily overfit to such small dataset. Previous work (e.g. MAML by Finn et al. 2017) tackles this challenge by meta-learning, which learns how to learn from a few data by using various tasks. On the other hand, one conventional approach to avoid overfitting is restricting hypothesis spaces by endowing sparse NN structures like convolution layers in computer vision. However, although such manually-designed sparse structures are sample-efficient for sufficiently large datasets, they are still insufficient for few-shot learning. Then the following questions naturally arise: (1) Can we find sparse structures effective for few-shot learning by meta-learning? (2) What benefits will it bring in terms of meta-generalization? In this work, we propose a novel meta-learning approach, called Meta-ticket, to find optimal sparse subnetworks for few-shot learning within randomly initialized NNs. We empirically validated that Meta-ticket successfully discover sparse subnetworks that can learn specialized features for each given task. Due to this task-wise adaptation ability, Meta-ticket achieves superior meta-generalization compared to MAML-based methods especially with large NNs. The code is available at: https://github.com/dchiji-ntt/meta-ticket

artificial intelligence, machine learning, meta-ticket, (17 more...)

2205.15619

Country:

North America > United States > California (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceSep-20-2022

MAC: A Meta-Learning Approach for Feature Learning and Recombination

Tiwari, S., Gogoi, M., Verma, S., Singh, K. P.

Optimization-based meta-learning aims to learn an initialization so that a new unseen task can be learned within a few gradient updates. Model Agnostic Meta-Learning (MAML) is a benchmark algorithm comprising two optimization loops. The inner loop is dedicated to learning a new task and the outer loop leads to meta-initialization. However, ANIL (almost no inner loop) algorithm shows that feature reuse is an alternative to rapid learning in MAML. Thus, the meta-initialization phase makes MAML primed for feature reuse and obviates the need for rapid learning. Contrary to ANIL, we hypothesize that there may be a need to learn new features during meta-testing. A new unseen task from non-similar distribution would necessitate rapid learning in addition reuse and recombination of existing features. In this paper, we invoke the width-depth duality of neural networks, wherein, we increase the width of the network by adding extra computational units (ACU). The ACUs enable the learning of new atomic features in the meta-testing task, and the associated increased width facilitates information propagation in the forwarding pass. The newly learnt features combine with existing features in the last layer for meta-learning. Experimental results show that our proposed MAC method outperformed existing ANIL algorithm for non-similar task distribution by approximately 13% (5-shot task setting)

artificial intelligence, machine learning, springer nature 2021, (16 more...)

2209.09613

Country: Asia > India > Uttar Pradesh (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)