AITopics | anil

Collaborating Authors

anil

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters

Neural Information Processing SystemsFeb-9-2026, 05:14:45 GMT

However, the theoretical convergence of ANIL has not been studied yet.

artificial intelligence, inner loop, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters

Neural Information Processing SystemsDec-24-2025, 06:11:56 GMT

Although model-agnostic meta-learning (MAML) is a very successful algorithm in meta-learning practice, it can have high computational cost because it updates all model parameters over both the inner loop of task-specific adaptation and the outer-loop of meta initialization training. A more efficient algorithm ANIL (which refers to almost no inner loop) was proposed recently by Raghu et al. 2019, which adapts only a small subset of parameters in the inner loop and thus has substantially less computational cost than MAML as demonstrated by extensive experiments. However, the theoretical convergence of ANIL has not been studied yet. In this paper, we characterize the convergence rate and the computational complexity for ANIL under two representative inner-loop loss geometries, i.e., strongly-convexity and nonconvexity. Our results show that such a geometric property can significantly affect the overall convergence performance of ANIL. For example, ANIL achieves a faster convergence rate for a strongly-convex inner-loop loss as the number $N$ of inner-loop gradient descent steps increases, but a slower convergence rate for a nonconvex inner-loop loss as $N$ increases. Moreover, our complexity analysis provides a theoretical quantification on the improved efficiency of ANIL over MAML.

convergence, meta-learning, task-specific adaptation, (11 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

84c578f202616448a2f80e6f56d5f16d-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 23:12:04 GMT

algorithm, anil, inner loop, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters

Neural Information Processing SystemsMay-27-2025, 04:47:11 GMT

Although model-agnostic meta-learning (MAML) is a very successful algorithm in meta-learning practice, it can have high computational cost because it updates all model parameters over both the inner loop of task-specific adaptation and the outer-loop of meta initialization training. A more efficient algorithm ANIL (which refers to almost no inner loop) was proposed recently by Raghu et al. 2019, which adapts only a small subset of parameters in the inner loop and thus has substantially less computational cost than MAML as demonstrated by extensive experiments. However, the theoretical convergence of ANIL has not been studied yet. In this paper, we characterize the convergence rate and the computational complexity for ANIL under two representative inner-loop loss geometries, i.e., strongly-convexity and nonconvexity. Our results show that such a geometric property can significantly affect the overall convergence performance of ANIL. For example, ANIL achieves a faster convergence rate for a strongly-convex inner-loop loss as the number N of inner-loop gradient descent steps increases, but a slower convergence rate for a nonconvex inner-loop loss as N increases.

artificial intelligence, machine learning, task-specific adaptation, (9 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Infinite Width Limits of Self Supervised Neural Networks

Fleissner, Maximilian, Anil, Gautham Govind, Ghoshdastidar, Debarghya

arXiv.org Artificial IntelligenceNov-17-2024

The NTK is a widely used tool in the theoretical analysis of deep learning, allowing us to look at supervised deep neural networks through the lenses of kernel regression. Recently, several works have investigated kernel models for self-supervised learning, hypothesizing that these also shed light on the behaviour of wide neural networks by virtue of the NTK. However, it remains an open question to what extent this connection is mathematically sound -- it is a commonly encountered misbelief that the kernel behaviour of wide neural networks emerges irrespective of the loss function it is trained on. In this paper, we bridge the gap between the NTK and self-supervised learning, focusing on two-layer neural networks trained under the Barlow Twins loss. We prove that the NTK of Barlow Twins indeed becomes constant as the width of the network approaches infinity. Our analysis technique is different from previous works on the NTK and may be of independent interest. Overall, our work provides a first rigorous justification for the use of classic kernel theory to understand self-supervised learning of wide neural networks. Building on this result, we derive generalization error bounds for kernelized Barlow Twins and connect them to neural networks of finite width.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2411.11176

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters

Neural Information Processing SystemsOct-10-2024, 16:40:21 GMT

Although model-agnostic meta-learning (MAML) is a very successful algorithm in meta-learning practice, it can have high computational cost because it updates all model parameters over both the inner loop of task-specific adaptation and the outer-loop of meta initialization training. A more efficient algorithm ANIL (which refers to almost no inner loop) was proposed recently by Raghu et al. 2019, which adapts only a small subset of parameters in the inner loop and thus has substantially less computational cost than MAML as demonstrated by extensive experiments. However, the theoretical convergence of ANIL has not been studied yet. In this paper, we characterize the convergence rate and the computational complexity for ANIL under two representative inner-loop loss geometries, i.e., strongly-convexity and nonconvexity. Our results show that such a geometric property can significantly affect the overall convergence performance of ANIL. For example, ANIL achieves a faster convergence rate for a strongly-convex inner-loop loss as the number N of inner-loop gradient descent steps increases, but a slower convergence rate for a nonconvex inner-loop loss as N increases.

convergence rate, partial parameter, task-specific adaptation, (7 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Few-shot Multi-Task Learning of Linear Invariant Features with Meta Subspace Pursuit

Zhang, Chaozhi, Liu, Lin, Zhang, Xiaoqun

arXiv.org Artificial IntelligenceSep-4-2024

Data scarcity poses a serious threat to modern machine learning and artificial intelligence, as their practical success typically relies on the availability of big datasets. One effective strategy to mitigate the issue of insufficient data is to first harness information from other data sources possessing certain similarities in the study design stage, and then employ the multi-task or meta learning framework in the analysis stage. In this paper, we focus on multi-task (or multi-source) linear models whose coefficients across tasks share an invariant low-rank component, a popular structural assumption considered in the recent multi-task or meta learning literature. Under this assumption, we propose a new algorithm, called Meta Subspace Pursuit (abbreviated as Meta-SP), that provably learns this invariant subspace shared by different tasks. Under this stylized setup for multi-task or meta learning, we establish both the algorithmic and statistical guarantees of the proposed method. Extensive numerical experiments are conducted, comparing Meta-SP against several competing methods, including popular, off-the-shelf model-agnostic meta learning algorithms such as ANIL. These experiments demonstrate that Meta-SP achieves superior performance over the competing methods in various aspects.

algorithm, iteration, meta-sp, (17 more...)

arXiv.org Artificial Intelligence

2409.02708

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > California (0.04)

Genre: Research Report > Experimental Study (0.34)

Industry:

Health & Medicine (0.68)
Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Exploring Active Learning in Meta-Learning: Enhancing Context Set Labeling

Bae, Wonho, Wang, Jing, Sutherland, Danica J.

arXiv.org Artificial IntelligenceNov-6-2023

Most meta-learning methods assume that the (very small) context set used to establish a new task at test time is passively provided. In some settings, however, it is feasible to actively select which points to label; the potential gain from a careful choice is substantial, but the setting requires major differences from typical active learning setups. We clarify the ways in which active meta-learning can be used to label a context set, depending on which parts of the meta-learning process use active learning. Within this framework, we propose a natural algorithm based on fitting Gaussian mixtures for selecting which points to label; though simple, the algorithm also has theoretical motivation. The proposed algorithm outperforms state-of-the-art active learning methods when used with various meta-learning algorithms across several benchmark datasets. Meta-learning has gained significant prominence as a substitute for traditional "plain" supervised learning tasks, with the aim to adapt or generalize to new tasks given extremely limited data. There has been enormous success compared to learning "from scratch" on each new problem, but could we do even better, with even less data? One major way to improve data-efficiency in standard supervised learning settings is to move to an active learning paradigm, where typically a model can request a small number of labels from a pool of unlabeled data; these are collected, used to further train the model, and the process is repeated. Although each of these lines of research are quite developed, their combination - active meta-learning - has seen comparatively little research attention. Given that both focus on improving data efficiency, it seems very natural to investigate further. How can a meta-learner exploit an active learning setup to learn the best model possible, using only a very small number of labels in its context sets? We are aware of two previous attempts at active selection of context sets in meta-learning: Müller et al. (2022) do so at meta-training time for text classification, while Boney & Ilin (2017) do it at meta-test time in semi-supervised few-shot image classification with ProtoNet (Snell et al., 2017). "Active meta-learning" thus means very different things in their procedures; these approaches are also entirely different from work on active selection of tasks during meta-training (as in Kaddour et al., 2020; Nikoloska & Simeone, 2022; Kumar et al., 2022). Our first contribution is therefore to clarify the different ways in which active learning can be applied to meta-learning, for differing purposes.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2311.02879

Country:

North America > Canada > British Columbia (0.04)
North America > United States > Virginia (0.04)
North America > United States > California (0.04)
(2 more...)

Genre:

Overview (0.87)
Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation

Wang, Haoxiang, Zhao, Han, Li, Bo

arXiv.org Machine LearningJun-16-2021

Multi-task learning (MTL) aims to improve the generalization of several related tasks by learning them jointly. As a comparison, in addition to the joint training scheme, modern meta-learning allows unseen tasks with limited labels during the test phase, in the hope of fast adaptation over them. Despite the subtle difference between MTL and meta-learning in the problem formulation, both learning paradigms share the same insight that the shared structure between existing training tasks could lead to better generalization and adaptation. In this paper, we take one important step further to understand the close connection between these two learning paradigms, through both theoretical analysis and empirical investigation. Theoretically, we first demonstrate that MTL shares the same optimization formulation with a class of gradient-based meta-learning (GBML) algorithms. We then prove that for over-parameterized neural networks with sufficient depth, the learned predictive functions of MTL and GBML are close. In particular, this result implies that the predictions given by these two models are similar over the same unseen task. Empirically, we corroborate our theoretical findings by showing that, with proper implementation, MTL is competitive against state-of-the-art GBML algorithms on a set of few-shot image classification benchmarks. Since existing GBML algorithms often involve costly second-order bi-level optimization, our first-order MTL method is an order of magnitude faster on large-scale datasets such as mini-ImageNet. We believe this work could help bridge the gap between these two learning paradigms, and provide a computationally efficient alternative to GBML that also supports fast task adaptation.

anil, bridging multi-task learning, mtl, (14 more...)

arXiv.org Machine Learning

2106.09017

Country: