Extending Contextual Self-Modulation: Meta-Learning Across Modalities, Task Dimensionalities, and Data Regimes

Nzoyem, Roussel Desmond, Barton, David A. W., Deakin, Tom

arXiv.org Artificial Intelligence 

Contextual Self-Modulation (CSM) is a potent regularization mechanism for the Neural Context Flow (NCF) framework which demonstrates powerful metalearning of physical systems. However, CSM has limitations in its applicability across different modalities and in high-data regimes. In this work, we introduce two extensions: iCSM, which expands CSM to infinite-dimensional tasks, and StochasticNCF, which improves scalability. These extensions are demonstrated through comprehensive experimentation on a range of tasks, including dynamical systems with parameter variations, computer vision challenges, and curve fitting problems. StochasticNCF enables the application of both CSM and iCSM to high-data scenarios by providing an unbiased approximation of meta-gradient updates through a sampled set of nearest environments. Additionally, we incorporate higher-order Taylor expansions via Taylor-Mode automatic differentiation, revealing that higher-order approximations do not necessarily enhance generalization. Finally, we demonstrate how CSM can be integrated into other meta-learning frameworks with FlashCAVIA, a computationally efficient extension of the CAVIA meta-learning framework (Zintgraf et al. 2019). FlashCAVIA outperforms its predecessor across various benchmarks and reinforces the utility of bi-level optimization techniques. Together, these contributions establish a robust framework for tackling an expanded spectrum of meta-learning tasks, offering practical insights for out-of-distribution generalization. Our opensourced library, designed for flexible integration of self-modulation into contextual meta-learning workflows, is available at github.com/ddrous/self-mod. Meta-learning has emerged as a powerful paradigm in machine learning, addressing the limitations of conventional approaches that train a single algorithm for a specific task. This innovative technique aims to develop models capable of rapid adaptation to novel but related tasks with minimal data, a process often referred to as "learning to learn" (Wang et al., 2021). By leveraging common information across multiple training environments (or meta-knowledge), meta-learning algorithms can efficiently adapt to new scenarios without starting from scratch (Hospedales et al., 2021). The success of meta-learning has been demonstrated in various domains, including dynamical system reconstruction (Norcliffe et al., 2021), program induction (Devlin et al., 2017), out-of-distribution (OoD) generalization (Yao et al., 2021), and continual learning (Hurtado et al., 2021).