Sparse Gaussian Neural Processes

Rochussen, Tommy, Fortuin, Vincent

arXiv.org Machine Learning 

While many models have been developed that can produce such probabilistic predictions, it is often the case that predictions are required for multiple related tasks, such that it would be desirable to have a probabilistic model that can make rapid predictions on new tasks without the need for task-specific training. Such is the case in the probabilistic meta-learning paradigm. While meta-learning has received an abundance of attention from the research community over the last decade (Finn et al., 2017; Gordon et al., 2019; Hospedales et al., 2022), the most notable class of probabilistic meta-model is, without doubt, the neural process family (NP; Garnelo et al., 2018a,b; Dubois et al., 2020). Recent advances in NPs have led them to reach astonishing heights in performance, representing the state-of-the-art in data-based approaches to weather and climate modeling (Bodnar et al., 2024; Allen et al., 2025; Ashman et al., 2024b), for example. Despite such impressive performance, industry practitioners seldom opt for deep learning models owing to their inherent lack of interpretability (Li et al., 2022), and instead prefer more traditional approaches such as kernel methods (Hofmann et al., 2008) that are easier to explain to non-technical stakeholders, even if they are incapable of meta-learning. Perhaps the most ubiquitous probabilistic model that practitioners turn to is the Gaussian process (GP; Rasmussen and Williams, 2005). With GPs, users can leverage their domain expertise to specify meaningful priors with which to bias predictions, any free parameters tend to have clear interpretations, and schemes such as automatic relevance T. Rochussen & V. Fortuin.