ncf
Recommending Composite Items Using Multi-Level Preference Information: A Joint Interaction Modeling Approach
Bi, Xuan, Wang, Yaqiong, Adomavicius, Gediminas, Curley, Shawn
Recommender systems have become ubiquitous across a wide range of fields, such as ecommerce, media consumption (including movies, books, music, news, etc.), social networks, finance, and many others, due to their effectiveness in identifying relevant items or content among numerous choices [1, 2]. Traditionally, recommender systems, largely based on collaborative filtering techniques, have focused on recommending individual (or "atomic") items, such as movies or books, by understanding users' preferences for these individual items. However, in certain application domains, recommending "composite" items (i.e., combinations of atomic items) represents a very important capability. For illustration, consider a clothing/fashion recommender system, where we want to recommend "outfits" - combinations of tops (t-shirts, shirts, sweaters) and bottoms (pants, skirts, shorts) - to users. In such a case, multiple fashion items in a recommended outfit ideally have to match both functionally and stylistically, which may require domain expertise (e.g., on things like style compatibility) beyond individual preferences. Another key challenge for such recommender systems is that a given user's personal preference for a composite item may not directly translate to the user's personal preferences for the underlying atomic items and vice versa.
Fair Play for Individuals, Foul Play for Groups? Auditing Anonymization's Impact on ML Fairness
Arcolezi, Hรฉber H., Alishahi, Mina, Bendoukha, Adda-Akram, Kaaniche, Nesrine
Machine learning (ML) algorithms are heavily based on the availability of training data, which, depending on the domain, often includes sensitive information about data providers. This raises critical privacy concerns. Anonymization techniques have emerged as a practical solution to address these issues by generalizing features or suppressing data to make it more difficult to accurately identify individuals. Although recent studies have shown that privacy-enhancing technologies can influence ML predictions across different subgroups, thus affecting fair decision-making, the specific effects of anonymization techniques, such as $k$-anonymity, $\ell$-diversity, and $t$-closeness, on ML fairness remain largely unexplored. In this work, we systematically audit the impact of anonymization techniques on ML fairness, evaluating both individual and group fairness. Our quantitative study reveals that anonymization can degrade group fairness metrics by up to fourfold. Conversely, similarity-based individual fairness metrics tend to improve under stronger anonymization, largely as a result of increased input homogeneity. By analyzing varying levels of anonymization across diverse privacy settings and data distributions, this study provides critical insights into the trade-offs between privacy, fairness, and utility, offering actionable guidelines for responsible AI development. Our code is publicly available at: https://github.com/hharcolezi/anonymity-impact-fairness.
Learning Neural Networks by Neuron Pursuit
The first part of this paper studies the evolution of gradient flow for homogeneous neural networks near a class of saddle points exhibiting a sparsity structure. The choice of these saddle points is motivated from previous works on homogeneous networks, which identified the first saddle point encountered by gradient flow after escaping the origin. It is shown here that, when initialized sufficiently close to such saddle points, gradient flow remains near the saddle point for a sufficiently long time, during which the set of weights with small norm remain small but converge in direction. Furthermore, important empirical observations are made on the behavior of gradient descent after escaping these saddle points. The second part of the paper, motivated by these results, introduces a greedy algorithm to train deep neural networks called Neuron Pursuit (NP). It is an iterative procedure which alternates between expanding the network by adding neuron(s) with carefully chosen weights, and minimizing the training loss using this augmented network. The efficacy of the proposed algorithm is validated using numerical experiments.
Neural Conjugate Flows: Physics-informed architectures with flow structure
Bizzi, Arthur, Nissenbaum, Lucas, Pereira, Joรฃo M.
We introduce Neural Conjugate Flows (NCF), a class of neural network architectures equipped with exact flow structure. By leveraging topological conjugation, we prove that these networks are not only naturally isomorphic to a continuous group, but are also universal approximators for flows of ordinary differential equation (ODEs). Furthermore, topological properties of these flows can be enforced by the architecture in an interpretable manner. We demonstrate in numerical experiments how this topological group structure leads to concrete computational gains over other physics informed neural networks in estimating and extrapolating latent dynamics of ODEs, while training up to five times faster than other flow-based architectures.
Extending Contextual Self-Modulation: Meta-Learning Across Modalities, Task Dimensionalities, and Data Regimes
Nzoyem, Roussel Desmond, Barton, David A. W., Deakin, Tom
Contextual Self-Modulation (CSM) is a potent regularization mechanism for the Neural Context Flow (NCF) framework which demonstrates powerful metalearning of physical systems. However, CSM has limitations in its applicability across different modalities and in high-data regimes. In this work, we introduce two extensions: iCSM, which expands CSM to infinite-dimensional tasks, and StochasticNCF, which improves scalability. These extensions are demonstrated through comprehensive experimentation on a range of tasks, including dynamical systems with parameter variations, computer vision challenges, and curve fitting problems. StochasticNCF enables the application of both CSM and iCSM to high-data scenarios by providing an unbiased approximation of meta-gradient updates through a sampled set of nearest environments. Additionally, we incorporate higher-order Taylor expansions via Taylor-Mode automatic differentiation, revealing that higher-order approximations do not necessarily enhance generalization. Finally, we demonstrate how CSM can be integrated into other meta-learning frameworks with FlashCAVIA, a computationally efficient extension of the CAVIA meta-learning framework (Zintgraf et al. 2019). FlashCAVIA outperforms its predecessor across various benchmarks and reinforces the utility of bi-level optimization techniques. Together, these contributions establish a robust framework for tackling an expanded spectrum of meta-learning tasks, offering practical insights for out-of-distribution generalization. Our opensourced library, designed for flexible integration of self-modulation into contextual meta-learning workflows, is available at github.com/ddrous/self-mod. Meta-learning has emerged as a powerful paradigm in machine learning, addressing the limitations of conventional approaches that train a single algorithm for a specific task. This innovative technique aims to develop models capable of rapid adaptation to novel but related tasks with minimal data, a process often referred to as "learning to learn" (Wang et al., 2021). By leveraging common information across multiple training environments (or meta-knowledge), meta-learning algorithms can efficiently adapt to new scenarios without starting from scratch (Hospedales et al., 2021). The success of meta-learning has been demonstrated in various domains, including dynamical system reconstruction (Norcliffe et al., 2021), program induction (Devlin et al., 2017), out-of-distribution (OoD) generalization (Yao et al., 2021), and continual learning (Hurtado et al., 2021).
Neural Context Flows for Learning Generalizable Dynamical Systems
Nzoyem, Roussel Desmond, Barton, David A. W., Deakin, Tom
Neural Ordinary Differential Equations typically struggle to generalize to new dynamical behaviors created by parameter changes in the underlying system, even when the dynamics are close to previously seen behaviors. The issue gets worse when the changing parameters are unobserved, i.e., their value or influence is not directly measurable when collecting data. We introduce Neural Context Flow (NCF), a framework that encodes said unobserved parameters in a latent context vector as input to a vector field. NCFs leverage differentiability of the vector field with respect to the parameters, along with first-order Taylor expansion to allow any context vector to influence trajectories from other parameters. We validate our method and compare it to established Multi-Task and Meta-Learning alternatives, showing competitive performance in mean squared error for in-domain and out-of-distribution evaluation on the Lotka-Volterra, Glycolytic Oscillator, and Gray-Scott problems. This study holds practical implications for foundational models in science and related areas that benefit from conditional neural ODEs. Our code is openly available at https://github.com/ddrous/ncflow.
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Neural networks have achieved remarkable success across various tasks, yet the precise mechanism driving this success remains theoretically elusive. The training of neural networks involve optimizing a non-convex loss function, where the training algorithm typically is a first-order methods such as gradient descent or its variants. A particularly puzzling aspect is how these training algorithms succeed in finding a solution with good generalization capabilities despite the non-convexity of the loss landscape. In addition to the choice of the training algorithm, the choice of initialization in these algorithms play a crucial role in determining the neural network performance. Indeed, recent works have made increasingly clear the benefit of small initializations, revealing that neural networks trained using (stochastic) gradient descent with small initializations exhibit feature learning [2] and also generalize better for various tasks [3, 4, 5]; see Section 2 for more details into the impact of initialization scale. However, for small initializations, the training dynamics of neural networks is extremely non-linear and not well understood so far. Our focus in this paper is on understanding the effect of small initialization on the training dynamics of neural networks. In pursuit of a deeper understanding of the training mechanism for small initializations, researchers have uncovered the phenomenon of directional convergence in the neural network weights during early phases of training [6, 7]. In [6], authors study the gradient flow dynamics of training two-layer Rectified Linear Unit (ReLU) neural networks.
Neural Contact Fields: Tracking Extrinsic Contact with Tactile Sensing
Higuera, Carolina, Dong, Siyuan, Boots, Byron, Mukadam, Mustafa
We present Neural Contact Fields, a method that brings together neural fields and tactile sensing to address the problem of tracking extrinsic contact between object and environment. Knowing where the external contact occurs is a first step towards methods that can actively control it in facilitating downstream manipulation tasks. Prior work for localizing environmental contacts typically assume a contact type (e.g. point or line), does not capture contact/no-contact transitions, and only works with basic geometric-shaped objects. Neural Contact Fields are the first method that can track arbitrary multi-modal extrinsic contacts without making any assumptions about the contact type. Our key insight is to estimate the probability of contact for any 3D point in the latent space of object shapes, given vision-based tactile inputs that sense the local motion resulting from the external contact. In experiments, we find that Neural Contact Fields are able to localize multiple contact patches without making any assumptions about the geometry of the contact, and capture contact/no-contact transitions for known categories of objects with unseen shapes in unseen environment configurations. In addition to Neural Contact Fields, we also release our YCB-Extrinsic-Contact dataset of simulated extrinsic contact interactions to enable further research in this area. Project page: https://github.com/carolinahiguera/NCF