Goto

Collaborating Authors

 nfm


Demystifying Network Foundation Models

Beltiukov, Sylee, Guthula, Satyandra, Guo, Wenbo, Willinger, Walter, Gupta, Arpit

arXiv.org Artificial Intelligence

This work presents a systematic investigation into the latent knowledge encoded within Network Foundation Models (NFMs) that focuses on hidden representations analysis rather than pure downstream task performance. Different from existing efforts, we analyze the models through a three-part evaluation: Embedding Geometry Analysis to assess representation space utilization, Metric Alignment Assessment to measure correspondence with domain-expert features, and Causal Sensitivity Testing to evaluate robustness to protocol perturbations. Using five diverse network datasets spanning controlled and real-world environments, we evaluate four state-of-the-art NFMs, revealing that they all exhibit significant anisotropy, inconsistent feature sensitivity patterns, an inability to separate the high-level context, payload dependency, and other properties. Our work identifies numerous limitations across all models and demonstrates that addressing them can significantly improve model performance (by up to +0.35 $F_1$ score without architectural changes).


Neural Fourier Modelling: A Highly Compact Approach to Time-Series Analysis

Kim, Minjung, Hioka, Yusuke, Witbrock, Michael

arXiv.org Machine Learning

Neural time-series analysis has traditionally focused on modeling data in the time domain, often with some approaches incorporating equivalent Fourier domain representations as auxiliary spectral features. In this work, we shift the main focus to frequency representations, modeling time-series data fully and directly in the Fourier domain. We introduce Neural Fourier Modelling (NFM), a compact yet powerful solution for time-series analysis. NFM is grounded in two key properties of the Fourier transform (FT): (i) the ability to model finite-length time series as functions in the Fourier domain, treating them as continuous-time elements in function space, and (ii) the capacity for data manipulation (such as resampling and timespan extension) within the Fourier domain. We reinterpret Fourier-domain data manipulation as frequency extrapolation and interpolation, incorporating this as a core learning mechanism in NFM, applicable across various tasks. To support flexible frequency extension with spectral priors and effective modulation of frequency representations, we propose two learning modules: Learnable Frequency Tokens (LFT) and Implicit Neural Fourier Filters (INFF). These modules enable compact and expressive modeling in the Fourier domain. Extensive experiments demonstrate that NFM achieves state-of-the-art performance on a wide range of tasks (forecasting, anomaly detection, and classification), including challenging time-series scenarios with previously unseen sampling rates at test time. Moreover, NFM is highly compact, requiring fewer than 40K parameters in each task, with time-series lengths ranging from 100 to 16K.


Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features

Radhakrishnan, Adityanarayanan, Beaglehole, Daniel, Pandit, Parthe, Belkin, Mikhail

arXiv.org Artificial Intelligence

In recent years neural networks have achieved impressive results on many technological and scientific tasks. Yet, the mechanism through which these models automatically select features, or patterns in data, for prediction remains unclear. Identifying such a mechanism is key to advancing performance and interpretability of neural networks and promoting reliable adoption of these models in scientific applications. In this paper, we identify and characterize the mechanism through which deep fully connected neural networks learn features. We posit the Deep Neural Feature Ansatz, which states that neural feature learning occurs by implementing the average gradient outer product to up-weight features strongly related to model output. Our ansatz sheds light on various deep learning phenomena including emergence of spurious features and simplicity biases and how pruning networks can increase performance, the "lottery ticket hypothesis." Moreover, the mechanism identified in our work leads to a backpropagation-free method for feature learning with any machine learning model. To demonstrate the effectiveness of this feature learning mechanism, we use it to enable feature learning in classical, non-feature learning models known as kernel machines and show that the resulting models, which we refer to as Recursive Feature Machines, achieve state-of-the-art performance on tabular data.


AI helps with drug discovery

#artificialintelligence

Drug-target interaction is a prominent research area in drug discovery, which refers to the recognition of interactions between chemical compounds and the protein targets. Chemists estimate that 1060 compounds with drug-like properties could be made--that's more than the total number of atoms in the Solar System, as an article reported in the journal Nature in 2017. Drug development, on average, takes about 14 years and costs up to 1.5 billion dollars. During the journey of drug discovery in this vast "galaxy," it is apparent that traditional biological experiments for DTI detection are normally costly and time-consuming. Prof. Hou Tingjun is an expert in computer-aided drug design (CADD) at the Zhejiang University College of Pharmaceutical Sciences.


Noisy Feature Mixup

Lim, Soon Hoe, Erichson, N. Benjamin, Utrera, Francisco, Xu, Winnie, Mahoney, Michael W.

arXiv.org Machine Learning

We introduce Noisy Feature Mixup (NFM), an inexpensive yet effective method for data augmentation that combines the best of interpolation based training and noise injection schemes. Rather than training with convex combinations of pairs of examples and their labels, we use noise-perturbed convex combinations of pairs of data points in both input and feature space. This method includes mixup and manifold mixup as special cases, but it has additional advantages, including better smoothing of decision boundaries and enabling improved model robustness. We provide theory to understand this as well as the implicit regularization effects of NFM. Our theory is supported by empirical results, demonstrating the advantage of NFM, as compared to mixup and manifold mixup. We show that residual networks and vision transformers trained with NFM have favorable trade-offs between predictive accuracy on clean data and robustness with respect to various types of data perturbation across a range of computer vision benchmark datasets.


nfm2021

#artificialintelligence

The widespread use and increasing complexity of mission-critical and safety-critical systems at NASA and in the aerospace industry require advanced techniques that address these systems' specification, design, verification, validation, and certification requirements. The NASA Formal Methods Symposium (NFM) is a forum to foster collaboration between theoreticians and practitioners from NASA, academia, and industry. NFM's goals are to identify challenges and to provide solutions for achieving assurance for such critical systems. New developments and emerging applications like autonomous software for Unmanned Aerial Systems (UAS), UAS Traffic Management (UTM), advanced separation assurance algorithms for aircraft, and the need for system-wide fault detection, diagnosis, and prognostics provide new challenges for system specification, development, and verification approaches. Similar challenges need to be addressed during development and deployment of on-board software for both spacecraft and ground systems.


Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers

Lamb, Alex, Goyal, Anirudh, Słowik, Agnieszka, Mozer, Michael, Beaudoin, Philippe, Bengio, Yoshua

arXiv.org Machine Learning

Feed-forward neural networks consist of a sequence of layers, in which each layer performs some processing on the information from the previous layer. A downside to this approach is that each layer (or module, as multiple modules can operate in parallel) is tasked with processing the entire hidden state, rather than a particular part of the state which is most relevant for that module. Methods which only operate on a small number of input variables are an essential part of most programming languages, and they allow for improved modularity and code re-usability. Our proposed method, Neural Function Modules (NFM), aims to introduce the same structural capability into deep learning. Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems. The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm which, as we show, improves the results in standard classification, out-of-domain generalization, generative modeling, and learning representations in the context of reinforcement learning.