Goto

Collaborating Authors

 model identification


RoFL: Robust Fingerprinting of Language Models

Tsai, Yun-Yun, Guo, Chuan, Yang, Junfeng, van der Maaten, Laurens

arXiv.org Artificial Intelligence

AI developers are releasing large language models (LLMs) under a variety of different licenses. Many of these licenses restrict the ways in which the models or their outputs may be used. This raises the question how license violations may be recognized. In particular, how can we identify that an API or product uses (an adapted version of) a particular LLM? We present a new method that enable model developers to perform such identification via fingerprints: statistical patterns that are unique to the developer's model and robust to common alterations of that model. Our method permits model identification in a black-box setting using a limited number of queries, enabling identification of models that can only be accessed via an API or product. The fingerprints are non-invasive: our method does not require any changes to the model during training, hence by design, it does not impact model quality. Empirically, we find our method provides a high degree of robustness to common changes in the model or inference settings. In our experiments, it substantially outperforms prior art, including invasive methods that explicitly train watermarks into the model.


CGI: Identifying Conditional Generative Models with Example Images

Zhou, Zhi, Tan, Hao-Zhe, Song, Peng-Xiao, Guo, Lan-Zhe

arXiv.org Artificial Intelligence

Generative models have achieved remarkable performance recently, and thus model hubs have emerged. Existing model hubs typically assume basic text matching is sufficient to search for models. However, in reality, due to different abstractions and the large number of models in model hubs, it is not easy for users to review model descriptions and example images, choosing which model best meets their needs. Therefore, it is necessary to describe model functionality wisely so that future users can efficiently search for the most suitable model for their needs. Efforts to address this issue remain limited. In this paper, we propose Conditional Generative Model Identification (CGI), which aims to provide an effective way to identify the most suitable model using user-provided example images rather than requiring users to manually review a large number of models with example images. To address this problem, we propose the PromptBased Model Identification (PMI) , which can adequately describe model functionality and precisely match requirements with specifications. To evaluate PMI approach and promote related research, we provide a benchmark comprising 65 models and 9100 identification tasks. Extensive experimental and human evaluation results demonstrate that PMI is effective. For instance, 92% of models are correctly identified with significantly better FID scores when four example images are provided.


Data-driven discovery of mechanical models directly from MRI spectral data

Heesterbeek, D. G. J., van Riel, M. H. C., van Leeuwen, T., Berg, C. A. T. van den, Sbrizzi, A.

arXiv.org Artificial Intelligence

Finding interpretable biomechanical models can provide insight into the functionality of organs with regard to physiology and disease. However, identifying broadly applicable dynamical models for in vivo tissue remains challenging. In this proof of concept study we propose a reconstruction framework for data-driven discovery of dynamical models from experimentally obtained undersampled MRI spectral data. The method makes use of the previously developed spectro-dynamic framework which allows for reconstruction of displacement fields at high spatial and temporal resolution required for model identification. The proposed framework combines this method with data-driven discovery of interpretable models using Sparse Identification of Non-linear Dynamics (SINDy). The design of the reconstruction algorithm is such that a symbiotic relation between the reconstruction of the displacement fields and the model identification is created. Our method does not rely on periodicity of the motion. It is successfully validated using spectral data of a dynamic phantom gathered on a clinical MRI scanner. The dynamic phantom is programmed to perform motion adhering to 5 different (non-linear) ordinary differential equations. The proposed framework performed better than a 2-step approach where the displacement fields were first reconstructed from the undersampled data without any information on the model, followed by data-driven discovery of the model using the reconstructed displacement fields. This study serves as a first step in the direction of data-driven discovery of in vivo models.


Structural Constraints for Physics-augmented Learning

Kuang, Simon, Lin, Xinfan

arXiv.org Artificial Intelligence

When the physics is wrong, physics-informed machine learning becomes physics-misinformed machine learning. A powerful black-box model should not be able to conceal misconceived physics. We propose two criteria that can be used to assert integrity that a hybrid (physics plus black-box) model: 0) the black-box model should be unable to replicate the physical model, and 1) any best-fit hybrid model has the same physical parameter as a best-fit standalone physics model. We demonstrate them for a sample nonlinear mechanical system approximated by its small-signal linearization.


Peaking into the Black-box: Prediction Intervals Give Insight into Data-driven Quadrotor Model Reliability

van Beers, Jasper, de Visser, Coen

arXiv.org Artificial Intelligence

Ensuring the reliability and validity of data-driven quadrotor model predictions is essential for their accepted and practical use. This is especially true for grey- and black-box models wherein the mapping of inputs to predictions is not transparent and subsequent reliability notoriously difficult to ascertain. Nonetheless, such techniques are frequently and successfully used to identify quadrotor models. Prediction intervals (PIs) may be employed to provide insight into the consistency and accuracy of model predictions. This paper estimates such PIs for polynomial and Artificial Neural Network (ANN) quadrotor aerodynamic models. Two existing ANN PI estimation techniques - the bootstrap method and the quality driven method - are validated numerically for quadrotor aerodynamic models using an existing high-fidelity quadrotor simulation. Quadrotor aerodynamic models are then identified on real quadrotor flight data to demonstrate their utility and explore their sensitivity to model interpolation and extrapolation. It is found that the ANN-based PIs widen considerably when extrapolating and remain constant, or shrink, when interpolating. While this behaviour also occurs for the polynomial PIs, it is of lower magnitude. The estimated PIs establish probabilistic bounds within which the quadrotor model outputs will likely lie, subject to modelling and measurement uncertainties that are reflected through the PI widths.


A Novel Hierarchical-Classification-Block Based Convolutional Neural Network for Source Camera Model Identification

Zunaed, Mohammad, Fattah, Shaikh Anowarul

arXiv.org Artificial Intelligence

Digital security has been an active area of research interest due to the rapid adaptation of internet infrastructure, the increasing popularity of social media, and digital cameras. Due to inherent differences in working principles to generate an image, different camera brands left behind different intrinsic processing noises which can be used to identify the camera brand. In the last decade, many signal processing and deep learning-based methods have been proposed to identify and isolate this noise from the scene details in an image to detect the source camera brand. One prominent solution is to utilize a hierarchical classification system rather than the traditional single-classifier approach. Different individual networks are used for brand-level and model-level source camera identification. This approach allows for better scaling and requires minimal modifications for adding a new camera brand/model to the solution. However, using different full-fledged networks for both brand and model-level classification substantially increases memory consumption and training complexity. Moreover, extracted low-level features from the different network's initial layers often coincide, resulting in redundant weights. To mitigate the training and memory complexity, we propose a classifier-block-level hierarchical system instead of a network-level one for source camera model classification. Our proposed approach not only results in significantly fewer parameters but also retains the capability to add a new camera model with minimal modification. Thorough experimentation on the publicly available Dresden dataset shows that our proposed approach can achieve the same level of state-of-the-art performance but requires fewer parameters compared to a state-of-the-art network-level hierarchical-based system.


Sensing Theorems for Unsupervised Learning in Linear Inverse Problems

Tachella, Julián, Chen, Dongdong, Davies, Mike

arXiv.org Artificial Intelligence

Solving an ill-posed linear inverse problem requires knowledge about the underlying signal model. In many applications, this model is a priori unknown and has to be learned from data. However, it is impossible to learn the model using observations obtained via a single incomplete measurement operator, as there is no information about the signal model in the nullspace of the operator, resulting in a chicken-and-egg problem: to learn the model we need reconstructed signals, but to reconstruct the signals we need to know the model. Two ways to overcome this limitation are using multiple measurement operators or assuming that the signal model is invariant to a certain group action. In this paper, we present necessary and sufficient sensing conditions for learning the signal model from measurement data alone which only depend on the dimension of the model and the number of operators or properties of the group action that the model is invariant to. As our results are agnostic of the learning algorithm, they shed light into the fundamental limitations of learning from incomplete data and have implications in a wide range set of practical algorithms, such as dictionary learning, matrix completion and deep neural networks.


An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification

Bao, Runxue, Gu, Bin, Huang, Heng

arXiv.org Machine Learning

Sparsity regularized loss minimization problems play an important role in various fields including machine learning, data mining, and modern statistics. Proximal gradient descent method and coordinate descent method are the most popular approaches to solving the minimization problem. Although existing methods can achieve implicit model identification, aka support set identification, in a finite number of iterations, these methods still suffer from huge computational costs and memory burdens in high-dimensional scenarios. The reason is that the support set identification in these methods is implicit and thus cannot explicitly identify the low-complexity structure in practice, namely, they cannot discard useless coefficients of the associated features to achieve algorithmic acceleration via dimension reduction. To address this challenge, we propose a novel accelerated doubly stochastic gradient descent (ADSGD) method for sparsity regularized loss minimization problems, which can reduce the number of block iterations by eliminating inactive coefficients during the optimization process and eventually achieve faster explicit model identification and improve the algorithm efficiency. Theoretically, we first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity. More importantly, we prove that ADSGD can achieve a linear rate of explicit model identification. Numerically, experimental results on benchmark datasets confirm the efficiency of our proposed method.


Robust Federated Learning for execution time-based device model identification under label-flipping attack

Sánchez, Pedro Miguel Sánchez, Celdrán, Alberto Huertas, Rubio, José Rafael Buendía, Bovet, Gérôme, Pérez, Gregorio Martínez

arXiv.org Artificial Intelligence

The computing device deployment explosion experienced in recent years, motivated by the advances of technologies such as Internet-of-Things (IoT) and 5G, has led to a global scenario with increasing cybersecurity risks and threats. Among them, device spoofing and impersonation cyberattacks stand out due to their impact and, usually, low complexity required to be launched. To solve this issue, several solutions have emerged to identify device models and types based on the combination of behavioral fingerprinting and Machine/Deep Learning (ML/DL) techniques. However, these solutions are not appropriated for scenarios where data privacy and protection is a must, as they require data centralization for processing. In this context, newer approaches such as Federated Learning (FL) have not been fully explored yet, especially when malicious clients are present in the scenario setup. The present work analyzes and compares the device model identification performance of a centralized DL model with an FL one while using execution time-based events. For experimental purposes, a dataset containing execution-time features of 55 Raspberry Pis belonging to four different models has been collected and published. Using this dataset, the proposed solution achieved 0.9999 accuracy in both setups, centralized and federated, showing no performance decrease while preserving data privacy. Later, the impact of a label-flipping attack during the federated model training is evaluated, using several aggregation mechanisms as countermeasure. Zeno and coordinate-wise median aggregation show the best performance, although their performance greatly degrades when the percentage of fully malicious clients (all training samples poisoned) grows over 50%.


Model identification and local linear convergence of coordinate descent

Klopfenstein, Quentin, Bertrand, Quentin, Gramfort, Alexandre, Salmon, Joseph, Vaiter, Samuel

arXiv.org Machine Learning

For composite nonsmooth optimization problems, Forward-Backward algorithm achieves model identification (e.g. support identification for the Lasso) after a finite number of iterations, provided the objective function is regular enough. Results concerning coordinate descent are scarcer and model identification has only been shown for specific estimators, the support-vector machine for instance. In this work, we show that cyclic coordinate descent achieves model identification in finite time for a wide class of functions. In addition, we prove explicit local linear convergence rates for coordinate descent. Extensive experiments on various estimators and on real datasets demonstrate that these rates match well empirical results.