Goto

Collaborating Authors

 Rusu, Cristian


Detecting and Mitigating DDoS Attacks with AI: A Survey

arXiv.org Artificial Intelligence

Distributed Denial of Service attacks represent an active cybersecurity research problem. Recent research shifted from static rule-based defenses towards AI-based detection and mitigation. This comprehensive survey covers several key topics. Preeminently, state-of-the-art AI detection methods are discussed. An in-depth taxonomy based on manual expert hierarchies and an AI-generated dendrogram are provided, thus settling DDoS categorization ambiguities. An important discussion on available datasets follows, covering data format options and their role in training AI detection methods together with adversarial training and examples augmentation. Beyond detection, AI based mitigation techniques are surveyed as well. Finally, multiple open research directions are proposed.


Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook

arXiv.org Artificial Intelligence

With the recent advancements in generative modeling, the realism of deepfake content has been increasing at a steady pace, even reaching the point where people often fail to detect manipulated media content online, thus being deceived into various kinds of scams. In this paper, we survey deepfake generation and detection techniques, including the most recent developments in the field, such as diffusion models and Neural Radiance Fields. Our literature review covers all deepfake media types, comprising image, video, audio and multimodal (audio-visual) content. We identify various kinds of deepfakes, according to the procedure used to alter or generate the fake content. We further construct a taxonomy of deepfake generation and detection methods, illustrating the important groups of methods and the domains where these methods are applied. Next, we gather datasets used for deepfake detection and provide updated rankings of the best performing deepfake detectors on the most popular datasets. In addition, we develop a novel multimodal benchmark to evaluate deepfake detectors on out-of-distribution content. The results indicate that state-of-the-art detectors fail to generalize to deepfake content generated by unseen deepfake generators. Finally, we propose future directions to obtain robust and powerful deepfake detectors. Our project page and new benchmark are available at https://github.com/CroitoruAlin/biodeep.


Learning Explicitly Conditioned Sparsifying Transforms

arXiv.org Artificial Intelligence

Sparsifying transforms became in the last decades widely known tools for finding structured sparse representations of signals in certain transform domains. Despite the popularity of classical transforms such as DCT and Wavelet, learning optimal transforms that guarantee good representations of data into the sparse domain has been recently analyzed in a series of papers. Typically, the conditioning number and representation ability are complementary key features of learning square transforms that may not be explicitly controlled in a given optimization model. Unlike the existing approaches from the literature, in our paper, we consider a new sparsifying transform model that enforces explicit control over the data representation quality and the condition number of the learned transforms. We confirm through numerical experiments that our model presents better numerical behavior than the state-of-the-art.


Kernel t-distributed stochastic neighbor embedding

arXiv.org Artificial Intelligence

This paper presents a kernelized version of the t-SNE algorithm, capable of mapping high-dimensional data to a low-dimensional space while preserving the pairwise distances between the data points in a non-Euclidean metric. This can be achieved using a kernel trick only in the high dimensional space or in both spaces, leading to an end-to-end kernelized version. The proposed kernelized version of the t-SNE algorithm can offer new views on the relationships between data points, which can improve performance and accuracy in particular applications, such as classification problems involving kernel methods. The differences between t-SNE and its kernelized version are illustrated for several datasets, showing a neater clustering of points belonging to different classes.


Efficient and Parallel Separable Dictionary Learning

arXiv.org Machine Learning

Separable, or Kronecker product, dictionaries provide natural decompositions for 2D signals, such as images. In this paper, we describe an algorithm to learn such dictionaries which is highly parallelizable and which reaches sparse representations competitive with the previous state of the art dictionary learning algorithms from the literature. We highlight the performance of the proposed method to sparsely represent image data and for image denoising applications.


Fast approximation of orthogonal matrices and application to PCA

arXiv.org Machine Learning

We study the problem of approximating orthogonal matrices so that their application is numerically fast and yet accurate. We find an approximation by solving an optimization problem over a set of structured matrices, that we call Givens transformations, including Givens rotations as a special case. We propose an efficient greedy algorithm to solve such a problem and show that it strikes a balance between approximation accuracy and speed of computation. The proposed approach is relevant in spectral methods and we illustrate its application to PCA.


Learning Multiplication-free Linear Transformations

arXiv.org Machine Learning

Abstract--In this paper, we propose several dictionary learning algorithms for sparse representations that also impose specific structures on the learned dictionaries such that they are numerically efficientto use: reduced number of addition/multiplications and even avoiding multiplications altogether. We base our work on factorizations of the dictionary in highly structured basic building blocks (binary orthonormal, scaling and shear transformations) forwhich we can write closed-form solutions to the optimization problemsthat we consider. We show the effectiveness of our methods on image data where we can compare against wellknown numericallyefficient transforms such as the fast Fourier and the fast discrete cosine transforms. I. INTRODUCTION In many situations, the success of theoretical concepts in signal processing applications depends on there existing an accompanying algorithmic implementation that is numerically efficient, e.g., Fourier analysis and the fast Fourier transform (FFT) or wavelet theory and the fast wavelet transform (FWT). Unfortunately, in a machine learning scenario where linear transformations are learned they do not exhibit in general advantageous numerical properties, as do the examples just mentioned, unless we explicitly search for such solutions.


On learning with shift-invariant structures

arXiv.org Machine Learning

Abstract--We describe new results and algorithms for two different, but related, problems which deal with circulant matrices: learningshift-invariant components from training data and calculating the shift (or alignment) between two given signals. In the first instance, we deal with the shift-invariant dictionary learning problem while the latter bears the name of (compressive) shift retrieval. We formulate these problems using circulant and convolutional matrices (including unions of such matrices), define optimization problems that describe our goals and propose efficient ways to solve them. Based on these findings, we also show how to learn a wavelet-like dictionary from training data. We connect our work with various previous results from the literature and we show the effectiveness of our proposed algorithms using synthetic, ECG signals and images.


Approximate Eigenvalue Decompositions of Linear Transformations with a Few Householder Reflectors

arXiv.org Machine Learning

The ability to decompose a signal in an orthonormal basis (a set of orthogonal components, each normalized to have unit length) using a fast numerical procedure rests at the heart of many signal processing methods and applications. The classic examples are the Fourier and wavelet transforms that enjoy numerically efficient implementations (FFT and FWT, respectively). Unfortunately, orthonormal transformations are in general unstructured, and therefore they do not enjoy low computational complexity properties. In this paper, based on Householder reflectors, we introduce a class of orthonormal matrices that are numerically efficient to manipulate: we control the complexity of matrix-vector multiplications with these matrices using a given parameter. We provide numerical algorithms that approximate any orthonormal or symmetric transform with a new orthonormal or symmetric structure made up of products of a given number of Householder reflectors. We show analyses and numerical evidence to highlight the accuracy of the proposed approximations and provide an application to the case of learning fast Mahanalobis distance metric transformations.


Fast Orthonormal Sparsifying Transforms Based on Householder Reflectors

arXiv.org Machine Learning

Dictionary learning is the task of determining a data-dependent transform that yields a sparse representation of some observed data. The dictionary learning problem is non-convex, and usually solved via computationally complex iterative algorithms. Furthermore, the resulting transforms obtained generally lack structure that permits their fast application to data. To address this issue, this paper develops a framework for learning orthonormal dictionaries which are built from products of a few Householder reflectors. Two algorithms are proposed to learn the reflector coefficients: one that considers a sequential update of the reflectors and one with a simultaneous update of all reflectors that imposes an additional internal orthogonal constraint. The proposed methods have low computational complexity and are shown to converge to local minimum points which can be described in terms of the spectral properties of the matrices involved. The resulting dictionaries balance between the computational complexity and the quality of the sparse representations by controlling the number of Householder reflectors in their product. Simulations of the proposed algorithms are shown in the image processing setting where well-known fast transforms are available for comparisons. The proposed algorithms have favorable reconstruction error and the advantage of a fast implementation relative to the classical, unstructured, dictionaries.