AITopics | dual certificate

Collaborating Authors

dual certificate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gaussian Mixture Model with unknown diagonal covariances via continuous sparse regularization

Giard, Romane, de Castro, Yohann, Marteau, Clément

arXiv.org Machine LearningOct-1-2025

This paper addresses the statistical estimation of Gaussian Mixture Models (GMMs) with unknown diagonal covariances from independent and identically distributed samples. We employ the Beurling-LASSO (BLASSO), a convex optimization framework that promotes sparsity in the space of measures, to simultaneously estimate the number of components and their parameters. Our main contribution extends the BLASSO methodology to multivariate GMMs with component-specific unknown diagonal covariance matrices-a significantly more flexible setting than previous approaches requiring known and identical covariances. We establish non-asymptotic recovery guarantees with nearly parametric convergence rates for component means, diagonal covariances, and weights, as well as for density prediction. A key theoretical contribution is the identification of an explicit separation condition on mixture components that enables the construction of non-degenerate dual certificates-essential tools for establishing statistical guarantees for the BLASSO. Our analysis leverages the Fisher-Rao geometry of the statistical model and introduces a novel semi-distance adapted to our framework, providing new insights into the interplay between component separation, parameter space geometry, and achievable statistical recovery.

certificate, ndsc, proof, (17 more...)

arXiv.org Machine Learning

2509.12889

Country:

Europe > France (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Efficient Online Large-Margin Classification via Dual Certificates

Ho-Nguyen, Nam, Kılınç-Karzan, Fatma, Nguyen, Ellie, Shen, Lingqing

arXiv.org Artificial IntelligenceSep-25-2025

Online classification is a central problem in optimization, statistical learning and data science. Classical algorithms such as the perceptron offer efficient updates and finite mistake guarantees on linearly separable data, but they do not exploit the underlying geometric structure of the classification problem. We study the offline maximum margin problem through its dual formulation and use the resulting geometric insights to design a principled and efficient algorithm for the online setting. A key feature of our method is its translation invariance, inherited from the offline formulation, which plays a central role in its performance analysis. Our theoretical analysis yields improved mistake and margin bounds that depend only on translation-invariant quantities, offering stronger guarantees than existing algorithms under the same assumptions in favorable settings. In particular, we identify a parameter regime where our algorithm makes at most two mistakes per sequence, whereas the perceptron can be forced to make arbitrarily many mistakes. Our numerical study on real data further demonstrates that our method matches the computational efficiency of existing online algorithms, while significantly outperforming them in accuracy.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.1967

Country: North America > United States (0.93)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.88)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.90)

Add feedback

Effective regions and kernels in continuous sparse regularisation, with application to sketched mixtures

De Castro, Yohann, Gribonval, Rémi, Jouvin, Nicolas

arXiv.org Machine LearningJul-18-2025

This TV-regularized convex program on the space of measures allows to recover a sparse measure using a noisy observation from an appropriate measurement operator. While previous works have uncovered the central role played by this operator and its associated kernel in order to get estimation error bounds, the latter requires a technical local positive curvature (LPC) assumption to be verified on a case-by-case basis. In practice, this yields only few LPC-kernels for which this condition is proved. At the heart of our contribution lies the kernel switch, which uncouples the model kernel from the LPC assumption: it enables to leverage any known LPC-kernel as a pivot kernel to prove error bounds, provided embedding conditions are verified between the model and pivot RKHS. We increment the list of LPC-kernels, proving that the "sinc-4" kernel, used for signal recovery and mixture problems, does satisfy the LPC assumption. Furthermore, we also show that the BLASSO localisation error around the true support decreases with the noise level, leading to effective near regions. This improves on known results where this error is fixed with some parameters depending on the model kernel. We illustrate the interest of our results in the case of translation-invariant mixture model estimation, using bandlimiting smoothing and sketching techniques to reduce the computational burden of BLASSO.

artificial intelligence, kernel, machine learning, (18 more...)

arXiv.org Machine Learning

2507.08444

Country:

Europe > France (0.04)
Asia > Middle East > Jordan (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science (0.66)

Add feedback

How Does Gradient Descent Learn Features -- A Local Analysis for Regularized Two-Layer Neural Networks

Zhou, Mo, Ge, Rong

arXiv.org Artificial IntelligenceJun-3-2024

Feature learning has long been considered to be a major advantage of neural networks. However, how gradient-based training algorithms can learn useful features is not well-understood. In particular, the most widely applied analysis for overparametrized neural networks is the neural tangent kernel(NTK)(Jacot et al., 2018; Du et al., 2019; Allen-Zhu et al., 2019b). In this setting, the neurons don't move far from their initialization and the features are determined by the network architecture and random initialization (Chizat et al., 2019). While there are empirical and theoretical evidence on the limitation of NTK regime (Chizat et al., 2019; Arora et al., 2019), extending the analysis beyond the NTK regime has been challenging. For 2-layer networks, an alternative framework for analyzing overparametrized neural networks called mean-field analysis was introduced. Earlier mean-field analysis (e.g., Chizat and Bach, 2018; Mei et al., 2018) require either infinite or exponentially many neurons. Later works (e.g., Li et al., 2020; Ge et al., 2021; Bietti et al., 2022; Mahankali et al., 2024) can analyze the training dynamics of mildly overparametrized networks with polynomially many neurons with stronger assumptions on the ground-truth function.

gradient descent, neural network, neuron, (13 more...)

arXiv.org Artificial Intelligence

2406.01766

Country: Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.64)

Add feedback

How robust is randomized blind deconvolution via nuclear norm minimization against adversarial noise?

Kostin, Julia, Krahmer, Felix, Stöger, Dominik

arXiv.org Artificial IntelligenceMar-17-2023

In this paper, we study the problem of recovering two unknown signals from their convolution, which is commonly referred to as blind deconvolution. Reformulation of blind deconvolution as a low-rank recovery problem has led to multiple theoretical recovery guarantees in the past decade due to the success of the nuclear norm minimization heuristic. In particular, in the absence of noise, exact recovery has been established for sufficiently incoherent signals contained in lower-dimensional subspaces. However, if the convolution is corrupted by additive bounded noise, the stability of the recovery problem remains much less understood. In particular, existing reconstruction bounds involve large dimension factors and therefore fail to explain the empirical evidence for dimension-independent robustness of nuclear norm minimization. Recently, theoretical evidence has emerged for ill-posed behavior of low-rank matrix recovery for sufficiently small noise levels. In this work, we develop improved recovery guarantees for blind deconvolution with adversarial noise which exhibit square-root scaling in the noise level. Hence, our results are consistent with existing counterexamples which speak against linear scaling in the noise level as demonstrated for related low-rank matrix recovery problems.

artificial intelligence, blind deconvolution, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2303.1003

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Exact nuclear norm, completion and decomposition for random overcomplete tensors via degree-4 SOS

Kivva, Bohdan, Potechin, Aaron

arXiv.org Machine LearningNov-18-2020

In this paper we show that simple semidefinite programs inspired by degree $4$ SOS can exactly solve the tensor nuclear norm, tensor decomposition, and tensor completion problems on tensors with random asymmetric components. More precisely, for tensor nuclear norm and tensor decomposition, we show that w.h.p. these semidefinite programs can exactly find the nuclear norm and components of an $(n\times n\times n)$-tensor $\mathcal{T}$ with $m\leq n^{3/2}/polylog(n)$ random asymmetric components. For tensor completion, we show that w.h.p. the semidefinite program introduced by Potechin \& Steurer (2017) can exactly recover an $(n\times n\times n)$-tensor $\mathcal{T}$ with $m$ random asymmetric components from only $n^{3/2}m\, polylog(n)$ randomly observed entries. This gives the first theoretical guarantees for exact tensor completion in the overcomplete regime. This matches the best known results for approximate versions of these problems given by Barak \& Moitra (2015) for tensor completion, and Ma, Shi \& Steurer (2016) for tensor decomposition.

diagram, matrix, matrix diagram, (15 more...)

arXiv.org Machine Learning

2011.09416

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
Asia > China > Jiangsu Province > Yancheng (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre:

Research Report (0.63)
Workflow (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.45)

Add feedback

Sparse Regularization for Mixture Problems

de Castro, Yohann, Gadat, Sébastien, Marteau, Clément, Maugis-Rabusseau, Cathy

arXiv.org Machine LearningJul-23-2019

This paper investigates the statistical estimation of a discrete mixing measure $\mu^0$ involved in a kernel mixture model. Using some recent advances in $\ell_1$-regularization over the space of measures, we introduce a "data fitting + regularization" convex program for estimating $\mu^0$ in a grid-less manner, this method is referred to as Beurling-LASSO. Our contribution is two-fold: we derive a lower bound on the bandwidth of our data fitting term depending only on the support of $\mu^0$ and its so-called "minimum separation" to ensure quantitative support localization error bounds; and under a so-called "non-degenerate source condition" we derive a non-asymptotic support stability property. This latter shows that for sufficiently large sample size $n$, our estimator has exactly as many weighted Dirac masses as the target $\mu^0$, converging in amplitude and localization towards the true ones. The statistical performances of this estimator are investigated designing a so-called "dual certificate", which will be appropriate to our setting. Some classical situations, as e.g., Gaussian or ordinary smooth mixtures (e.g., Laplace distributions), are discussed at the end of the paper. We stress in particular that our method is completely adaptive w.r.t. the number of components involved in the mixture.

artificial intelligence, deduce, machine learning, (19 more...)

arXiv.org Machine Learning

1907.10592

Country: Europe > France (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

A Dictionary Based Generalization of Robust PCA

Rambhatla, Sirisha, Li, Xingguo, Haupt, Jarvis

arXiv.org Machine LearningFeb-21-2019

ABSTRACT We analyze the decomposition of a data matrix, assumed to be a superposition of a low-rank component and a component which is sparse in a known dictionary, using a convex demixing method.We provide a unified analysis, encompassing both undercomplete and overcomplete dictionary cases, and show that the constituent components can be successfully recovered undersome relatively mild assumptions up to a certain global sparsity level. Further, we corroborate our theoretical results by presenting empirical evaluations in terms of phase transitions in rank and sparsity for various dictionary sizes. Index Terms-- Low-rank, dictionary sparse, Robust PCA. 1. INTRODUCTION Exploiting the inherent structure of data for the recovery of relevant information is at the heart of data analysis. R. A wide range of problems can be expressed in the form described above. Perhaps the most celebrated of these is principal componentanalysis (PCA) [1], which can be viewed as a special case of eq.(1), with the matrix X, the problem reduces to that of sparse recovery [2-4]; See [5] and references therein for an overview of related works.

matrix, recovery, sparsity, (13 more...)

arXiv.org Machine Learning

doi: 10.1109/GlobalSIP.2016.7906054

1902.08171

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence (0.94)
Information Technology > Data Science > Data Mining (0.34)

Add feedback

Filters

Collaborating Authors

dual certificate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

527d9d8f89aec80d634e366a97f49ba8-Paper-Conference.pdf

527d9d8f89aec80d634e366a97f49ba8-Paper-Conference.pdf

Gaussian Mixture Model with unknown diagonal covariances via continuous sparse regularization

Efficient Online Large-Margin Classification via Dual Certificates

Effective regions and kernels in continuous sparse regularisation, with application to sketched mixtures

How Does Gradient Descent Learn Features -- A Local Analysis for Regularized Two-Layer Neural Networks

How robust is randomized blind deconvolution via nuclear norm minimization against adversarial noise?

Exact nuclear norm, completion and decomposition for random overcomplete tensors via degree-4 SOS

Sparse Regularization for Mixture Problems

A Dictionary Based Generalization of Robust PCA