AITopics | regularization problem

Collaborating Authors

regularization problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reduced-Space Iteratively Reweighted Second-Order Methods for Nonconvex Sparse Regularization

Wang, Hao, Yang, Xiangyu, Zhu, Yichen

arXiv.org Artificial IntelligenceAug-17-2024

This paper explores a specific type of nonconvex sparsity-promoting regularization problems, namely those involving $\ell_p$-norm regularization, in conjunction with a twice continuously differentiable loss function. We propose a novel second-order algorithm designed to effectively address this class of challenging nonconvex and nonsmooth problems, showcasing several innovative features: (i) The use of an alternating strategy to solve a reweighted $\ell_1$ regularized subproblem and the subspace approximate Newton step. (ii) The reweighted $\ell_1$ regularized subproblem relies on a convex approximation to the nonconvex regularization term, enabling a closed-form solution characterized by the soft-thresholding operator. This feature allows our method to be applied to various nonconvex regularization problems. (iii) Our algorithm ensures that the iterates maintain their sign values and that nonzero components are kept away from 0 for a sufficient number of iterations, eventually transitioning to a perturbed Newton method. (iv) We provide theoretical guarantees of global convergence, local superlinear convergence in the presence of the Kurdyka-\L ojasiewicz (KL) property, and local quadratic convergence when employing the exact Newton step in our algorithm. We also showcase the effectiveness of our approach through experiments on a diverse set of model prediction problems.

convergence, soirl 1, subproblem, (16 more...)

arXiv.org Artificial Intelligence

2407.17216

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The representer theorem for Hilbert spaces: a necessary and sufficient condition

Neural Information Processing SystemsMar-14-2024, 20:33:51 GMT

The representer theorem is a property that lies at the foundation of regularization theory and kernel methods. A class of regularization functionals is said to admit a linear representer theorem if every member of the class admits minimizers that lie in the finite dimensional subspace spanned by the representers of the data. A recent characterization states that certain classes of regularization functionals with differentiable regularization term admit a linear representer theorem for any choice of the data if and only if the regularization term is a radial nondecreasing function. In this paper, we extend such result by weakening the assumptions on the regularization term. In particular, the main result of this paper implies that, for a sufficiently large family of regularization functionals, radial nondecreasing functions are the only lower semicontinuous regularization terms that guarantee existence of a representer theorem for any choice of the data.

linear representer theorem, representer theorem, theorem, (16 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

A New Convex Relaxation for Tensor Completion

Neural Information Processing SystemsMar-13-2024, 20:02:20 GMT

We study the problem of learning a tensor from a set of linear measurements. A prominent methodology for this problem is based on a generalization of trace norm regularization, which has been used extensively for learning low rank matrices, to the tensor setting. In this paper, we highlight some limitations of this approach and propose an alternative convex relaxation on the Euclidean ball. We then describe a technique to solve the associated regularization problem, which builds upon the alternating direction method of multipliers. Experiments on one synthetic dataset and two real datasets indicate that the proposed method improves significantly over tensor trace norm regularization in terms of estimation error, while remaining computationally tractable.

regularizer, tensor, trace norm, (12 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Hypothesis Spaces for Deep Learning

Wang, Rui, Xu, Yuesheng, Yan, Mingsong

arXiv.org Machine LearningMar-11-2024

Deep learning has been a huge success in applications. Mathematically, its success is due to the use of deep neural networks (DNNs), neural networks of multiple layers, to describe decision functions. Various mathematical aspects of DNNs as an approximation tool were investigated recently in a number of studies [9, 11, 13, 16, 20, 27, 28, 31]. As pointed out in [8], learning processes do not take place in a vacuum. Classical learning methods took place in a reproducing kernel Hilbert space (RKHS) [1], which leads to representation of learning solutions in terms of a combination of a finite number of kernel sessions [19] of a universal kernel [17]. Reproducing kernel Hilbert spaces as appropriate hypothesis spaces for classical learning methods provide a foundation for mathematical analysis of the learning methods. A natural and imperative question is what are appropriate hypothesis spaces for deep learning. Although hypothesis spaces for learning with shallow neural networks (networks of one hidden layer) were investigated recently in a number of studies, (e.g.

banach space, mni problem, representer theorem, (15 more...)

arXiv.org Machine Learning

2403.03353

Country:

Asia > China (0.04)
North America > United States > Virginia > Norfolk City County > Norfolk (0.04)
North America > United States > New York > Onondaga County > Syracuse (0.04)

Genre: Research Report (0.63)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sparse Representer Theorems for Learning in Reproducing Kernel Banach Spaces

Wang, Rui, Xu, Yuesheng, Yan, Mingsong

arXiv.org Artificial IntelligenceMay-21-2023

Sparsity of a learning solution is a desirable feature in machine learning. Certain reproducing kernel Banach spaces (RKBSs) are appropriate hypothesis spaces for sparse learning methods. The goal of this paper is to understand what kind of RKBSs can promote sparsity for learning solutions. We consider two typical learning models in an RKBS: the minimum norm interpolation (MNI) problem and the regularization problem. We first establish an explicit representer theorem for solutions of these problems, which represents the extreme points of the solution set by a linear combination of the extreme points of the subdifferential set, of the norm function, which is data-dependent. We then propose sufficient conditions on the RKBS that can transform the explicit representation of the solutions to a sparse kernel representation having fewer terms than the number of the observed data. Under the proposed sufficient conditions, we investigate the role of the regularization parameter on sparsity of the regularized solutions. We further show that two specific RKBSs: the sequence space $\ell_1(\mathbb{N})$ and the measure space can have sparse representer theorems for both MNI and regularization models.

artificial intelligence, machine learning, mni problem, (18 more...)

arXiv.org Artificial Intelligence

2305.12584

Country:

Asia > China (0.04)
North America > United States > Virginia > Norfolk City County > Norfolk (0.04)
North America > United States > New York > Onondaga County > Syracuse (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Education (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On Generalization and Regularization via Wasserstein Distributionally Robust Optimization

Wu, Qinyu, Li, Jonathan Yu-Meng, Mao, Tiantian

arXiv.org Artificial IntelligenceDec-12-2022

Wasserstein distributionally robust optimization (DRO) has found success in operations research and machine learning applications as a powerful means to obtain solutions with favourable out-of-sample performances. Two compelling explanations for the success are the generalization bounds derived from Wasserstein DRO and the equivalency between Wasserstein DRO and the regularization scheme commonly applied in machine learning. Existing results on generalization bounds and the equivalency to regularization are largely limited to the setting where the Wasserstein ball is of a certain type and the decision criterion takes certain forms of an expected function. In this paper, we show that by focusing on Wasserstein DRO problems with affine decision rules, it is possible to obtain generalization bounds and the equivalency to regularization in a significantly broader setting where the Wasserstein ball can be of a general type and the decision criterion can be a general measure of risk, i.e., nonlinear in distributions. This allows for accommodating many important classification, regression, and risk minimization applications that have not been addressed to date using Wasserstein DRO. Our results are strong in that the generalization bounds do not suffer from the curse of dimensionality and the equivalency to regularization is exact. As a byproduct, our regularization results broaden considerably the class of Wasserstein DRO models that can be solved efficiently via regularization formulations.

artificial intelligence, machine learning, wasserstein ball, (13 more...)

arXiv.org Artificial Intelligence

2212.05716

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.14)
Asia > China (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.30)

Add feedback

The Geometry of Adversarial Training in Binary Classification

Bungert, Leon, Trillos, Nicolás García, Murray, Ryan

arXiv.org Machine LearningNov-26-2021

We establish an equivalence between a family of adversarial training problems for non-parametric binary classification and a family of regularized risk minimization problems where the regularizer is a nonlocal perimeter functional. The resulting regularized risk minimization problems admit exact convex relaxations of the type $L^1+$ (nonlocal) $\operatorname{TV}$, a form frequently studied in image analysis and graph-based learning. A rich geometric structure is revealed by this reformulation which in turn allows us to establish a series of properties of optimal solutions of the original problem, including the existence of minimal and maximal solutions (interpreted in a suitable sense), and the existence of regular solutions (also interpreted in a suitable sense). In addition, we highlight how the connection between adversarial training and perimeter minimization problems provides a novel, directly interpretable, statistical motivation for a family of regularized risk minimization problems involving perimeter/total variation. The majority of our theoretical results are independent of the distance used to define adversarial attacks.

minimizer, perimeter, proposition 3, (14 more...)

arXiv.org Machine Learning

2111.13613

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

On implicit regularization: Morse functions and applications to matrix factorization

Belabbas, Mohamed Ali

arXiv.org Machine LearningJan-31-2020

In this paper, we revisit implicit regularization from the ground up using notions from dynamical systems and invariant subspaces of Morse functions. The key contributions are a new criterion for implicit regularization---a leading contender to explain the generalization power of deep models such as neural networks---and a general blueprint to study it. We apply these techniques to settle a conjecture on implicit regularization in matrix factorization.

implicit regularization, matrix, regularization problem, (14 more...)

arXiv.org Machine Learning

2001.04264

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Sparse estimation via $\ell_q$ optimization method in high-dimensional linear regression

Li, Xin, Hu, Yaohua, Li, Chong, Yang, Xiaoqi, Jiang, Tianzi

arXiv.org Machine LearningNov-11-2019

In this paper, we discuss the statistical properties of the $\ell_q$ optimization methods $(0

probability, recovery, regularization problem, (15 more...)

arXiv.org Machine Learning

1911.05073

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)

Add feedback

A brief introduction to the Grey Machine Learning

Ma, Xin

arXiv.org Machine LearningMay-4-2018

This paper presents a brief introduction to the key points of the Grey Machine Learning (GML) based on the kernels. The general formulation of the grey system models have been firstly summarized, and then the nonlinear extension of the grey models have been developed also with general formulations. The kernel implicit mapping is used to estimate the nonlinear function of the GML model, by extending the nonparametric formulation of the LSSVM, the estimation of the nonlinear function of the GML model can also be expressed by the kernels. A short discussion on the priority of this new framework to the existing grey models and LSSVM have also been discussed in this paper. And the perspectives and future orientations of this framework have also been presented.

deep learning, formulation, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

1805.01745

Country: Asia > China (0.29)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback