AITopics | subgradient descent

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
Oceania > Australia (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsAug-17-2025, 07:07:14 GMT

c559da2ba967eb820766939a658022c8-Supplemental.pdf

artificial intelligence, machine learning, neural network, (14 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
Oceania > Australia (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Neural Information Processing SystemsAug-17-2025, 07:07:10 GMT

On the Equivalence between Neural Network and Support Vector Machine

Recent research has made some progress towards deep learning theory from the perspective of infinite-width NN.

artificial intelligence, machine learning, neural network, (12 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
Oceania > Australia (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsOct-8-2024, 01:43:36 GMT

Reviews: Training Deep Networks without Learning Rates Through Coin Betting

Summary: This paper is based on the notion (established in existing works) that subgradient descent can be interpreted as betting on a coin. It generalizes existing results to data-dependent bets and, consequently, data-dependent convergence bounds that improve upon the existing ones. The algorithm is then adapted for the training of neural networks and evaluated on this task experimentally. Quality: The derivations are mathematically sound. I verified the proof of Theorem 1. The changes made to adapt COCOB for the training of neural networks (Algo 1 -- Algo 2) are sensible and intuitive.

algorithm, experimental evaluation, training deep network, (12 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.44)

arXiv.org Artificial IntelligenceNov-9-2023

Primal Estimated Subgradient Solver for SVM for Imbalanced Classification

Sun, John

We aim to demonstrate in experiments that our cost sensitive PEGASOS SVM achieves good performance on imbalanced data sets with a Majority to Minority Ratio ranging from 8.6:1 to 130:1 and to ascertain whether the including intercept (bias), regularization and parameters affects performance on our selection of datasets. Although many resort to SMOTE methods, we aim for a less computationally intensive method. We evaluate the performance by examining the learning curves. These curves diagnose whether we overfit or underfit or whether the random sample of data chosen during the process was not random enough or diverse enough in dependent variable class for the algorithm to generalized to unseen examples. We will also see the background of the hyperparameters versus the test and train error in validation curves. We benchmark our PEGASOS Cost-Sensitive SVM's results of Ding's LINEAR SVM DECIDL method. He obtained an ROC-AUC of .5 in one dataset. Our work will extend the work of Ding by incorporating kernels into SVM. We will use Python rather than MATLAB as python has dictionaries for storing mixed data types during multi-parameter cross-validation.

algorithm, dataset, roc-auc, (14 more...)

arXiv.org Artificial Intelligence

2206.09311

Country: North America > United States > California > Alameda County > Livermore (0.04)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.51)

Dhingra, Atul, Shen, Jie, Kleene, Nicholas

Learning Large Scale Sparse Models

arXiv.org Artificial IntelligenceJan-26-2023

In this work, we consider learning sparse models in large scale settings, where the number of samples and the feature dimension can grow as large as millions or billions. Two immediate issues occur under such challenging scenario: (i) computational cost; (ii) memory overhead. In particular, the memory issue precludes a large volume of prior algorithms that are based on batch optimization technique. To remedy the problem, we propose to learn sparse models such as Lasso in an online manner where in each iteration, only one randomly chosen sample is revealed to update a sparse iterate. Thereby, the memory cost is independent of the sample size and gradient evaluation for one sample is efficient. Perhaps amazingly, we find that with the same parameter, sparsity promoted by batch methods is not preserved in online fashion. We analyze such interesting phenomenon and illustrate some effective variants including mini-batch methods and a hard thresholding based stochastic gradient algorithm. Extensive experiments are carried out on a public dataset which supports our findings and algorithms.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2301.10958

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.51)

arXiv.org Machine LearningNov-11-2021

On the Equivalence between Neural Network and Support Vector Machine

Chen, Yilan, Huang, Wei, Nguyen, Lam M., Weng, Tsui-Wei

Recent research shows that the dynamics of an infinitely wide neural network (NN) trained by gradient descent can be characterized by Neural Tangent Kernel (NTK) \citep{jacot2018neural}. Under the squared loss, the infinite-width NN trained by gradient descent with an infinitely small learning rate is equivalent to kernel regression with NTK \citep{arora2019exact}. However, the equivalence is only known for ridge regression currently \citep{arora2019harnessing}, while the equivalence between NN and other kernel machines (KMs), e.g. support vector machine (SVM), remains unknown. Therefore, in this work, we propose to establish the equivalence between NN and SVM, and specifically, the infinitely wide NN trained by soft margin loss and the standard soft margin SVM with NTK trained by subgradient descent. Our main theoretical results include establishing the equivalence between NN and a broad family of $\ell_2$ regularized KMs with finite-width bounds, which cannot be handled by prior work, and showing that every finite-width NN trained by such regularized loss functions is approximately a KM. Furthermore, we demonstrate our theory can enable three practical applications, including (i) \textit{non-vacuous} generalization bound of NN via the corresponding KM; (ii) \textit{non-trivial} robustness certificate for the infinite-width NN (while existing robustness verification methods would provide vacuous bounds); (iii) intrinsically more robust infinite-width NNs than those from previous kernel regression. Our code for the experiments are available at \url{https://github.com/leslie-CH/equiv-nn-svm}.

neural network, subgradient descent, svm, (11 more...)

2111.06063

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
Oceania > Australia (0.04)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Halabi, Marwa El, Jegelka, Stefanie

Minimizing approximately submodular functions

arXiv.org Machine LearningMay-28-2019

The problem of minimizing a submodular function is well studied; several polynomial-time algorithms have been developed to solve it exactly or up to arbitrary accuracy. However, in many applications, the objective functions are not exactly submodular. In this paper, we show that a classical algorithm used for submodular minimization performs well even for a class of non-submodular functions, namely weakly DR-submodular functions. We provide the first approximation guarantee for non-submodular minimization. This broadly expands the range of applications of submodular minimization techniques.

algorithm, artificial intelligence, machine learning, (18 more...)

1905.12145

Country: North America > United States (0.93)

Genre: Research Report (0.82)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningMay-11-2018

Randomized Smoothing SVRG for Large-scale Nonsmooth Convex Optimization

Huang, Wenjie

In this paper, we consider the problem of minimizing the average of a large number of nonsmooth and convex functions. Such problems often arise in typical machine learning problems as empirical risk minimization, but are computationally very challenging. We develop and analyze a new algorithm that achieves robust linear convergence rate, and both its time complexity and gradient complexity are superior than state-of-art nonsmooth algorithms and subgradient-based schemes. Besides, our algorithm works without any extra error bound conditions on the objective function as well as the common strongly-convex condition. We show that our algorithm has wide applications in optimization and machine learning problems, and demonstrate experimentally that it performs well on a large-scale ranking problem.

algorithm, artificial intelligence, machine learning, (17 more...)

1805.05189

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.82)

Industry: Education > Focused Education > Special Education (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Houthooft, Rein, De Turck, Filip

Integrated Inference and Learning of Neural Factors in Structural Support Vector Machines

arXiv.org Machine LearningMar-3-2016

Tackling pattern recognition problems in areas such as computer vision, bioinformatics, speech or text recognition is often done best by taking into account task-specific statistical relations between output variables. In structured prediction, this internal structure is used to predict multiple outputs simultaneously, leading to more accurate and coherent predictions. Structural support vector machines (SSVMs) are nonprobabilistic models that optimize a joint input-output function through margin-based learning. Because SSVMs generally disregard the interplay between unary and interaction factors during the training phase, final parameters are suboptimal. Moreover, its factors are often restricted to linear combinations of input features, limiting its generalization power. To improve prediction accuracy, this paper proposes: (i) Joint inference and learning by integration of back-propagation and loss-augmented inference in SSVM subgradient descent; (ii) Extending SSVM factors to neural networks that form highly nonlinear functions of input features. Image segmentation benchmark results demonstrate improvements over conventional SSVM training methods in terms of accuracy, highlighting the feasibility of end-to-end SSVM training with neural factors.

artificial intelligence, interaction factor, machine learning, (15 more...)

doi: 10.1016/j.patcog.2016.03.014

1508.00451

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
Europe > Belgium > Flanders (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)