AITopics | Volgushev, Stanislav

Collaborating Authors

Volgushev, Stanislav

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Universality of Benign Overfitting in Binary Linear Classification

Hashimoto, Ichiro, Volgushev, Stanislav, Zwiernik, Piotr

arXiv.org Machine LearningJan-17-2025

The practical success of deep learning has led to the discovery of several surprising phenomena. One of these phenomena, that has spurred intense theoretical research, is ``benign overfitting'': deep neural networks seem to generalize well in the over-parametrized regime even though the networks show a perfect fit to noisy training data. It is now known that benign overfitting also occurs in various classical statistical models. For linear maximum margin classifiers, benign overfitting has been established theoretically in a class of mixture models with very strong assumptions on the covariate distribution. However, even in this simple setting, many questions remain open. For instance, most of the existing literature focuses on the noiseless case where all true class labels are observed without errors, whereas the more interesting noisy case remains poorly understood. We provide a comprehensive study of benign overfitting for linear maximum margin classifiers. We discover a phase transition in test error bounds for the noisy model which was previously unknown and provide some geometric intuition behind it. We further considerably relax the required covariate assumptions in both, the noisy and noiseless case. Our results demonstrate that benign overfitting of maximum margin classifiers holds in a much wider range of scenarios than was previously known and provide new insights into the underlying mechanisms.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

2501.10538

Country: North America > Canada (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Group structure estimation for panel data -- a general approach

Yu, Lu, Gu, Jiaying, Volgushev, Stanislav

arXiv.org Machine LearningJan-5-2022

Panel data models are a standard empirical tool in statistics, economics, marketing, and financial research. The conventional modeling approach is to assume that all individual heterogeneity can be summarized by an individual specific intercept, often known as the fixed effects, while assuming all covariates have a common effect among all the individuals, such that information can be pooled across individuals to gain efficiency of these common parameters. However, heterogeneous responses towards observed control variables are often better supported by empirical evidence, especially as detailed individual level data becomes more available. An increasingly popular approach to model unobserved heterogeneity in the effects of covariates on individual responses is to assume the existence of a finite number of homogeneous groups.

artificial intelligence, assumption 3, machine learning, (17 more...)

arXiv.org Machine Learning

2201.01793

Country:

North America > United States > California (0.45)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Government (1.00)
Law > Environmental Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias

Yu, Lu, Balasubramanian, Krishnakumar, Volgushev, Stanislav, Erdogdu, Murat A.

arXiv.org Machine LearningJul-29-2020

Structured non-convex learning problems, for which critical points have favorable statistical properties, arise frequently in statistical machine learning. Algorithmic convergence and statistical estimation rates are well-understood for such problems. However, quantifying the uncertainty associated with the underlying training algorithm is not well-studied in the non-convex setting. In order to address this shortcoming, in this work, we establish an asymptotic normality result for the constant step size stochastic gradient descent (SGD) algorithm--a widely used algorithm in practice. Specifically, based on the relationship between SGD and Markov Chains [DDB19], we show that the average of SGD iterates is asymptotically normally distributed around the expected value of their unique invariant distribution, as long as the non-convex and non-smooth objective function satisfies a dissipativity property. We also characterize the bias between this expected value and the critical points of the objective function under various local regularity conditions. Together, the above two results could be leveraged to construct confidence intervals for non-convex problems that are trained using the SGD algorithm.

artificial intelligence, assumption 2, optimization problem, (14 more...)

arXiv.org Machine Learning

2006.07904

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.63)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback