AITopics | mgf

Collaborating Authors

mgf

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

c6c31413d5c53b7d1c343c1498734b0f-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-18-2026, 02:01:40 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(5 more...)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Law (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

69f3eb242c7c9df9ea2f2b66ea8b3c0f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 14:01:09 GMT

artificial intelligence, machine learning, prediction, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
(2 more...)

Add feedback

c6c31413d5c53b7d1c343c1498734b0f-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsOct-10-2025, 16:14:40 GMT

dataset, molecule, spectra, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(5 more...)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Law (0.67)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Quality (0.67)
(2 more...)

Add feedback

69f3eb242c7c9df9ea2f2b66ea8b3c0f-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 04:59:40 GMT

dataset, prediction, trajectory prediction, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
(2 more...)

Add feedback

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Papazov, Hristo, Pesme, Scott, Flammarion, Nicolas

arXiv.org Machine LearningMar-8-2024

In this work, we investigate the effect of momentum on the optimisation trajectory of gradient descent. We leverage a continuous-time approach in the analysis of momentum gradient descent with step size $\gamma$ and momentum parameter $\beta$ that allows us to identify an intrinsic quantity $\lambda = \frac{ \gamma }{ (1 - \beta)^2 }$ which uniquely defines the optimisation path and provides a simple acceleration rule. When training a $2$-layer diagonal linear network in an overparametrised regression setting, we characterise the recovered solution through an implicit regularisation problem. We then prove that small values of $\lambda$ help to recover sparse solutions. Finally, we give similar but weaker results for stochastic momentum gradient descent. We provide numerical experiments which support our claims.

balancedness, leveraging continuous time, mgf, (14 more...)

arXiv.org Machine Learning

2403.05293

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)

Add feedback

Concentration of the Langevin Algorithm's Stationary Distribution

Altschuler, Jason M., Talwar, Kunal

arXiv.org Artificial IntelligenceDec-23-2022

A canonical algorithm for log-concave sampling is the Langevin Algorithm, aka the Langevin Diffusion run with some discretization stepsize $\eta > 0$. This discretization leads the Langevin Algorithm to have a stationary distribution $\pi_{\eta}$ which differs from the stationary distribution $\pi$ of the Langevin Diffusion, and it is an important challenge to understand whether the well-known properties of $\pi$ extend to $\pi_{\eta}$. In particular, while concentration properties such as isoperimetry and rapidly decaying tails are classically known for $\pi$, the analogous properties for $\pi_{\eta}$ are open questions with direct algorithmic implications. This note provides a first step in this direction by establishing concentration results for $\pi_{\eta}$ that mirror classical results for $\pi$. Specifically, we show that for any nontrivial stepsize $\eta > 0$, $\pi_{\eta}$ is sub-exponential (respectively, sub-Gaussian) when the potential is convex (respectively, strongly convex). Moreover, the concentration bounds we show are essentially tight. Key to our analysis is the use of a rotation-invariant moment generating function (aka Bessel function) to study the stationary dynamics of the Langevin Algorithm. This technique may be of independent interest because it enables directly analyzing the discrete-time stationary distribution $\pi_{\eta}$ without going through the continuous-time stationary distribution $\pi$ as an intermediary.

artificial intelligence, langevin algorithm, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2212.12629

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

The Implicit Regularization of Momentum Gradient Descent with Early Stopping

Wang, Li, Zhou, Yingcong, Fu, Zhiguo

arXiv.org Machine LearningJan-14-2022

The study on the implicit regularization induced by gradient-based optimization is a longstanding pursuit. In the present paper, we characterize the implicit regularization of momentum gradient descent (MGD) with early stopping by comparing with the explicit $\ell_2$-regularization (ridge). In details, we study MGD in the continuous-time view, so-called momentum gradient flow (MGF), and show that its tendency is closer to ridge than the gradient descent (GD) [Ali et al., 2019] for least squares regression. Moreover, we prove that, under the calibration $t=\sqrt{2/\lambda}$, where $t$ is the time parameter in MGF and $\lambda$ is the tuning parameter in ridge regression, the risk of MGF is no more than 1.54 times that of ridge. In particular, the relative Bayes risk of MGF to ridge is between 1 and 1.035 under the optimal tuning. The numerical experiments support our theoretical results strongly.

mgf, prediction risk, regularization, (14 more...)

arXiv.org Machine Learning

2201.05405

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > China > Jilin Province (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.92)

Add feedback

Moment Generating Function Tutorial

#artificialintelligenceJun-17-2021, 22:05:34 GMT

We generally use moments in statistics, machine learning, mathematics, and other fields to describe the characteristics of a distribution. Let's say the variable of our interest is X then, moments are X's expected values. Now we are very familiar with the first moment(mean) and the second moment(variance). The third moment is called skewness, and the fourth moment is known as kurtosis. The third moment measures the asymmetry of distribution while the fourth moment measures how heavy the tail values are.

derivative, formula, generating function, (7 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.52)

Add feedback

Concentration Inequalities for Statistical Inference

Zhang, Huiming, Chen, Song Xi

arXiv.org Machine LearningNov-4-2020

This paper gives a review of concentration inequalities which are widely employed in analyzes of mathematical statistics in a wide range of settings, from distribution free to distribution dependent, from sub-Gaussian to sub-exponential, sub-Gamma, and sub-Weibull random variables, and from the mean to the maximum concentration. This review provides results in these settings with some fresh new results. Given the increasing popularity of high dimensional data and inference, results in the context of high-dimensional linear and Poisson regressions are also provided. We aim to illustrate the concentration inequalities with known constants and to improve existing bounds with sharper constants.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Machine Learning

2011.02258

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(5 more...)

Genre: Research Report > New Finding (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Modeling & Simulation (0.68)
Information Technology > Data Science (0.67)

Add feedback

Moment Generating Function for Probability Distribution with Python

#artificialintelligenceOct-17-2020, 23:00:09 GMT

This tutorial's code is available on Github and its full implementation as well on Google Colab. Check out our editorial suggestions on the best data science books. We generally use moments in statistics, machine learning, mathematics, and other fields to describe the characteristics of a distribution. Let's say the variable of our interest is X then, moments are X's expected values. Now we are very familiar with the first moment(mean) and the second moment(variance).

artificial intelligence, formula, generating function, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.51)

Add feedback