AITopics | glm-tron

Collaborating Authors

glm-tron

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron

Wu, Jingfeng, Zou, Difan, Chen, Zixiang, Braverman, Vladimir, Gu, Quanquan, Kakade, Sham M.

arXiv.org Artificial IntelligenceJun-26-2023

This paper considers the problem of learning a single ReLU neuron with squared loss (a.k.a., ReLU regression) in the overparameterized regime, where the input dimension can exceed the number of samples. We analyze a Perceptron-type algorithm called GLM-tron (Kakade et al., 2011) and provide its dimension-free risk upper bounds for high-dimensional ReLU regression in both well-specified and misspecified settings. Our risk bounds recover several existing results as special cases. Moreover, in the well-specified setting, we provide an instance-wise matching risk lower bound for GLM-tron. Our upper and lower risk bounds provide a sharp characterization of the high-dimensional ReLU regression problems that can be learned via GLM-tron. On the other hand, we provide some negative results for stochastic gradient descent (SGD) for ReLU regression with symmetric Bernoulli data: if the model is well-specified, the excess risk of SGD is provably no better than that of GLM-tron ignoring constant factors, for each problem instance; and in the noiseless case, GLM-tron can achieve a small risk while SGD unavoidably suffers from a constant risk in expectation. These results together suggest that GLM-tron might be preferable to SGD for high-dimensional ReLU regression.

finite-sample analysis, glm-tron, regression, (12 more...)

arXiv.org Artificial Intelligence

2303.02255

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.34)

Add feedback

The Reflectron: Exploiting geometry for learning generalized linear models

Boffi, Nicholas M., Slotine, Jean-Jacques E.

arXiv.org Machine LearningAug-26-2020

Generalized linear models (GLMs) extend linear regression by generating the dependent variables through a nonlinear function of a predictor in a Reproducing Kernel Hilbert Space. Despite nonconvexity of the underlying optimization problem, the GLM-tron algorithm of Kakade et al. (2011) provably learns GLMs with guarantees of computational and statistical efficiency. We present an extension of the GLM-tron to a mirror descent or natural gradient-like setting, which we call the Reflectron. The Reflectron enjoys the same statistical guarantees as the GLM-tron for any choice of potential function $\psi$. We show that $\psi$ can be used to exploit the underlying optimization geometry and improve statistical guarantees, or to define an optimization geometry and thereby implicitly regularize the model. The implicit bias of the algorithm can be used to impose advantageous -- such as sparsity-promoting -- priors on the learned weights. Our results extend to the case of multiple outputs with or without weight sharing, and we further show that the Reflectron can be used for online learning of GLMs in the realizable or bounded noise settings. We primarily perform our analysis in continuous-time, leading to simple derivations. We subsequently prove matching guarantees for a discrete implementation. We supplement our theoretical analysis with simulations on real and synthetic datasets demonstrating the validity of our theoretical results.

artificial intelligence, machine learning, reflectron, (17 more...)

arXiv.org Machine Learning

2006.08575

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression

Kakade, Sham, Kalai, Adam Tauman, Kanade, Varun, Shamir, Ohad

arXiv.org Artificial IntelligenceApr-11-2011

Generalized Linear Models (GLMs) and Single Index Models (SIMs) provide powerful generalizations of linear regression, where the target variable is assumed to be a (possibly unknown) 1-dimensional function of a linear predictor. In general, these problems entail non-convex estimation procedures, and, in practice, iterative local search heuristics are often used. Kalai and Sastry (2009) recently provided the first provably efficient method for learning SIMs and GLMs, under the assumptions that the data are in fact generated under a GLM and under certain monotonicity and Lipschitz constraints. However, to obtain provable performance, the method requires a fresh sample every iteration. In this paper, we provide algorithms for learning GLMs and SIMs, which are both computationally and statistically efficient. We also provide an empirical study, demonstrating their feasibility in practice.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1104.2018

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback