AITopics | gaussian marginal

Collaborating Authors

gaussian marginal

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

Neural Information Processing SystemsDec-24-2025, 23:41:32 GMT

We consider the problem of computing the best-fitting ReLU with respect to square-loss on a training set when the examples have been drawn according to a spherical Gaussian distribution (the labels can be arbitrary). Let $\opt < 1$ be the population loss of the best-fitting ReLU. We prove: \begin{itemize} \item Finding a ReLU with square-loss $\opt + \epsilon$ is as hard as the problem of learning sparse parities with noise, widely thought to be computationally intractable. This is the first hardness result for learning a ReLU with respect to Gaussian marginals, and our results imply --{\em unconditionally}-- that gradient descent cannot converge to the global minimum in polynomial time.

best-fitting relu, relu, time accuracy tradeoff, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Near-Optimal SQ Lower Bounds for Agnostically Learning Halfspaces and ReLUs under Gaussian Marginals

Neural Information Processing SystemsDec-24-2025, 08:58:44 GMT

We study the fundamental problems of agnostically learning halfspaces and ReLUs under Gaussian marginals. In the former problem, given labeled examples $(\bx, y)$ from an unknown distribution on $\R^d \times \{ \pm 1\}$, whose marginal distribution on $\bx$ is the standard Gaussian and the labels $y$ can be arbitrary, the goal is to output a hypothesis with 0-1 loss $\opt+\eps$, where $\opt$ is the 0-1 loss of the best-fitting halfspace. In the latter problem, given labeled examples $(\bx, y)$ from an unknown distribution on $\R^d \times \R$, whose marginal distribution on $\bx$ is the standard Gaussian and the labels $y$ can be arbitrary, the goal is to output a hypothesis with square loss $\opt+\eps$, where $\opt$ is the square loss of the best-fitting ReLU. We prove Statistical Query (SQ) lower bounds of $d^{\poly(1/\eps)}$ for both of these problems. Our SQ lower bounds provide strong evidence that current upper bounds for these tasks are essentially best possible.

agnostically learning halfspace and relus, name change, near-optimal sq lower bound, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.63)

Add feedback

Proximal Approximate Inference in State-Space Models

Abdulsamad, Hany, García-Fernández, Ángel F., Särkkä, Simo

arXiv.org Artificial IntelligenceNov-20-2025

We present a class of algorithms for state estimation in nonlinear, non-Gaussian state-space models. Our approach is based on a variational Lagrangian formulation that casts Bayesian inference as a sequence of entropic trust-region updates subject to dynamic constraints. This framework gives rise to a family of forward-backward algorithms, whose structure is determined by the chosen factorization of the variational posterior. By focusing on Gauss--Markov approximations, we derive recursive schemes with favorable computational complexity. For general nonlinear, non-Gaussian models we close the recursions using generalized statistical linear regression and Fourier--Hermite moment matching.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.15409

Country: Europe (0.92)

Genre:

Overview (0.45)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

Surbhi Goel, Sushrut Karmalkar, Adam Klivans

Neural Information Processing SystemsOct-2-2025, 00:18:48 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.46)
North America > United States > Texas (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Reliable Learning of Halfspaces under Gaussian Marginals

Neural Information Processing SystemsMay-27-2025, 01:03:27 GMT

We study the problem of PAC learning halfspaces in the reliable agnostic model of Kalai et al. (2012).The reliable PAC model captures learning scenarios where one type of error is costlier than the others. We complement our upper bound with a Statistical Query lower bound suggesting that the d {\Omega(\log (1/\alpha))} dependence is best possible. Conceptually, our results imply a strong computational separation between reliable agnostic learning and standard agnostic learning of halfspaces in the Gaussian setting.

artificial intelligence, halfspace, machine learning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Reviews: Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

Neural Information Processing SystemsJan-21-2025, 08:28:27 GMT

This paper studies the computational complexity of learning a single ReLU with respect to Gaussian examples. Since ReLUs are now the standard choice of nonlinearity in deep neural networks, the computational complexity of learning them is clearly of interest. Of course, the computational complexity of learning a ReLU may depend substantially on the specific setting assumed; it is interesting to understand the range of such assumptions and their implications for complexity. This paper studies the following setting: given independent samples (x_1,y_1), ..., (x_n,y_n) where x is spherical Gaussian in d dimensions and y \in R is arbitrary, find a ReLU function f_w(x) max(0, w \cdot x) for some vector w with minimal mean squared error \sum (y_i - f(x_i)) 2. (This is agnostic learning since the y's are arbitrary.) The main results are as follows: 1) There is no algorithm to learn a single ReLU with respect to Gaussian examples to additive error \epsilon in time d {o(log 1/\epsilon)} unless k -sparse parities with noise can be learned in time d {o(k)} 2) If opt min_{w} (mean squared error of f) then (with normalization such that opt \in [0,1]) there is an algorithm which agnostically learns a ReLU to error opt {2/3} \epsilon in time poly(d,1/\epsilon). The proof of (1) goes via Hermite analysis (i.e.

algorithm, complexity, relu, (11 more...)

Neural Information Processing Systems

Genre: Research Report (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.57)

Add feedback

Near-Optimal SQ Lower Bounds for Agnostically Learning Halfspaces and ReLUs under Gaussian Marginals

Neural Information Processing SystemsOct-10-2024, 22:24:29 GMT

We study the fundamental problems of agnostically learning halfspaces and ReLUs under Gaussian marginals. In the former problem, given labeled examples (\bx, y) from an unknown distribution on \R d \times \{ \pm 1\}, whose marginal distribution on \bx is the standard Gaussian and the labels y can be arbitrary, the goal is to output a hypothesis with 0-1 loss \opt \eps, where \opt is the 0-1 loss of the best-fitting halfspace. In the latter problem, given labeled examples (\bx, y) from an unknown distribution on \R d \times \R, whose marginal distribution on \bx is the standard Gaussian and the labels y can be arbitrary, the goal is to output a hypothesis with square loss \opt \eps, where \opt is the square loss of the best-fitting ReLU. We prove Statistical Query (SQ) lower bounds of d {\poly(1/\eps)} for both of these problems. Our SQ lower bounds provide strong evidence that current upper bounds for these tasks are essentially best possible.

agnostically learning halfspace and relus, gaussian marginal, near-optimal sq lower bound, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

Neural Information Processing SystemsOct-9-2024, 11:31:09 GMT

We consider the problem of computing the best-fitting ReLU with respect to square-loss on a training set when the examples have been drawn according to a spherical Gaussian distribution (the labels can be arbitrary). Let \opt 1 be the population loss of the best-fitting ReLU. We prove: \begin{itemize} \item Finding a ReLU with square-loss \opt \epsilon is as hard as the problem of learning sparse parities with noise, widely thought to be computationally intractable. This is the first hardness result for learning a ReLU with respect to Gaussian marginals, and our results imply --{\em unconditionally}-- that gradient descent cannot converge to the global minimum in polynomial time. The algorithm uses a novel reduction to noisy halfspace learning with respect to 0/1 loss.

best-fitting relu, relu, time accuracy tradeoff, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces and ReLU Regression under Gaussian Marginals

Diakonikolas, Ilias, Kane, Daniel M., Ren, Lisheng

arXiv.org Artificial IntelligenceFeb-13-2023

We study the task of agnostically learning halfspaces under the Gaussian distribution. Specifically, given labeled examples $(\mathbf{x},y)$ from an unknown distribution on $\mathbb{R}^n \times \{ \pm 1\}$, whose marginal distribution on $\mathbf{x}$ is the standard Gaussian and the labels $y$ can be arbitrary, the goal is to output a hypothesis with 0-1 loss $\mathrm{OPT}+\epsilon$, where $\mathrm{OPT}$ is the 0-1 loss of the best-fitting halfspace. We prove a near-optimal computational hardness result for this task, under the widely believed sub-exponential time hardness of the Learning with Errors (LWE) problem. Prior hardness results are either qualitatively suboptimal or apply to restricted families of algorithms. Our techniques extend to yield near-optimal lower bounds for related problems, including ReLU regression.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.06512

Country:

North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

Goel, Surbhi, Karmalkar, Sushrut, Klivans, Adam

Neural Information Processing SystemsMar-19-2020, 00:03:18 GMT

We consider the problem of computing the best-fitting ReLU with respect to square-loss on a training set when the examples have been drawn according to a spherical Gaussian distribution (the labels can be arbitrary). Let $\opt 1$ be the population loss of the best-fitting ReLU. We prove: \begin{itemize} \item Finding a ReLU with square-loss $\opt \epsilon$ is as hard as the problem of learning sparse parities with noise, widely thought to be computationally intractable. This is the first hardness result for learning a ReLU with respect to Gaussian marginals, and our results imply --{\em unconditionally}-- that gradient descent cannot converge to the global minimum in polynomial time. The algorithm uses a novel reduction to noisy halfspace learning with respect to $0/1$ loss.

best-fitting relu, relu, time accuracy tradeoff, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback