AITopics | approximate description length

Collaborating Authors

approximate description length

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generalization Bounds for Neural Networks via Approximate Description Length

Amit Daniely, Elad Granot

Neural Information Processing SystemsFeb-12-2026, 20:01:02 GMT

Namely,thattheempirical lossofall the functions in the class is -close to the true loss. Finally, we develop a set of tools for calculating the approximate description length of classes of functions that can be presented as a composition of linear function classes and non-linear functions.

approximate description length, artificial intelligence, repd, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback

Generalization Bounds for Neural Networks via Approximate Description Length

Neural Information Processing SystemsDec-25-2025, 16:20:54 GMT

We investigate the sample complexity of networks with bounds on the magnitude of its weights.

generalization bound, neural network, sample complexity, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Generalization Bounds for Neural Networks via Approximate Description Length

Amit Daniely, Elad Granot

Neural Information Processing SystemsOct-3-2025, 03:41:43 GMT

Neural Information Processing Systems http://nips.cc/

approximate description length, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.83)

Add feedback

Reviews: Generalization Bounds for Neural Networks via Approximate Description Length

Neural Information Processing SystemsJan-25-2025, 08:00:23 GMT

In this paper the authors establish upper bounds on the generalization error of classes of norm-bounded neural networks. There is a long line of literature on this exact question, and this paper claims to resolve an interesting open question in this area (at least when the depth of the network is viewed as a constant). In particular, the paper considers generalization bounds for a class of fully-connected networks of constant depth and whose matrices are of bounded norm. Work by Bartlett et al. ("Spectrally normalized margin bounds on neural networks", ref [4] in the paper) proved an upper bound on generalization error that contains a factor growing as the (1,2)-matrix norm of any layer. If one further assumes that the depth as well as all the spectral norms are constants, then this is the dominant term (up to logarithmic factors) in their generalization bound.

approximate description length, generalization, neural network, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

Reviews: Generalization Bounds for Neural Networks via Approximate Description Length

Neural Information Processing SystemsJan-25-2025, 08:00:13 GMT

This paper proposes a new framework for bounding the generalization error of fully connected neural nets. The authors are able to show that, for sufficiently smooth activation functions, the number of examples required to achieve a good generalization error scales sublinearly with the total number of parameters in the network. This is a significantly better bound than the previous state-of-the-art results. The analytical tools based on description length are very interesting, and could be applicable to the analysis of other multi-layer non-convex models. All three reviewers are uniformly enthusiastic about this work, which is guaranteed to attract a great deal of attention and to catalyze further research activity.

approximate description length, generalization bound, neural network

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generalization Bounds for Neural Networks via Approximate Description Length

Neural Information Processing SystemsOct-10-2024, 10:54:25 GMT

We investigate the sample complexity of networks with bounds on the magnitude of its weights. We show that for any depth t, if the inputs are in [-1,1] d, the sample complexity of \cn is \tilde O\left(\frac{dR 2}{\epsilon 2}\right) . This bound is optimal up to log-factors, and substantially improves over the previous state of the art of \tilde O\left(\frac{d 2R 2}{\epsilon 2}\right), that was established in a recent line of work. We furthermore show that this bound remains valid if instead of considering the magnitude of the W_i's, we consider the magnitude of W_i - W_i 0, where W_i 0 are some reference matrices, with spectral norm of O(1) . By taking the W_i 0 to be the matrices in the onset of the training process, we get sample complexity bounds that are sub-linear in the number of parameters, in many {\em typical} regimes of parameters.

approximate description length, generalization bound, sample complexity, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Approximate Description Length, Covering Numbers, and VC Dimension

Daniely, Amit, Katzhendler, Gal

arXiv.org Artificial IntelligenceSep-26-2022

Neural Networks are a widely used tool nowadays, despite the lack of theoretical background supporting their abilities to generalize well. Classical notions of learning guarantee generalization only if there are more examples that parameters. It is clear that a stronger assumption is needed to achieve tighter bounds, and indeed, different types of assumptions were used in order to fill this empirical-theoretical gap, including assumptions on robustness to noise [2], bias of the learning algorithm [5, 10], and norm bounds on the weight's matrices [8, 9] The idea of Approximate Description Length [4] was conceived as a part of the line of research working under assumptions that bound the magnitude of the network's weight matrices.

adl, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2209.12882

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.84)

Add feedback

Generalization Bounds for Neural Networks via Approximate Description Length

Daniely, Amit, Granot, Elad

Neural Information Processing SystemsMar-19-2020, 02:01:31 GMT

We investigate the sample complexity of networks with bounds on the magnitude of its weights. This bound is optimal up to log-factors, and substantially improves over the previous state of the art of $\tilde O\left(\frac{d 2R 2}{\epsilon 2}\right)$, that was established in a recent line of work. We furthermore show that this bound remains valid if instead of considering the magnitude of the $W_i$'s, we consider the magnitude of $W_i - W_i 0$, where $W_i 0$ are some reference matrices, with spectral norm of $O(1)$. By taking the $W_i 0$ to be the matrices in the onset of the training process, we get sample complexity bounds that are sub-linear in the number of parameters, in many {\em typical} regimes of parameters. To establish our results we develop a new technique to analyze the sample complexity of families $\ch$ of predictors.

approximate description length, generalization bound, sample complexity, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Generalization Bounds for Neural Networks via Approximate Description Length

Daniely, Amit, Granot, Elad

arXiv.org Machine LearningOct-13-2019

We investigate the sample complexity of networks with bounds on the magnitude of its weights. In particular, we consider the class \[ H=\left\{W_t\circ\rho\circ \ldots\circ\rho\circ W_{1} :W_1,\ldots,W_{t-1}\in M_{d, d}, W_t\in M_{1,d}\right\} \] where the spectral norm of each $W_i$ is bounded by $O(1)$, the Frobenius norm is bounded by $R$, and $\rho$ is the sigmoid function $\frac{e^x}{1+e^x}$ or the smoothened ReLU function $ \ln (1+e^x)$. We show that for any depth $t$, if the inputs are in $[-1,1]^d$, the sample complexity of $H$ is $\tilde O\left(\frac{dR^2}{\epsilon^2}\right)$. This bound is optimal up to log-factors, and substantially improves over the previous state of the art of $\tilde O\left(\frac{d^2R^2}{\epsilon^2}\right)$. We furthermore show that this bound remains valid if instead of considering the magnitude of the $W_i$'s, we consider the magnitude of $W_i - W_i^0$, where $W_i^0$ are some reference matrices, with spectral norm of $O(1)$. By taking the $W_i^0$ to be the matrices at the onset of the training process, we get sample complexity bounds that are sub-linear in the number of parameters, in many typical regimes of parameters. To establish our results we develop a new technique to analyze the sample complexity of families $H$ of predictors. We start by defining a new notion of a randomized approximate description of functions $f:X\to\mathbb{R}^d$. We then show that if there is a way to approximately describe functions in a class $H$ using $d$ bits, then $d/\epsilon^2$ examples suffices to guarantee uniform convergence. Namely, that the empirical loss of all the functions in the class is $\epsilon$-close to the true loss. Finally, we develop a set of tools for calculating the approximate description length of classes of functions that can be presented as a composition of linear function classes and non-linear functions.

approximate description length, matrix, sample complexity, (14 more...)

arXiv.org Machine Learning

1910.05697

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.82)

Add feedback