bartlett
What Bigfoot hunters get right (and very wrong)
'Bigfooters' often employ credible scientific methods in their searches. Breakthroughs, discoveries, and DIY tips sent every weekday. Bigfoot remains firmly in the realm of cryptozoology, along with the likes of the Loch Ness monster . However, its pursuers often are not the stereotypical crackpots depicted across pop culture. According to two social scientists, they frequently rely on widely accepted, reliable methods and tools to search for the elusive Sasquatch.
- Oceania > Australia > Queensland (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (2 more...)
Provably data-driven projection method for quadratic programming
Nguyen, Anh Tuan, Nguyen, Viet Anh
Projection methods aim to reduce the dimensionality of the optimization instance, thereby improving the scalability of high-dimensional problems. Recently, Sakaue and Oki proposed a data-driven approach for linear programs (LPs), where the projection matrix is learned from observed problem instances drawn from an application-specific distribution of problems. We analyze the generalization guarantee for the data-driven projection matrix learning for convex quadratic programs (QPs). Unlike in LPs, the optimal solutions of convex QPs are not confined to the vertices of the feasible polyhedron, and this complicates the analysis of the optimal value function. To overcome this challenge, we demonstrate that the solutions of convex QPs can be localized within a feasible region corresponding to a special active set, utilizing Caratheodory's theorem. Building on such observation, we propose the unrolled active set method, which models the computation of the optimal value as a Goldberg-Jerrum (GJ) algorithm with bounded complexities, thereby establishing learning guarantees. We then further extend our analysis to other settings, including learning to match the optimal solution and input-aware setting, where we learn a mapping from QP problem instances to projection matrices.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > China > Hong Kong (0.04)
- North America > Canada > Ontario > Hamilton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China (0.04)
- Asia > Middle East > Jordan (0.05)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- (4 more...)
Reviewer
We thank all reviewers for their insightful comments. We subsequently use this characterization to bound the train and test loss in Lemma F.2. "mnist necessarily less complex than cifar10?" [R8] use spectrally normalized margin distributions to show "...if drop-out or batch-norm have any influence on the obtained results" We will add another section on the effect of batch-norm in the revised version.
- North America > Canada > Ontario > Hamilton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Asia > Middle East > Jordan (0.05)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- (4 more...)
Spectrally-normalized margin bounds for neural networks Peter L. Bartlett Dylan J. Foster
This paper presents a margin-based multiclass generalization bound for neural networks that scales with their margin-normalized spectral complexity: their Lipschitz constant, meaning the product of the spectral norms of the weight matrices, times a certain correction factor. This bound is empirically investigated for a standard AlexNet network trained with SGD on the mnist and cifar10 datasets, with both original and random labels; the bound, the Lipschitz constants, and the excess risks are all in direct correlation, suggesting both that SGD selects predictors whose complexity scales with the difficulty of the learning task, and secondly that the presented bound is sensitive to this complexity.
- Oceania > Australia > Queensland (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- (3 more...)