layer neural network
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- (3 more...)
- Europe > France (0.05)
- North America > United States > California (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Switzerland (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Response for paper ID8808
We thank all reviewers for their thoughtful feedback. Please find detailed responses to your comments below. Thank you very much for carefully reading our paper and your supportive comments. Thank you very much for carefully reading our paper and your supportive comments. Theorem 2 is a lot to unpack.
Supplementary Material Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time
In the main text, many algorithmic details were omitted and only discussed briefly. A.1 Dataset Details We expand upon the seven datasets used for our experiments in this section. The task is multi-class classification with a heavy class imbalance. It has 8 features including price, day of the week and units transferred. We discard instances with missing values.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- (2 more...)
Uncovering Critical Sets of Deep Neural Networks via Sample-Independent Critical Lifting
Zhang, Leyang, Zhang, Yaoyu, Luo, Tao
This paper investigates the sample dependence of critical points for neural networks. We introduce a sample-independent critical lifting operator that associates a parameter of one network with a set of parameters of another, thus defining sample-dependent and sample-independent lifted critical points. We then show by example that previously studied critical embeddings do not capture all sample-independent lifted critical points. Finally, we demonstrate the existence of sample-dependent lifted critical points for sufficiently large sample sizes and prove that saddles appear among them.
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > Middle East > Jordan (0.04)