Generalizing in the Real World with Representation Learning
–arXiv.org Artificial Intelligence
Machine learning (ML) formalizes the problem of getting computers to learn from experience as optimization of performance according to some metric(s) on a set of data examples. This is in contrast to requiring behaviour specified in advance (e.g. by hard-coded rules). Formalization of this problem has enabled great progress in many applications with large real-world impact, including translation, speech recognition, self-driving cars, and drug discovery. But practical instantiations of this formalism make many assumptions - for example, that data are i.i.d.: independent and identically distributed - whose soundness is seldom investigated. And in making great progress in such a short time, the field has developed many norms and ad-hoc standards, focused on a relatively small range of problem settings. As applications of ML, particularly in artificial intelligence (AI) systems, become more pervasive in the real world, we need to critically examine these assumptions, norms, and problem settings, as well as the methods that have become de-facto standards. There is much we still do not understand about how and why deep networks trained with stochastic gradient descent are able to generalize as well as they do, why they fail when they do, and how they will perform on out-of-distribution data. In this thesis I cover some of my work towards better understanding deep net generalization, identify several ways assumptions and problem settings fail to generalize to the real world, and propose ways to address those failures in practice.
arXiv.org Artificial Intelligence
Oct-18-2022
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe
- Germany > North Rhine-Westphalia
- Upper Bavaria > Munich (0.04)
- Italy > Sardinia (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Germany > North Rhine-Westphalia
- North America
- Canada
- United States
- California
- San Francisco County > San Francisco (0.13)
- Santa Clara County > Palo Alto (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.13)
- New Jersey > Hudson County
- Secaucus (0.04)
- New York > New York County
- New York City (0.04)
- Washington (0.04)
- California
- Asia > Middle East
- Genre:
- Overview (1.00)
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Industry:
- Education (1.00)
- Government > Regional Government
- Health & Medicine
- Epidemiology (1.00)
- Pharmaceuticals & Biotechnology (1.00)
- Therapeutic Area
- Immunology (1.00)
- Infections and Infectious Diseases (1.00)
- Pulmonary/Respiratory Diseases (0.92)
- Information Technology > Security & Privacy (1.00)
- Law (0.92)
- Leisure & Entertainment (1.00)
- Media > Film (0.92)
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science (1.00)
- Machine Learning
- Learning Graphical Models > Directed Networks
- Bayesian Learning (0.67)
- Neural Networks > Deep Learning (1.00)
- Performance Analysis > Accuracy (1.00)
- Statistical Learning (1.00)
- Learning Graphical Models > Directed Networks
- Representation & Reasoning
- Agents (1.00)
- Rule-Based Reasoning (0.92)
- Uncertainty > Bayesian Inference (0.67)
- Information Technology > Artificial Intelligence