The Limits of Learning with Missing Data

Brian Bullins, Elad Hazan, Tomer Koren

Neural Information Processing Systems 

The primary objective of linear regression is to determine the relationships between multiple variables and how they may affect a certain outcome. A standard example is that of medical diagnosis, whereby the data gathered for a given patient provides information about their susceptibility to certain illnesses. A major drawback to this process is the work necessary to collect the data, as it requires running numerous tests for each person, some of which may be discomforting. In such cases it may be necessary to impose limitations on the amount of data available for each example.