Characterizing the robustness of Bayesian adaptive experimental designs to active learning bias

Sloman, Sabina J., Oppenheimer, Daniel M., Broomell, Stephen B., Shalizi, Cosma Rohilla

arXiv.org Machine Learning 

Bayesian adaptive experimental design is a form of active learning, which chooses samples to maximize the information they give about uncertain parameters. Prior work has shown that other forms of active learning can suffer from active learning bias, where unrepresentative sampling leads to inconsistent parameter estimates. We show that active learning bias can also afflict Bayesian adaptive experimental design, depending on model misspecification. We analyze the case of estimating a linear model, and show that worse misspecification implies more severe active learning bias. At the same time, model classes incorporating more "noise" -- i.e., specifying higher inherent variance in observations -- suffer less from active learning bias. Finally, we demonstrate empirically that insights from the linear model can predict the presence and degree of active learning bias in nonlinear contexts, namely in a (simulated) preference learning experiment. Statistical theory often assumes learners' access to large amounts of representative training data, drawn from the distribution which is the target of inference or prediction. Nonetheless, such access is not feasible for many applications. Training data may be scarce (e.g., learning to identify a rare medical condition; Henry, Hager, Pronovost, and Saria (2015)), difficult or expensive to obtain (e.g., requiring human coders for text; Chen, Lasko, Mei, Denny, and Xu (2015)), or time-consuming to collect (e.g., obtaining user preferences online; Cavagnaro, Gonzalez, Myung, and Pitt (2013); Golovin, Krause, and Ray (2010)). One response is to abandon random sampling for adaptive sampling methods, choosing data points in sequence to be as informative as possible.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found