Semi-supervised Active Linear Regression
–Neural Information Processing Systems
Labeled data often comes at a high cost as it may require recruiting human labelers or running costly experiments. At the same time, in many practical scenarios, one already has access to a partially labeled, potentially biased dataset that can help with the learning task at hand. Motivated by such settings, we formally initiate a study of semi-supervised active learning'' through the frame of linear regression. In this paper, we introduce an instance dependent parameter called the reduced rank, denoted \text{R}_X, and propose an efficient algorithm with query complexity O(\text{R}_X/\epsilon) . This result directly implies improved upper bounds for two important special cases: (i) active ridge regression, and (ii) active kernel ridge regression, where the reduced-rank equates to the statistical dimension'', \textsf{sd}_\lambda and effective dimension'', d_\lambda of the problem respectively, where \lambda \ge 0 denotes the regularization parameter. Finally, we introduce a distributional version of the problem as a special case of the agnostic formulation we consider earlier; here, for every X, we prove a matching instance-wise lower bound of \Omega (\text{R}_X / \epsilon) on the query complexity of any algorithm.
Neural Information Processing Systems
Oct-9-2024, 12:49:22 GMT
- Technology: