Goto

Collaborating Authors

 von Kleist, Henrik


Evaluation of Active Feature Acquisition Methods for Static Feature Settings

arXiv.org Machine Learning

Machine learning (ML) methods generally assume the ready availability of the complete set of input features at deployment, typically incurring little to no cost. However, this assumption does not hold universally, especially in scenarios where feature acquisitions are associated with substantial costs. In contexts like medical diagnostics, the cost of acquiring certain features, such as X-rays, biopsies, etc. encompasses not only financial costs but also poses potential risks to patient well-being. In such cases, the cost or harm of the feature acquisition should be balanced against the predictive value of the feature. Active feature acquisition (AFA) addresses this problem by training two AI components: i) the "AFA agent," an AI system tasked with determining which features should be observed, and ii) an ML prediction model that undertakes the prediction task based on the acquired feature set. While missingness was effectively determined by, for example, a physician during the acquisition of the retrospective dataset, the missingness at the deployment of the AFA agent is determined by the AFA agent, thereby leading to a missingness distribution shift. In our companion paper [1], we formulate the problem of active feature acquisition performance evaluation (AFAPE) which addresses the task of estimating the performance an AFA agent would have at deployment, from the retrospective dataset. Consequently, upon completing the AFAPE problem, the physician will be well-informed about expected rates of incorrect diagnoses and the average costs associated with feature acquisitions when the AFA system is put into operation.


Evaluation of Active Feature Acquisition Methods for Time-varying Feature Settings

arXiv.org Machine Learning

Machine learning methods often assume input features are available at no cost. However, in domains like healthcare, where acquiring features could be expensive or harmful, it is necessary to balance a feature's acquisition cost against its predictive value. The task of training an AI agent to decide which features to acquire is called active feature acquisition (AFA). By deploying an AFA agent, we effectively alter the acquisition strategy and trigger a distribution shift. To safely deploy AFA agents under this distribution shift, we present the problem of active feature acquisition performance evaluation (AFAPE). We examine AFAPE under i) a no direct effect (NDE) assumption, stating that acquisitions don't affect the underlying feature values; and ii) a no unobserved confounding (NUC) assumption, stating that retrospective feature acquisition decisions were only based on observed features. We show that one can apply offline reinforcement learning under the NUC assumption and missing data methods under the NDE assumption. When NUC and NDE hold, we propose a novel semi-offline reinforcement learning framework, which requires a weaker positivity assumption and yields more data-efficient estimators. We introduce three novel estimators: a direct method (DM), an inverse probability weighting (IPW), and a double reinforcement learning (DRL) estimator.