Collaborating Authors

Active Regression by Stratification

Neural Information Processing Systems

We propose a new active learning algorithm for parametric linear regression with random design. We provide finite sample convergence guarantees for general distributions in the misspecified model. This is the first active learner for this setting that provably can improve over passive learning. Unlike other learning settings (such as classification), in regression the passive learning rate of O(1/epsilon) cannot in general be improved upon. Nonetheless, the so-called constant' in the rate of convergence, which is characterized by a distribution-dependent risk, can be improved in many cases.

Reducing health inequities and increasing access to care using AI and blockchain


The Palmerston North-based Health Hub Project in New Zealand is aiming to reduce health inequities and increase access to care with the help of artificial intelligence, machine learning and blockchain. Project co-founder David Hill is a GP at the Health Hub Project in Palmerston North, which runs four general practices with around 9000 patients. Hill says clinically trained people are a diminishing resource in healthcare and the system cannot rely on that to ensure its sustainability in the future, therefore technology needs to be used to "balance that inequity of supply and demand". "The whole point of what we are doing is trying to make sure that we use IT in a way that allows or permits greater equity of access to patients and starts to reduce the reliance on the ever-dwindling resource of healthcare workers," he says. "Also, to advance the value proposition that we give to patients."

Redox stratification of an ancient lake in Gale crater, Mars


The observed variations in CIA are consistent with modest short-term fluctuations in the ancient climate between cold, dry conditions and relatively warmer, wetter conditions. These changes occurred during the deposition of lake-bed mudstones in an environment that was conducive to the presence of a long-lived lake in Gale crater. We propose that the distinct properties of the two Murray facies were developed as a result of (i) fractionation of river-borne detritus into coarser, denser materials in shallow water close to shore and finer, lower density materials offshore in deeper water as a result of deceleration of river flow as it entered the lake; and (ii) redox stratification of the lake water body, caused by depth-dependent variations in the concentration of atmospheric oxidants and dissolved, groundwater-derived solutes, resulting in oxidizing conditions in shallow water and anoxia in deeper water. The addition of saline minerals during a later phase of brine migration through the section records longer-term changes in martian climate at Gale crater, perhaps driven by global atmospheric escape processes. The recognition of redox stratification in the lake in Gale crater adds new detail to our understanding of ancient martian aquatic environments.

Artificial Intelligence in Patient Risk Stratification and Care Coordination


Healthcare professionals …. please help fuel the research. I know you have a lot to share about these new developments (artificial intelligence) and hence will request your comments. A few years back, I served as a board member of a newly formed accountable care entity. This multi-hospital, multi-county accountable care entity spent significant time and effort to develop a risk stratification and care coordination model. The goal was that the model will not only provide efficient and effective care but will also be based upon evidence based medicine.

Classifier Risk Estimation under Limited Labeling Resources Machine Learning

In this paper we propose strategies for estimating performance of a classifier when labels cannot be obtained for the whole test set. The number of test instances which can be labeled is very small compared to the whole test data size. The goal then is to obtain a precise estimate of classifier performance using as little labeling resource as possible. Specifically, we try to answer, how to select a subset of the large test set for labeling such that the performance of a classifier estimated on this subset is as close as possible to the one on the whole test set. We propose strategies based on stratified sampling for selecting this subset. We show that these strategies can reduce the variance in estimation of classifier accuracy by a significant amount compared to simple random sampling (over 65% in several cases). Hence, our proposed methods are much more precise compared to random sampling for accuracy estimation under restricted labeling resources. The reduction in number of samples required (compared to random sampling) to estimate the classifier accuracy with only 1% error is high as 60% in some cases.