IBM Watson Health has formed a medical imaging collaborative with more than 15 leading healthcare organizations. The goal: To take on some of the most deadly diseases. The collaborative, which includes health systems, academic medical centers, ambulatory radiology providers and imaging technology companies, aims to help doctors address breast, lung, and other cancers; diabetes; eye health; brain disease; and heart disease and related conditions, such as stroke. Watson will mine insights from what IBM calls previously invisible unstructured imaging data and combine it with a broad variety of data from other sources, such as data from electronic health records, radiology and pathology reports, lab results, doctors' progress notes, medical journals, clinical care guidelines and published outcomes studies. As the work of the collaborative evolves, Watson's rationale and insights will evolve, informed by the latest combined thinking of the participating organizations.
Predicated on the increasing abundance of electronic health records, we investigate the problem of inferring individualized treatment effects using observational data. Stemming from the potential outcomes model, we propose a novel multi-task learning framework in which factual and counterfactual outcomes are modeled as the outputs of a function in a vector-valued reproducing kernel Hilbert space (vvRKHS). We develop a nonparametric Bayesian method for learning the treatment effects using a multi-task Gaussian process (GP) with a linear coregionalization kernel as a prior over the vvRKHS. The Bayesian approach allows us to compute individualized measures of confidence in our estimates via pointwise credible intervals, which are crucial for realizing the full potential of precision medicine. The impact of selection bias is alleviated via a risk-based empirical Bayes method for adapting the multi-task GP prior, which jointly minimizes the empirical error in factual outcomes and the uncertainty in (unobserved) counterfactual outcomes. We conduct experiments on observational datasets for an interventional social program applied to premature infants, and a left ventricular assist device applied to cardiac patients wait-listed for a heart transplant. In both experiments, we show that our method significantly outperforms the state-of-the-art.
We propose a "soft greedy" learning algorithm for building small conjunctions of simple threshold functions, called rays, defined on single real-valued attributes. We also propose a PAC-Bayes risk bound which is minimized for classifiers achieving a nontrivial tradeoff between sparsity (the number of rays used) and the magnitude ofthe separating margin of each ray. Finally, we test the soft greedy algorithm on four DNA micro-array data sets.
Finite mixture model is an important branch of clustering methods and can be applied on data sets with mixed types of variables. However, challenges exist in its applications. First, it typically relies on the EM algorithm which could be sensitive to the choice of initial values. Second, biomarkers subject to limits of detection (LOD) are common to encounter in clinical data, which brings censored variables into finite mixture model. Additionally, researchers are recently getting more interest in variable importance due to the increasing number of variables that become available for clustering. To address these challenges, we propose a Bayesian finite mixture model to simultaneously conduct variable selection, account for biomarker LOD and obtain clustering results. We took a Bayesian approach to obtain parameter estimates and the cluster membership to bypass the limitation of the EM algorithm. To account for LOD, we added one more step in Gibbs sampling to iteratively fill in biomarker values below or above LODs. In addition, we put a spike-and-slab type of prior on each variable to obtain variable importance. Simulations across various scenarios were conducted to examine the performance of this method. Real data application on electronic health records was also conducted.
Tandy J. Warnow Department of Computer Science University of Arizona Tucson AZ USA email: tandy cs, arizona, edu Abstract In an earlier paper, we described a new method for phylogenetic tree reconstruction called the Disk Covering Method, or DCM. This is a general method which can be used with an)' existing phylogenetic method in order to improve its performance, lCre showed analytically and experimentally that when DCM is used in conjunction with polynomial time distance-based methods, it improves the accuracy of the trees reconstructed. In this paper, we discuss a variant on DCM, that we call DCM2. DCM2 is designed to be used with phylogenetic methods whose objective is the solution of NPhard optimization problems. We also motivate the need for solutions to NPhard optimization problems by showing that on some very large and important datasets, the most popular (and presumably best performing) polynomial time distance methods have poor accuracy. Introduction 118 HUSON The accurate recovery of the phylogenetic branching order from molecular sequence data is fundamental to many problems in biology. Multiple sequence alignment, gene function prediction, protein structure, and drug design all depend on phylogenetic inference. Although many methods exist for the inference of phylogenetic trees, biologists who specialize in systematics typically compute Maximum Parsimony (MP) or Maximum Likelihood (ML) trees because they are thought to be the best predictors of accurate branching order. Unfortunately, MP and ML optimization problems are NPhard, and typical heuristics use hill-climbing techniques to search through an exponentially large space. When large numbers of taxa are involved, the computational cost of MP and ML methods is so great that it may take years of computation for a local minimum to be obtained on a single dataset (Chase et al. 1993; Rice, Donoghue, & Olmstead 1997). It is because of this computational cost that many biologists resort to distance-based calculations, such as Neighbor-Joining (NJ) (Saitou & Nei 1987), even though these may poor accuracy when the diameter of the tree is large (Huson et al. 1998). As DNA sequencing methods advance, large, divergent, biological datasets are becoming commonplace. For example, the February, 1999 issue of Molecular Biology and Evolution contained five distinct datascts of more than 50 taxa, and two others that had been pruned below that.