Big data, artificial intelligence and machine learning are ruling the tech structure of most industries. We all know how Amazon combines a customer's historical data and other customers' data to power recommendations. Likewise, for Google, it's not difficult to predict our preferences and interests. They make use of big data, analytics and machine learning to be able to process huge amounts of data, identify patterns, analyze them and consequently indulge in predictive analysis. The most complicated disease of the most important organ of the body – the brain, is a clear beneficiary of this AI approach.
The task of decision-making under uncertainty is daunting, especially for problems which have significant complexity. Healthcare policy makers across the globe are facing problems under challenging constraints, with limited tools to help them make data driven decisions. In this work we frame the process of finding an optimal malaria policy as a stochastic multi-armed bandit problem, and implement three agent based strategies to explore the policy space. We apply a Gaussian Process regression to the findings of each agent, both for comparison and to account for stochastic results from simulating the spread of malaria in a fixed population. The generated policy spaces are compared with published results to give a direct reference with human expert decisions for the same simulated population. Our novel approach provides a powerful resource for policy makers, and a platform which can be readily extended to capture future more nuanced policy spaces.
Inductive generalization rules for structured attributes have been developed that take into consideration the type of nodes in the domain hierarchy (anchor or non-anchor) and the type of decision rules to be generated (characteristic, discriminant or miniium complexity). These generalization rules enhance the ability of knowledge discovery system INIEN-2 to exploit the semantic content of the domain knowledge in the process of generating hypotheses. If the dependent attribute (e.g., a decision attribute) is structured, the system generates a system of hierarchicaliy organ&d ruies representing relationships between the values of this attribute and independent attributes. Such a situation often occurs in practice when the decision to be assigned to a situation can be at different levels of abstraction (e.g., this is a liver disease, or this is a liver cancer). Continuous attributes (e.g., physical measurements) am quantized into a hierarchy of values (ranges of values arranged into different levels).
We've just added several publicly available healthcare datasets to the collection of public datasets on Google BigQuery (the cloud-native data warehouse for analytics at petabyte scale), including RxNorm (maintained by NLM) and the Healthcare Common Procedure Coding System (HCPCS) Level II. While it's not technically a healthcare dataset, we also added the 2000 and 2010 Decennial census counts broken down by age, gender and zip code tabular areas, which we hope will assist healthcare utilization and population health analysis (as we'll discuss below). Anyone with a Google Cloud Platform (GCP) account can explore these datasets. RxNorm was created by the U.S. National Library of Medicine (NLM) to provide a normalized naming system for clinical drugs and provide structured information such as brand names, ingredients and so on for each drug. Drug information is made available as a single "concepts" table while the relationships that map entities to each other (ingredient to brand name, for example) is made available as a separate "relationships" table.
I've written before about the apparent shift in healthcare whereby making sense of the vast quantities of data produced within the system is key to successful treatment of patients. Nowhere is this moreso than in cancer care. For instance, a team from UCL utilized deep learning earlier this year to more accurately identify cancer cells. This trend is continued with a second study, which aims to make sense of the cancer data currently sitting in the cancer registry program that's coordinated by the National Cancer Institute (NCI) and the Centers for Disease Control and Prevention. This database has records of cancer incidences across the US, but the curation of it can often be a hugely time-intensive process as it requires manual editing and annotation from experts for each file.