New healthcare and population datasets now available in Google BigQuery Google Cloud Big Data and Machine Learning Blog Google Cloud Platform


We've just added several publicly available healthcare datasets to the collection of public datasets on Google BigQuery (the cloud-native data warehouse for analytics at petabyte scale), including RxNorm (maintained by NLM) and the Healthcare Common Procedure Coding System (HCPCS) Level II. While it's not technically a healthcare dataset, we also added the 2000 and 2010 Decennial census counts broken down by age, gender and zip code tabular areas, which we hope will assist healthcare utilization and population health analysis (as we'll discuss below). Anyone with a Google Cloud Platform (GCP) account can explore these datasets. RxNorm was created by the U.S. National Library of Medicine (NLM) to provide a normalized naming system for clinical drugs and provide structured information such as brand names, ingredients and so on for each drug. Drug information is made available as a single "concepts" table while the relationships that map entities to each other (ingredient to brand name, for example) is made available as a separate "relationships" table.