A compendium of data sources for data science, machine learning, and artificial intelligence