Learn techniques for programmatically acquiring data and how to extract that data. Finding your first dataset(s) to investigate might be the most important step toward acheiving your goal of answering your questions. As we mentioned in not available, you should first spend some time refining your question until you have one specific enough to identify good data about but broad enough to be interesting to you and others. Alternatively, you might have a dataset you already find interesting, but no compelling question. If you don't already know and trust the data source, you should spend some time investigating.
AzureML is the cloud hosted machine learning platform on top of Microsoft's cloud platform. Readers of Data Science Central will realize that AzureML have hosted a few webinars about their platform. This tutorial will walk you through integrating Python with AzureML. You are planning to move out of the place you are currently staying and are looking for a place place which is similar to the current place. How will you decide where to go?
Python is a great language for data science. When working with large datasets which don't fit entirely in memory, we may need to use some different approaches. In this talk we will discuss various Python libraries which are ideal for working with large time series datasets in a pandas-like way, including dask and vaex. We shall also explore how to make computation parallel in Python, talking about the differences between threading and multiprocessing, and wrappers like concurrent.futures. We shall also talk about using the very powerful celery to distribute tasks.