Collaborating Authors

Guide For Data Exploration In Python Using NumPy, Matplotlib, Pandas


Exploring data sets and developing deep understanding about the data is one of the most important skill every data scientist should possess. People estimate that time spent on these activities can go as high as 80% of the project time in some cases.

Deploying Machine Learning apps in production (x-post from r/python) • r/MachineLearning


Reposting because I didn't get any replies. Can anyone point me to a tutorial/book which teaches you how to deploy ML-based services in production using Flask/Django? All I have found is a bunch of blogs which don't really dive deep into the nitty gritty aspects of server side development. It gets very difficult for web-dev noobs like myself who need to move into the intermediate aspects of the field after getting familiar with the algorithms. I'd like something that explains scalable server side development from a Machine Learning-perspective and doesn't shy away from the details.

Python Scripting


The Python scripting extension provides an operator "Execute Python" that allows to seamlessly execute Python code within a RapidMiner process. Data made available as input to the operator will be transferred to Python, the specified Python code will be executed, and any outputs specified in the Python script will be again made available in RapidMiner. Any Python object can be returned, stored within RapidMiner repositories and made available for later use by another "Execute Python" operator.

Announcing Support for Native Editing of Jupyter Notebooks in VS Code Python


With today's October release of the Python extension, we're excited to announce the support of native editing of Jupyter notebooks inside Visual Studio Code! You can now directly edit .ipynb You can manage source control, open multiple files, and leverage productivity features like IntelliSense, Git integration, and multi-file management, offering a brand-new way for data scientists and developers to experiment and work with data efficiently. You can try out this experience today by downloading the latest version of the Python extension and creating/opening a Jupyter Notebook inside VS Code. Since the initial release of our data science experience in VS Code, one of the top features that users have requested has been a more notebook-like layout to edit their Jupyter notebooks inside VS Code.



This guide is a collection of distributed training examples (that can act as boilerplate code) and a tutorial of basic distributed TensorFlow. Many of the examples focus on implementing well-known distributed training schemes, such as those available in Distriubted Keras which were discussed in the author's blog post. Almost all the examples can be run on a single machine with a CPU, and all the examples only use data-parallelism (i.e. The motivation for this guide stems from the current state of distributed deep learning. Deep learning papers typical demonstrate successful new architectures on some benchmark, but rarely show how these models can be trained with 1000x the data which is usually the requirement in industy.