Stone Soup: Cooking Up Custom Solutions with SQL Server Machine Learning

#artificialintelligence 

This article describes the machine learning services provided in SQL Server 2017, which support in-database use of the Python and R languages. The integration of SQL Server with open source languages popular for machine learning makes it easier to use the appropriate tool--SQL, Python, or R--for data exploration and modeling. R and Python scripts can also be used in T-SQL scripts or Integration Services packages, expanding the capabilities of ETL and database scripting. What has this to do with stone soup, you ask? It's a metaphor, of course, but one that captures the essence of why SQL Server works so well with Python and R. To illustrate the point, I'll provide a simple walkthrough of data exploration and modeling combining SQL and Python, using a food and nutrition analysis dataset from the US Department of Agriculture. You might have heard that data science is more of a craft than a science. Many ingredients have to come together efficiently, to process intake data and generate models and predictions that can be consumed by business users and end customers. However, what works well at the level of "craftsmanship" often has to change at commercial scale. Much like the home cook who has ventured out of the kitchen into a restaurant or food factory, big changes are required in the roles, ingredients, and processes. Moreover, cooking can no longer be a "one-man show;" you need the help of professionals with different specializations and their own tools to create a successful product or make the process more efficient. These specialists include data scientists, data developers and taxonomists, SQL developers, DBAS, application developers, and the domain specialists or end users who consume the results. Any kitchen would soon be chaos if the tools used by each professional were incompatible with each other, or if processes had to be duplicated and slightly changed at each step. What restaurant would survive if carrots chopped up at one station were unusable at the next?