Goto

Collaborating Authors

How to automate machine learning on SQL Server 2019 big data clusters

#artificialintelligence

In this post, we will explore how to use automated machine learning (AutoML) to create new machine learning models over your data in SQL Server 2019 big data clusters. SQL Server 2019 big data clusters make it possible to use the software of your choice to fit machine learning models on big data and use those models to perform scoring. In fact, Apache SparkTM, the popular open source big data framework, is now built in! Apache SparkTM includes the MLlib Machine Learning Library, and the open source community has developed a wealth of additional packages that integrate with and extend Apache SparkTM and MLlib. Manually selecting and tuning machine learning models requires familiarity with a variety of model types and can be laborious and time-consuming.


MLlib: Machine Learning in Apache Spark

arXiv.org Machine Learning

Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLlib, Spark's open-source distributed machine learning library. MLlib provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives. Shipped with Spark, MLlib supports several languages and provides a high-level API that leverages Spark's rich ecosystem to simplify the development of end-to-end machine learning pipelines. MLlib has experienced a rapid growth due to its vibrant open-source community of over 140 contributors, and includes extensive documentation to support further growth and to let users quickly get up to speed.


What's Driving Apache Spark Growth? SQL, Streaming and Machine Learning -- ADTmag

#artificialintelligence

Databricks Inc., the primary commercial steward behind the popular open source Apache Spark data processing framework for Big Data analytics, published a new report indicating the technology is still red-hot, driven by more use of SQL, streaming analytics and machine learning. The company this summer polled more than 900 organizations and solicited data from 1,615 respondents -- mostly Spark users -- coming from the ranks of data scientists, data engineers, architects and others, and last week published the results in the Apache Spark Survey 2016 Report (free download upon providing registration info). The report follows up on a similar survey last year, confirming the technology's widespread popularity as the most active open source project in the Big Data space. "As in 2015, which was a tremendous year in growth for Apache Spark, this year, too, its growth remains unabated -- not only in areas like the public cloud, but also with the increased use of Spark Streaming and the use of machine learning," the report states. "2016 also shows Spark's robust adoption across a variety of organizations and users from many functional roles to build complex solutions, using multiple Spark components."


H2O.ai Melds Machine Learning with Spark, Via Sparkling Water 2.0

#artificialintelligence

H2O.ai Melds Machine Learning with Spark, Via Sparkling Water 2.0 by - Jul. 01, 2016 The Renaissance Continues for Open Source Artificial Intelligence Baidu Delivers a Hardened Open Source Deep Learning Tool Google Launches a Slew of Open Source Parsers, to Work with 40 Languages IBM's Massive Spark Initiatives Include an Offering for Data Scientists Google's Custom Chip Can Accelerate Machine Learning Jobs In recent interviews here on OStatic, found here and here, we have explored the efforts of H2O.ai, formerly known as Oxdata, which has steadily been carving out a niche with its open source software for big data analysis and machine learning. You can get the main H2O platform and Sparkling Water, a package that works with Apache Spark, by simply downloading them. You can run them on clusters powered by Amazon Web Services (AWS) and others for just a few hundred dollars, putting powerful artificial intelligence muscle in reach of everyone. Now, H2O.ai has announced the availability of Sparkling Water 2.0. Sparkling Water 2.0 builds off the popularity of Sparkling Water, H2O.ai's API for Apache Spark, with additional features and functionality.


H2O.ai Melds Machine Learning with Spark, Via Sparkling Water 2.0

#artificialintelligence

H2O.ai Melds Machine Learning with Spark, Via Sparkling Water 2.0 by - Jul. 01, 2016 The Renaissance Continues for Open Source Artificial Intelligence Baidu Delivers a Hardened Open Source Deep Learning Tool Google Launches a Slew of Open Source Parsers, to Work with 40 Languages IBM's Massive Spark Initiatives Include an Offering for Data Scientists In recent interviews here on OStatic, found here and here, we have explored the efforts of H2O.ai, formerly known as Oxdata, which has steadily been carving out a niche with its open source software for big data analysis and machine learning. You can get the main H2O platform and Sparkling Water, a package that works with Apache Spark, by simply downloading them. You can run them on clusters powered by Amazon Web Services (AWS) and others for just a few hundred dollars, putting powerful artificial intelligence muscle in reach of everyone. Now, H2O.ai has announced the availability of Sparkling Water 2.0. Sparkling Water 2.0 builds off the popularity of Sparkling Water, H2O.ai's API for Apache Spark, with additional features and functionality.