What is Azure Databricks?
Azure Databricks (documentation and user guide) was announced at Microsoft Connect, and with this post I'll try to explain its use case. At a high level, think of it as a tool for curating and processing massive amounts of data and developing, training and deploying models on that data, and managing the whole workflow process throughout the project. It is for those who are comfortable with Apache Spark as it is 100% based on Spark and is extensible with support for Scala, Java, R, and Python alongside Spark SQL, GraphX, Streaming and Machine Learning Library (Mllib). It has built-in integration with Azure Blog Storage, Azure Data Lake Storage (ADLS), Azure SQL Data Warehouse (SQL DW), Cosmos DB, Azure Event Hub, Apache Kafka for HDInsight, and Power BI (see Spark Data Sources). Think of it as an alternative to HDInsight (HDI) and Azure Data Lake Analytics (ADLA).
Nov-21-2017, 08:45:17 GMT
- Technology:
- Information Technology
- Data Science (0.82)
- Information Management (0.59)
- Artificial Intelligence > Machine Learning (0.59)
- Information Technology