AITopics | apache spark 2

Collaborating Authors

apache spark 2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Apache Spark Machine Learning Tutorial

#artificialintelligenceFeb-22-2019, 03:02:17 GMT

Editor's Note: Download this Free eBook: Getting Started with Apache Spark 2.x – from Inception to Production In this blog post, we will give an introduction to machine learning and deep learning, and we will go over the main Spark machine learning algorithms and techniques with some real-world use cases. The goal is to give you a better understanding of what you can do with machine learning. Machine learning is becoming more accessible to developers, and data scientists work with domain experts, architects, developers, and data engineers, so it is important for everyone to have a better understanding of the possibilities. Every piece of information that your business generates has potential to add value. This overview is meant to provoke a review of your own data to identify new opportunities.

algorithm, learning, spark 2, (13 more...)

#artificialintelligence

Genre: Overview (0.89)

Industry: Banking & Finance (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.38)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)

Add feedback

Beginning Apache Spark 2 - Programmer Books

#artificialintelligenceFeb-2-2019, 20:45:58 GMT

Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it. Along the way, you'll discover resilient distributed datasets (RDDs); use Spark SQL for structured data, and learn stream processing and build real-time applications with Spark Structured Streaming. Furthermore, you'll learn the fundamentals of Spark ML for machine learning and much more.

artificial intelligence, data mining, machine learning, (5 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.87)
Information Technology > Artificial Intelligence > Machine Learning (0.58)

Add feedback

Machine Learning with Apache Spark 2: 2-in-1 Udemy

@machinelearnbotMay-26-2018, 00:30:38 GMT

Apache Spark lets you apply machine learning techniques to data in real time, giving users immediate machine-learning based insights based on what's happening right now. It's used to create machine learning models and programs that are distributed and much faster compared to standard machine learning toolkits such as R or Python. If you're a data professional who is familiar with machine learning and wants to use Apache Spark for developing efficient and fast machine learning systems, then this learning path is for you. This comprehensive 2-in-1 course teaches you to build machine learning systems, perform analytics, and predictions with Apache Spark. You'll learn through practical demonstrations of use cases, clear explanations, and interesting real-world applications. Each section briefly establishes theoretical basis for the topic under discussion and then cement your understanding with practical use cases.

apache spark, artificial intelligence, machine learning, (5 more...)

@machinelearnbot

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.55)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)

Add feedback

Learning Path: Data Science With Apache Spark 2

@machinelearnbotMay-23-2018, 03:05:32 GMT

The real power and value proposition of Apache Spark is its speed and platform to execute data processing and data science tasks. Let's see how easy it is! Packt's Video Learning Paths are a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it. Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools that are equally useful for application developers as well as data scientists.

artificial intelligence, data mining, machine learning, (10 more...)

@machinelearnbot

Genre:

Instructional Material > Course Syllabus & Notes (0.52)
Instructional Material > Online (0.40)

Industry:

Information Technology (0.70)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)
Information Technology > Artificial Intelligence > Machine Learning (0.37)
Information Technology > Data Science > Data Mining > Big Data (0.35)

Add feedback

Improve performance of ML pipelines for wide DataFrames in Apache Spark 2.3

#artificialintelligenceApr-15-2018, 01:46:55 GMT

Apache Spark MLlib's DataFrame-based API provides a simple, yet flexible and elegant framework for creating end-to-end machine learning pipelines. Leveraging the power of Spark's DataFrames and SQL engine, Spark ML pipelines make it easy to link together the phases of the machine learning workflow, from data processing, to feature extraction and engineering, to model training and evaluation. However, while Spark SQL can provide significant performance gains to some parts of the ML workflow, in other areas there are important shortcomings. One of these is that many of the most commonly used Spark ML components operate on a single column at a time. This particularly impacts the common use case of "wide" datasets, where there are many variables or features that typically need to be processed in the same manner (for example, encoding many categorical feature columns or discretizing many numerical feature columns).

apache spark 2, pipeline, transformer, (14 more...)

#artificialintelligence

Industry: Information Technology (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Feature Extraction (0.37)

Add feedback

Learning Path: Data Science With Apache Spark 2

@machinelearnbotJan-17-2018, 02:33:32 GMT

artificial intelligence, data mining, machine learning, (10 more...)

@machinelearnbot

Genre:

Instructional Material > Course Syllabus & Notes (0.52)
Instructional Material > Online (0.40)

Industry:

Information Technology (0.70)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)
Information Technology > Artificial Intelligence > Machine Learning (0.37)
Information Technology > Data Science > Data Mining > Big Data (0.35)

Add feedback

Spark Summit Europe 2017 Keynote: Deep Learning and Streaming in Apache Spark 2.x

@machinelearnbotOct-26-2017, 16:30:04 GMT

apache spark 2, deep learning and streaming, spark summit europe 2017, (1 more...)

@machinelearnbot

Country: Europe (0.69)

Industry: Media > News (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Deep Studying and Streaming in Apache Spark 2 x – Matei Zaharia & Sue Ann Hong

@machinelearnbotOct-26-2017, 04:35:12 GMT

"2017 continues to be an thrilling 12 months for Apache Spark. I'll speak about new updates in two main areas within the Spark group this 12 months: stream processing with Structured Streaming, and deep studying with high-level libraries reminiscent of Deep Studying Pipelines and TensorFlowOnSpark. In each areas, the group is making highly effective new performance obtainable in the identical high-level APIs utilized in the remainder of the Spark ecosystem (e.g., DataFrames and ML Pipelines), and bettering each the scalability and ease of use of stream processing and machine studying. " More from OnlineGames.Guru Machine Studying in Excessive Frequency Buying and selling – qplum FinTech Talks what are Job alternatives in Synthetic Intelligence (AI) Machine Studying Knowledge Science What's Utilized AI Course? Machine Studying in Excessive Frequency Buying and selling – qplum FinTech Talks what are Job alternatives in Synthetic Intelligence (AI) Machine Studying Knowledge Science What's Utilized AI Course?

artificial intelligence, machine studying, studying, (10 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > e-Commerce > Financial Technology (0.53)

Add feedback

Apache Spark 2 for Beginners - Udemy

@machinelearnbotSep-23-2017, 15:56:08 GMT

No matter where you are in your coding journey this course will get you up and running with Apache Spark, from installation and configuration to power user with 5.5 hours of top quality video tutorials. The first chapters are a step by step guide through the fundamentals of Spark programming, covering data frames, aggregations and data sets. Next you'll dive into what you can do with all the data you collect using Spark, filter results with R and expose your data to Python for deeper processing and presentation using charts and graphs. After that, you go further into the capabilities of Spark's stream processing, machine learning, and graph processing libraries. The last chapter combines all the skills you learned from the preceding chapters to develop a real-world Spark application.By the end of this video, you will be able to consolidate data processing, stream processing, machine learning, and graph processing into one unified and highly interoperable framework with a uniform API using Scala or Python.

artificial intelligence, data mining, machine learning, (10 more...)

@machinelearnbot

Country:

North America > United States (0.07)
Europe > United Kingdom (0.07)
Asia > Singapore (0.07)
Asia > India (0.07)

Genre:

Instructional Material > Training Manual (0.59)
Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.50)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)
Information Technology > Data Science > Data Mining > Big Data (0.36)

Add feedback

Cost Based Optimizer in Apache Spark 2.2 - The Databricks Blog

@machinelearnbotSep-5-2017, 08:35:03 GMT

This is a joint engineering effort between Databricks' Apache Spark engineering team (Sameer Agarwal and Wenchen Fan) and Huawei's engineering team (Ron Hu and Zhenhua Wang) Apache Spark 2.2 recently shipped with a state-of-art cost-based optimization framework that collects and leverages a variety of per-column data statistics (e.g., cardinality, number of distinct values, NULL values, max/min, average/max length, etc.) to improve the quality of query execution plans. Leveraging these statistics helps Spark to make better decisions in picking the most optimal query plan. Examples of these optimizations include selecting the correct build side in a hash-join, choosing the right join type (broadcast hash-join vs. shuffled hash-join) or adjusting a multi-way join order, among others. In this blog, we'll take a deep dive into Spark's Cost Based Optimizer (CBO) and discuss how Spark collects and stores these statistics, optimizes queries, and show its performance impact on TPC-DS benchmark queries. At its core, Spark's Catalyst optimizer is a general library for representing query plans as trees and sequentially applying a number of optimization rules to manipulate them.

artificial intelligence, information retrieval query processing, natural language, (16 more...)

@machinelearnbot

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Add feedback