First Steps in Machine Learning with Apache Spark

Jan-5-2023, 19:05:21 GMT–#artificialintelligence

Apache Spark is one of the main tools for data processing and analysis in the BigData context. It's a very complete (and complex) data processing framework, with functionalities that can be roughly divided into four groups: SparkSQL & DataFrames, the all-purpose data processing needs; Spark Structured Streaming, used to handle data-streams; Spark MLlib, for machine learning and data science and GraphX, the graph processing API. I've already featured the first two in other posts: creating an ETL process for a Data Warehouse and integrating Spark and Kafka for stream processing. Today is the time for the third one -- Let's play with Machine Learning using Spark MLlib. Machine Learning has a special place in my heart, because it was my entrance door to the data science field and, as probably many of yours, I started it with the classic Scikit-Learn library.

apache spark, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Jan-5-2023, 19:05:21 GMT

News Web Page

Add feedback

Technology:
- Information Technology
  - Data Science (1.00)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning > Information Fusion (0.56)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found