Collaborating Authors

Build and Deploy Scalable Machine Learning in Production with Kafka - DZone AI


Intelligent real time applications are a game changer in any industry. Machine learning and its sub-topic, deep learning, are gaining momentum because machine learning allows computers to find hidden insights without being explicitly programmed where to look. This capability is needed for analyzing unstructured data, image recognition, speech recognition, and intelligent decision making. It is an important difference from traditional programming with Java, .NET, or Python. While the concepts behind machine learning are not new, the availability of big data sets and processing power allow every enterprise to build powerful analytic models.

1. Intro to Streams Apache Kafka Streams API


The Streams API of Apache Kafka is the easiest way to write mission-critical real-time applications and microservices with all the benefits of Kafka's server-side cluster technology. It allows you to build standard Java or Scala applications that are elastic, highly scalable, and fault-tolerant, and don't require a separate processing cluster technology. Applications can be deployed on containers, VMs, or bare-metal hardware, to the cloud or on-premises. The Confluent Platform manages the barrage of stream data and makes it available throughout an organization. It provides various industries, from retail, logistics and manufacturing, to financial services and online social networking, a scalable, unified, real-time data pipeline that enables applications ranging from large volume data integration to big data analysis with Hadoop to real-time stream processing.

Streaming data, simplified: Kafka Streams reaches GA


Wouldn't it be great if working with streaming data were just as simple as working with data at rest? And imagine if the two could be modeled, processed and coded against similarly; that would let organizations working with analytics broaden the scope of their work to do real-time streaming analytics too. We're not quite there yet, but Kafka Streams, a lightweight Java library that works with the Apache Kafka stream data platform, gets us closer, by empowering mainstream Java developers. And today, with the release of Confluent Data Platform 3.0, Kafka Streams has reached general availability (it had been released in preview form in Confluent Data Platform 2.0). How it works; where it's useful At the risk of oversimplifying things, Kafka Streams makes streaming data look like a conventional table, of keys and value pairs (the data structure is called a KTable).



This project contains examples which demonstrate how to deploy analytic models to mission-critical, scalable production leveraging Apache Kafka and its Streams API. Examples will include analytic models built with TensorFlow, Keras, H2O, Python, DeepLearning4J and other technologies. More sophisticated use cases around Kafka Streams and other technologies will be added over time. The code is developed and tested on Mac and Linux operating systems. As Kafka does not support and work well on Windows, this is not tested at all.

Apache Kafka Online Training Kafka Certification Course Edureka


You have to build a system which should be consistent in nature. For example, if you are getting product feeds either through flat file or any event stream you have to make sure you don't lose any events related to product specially inventory and price. If we talk about price and availability it should always be consistent because there might be possibility that product is sold or seller doesn't want to sell it anymore or any other reason. However, attributes like Name, description doesn't make that much noise if not updated on time. John wants to build an e-commerce portal like Amazon, Flipkart or Paytm.