With the release of Apache Kafka 1.0 this week, an eight-year journey is finally coming to a temporary end. Temporary because the project will continue to evolve, see near-term big fixes, and long-term feature updates. But for Neha Narkhede, Chief Technology Officer of Confluent, this release is the culmination of work towards a vision she and a team of engineers first laid out in 2009. Back then, a team at LinkedIn decided it had the solution to a major data stream processing problem. Narkhede said the originators of Kafka first began their journey to building the project by sitting down and trying to understand why stream processing companies founded in the 1990's and 2000's had failed.
The relationship between Apache Kafka and machine learning (ML) is an interesting one that I've written about quite a bit in How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka and Using Apache Kafka to Drive Cutting-Edge Machine Learning. This blog post addresses a specific part of building a machine learning infrastructure: the deployment of an analytic model in a Kafka application for real-time predictions. Model training and model deployment can be two separate processes. However, you can also use many of the same steps for integration and data preprocessing because you often need to perform the same integration, filter, enrichment, and aggregation of data for model training and model inference. We will discuss and compare two different options for model deployment: model servers with remote procedure calls (RPCs), and natively embedding models into Kafka client applications.
Intelligent real time applications are a game changer in any industry. Machine learning and its sub-topic, deep learning, are gaining momentum because machine learning allows computers to find hidden insights without being explicitly programmed where to look. This capability is needed for analyzing unstructured data, image recognition, speech recognition, and intelligent decision making. It is an important difference from traditional programming with Java, .NET, or Python. While the concepts behind machine learning are not new, the availability of big data sets and processing power allow every enterprise to build powerful analytic models.
In our last Apache Kafka Tutorial, we discussed Kafka Features. Today, in this Kafka Tutorial, we will see 5 famous Apache Kafka Books. Here, we come up with the best 5 Apache Kafka books, especially for big data professionals. Hence, we have organized the absolute best books to learn Apache Kafka to take you from a complete novice to an expert user. Even if you are looking for a career as a Kafka developer or Kafka professional, we are sure that, these Apache Kafka books will help you a lot.
Stream Analytics helps to develop and deploy solutions to gain real time insights from devices, sensors, and applications by real time stream processing in the cloud. Stream Analytics enables to perform real time analytics for Internet of Things solutions, stream millions of events per second, provide mission critical reliability and performance, also deliver real time dashboards and alerts over data from devices and applications, correlate across multiple streams of data and use SQL based language for development. Stream Analytics customers deploy and monitor streaming jobs. Applications of stream analytics includes personalized, real-time stock-trading analysis and alerts offered by financial services companies, real-time fraud detection; data and identity protection services, analysis of data generated by sensors and actuators, web clickstream analytics, customer relationship management (CRM) alerts, supply chain alerts, transportation alerts. Apache Flink is an open source platform for distributed stream and batch data processing.