The day when armies of business analysts can query incoming data in real time may be drawing closer. Supporting such continuous interactive queries is a goal of KSQL, software put forward this week by the Kafka data-streaming software originators at Confluent Inc. KSQL is a SQL engine that directly handles Apache Kafka data streams. She also said KSQL is intended to broaden the use of Kafka beyond Java and Python, opening up Kafka programming to developers familiar with SQL; although, the form of SQL Confluent is using here is a dialect, one the company has developed to deal with the unique architecture of Kafka streaming. The software is appearing first as a developer preview, and it will be available under an Apache 2.0 license, according to the company. Created at LinkedIn, Kafka began life as a publish-and-subscribe messaging system that focused on handling log files as system events.
With the release of Apache Kafka 1.0 this week, an eight-year journey is finally coming to a temporary end. Temporary because the project will continue to evolve, see near-term big fixes, and long-term feature updates. But for Neha Narkhede, Chief Technology Officer of Confluent, this release is the culmination of work towards a vision she and a team of engineers first laid out in 2009. Back then, a team at LinkedIn decided it had the solution to a major data stream processing problem. Narkhede said the originators of Kafka first began their journey to building the project by sitting down and trying to understand why stream processing companies founded in the 1990's and 2000's had failed.
Confluent cofounders Neha Narkhede, CEO Jay Kreps and Jun Rao want to help companies use Kafka in the cloud. High-flying startup Confluent is bringing its open-source technology Apache Kafka to the cloud. In the years since its founders devised Kafka while at LinkedIn in 2010, the database streaming software has become one of tech's most popular ways to manage large amounts of data when it's needed fast. Investors have poured $80 million into the company launched by its creators, Confluent, valuing the buzzy startup at more than $530 million, according to data from PitchBook. As with any tech company built off an open-source project--just ask Docker, another high-flyer that recently brought on a third CEO--scaling a lasting and lucrative business off Kafka has trailed behind the popularity of its free version.
Wouldn't it be great if working with streaming data were just as simple as working with data at rest? And imagine if the two could be modeled, processed and coded against similarly; that would let organizations working with analytics broaden the scope of their work to do real-time streaming analytics too. We're not quite there yet, but Kafka Streams, a lightweight Java library that works with the Apache Kafka stream data platform, gets us closer, by empowering mainstream Java developers. And today, with the release of Confluent Data Platform 3.0, Kafka Streams has reached general availability (it had been released in preview form in Confluent Data Platform 2.0). How it works; where it's useful At the risk of oversimplifying things, Kafka Streams makes streaming data look like a conventional table, of keys and value pairs (the data structure is called a KTable).
Apache Kafka, the open source streaming data platform, has gained immense momentum over the past few years, both in terms of its core technology and as an API standard for proprietary streaming data solutions. Yesterday saw the release of version 2.0.0 of this streaming data juggernaut and today Confluent, the company founded by Kafka's creators, is releasing its enterprise distribution of that release, in the form of Confluent Platform 5.0. Data shows that data-driven organizations perform better. But what does it take to get there? In a briefing last week, Confluent Co-Founder and CTO, Neha Narkhede, explained to me how Confluent Platform is much more than the open source code and a support contract.