But it also likely will enable the new company to compete better against not only fellow Hadoop pioneer MapR Technologies, the other remaining independent vendor, but also Amazon EMR and the Google Cloud Dataproc managed service in the cloud. Microsoft also offers a Hadoop-based managed service in its Azure cloud, although the Azure HDInsight technology is based on the Hortonworks platform. The plan, which was approved by both Cloudera and Hortonworks boards, calls for Cloudera shareowners to hold about 60% of the combined company. Still, both Cloudera CEO Tom Reilly and Hortonworks CEO Rob Bearden suggested the Cloudera-Hortonworks merger should be seen as a combination of equals. Reilly will serve as CEO after the merger is finalized, while Bearden will be on the board of directors but not have an operational role.
In some ways Hortonworks is old fashioned in that it still clings to the stretch goal of managing half of the world's data in an era where cloud object stores and bespoke analytic services are adding more alternatives to the mix. Hortonworks' aspirational goal may not be realistic, but never mind, there are bigger fish to fry. The underlying message from this year's North American DataWorks Summit and analyst briefings is that the company is competing and facing the challenges of navigating a multipolar cloud world. My big on data bro Andrew Brust reported the headlines coming out earlier in the week: Hortonworks is releasing the 3.0 version of its data platform that, confusingly, is based on Hadoop 3.1. As we reported back at the start of the year, the 3.x generation of Apache Hadoop will mark a watershed with containerization and storage.
For data platform providers, Amazon is the ultimate frenemy. If you're trying to have a major cloud market presence, the Amazon cloud is almost impossible to avoid. So it's not surprising that Hadoop providers are increasingly making friendly with Amazon AWS - and Microsoft Azure. For Hortonworks, roughly a quarter of its customers are deploying in the cloud for some or all of their workloads. Until now, its primary cloud presence has been as the Hadoop engine of Azure's HDInsight big data service.