These are the questions your firm should ask before going down the route of edge analytics and processing. Hadoop is the operating system for big data in the enterprise. So when Cloudera and Hortonworks, the two leading Hadoop distributions and vendors, merged, that was big news in and by itself. Last week's DataWorks Summit Europe was the first big public event for the new Cloudera after the merger, and it sure was not short of interesting news, both on the technology and the business front. That's the name the new company will go by, and there's a new-ish logo and branding to go with this too.
A few weeks ago, two giants of the big data Hadoop era, Cloudera and Hortonworks, announced they would be merging. The announcement claimed it would be a "merger of equals." It is fascinating to see these two groundbreaking pioneers coming together. I remember several years ago when they burst onto the technology scene. They promised to help leaders re-architect their data centers and information frameworks -- and substantially lower the cost of storing and processing data, from the many thousands of dollars per terabyte that used to be common.
Big data means big business. Countless companies are digging into data acquisition, storage, analysis and trend-spotting on a scope and scale unlike anything ever undertaken before. And along with those companies, a new generation of software platforms, analysis tools and related professional skills and knowledge is presenting unparalleled opportunities for interesting and high-paying work for IT professionals with the "right stuff" to play on the big data field. Cloudera remains on our list as one of the top big data certification providers, and Hadoop certification as one of the top four big data platforms in use today. Cloudera is a company that specializes in megadata collections built around the Apache Hadoop platform to create what it calls "enterprise data hubs."
But it also likely will enable the new company to compete better against not only fellow Hadoop pioneer MapR Technologies, the other remaining independent vendor, but also Amazon EMR and the Google Cloud Dataproc managed service in the cloud. Microsoft also offers a Hadoop-based managed service in its Azure cloud, although the Azure HDInsight technology is based on the Hortonworks platform. The plan, which was approved by both Cloudera and Hortonworks boards, calls for Cloudera shareowners to hold about 60% of the combined company. Still, both Cloudera CEO Tom Reilly and Hortonworks CEO Rob Bearden suggested the Cloudera-Hortonworks merger should be seen as a combination of equals. Reilly will serve as CEO after the merger is finalized, while Bearden will be on the board of directors but not have an operational role.
Coming in the midst of a transitional year, Cloudera is announcing at the Strata London conference this week general release of the 6.0 release of its platform after an extended beta. For the new release, Hadoop 3.0 is the star of the show. We reviewed Hadoop 3.0 at the beginning of the year. To recap, the 3.0 Apache Hadoop release marks a major watershed for the platform, as it starts to address the information lifecycle with a feature with a very geeky name: Erasure coding. Erasure coding is a key feature of established RAID technologies.