Coming in the midst of a transitional year, Cloudera is announcing at the Strata London conference this week general release of the 6.0 release of its platform after an extended beta. For the new release, Hadoop 3.0 is the star of the show. We reviewed Hadoop 3.0 at the beginning of the year. To recap, the 3.0 Apache Hadoop release marks a major watershed for the platform, as it starts to address the information lifecycle with a feature with a very geeky name: Erasure coding. Erasure coding is a key feature of established RAID technologies.
Big data means big business. Countless companies are digging into data acquisition, storage, analysis and trend-spotting on a scope and scale unlike anything ever undertaken before. And along with those companies, a new generation of software platforms, analysis tools and related professional skills and knowledge is presenting unparalleled opportunities for interesting and high-paying work for IT professionals with the "right stuff" to play on the big data field. Cloudera remains on our list as one of the top big data certification providers, and Hadoop certification as one of the top four big data platforms in use today. Cloudera is a company that specializes in megadata collections built around the Apache Hadoop platform to create what it calls "enterprise data hubs."
A distributed file system, a MapReduce programming framework, and an extended family of tools for processing huge data sets on large clusters of commodity hardware, Hadoop has been synonymous with "big data" for more than a decade. But no technology can hold the spotlight forever. While Hadoop remains an essential part of the big data platforms, and the major Hadoop vendors--namely Cloudera, Hortonworks, and MapR--have changed their platforms dramatically. Once-peripheral projects like Apache Spark and Apache Kafka have become the new stars, and the focus has turned to other ways to drill into data and extract insight. Let's take a brief tour of the three leading big data platforms, what each adds to the mix of Hadoop technologies to set it apart, and how they are evolving to embrace a new era of containers, Kubernetes, machine learning, and deep learning.