ETL, ELT and Data Hub: Where Hadoop is the right fit ?

@machinelearnbot 

DMX-h, Syncsort's ETL /DI product for Hadoop runs natively on Hadoop and integrates very closely with the Map Reduce paradigm to perform high volume ETL batch operations like large JOINS, AGGREGATIONS, etc., which doesn't require users to rip the data out of Hadoop, do the ETL, and put it back into Hadoop as you referenced. DMX-h's ETL engine integrates via Syncsort's contribution to the Apache open source community, patch MAPREDUCE-2454, which introduced a new feature to the Hadoop MapReduce framework to allow alternative implementations of the Sort phase. This engine is the same ETL engine Syncsort offers outside of Hadoop and uses the same graphical UI, thereby making it very easy and seamless for existing ETL developers and architects to make the transition to ETL in Hadoop/Map Reduce – eliminating the need for Java/PIG expertise. The same lightweight DMX-h engine can be used to extract data from disparate source systems (Mainframe, RDBMS, files, etc.), pre-process, cleanse, validate and load it to HDFS, and then be used to implement very efficient and high speed Map Reduce ETL in Hadoop. Why Hadoop means more data savings & less data warehouse.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found