Reducers: Workhorses of Parallel Programming - DZone Big Data
The secret to big data is of course the ability to do work in parallel. Modern Big Data engines like Hadoop don't rely on the invention of clever new algorithms or artificial intelligence to produce impressive results; instead, they are based on the idea of taking lots of inputs, working on little pieces of it in lots of places at the same time, then bringing the results together. Usually, the results are much smaller than the inputs, small enough that human beings can look at them directly. In order to work on lots of small pieces at the same time, whatever task we're performing has to be structured to be run in parallel. Many algorithms we're used to seeing work well sequentially have to be tweaked at least a little to work in parallel, and some have to be discarded as unusable in parallel.
Apr-27-2016, 06:00:29 GMT
- Technology: