Parallel Graph Analytics

Communications of the ACM 

Most existing abstractions and implementations for parallel computing were developed for computational science applications in which the main parallelism pattern is data parallelism; an example of a data-parallel operation is "map," which applies a function to each element of a set, producing a new set. Systems like Hadoop enable programmers to express and exploit data parallelism without having to write low-level parallel code. Some graph analytics algorithms have data parallelism, but others exhibit a more complex parallelism pattern called "amorphous" data-parallelism,29 described later in this article. Unlike in data-parallel programs, tasks in amorphous data-parallel programs may or may not be able to run in parallel, depending on the graph structure and values known only at runtime. To exploit this parallelism pattern, systems need to find opportunities for parallel execution while executing the program in parallel.