Introducing Social Hash Partitioner, a scalable distributed hypergraph partitioner


As a single host has limited storage and compute resources, our storage systems shard data items over multiple hosts and our batch jobs execute over clusters of thousands of workers, to scale and speed-up the computation. Our VLDB'17 paper, Social Hash Partitioner: A Scalable Distributed Hypergraph Partitioner, describes a new method for partitioning bipartite graphs while minimizing fan-out. We describe the resulting framework as a Social Hash Partitioner (SHP) because it can be used as the hypergraph partitioning component of the Social Hash framework introduced in our earlier NSDI'16 paper. The fan-out reduction model is applicable to many infrastructure optimization problems at Facebook, like data sharding, query routing and index compression.