Who has not heard that Bayesian statistics are difficult, computationally slow, cannot scale-up to big data, the results are subjective; and we don't need it at all? Do we really need to learn a lot of math and a lot of classical statistics first before approaching Bayesian techniques. Why do the most popular books about Bayesian statistics have over 500 pages? Bayesian nightmare is real or myth? Someone once compared Bayesian approach to the kitchen of a Michelin star chef with high-quality chef knife, a stockpot and an expensive sautee pan; while Frequentism is like your ordinary kitchen, with banana slicers and pasta pots. People talk about Bayesianism and Frequentism as if they were two different religions. Does Bayes really put more burden on the data scientist to use her brain at the outset because Bayesianism is a religion for the brightest of the brightest?

Bayesian Statistics continues to remain incomprehensible in the ignited minds of many analysts. Being amazed by the incredible power of machine learning, a lot of us have become unfaithful to statistics. Our focus has narrowed down to exploring machine learning. We fail to understand that machine learning is only one way to solve real world problems. In several situations, it does not help us solve business problems, even though there is data involved in these problems. To say the least, knowledge of statistics will allow you to work on complex analytical problems, irrespective of the size of data. In 1770s, Thomas Bayes introduced'Bayes Theorem'.

Mussmann, Stephen (Purdue University) | Moore, John (Purdue University) | Pfeiffer, Joseph John (Purdue University) | Neville, Jennifer (Purdue University)

Due to the recent availability of large complex networks, considerable analysis has focused on understanding and characterizing the properties of these networks. Scalable generative graph models focus on modeling distributions of graphs that match real world network properties and scale to large datasets. Much work has focused on modeling networks with a power law degree distribution, clustering, and small diameter. In network analysis, the assortativity statistic is defined as the correlation between the degrees of linked nodes in the network. The assortativity measure can distinguish between types of networks---social networks commonly exhibit positive assortativity, in contrast to biological or technological networks that are typically disassortative. Despite this, little work has focused on scalable graph models that capture assortativity in networks. The contributions of our work are twofold. First, we prove that an unbounded number of pairs of networks exist with the same degree distribution and assortativity, yet different joint degree distributions. Thus, assortativity as a network measure cannot distinguish between graphs with complex (non-linear) dependence in their joint degree distributions. Motivated by this finding, we introduce a generative graph model that explicitly estimates and models the joint degree distribution. Our Binned Chung Lu method accurately captures both the joint degree distribution and assortativity, while still matching characteristics such as the degree distribution and clustering coefficients. Further, our method has subquadratic learning and sampling methods that enable scaling to large, real world networks. We evaluate performance compared to other scalable graph models on six real world networks, including a citation network with over 14 million edges.