Batch kernel SOM and related Laplacian methods for social network analysis

Boulet, Romain, Jouve, Bertrand, Rossi, Fabrice, Villa, Nathalie

arXiv.org Machine Learning 

Institut de Mathématiques, Université de Toulouse et CNRS (UMR 5219), 118 route de Narbonne, 31062 Toulouse cedex 9, France Abstract Large graphs are natural mathematical models for describing the structure of the data in a wide variety of fields, such as web mining, social networks, information retrieval, biological networks, etc. For all these applications, automatic tools are required to get a synthetic view of the graph and to reach a good understanding of the underlying problem. In particular, discovering groups of tightly connected vertices and understanding the relations between those groups is very important in practice. This paper shows how a kernel version of the batch Self Organizing Map can be used to achieve these goals via kernels derived from the Laplacian matrix of the graph, especially when it is used in conjunction with more classical methods based on the spectral analysis of the graph. The proposed method is used to explore the structure of a medieval social network modeled through a weighted graph that has been directly built from a large corpus of agrarian contracts. This work was partially supported by ANR Project "Graph-Comp". Preprint submitted to Neurocomputing 19 March 2018 1 Introduction Complex networks are large graphs with a non trivial organization. They arise naturally in numerous context [7], such as, to name a few, the World Wide Web (which gives a perfect example of how large and complex such a network may grow), metabolic pathways, citation networks between scientific articles or more general social networks that model interaction between individuals and/or organizations, etc. Complex networks share common properties that have allowed the emergence of mathematical descriptions such as small world graphs or power law graphs. The structure of these graphs often gives some keys to understand the complex network underlined. To study such a structure, one often begins with a metrology process applied to the graph that describes the degree distribution, the number of components, the density, etc. However, it should be noted that dealing with very large graphs (millions of vertices) is still an open question (see [9] for an example of an efficient algorithm to explore that kind of data sets). Several ways have been explored to cluster the vertices of the graph into communities [43] and some of them have in common the use of the Laplacian matrix. Indeed, there are important relationships between the spectrum of the Laplacian and the graph invariants that characterize its structure (see, e.g. These properties can be used for building, from the eigen-decomposition of the Laplacian, a similarity measure or a metric space such that the induced dissimilarities between vertices of the graph are related to its community structure (see [13], among others).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found