Stability of maximum-likelihood-based clustering methods: exploring the backbone of classifications
(a.k.a. Who is keeping you in that community?)

Muhittin Mungan1,2 and José J. Ramasco3
1Department of Physics, Boğaziçi University, 34342 Istanbul, Turkey.
2The Feza Gürsey Institute, 34680 Istanbul, Turkey.
3Complex Networks Lagrange Laboratory, ISI Foundation, Turin I-10133, Italy.

(April 2010)

Components of complex systems are often classified according to the way they interact with each other. In graph theory such groups are known as clusters or communities. Many different techniques have been recently proposed to detect them, some of which involve inference methods using either Bayesian or maximum likelihood approaches. In this paper, we study a statistical model designed for detecting clusters based on connection similarity. The basic assumption of the model is that the graph was generated by a certain grouping of the nodes and an expectation maximization algorithm is employed to infer that grouping. We show that the method admits further development to yield a stability analysis of the groupings that quantifies the extent to which each node influences its neighbors' group membership. Our approach naturally allows for the identification of the key elements responsible for the grouping and their resilience to changes in the network. Given the generality of the assumptions underlying the statistical model, such nodes are likely to play special roles in the original system. We illustrate this point by analyzing several empirical networks for which further information about the properties of the nodes is available. The search and identification of stabilizing nodes constitutes thus a novel technique to characterize the relevance of nodes in complex networks.