Inversion method for content-based networks

José J. Ramasco1 and Muhittin Mungan2,3
1Complex Networks Lagrange Laboratory, ISI Foundation, Turin I-10133, Italy.
2Department of Physics, Boğaziçi University, 34342 Istanbul, Turkey.
3The Feza Gürsey Institute, 34680 Istanbul, Turkey.

(November 2007)

In this paper, we generalize a recently introduced Expectation Maximization (EM) method for graphs and apply it to content-based networks. The EM method provides a classification of the nodes of a graph, and allows to infer relations between the different classes. Content-based networks are ideal models for graphs displaying any kind of community or/and multipartite structure. We show both numerically and analytically that the generalized EM method is able to recover the process that led to the generation of such networks. We also investigate the conditions under which our generalized EM method can recover the underlying contents-based structure in the presence of randomness in the connections. Two entropies, Sq and Sc, are defined to measure the quality of the node classification and to what extent the connectivity of a given network is content-based. Sq and Sc are also useful in determining the number of classes for which the classification is optimal.