Wikipedia information flow analysis reveals the scale-free architecture of the Semantic Space

  • IFISC Seminar

  • Adolfo Paolo Masucci
  • IFISC
  • 27 de Enero de 2011 a las 15:00
  • IFISC Seminar Room
  • Announcement file

We extract the topology of the semantic space in its encyclopedic
acception, measuring the semantic flow between the different entries
of the largest modern encyclopedia, Wikipedia, and thus creating a
directed complex network of semantic flows.
Notably at the percolation threshold the semantic space is
characterised by scale-free behaviour at different levels of
complexity, this relating the semantic space to a wide range of
biological, social and linguistics phenomena. In particular we find
that the cluster size distribution, representing the size of different
semantic areas, is scale-free. Moreover the topology of the resulting
semantic space is scale-free in the connectivity distribution and
displays small-world properties. However its statistical properties do
not allow a classical interpretation via a generative model based on a
simple multiplicative process.
After giving a detailed description and interpretation of the
topological properties of the semantic space, we introduce a
stochastic model of content-based network, based on a copy and
mutation algorithm and on the Heaps\' law, that is able to capture the
main statistical properties of the analysed semantic space, including
the Zipf\'s law for the word frequency distribution.


Detalles de contacto:

Ernesto M. Nicola

Contact form


Esta web utiliza cookies para la recolección de datos con un propósito estadístico. Si continúas navegando, significa que aceptas la instalación de las cookies.


Más información De acuerdo