Big Data to characterize the spatial integration of immigrant communities

March 15, 2018

A research team analyzes the spatial integration of immigrant communities using language detection and tweet localization

Due to the globalization process, the world's major metropolises are increasingly characterized by a growing heterogeneity in their populations. Newly arrived immigrants face challenges of social, economic and legal nature that can lead to exclusion. Integration has many facets: education, employment, health, bureaucracy associated with obtaining residence and work permits, etc. These are all complex issues that are difficult to quantify. Traditionally, they have been studied with information collected from surveys. In contrast, spatial integration in the place of residence is a variable that can be observed in a direct way and it bears indirect information on integration. The formation of ghettos is generally related to a lack of integration.

A team of researchers from the Institute of Cross-Disciplinary Physics and Complex Systems IFISC (CSIC-UIB) has developed a method that, using Twitter data, analyzes the degree of spatial segregation of immigrant communities. The user's cultural background is determined by the language of the tweets. If all the messages are in the local language, the user is considered to be a local resident. If some messages are in languages specific to immigrant communities, it can be assumed that the user knows that language and has a relation with that community. 

Language, together with the localization of messages, makes it possible to find the typical living areas of the different communities and to determine whether they are more or less concentrated in space than the local population. The method is applied to analyze immigrant communities in 53 of the world's largest cities. These cities can be classified into three major categories according to their integration capacity: those with high levels of integration, those with few immigrant communities or highly segregated from the spatial point of view and an intermediate category between both extremes. In the first group (high integration) there are cities like London, San Francisco, Tokyo or Los Angeles, while in the other extreme (low integration) there are others like Detroit, Miami, Toronto or Amsterdam.

In addition, the method allows us to study how different cultures, characterized by language, integrate in every country. The best integration is found among close cultures: for example, speakers of Portuguese and Italian in South American (Spanish-speaking) countries or Europeans in the United Kingdom. Cases of greater segregation occur between cultures that are more separated.

The proposed method lays the foundations for the use of online data for analyzing the spatial integration of immigrant communities. This type of data is massive and constantly updated, making the results almost real-time. The areas that can be studied are at a global scale, not just one country, and the cost of these studies is much lower than traditional surveys. This work opens up the possibility of using online data as a valuable source of information for public managers in charge of immigration policies. 

Fabio Lamanna, Maxime Lenormand, María Henar Salas-Olmedo, Gustavo Romanillos, Bruno Gonçalves, José J. Ramasco. PLOS ONE. DOI: 10.1371/journal.pone.0191612


Press and media

This web uses cookies for data collection with a statistical purpose. If you continue browsing, it means acceptance of the installation of the same.

More info I agree