This project is aimed at empirically demonstrating that genetic diversity, predominantly determined during the prehistoric "out of Africa" migration of humans, is an underlying cause of various existing manifestations of ethnolinguistic heterogeneity which in turn are related to the stability and prosperity of nations. A wide range of measures of contemporary ethnolinguistic heterogeneity at the country level are considered. These include (i) the log number of ethnic groups (EG), compiled by Fearon (2003); (ii) two distinct measures of ethnic fractionalization (EF-F and EF-A), constructed by Fearon (2003) and Alesina et al. (2003), respectively; (iii) indices of ethnolinguistic fractionalization (ELF-D) and polarization (POL-D), based on deeply-rooted ancestral cleavages among linguistic groups in the population (i.e., level 1), developed by Desmet, OrtuÒo-OrtÌn and Wacziarg (2012); and (iv) measures of ethnolinguistic polarization, based on the methodologies of Esteban-Ray (POL-ER) and Reynal-Querol (POL-RQ), constructed by Esteban, Mayoral and Ray (2012). The analysis also accounts for a large vector of geographical covariates. The data consists a sample of 143 countries for which data on all employed variables are available.
The data consists of a wide range of measures of contemporary ethnolinguistic heterogeneity at the country level from different sources in the literature. These include (i) the log number of ethnic groups (EG), compiled by Fearon (2003); (ii) two distinct measures of ethnic fractionalization (EF-F and EF-A), constructed by Fearon (2003) and Alesina et al. (2003), respectively; (iii) indices of ethnolinguistic fractionalization (ELF-D) and polarization (POL-D), based on deeply-rooted ancestral cleavages among linguistic groups in the population (i.e., level 1), developed by Desmet, OrtuÒo-OrtÌn and Wacziarg (2012); and (iv) measures of ethnolinguistic polarization, based on the methodologies of Esteban-Ray (POL-ER) and Reynal-Querol (POL-RQ), constructed by Esteban, Mayoral and Ray (2012). The analysis also accounts for a large vector of geographical covariates. The data consists a sample of 143 countries for which data on all employed variables are available. The definitions to all the variables and the respective sources can be found in the Appendix of the publication (see related resources).