Aug 13 2008

A Different Kind of Gene Mapping: Comparing Genetic and Geographic Structure in Europe

Published by chris at 9:58 am under big questions

By Chris Gignoux and Mike Macpherson

It should be no surprise that in general, we are more genetically similar to our neighbors than to people living far away. The reason is fairly simple — until recently in human history it was fairly rare for people from widely separated geographic regions to even meet, much less reproduce.

This pattern, known as isolation-by-distance, has been observed in a number of studies over the past several decades. This week, it has been confirmed in Europe by the largest study of its kind to date.

The researchers produced a two-dimensional map, like the one below, that preserves the genetic similarities between individuals as far as possible; in other words, the closer two dots (people) are on the map, the more closely related they are genetically.

Two dimensional genetic similarity map of Europeans showing the northern and southern clusters. Each colored symbol in the plot on the left represents a single person’s genotype. Note the similar placement of symbols on the plot to the left and the geographic legend to the right. Adapted from Tian et al., Plos Genetics, (2008).

In the figure above, each individual was labeled with their country of origin after the mapmaking procedure was run. If Europe were genetically homogeneous, you would expect the different nationalities to appear in a jumble. Instead, they  separate into clusters that, remarkably, roughly recapitulate the geography of Europe.

Northern vs. Southern Europe

Even though Europe has been occupied for only a relatively short time compared to other parts of the world, different populations within the continent have had time to differentiate from one another. Scientists have known for a long time that certain traits, like lactase persistence and light-colored eyes and hair are more common in northern than in southern Europe. Likewise, there are certain diseases such as sickle cell anemia that, although rare across Europe, are found more in the south than in the north. Height and skin color also vary from northern to southern Europe: both vary gradually with latitude rather than in quick jumps.
Early genetic studies (such as those in the landmark population genetics text History and Geography of Human Genes) showed that this north-south cline was also a genetic one: even though Europeans of different nationalities did not fit into simple clusters, there was an overarching north-south difference. Newer studies have increased the number of people typed, and the number of markers, to approach the genome-wide level of hundreds of thousands of SNPs we use here at 23andMe — which brings us to this week’s paper.

A summary of genome-wide findings

The Lao et al. study out this week obtained genotypes from more than 2,500 individuals of known European ancestry. Each of the genotypes consists of about half a million SNPs typed on the Affymetrix 500K, a chip similar in size to the Illumina 550K used here at 23andMe. They confirm the findings of several recent but smaller European studies (Seldin et al, PLoS Genetics (2006); Bauchet et al, AJHG (2007); Tian et al, PLoS Genetics (2008); Price et al, PLoS Genetics (2008); Paschou et al, PLoS Genetics (2008)), namely:

  • Over all SNPs, Europeans are very genetically similar.
  • There is a small set of SNPs that does allow European populations to be distinguished — at least when used among people whose ancestors are all from the same part of Europe — and they are surprisingly effective.
  • Most of the genetic variation in Europe is found along the north-south axis, which is consistent with archaeological knowledge. The next most prominent axis of genetic variation runs roughly east-west.
  • More isolated populations tend to exist at the extremes of these plots. In the case of this current paper the Finns are the only nationality completely distinct from the rest of the European samples. The Finns speak a different kind of language from much of the rest of Europe, and are the only Scandinavian population represented.

There’s plenty of action in the blogosphere on this one. For more discussion check out dienekes’ anthropology blog, anthropology.net, gene expression, and genetic future.

2 Responses to “A Different Kind of Gene Mapping: Comparing Genetic and Geographic Structure in Europe”

  1. rogerson 10 Dec 2008 at 8:48 pm

    I have not read the paper (yet!) but based on the map I find it somewhat misleading as there are instances where genetic similarities between populations exist that are not even remotely close geographically.

    Take for example the Y-chromosome haplogroup I. Its frequency among the male population between regions of southern europe and northern europe are remarkably similar. More specifically, in the north west balkans the frequency of the “I” haplogroup matches or exceeds frequencies of that found in areas of far northern europe. Perhaps this is why this region of europe was left out of the study?

  2. chrison 11 Dec 2008 at 5:02 pm

    This paper, and the others like it, use data from across the entire genome to determine genetic distances and associated techniques to best fit the data across multiple dimensions. The dimensions that do come out are not an attempt on the authors’ part to create a map out of genes: these are the first two dimensions of variation in the data. They just happen to line up well to N/S and E/W axes.

    It is sometimes hard to believe how good the concordance is between genes and geography in Europe, but it’s worth noting that this is what you would expect: neighbors are more likely to mate with each other than people on opposite sides of the continent. Thus, over time, neighbors will be more likely to be similar to one another than to people from far away.

    Your example of Y-chromosome haplogroup I is worth bringing up. However, on more precise examination there are different lineages within haplogroup I in different parts of Europe. I1 is found further north, on average, and I2 is found at higher frequencies in the Balkans. Most markers across the genome provide very little information (less than the Y-chromosome haplogroup I SNPs, for example) about ancestry, but combined, hundreds of thousands of markers do tell quite a lot. If you are a customer at 23andMe or have a demo account, you can see it for yourself. The new Advanced Global Similarity feature puts you in this same sort of display:
    https://www.23andme.com/you/globalsim/advanced/
    check it out!

Trackback URI | Comments RSS

Leave a Reply

You must be logged in to post a comment.