john hawks weblog

paleoanthropology, genetics and evolution

genetic ancestry

  • "Ancestry is complicated and very messy"

    Tue, 2013-02-26 12:33 -- John Hawks

    Mark Thomas has a Guardian piece reacting to some recent genetics promotion in the UK: "To claim someone has 'Viking ancestors' is no better than astrology". It is a good article to share with students because it clarifies some of the possibilities of genetic genealogy from the hype.

    My colleague Prof David Balding and I wrote to the BBC and to the two main scientists at BritainsDNA – both of whom we knew – expressing our concerns about the claims being made. Our expressions of concern over accuracy were met with threats of legal action for defamation by Mr Moffat's solicitors.

    Perhaps it is harmless fun to speculate beyond the facts, armed with exciting new DNA technologies? Not really. It costs unwitting customers of the genetic ancestry industry a substantial amount of hard-earned cash, and it disillusions them about science and scientists when they learn the truth, which is almost always disappointing relative to the story they were told.

    My advice is consistently: Don't spend money you need for something else, and don't assume that the "interpretation" of your genetics will last more than two months.

  • A problem with communicating human genetic history

    Thu, 2013-01-03 19:02 -- John Hawks

    Vincent Plagnol in Genomes Unzipped last month wrote about a bad example of public communication of population genetics and DNA ancestry testing: "Exaggerations and errors in the promotion of genetic ancestry testing".

    One thing we have done in Genomes Unzipped is to report on what is on the market for consumers interested in getting information about their genetic data. While we have found generally positive things to say about this market, there are also many exaggerated claims especially when it comes to making inferences about an individual’s ancestors from direct-to-consumer genetics companies. An example came up last summer with a BBC radio 4 interview of Alistair Moffat of Britain’s DNA. This post will discuss the scientific basis of some of the claims made in the interview.

    Now, Genomes Unzipped has published a response from Jim Wilson, chief scientist of BritainsDNA: "Response to 'Exaggerations and errors in the promotion of genetic ancestry testing'".

    The two posts are a useful example of the problems communicating human population history and human variation. We know that 10-year-old descriptions of human mtDNA phylogeography are wrong. But those descriptions are still out there, with people assuming they are close to correct, and companies selling the "information" about where their customers' mtDNA came from 50,000 years ago.

  • Mailbag: North and South China

    Mon, 2012-11-19 09:08 -- John Hawks

    I read with interest your post on:
    http://johnhawks.net/weblog/reviews/neandertals/pigmentation/neandertal-...

    in particular:
    "People of Han Chinese ethnicity sampled in Beijing appear to have on
    average a half percent more Neandertal ancestry than people of the
    same ethnicity sampled in southern China."

    Apologies if you know this already but Han Chinese civilization
    started in the Yellow River area and only later expanded south. The
    original people in the south of China are Viet people and have more in
    common with modern Vietnamese. They all became "Han" people after
    their kingdoms were conquered by the north and are really Han in name
    only. Northern and Southern Chinese people look different and their
    spoken dialects (languages) are mutually incomprehensible to each
    other.

    Chinese people from the province of Shantung have the reputation of
    being the biggest in size, always attributed to their diet of wheat,
    but they are probably the last purest reservoir of Neandertal genes in
    the East. Shantung people generally have big noses, fair skin and big
    bones.

    Yes indeed, these are very deep differences, at least as great as between northern and southern Europe genetically, and maybe more. That's why we find the contrast so useful in comparison with the archaic human genomes. The current samples are not ideal because the "South Chinese" were sampled in Beijing based on ancestry, and so are a diverse set. We are hoping soon to have data from many more Southeast and Northeast Asian populations, which will give us some resolution on when things changed.

  • Postmodernists are genetic determinists?

    Tue, 2012-05-01 19:37 -- John Hawks

    An article in The Awl by Russell Brandom sighs disappointedly about commercially available personal genome testing ("Everything I Didn't Learn From Taking A Personal Genome Test"). Misha Angrist, early personal genomics adopter, reacts to the piece on his GenomeBoy blog, "Of hairballs and long hauls".

    I agree with most of Misha's post, and I especially started cheering when I read his final point:

    Some of my postmodernist friends tend to look down their noses at genetic ancestry testing. I would argue that they are genetic determinists. Why assume that genetic information is so omnipotent as to irrevocably unravel one’s identity? Why must one narrative trump another? “Because it’s TECHNOLOGY! It’s GENETICS! It is ALL POWERFUL!” Please. It’s just another way of looking at one’s ancestry. And learning about genetics: I would argue that Henry Louis Gates has done as much to stir public interest in genetics as anyone or anything since the Human Genome Project. For realz.

    Data demystifies. The ancestry determinations aren't great, but you know what? People -- including nonprofessionals with an avocation in genetics -- are improving them every day, both showing their limits and inventing new ways to sift information out of genomes.

  • Ancestry perspective from 23andMe

    Sun, 2012-03-04 13:42 -- John Hawks

    Stanford geneticist Joanna Mountain recounts some of the experience she brings to 23andMe in her role as Senior Director of Research: "Solving mysteries via DNA". Much of her interests are the anthropological aspects of DNA and ancestry.

    Now that we know how DNA aligns with prehistoric migrations, we can trace the DNA of individuals to northern Europe or Central Asia, South America or the Near East, western Africa or Oceania. That information about where DNA is from can, in turn, answer questions about our ancestors. Were they struggling to feed their children through hunting red deer in northern Europe, harvesting shellfish in southeastern Asia, raising alpacas in the highland plateaus of western South America, or digging for tubers in eastern Africa? DNA shows that some of us have ancestors who faced the challenge of survival using several of these strategies.

    A new round of "Finding Your Roots With Henry Louis Gates, Jr" is going to begin on PBS this month, using 23andMe services as part of the program.

  • Open data genomics

    Fri, 2010-12-17 14:17 -- John Hawks

    Nature this week carries a story by Ewen Callaway titled, "The rise of the genome bloggers". The main subject is Dienekes Pontikos, whose "Dodecad Ancestry Project" seeks to illuminate population structure in undersampled populations by using SNP data supplied by volunteers with results from commercial testing companies.

    I like this quote:

    But Pontikos sees little point in formally publishing his findings. "I can bypass them entirely, and have the entire world review what I write," he wrote in an e-mail. Indeed, comments on his blog — "could you please provide the eigenvalues for the principal component analysis", for instance — read like the niggling recommendations of a manuscript reviewer.

    I've often found that the best reviews of my work come from blogs and readers, not from peer review itself. With a project like this, the most critical readings will come from the most interested community, which may be a broader public than the scientific community.

    The article also points to other worthwhile efforts such as Eurogenes and Genomes Unzipped, which are beginning to make connections between academic and public research.

  • Genomes unzipped, ancestry revealed

    Mon, 2010-11-01 11:25 -- John Hawks

    Last week I linked to Genomes Unzipped participant Joe Pickrell ("Ancestry unzipped"), who was working through the ancestry calculations that made his genome appear to have been partially Ashkenazi Jewish in origin ("Am I partly Jewish?").

    Now, Pickrell updates the story ("Am I partly Jewish? An unexpected turn of events"):

    As I was mulling over these sorts of issues, I sent the link to my previous analysis to a family member. I didn’t really expect this person to find it that interesting, but hey, you never know. I then got a phone call. I’ll summarize a couple days worth of moderate confusion, second-hand reports of conversations with distant relatives, and family intrigue with this: as it turns out, one of my great-grandparents was indeed a Polish Ashkenazi Jew who immigrated to the United States around the turn of the century. I, obviously, was completely unaware of this.

    Before coming to this deeper genealogical discovery, Pickrell summarizes some additional comparisons that made it difficult to explain the genetic results by errors in assumptions of the methods. This is the kind of outcome many people are hoping they will get from their genetic information -- a prompt for distant relatives to uncover family histories that, in many cases, they didn't know would be interesting. Or that in the past some wanted to forget.

    Razib ponders the question "Do ancestors matter?" Obviously they matter deeply to some, to others not so much. Unexpected discoveries from genetic information are not new, they're at least as old as blood typing. There is already a large community of people who find meaning from genealogical research and comparisons with others -- research that ultimately can illuminate only a very small part of any individual's genealogical history. Genetics doesn't necessarily offer any more than this. Any one person's genes come from only a small fraction of her ancestors.

    But the DNA may inform about a different part of one's genealogy than oral and written history. And it may give rise to a deeper idea of ancestry than cultural reckoning -- which in a mere fifty years can drive people to forget some of their ancestors and promote fictive ones.

    Synopsis: 
    A researcher gets a genealogical surprise after making his genome public.
  • Ancestry unzipped

    Fri, 2010-10-22 08:30 -- John Hawks

    One of the incredible benefits of the open source approach to genomics is that non-practitioners have a chance to see how interpretations are built. Sometimes it's a real "warts and all" picture of science, as statistical and historical details come into conflict with each other.

    The group at Genomes Unzipped includes a group of forward-thinking geneticists and related professionals who have made their 23andMe genotype data public. Soon after their data release, some other folks went to work on the data. Dienekes Pontikos applied ancestry prediction algorithm, finding that the Genomes Unzipped authors were, no surprise, mostly European -- but two of them were predicted to have a high component of Ashkenazi Jewish ancestry.

    Genomes Unzipped participant Joe Pickrell was surprised to discover he might have a high fraction of ancestry tracing back to Ashkenazi Jews. So he did some investigation of his own:

    Several hours after we released our data, however, I was pointed to a post where Dienekes Pontikos wrote about the results of running all our data through his ancestry prediction program. While just about everyone was quite confidently predicted to be almost entirely of northwestern European descent, this analysis gave me a point estimate of 20% Ashkenazi Jewish ancestry. Within hours, several people had asked me about this, and I had no real response. So I decided to take a look at the data myself; some basic analyses are below.

    The post is a great summary of some basic methods, including the strengths and weaknesses of the assumptions that underlie them.

    I have found over the last several years that this "surprised to discover" reaction is very common among people who have ancestry testing or other genotyping done. Sometimes the surprises end up being well supported by other historical evidence, of which the subject may not have been aware. But more often, the "surprising genealogy" is just an artificial result of applying erroneous or simplified assumptions in the course of the analysis. I think it is tremendously important to write up case studies where the process leading to a result is explicated, where the sensitivity of the analysis to various assumptions can be probed.

  • Good grief, the Neandertal test kits have been sent

    Wed, 2010-07-14 00:13 -- John Hawks

    Blaine Bettinger (the Genetic Genealogist) writes that some commercial test offerings are trying to sort out a way to tell you how Neandertal you are:

    Once [the Max Planck] study came out, I knew it was only a matter of time before companies began offering tests that examined the percent of Neanderthal contribution to a test-taker’s genome.

    This is one of the stickiest places to be a blogger. Bettinger links to a testing company's information on its product (including promotion of "Neandertal themed art" for the customer, sold at their Las Vegas gallery). Others have linked to Bettinger, drawing more attention.

    I think that as a scientist, more promotion is the last thing I should be giving this company. So I won't be naming or linking to their advertising.

    Ironically, the promotional material does not make any false statements of fact. The material makes it perfectly clear that the product does not test any gene variants that scientific research has shown may have come from Neandertals. Instead, the product reports on gene variants that we don't know about from Neandertals.

    Huh?

    You may wonder how a company can market such a product as a "Neanderthal Index". Since "Neanderthal Index" is not a scientific concept, a company can claim whatever it wants.

    So what is it? According to the material, the Neanderthal Index is computed from (a very few) STR alleles shared with "archaic" populations. Those "archaic" populations aren't Neandertals, they're Basques, Turks, Syrians, and other living people. Anthropologists do not call these people "archaic", so this is not a scientific concept either. Nobody has demonstrated that the listed populations are more or less Neandertal-like than any other living people. Most of the differences between these living populations emerged during the last 10,000 years.

    You'd do better putting calipers on your skull and measuring your cephalic index. At least that would tell you whether some real phenotype is Neandertal-like.

    I don't imagine that customers beating down the doors for this product. I think it exists as a way of bringing attention -- Neandertals are in the headlines. That's a big reason to not give them any attention. The test has nothing whatsoever to do with Neandertals as we scientifically understand them.

    Can you tell that I'm disgusted by this?

    Here in my lab, we're in a very good position to say that no test today can accurately report on your individual proportion of Neandertal ancestry. Until we have characterized a broader set of gene trees than we have so far, we are really not able to give any answer about how similar any person's genome is to Neandertals. We can't say yet how heterogeneous the human population is today in its ancestry from different parts of the world during the Late Pleistocene. For the past thirty years most working geneticists completely ignored the possibility of such heterogeneity, we are only just beginning to investigate it seriously.

    This kind of thing may not be why the FDA is looking to regulate personal genomics. Neandertal ancestry is not directly relevant to health. But if customers buy tests like this thinking that they are learning about Uncle Thag, just how much misinformation will they accept from other tests that purport to tell them something more important?

  • African-American mtDNA and regional populations of Africa

    Fri, 2010-03-12 12:04 -- John Hawks

    I'm attending a symposium on genetics and genealogy of the African Diaspora this morning. Fatimah Jackson is here giving a very interesting talk about her genetic work in Africa and African-Americans, and in particular her idea of "ethnogenetic layering" (Jackson 2008), which is basically a strategy for describing the fine-scale makeup of present-day populations by examining their genetic ancestry from different regions of the Old World.

    Part of her research has involved characterizing the regional distribution of mtDNA haplotypes within African populations. She shared some newer data with us, but I thought it worth pointing people to an earlier publication by Bert Ely, Jackson and others (2006), which gave rise to some strong insights about the poverty of current sampling of African populations.

    The study reports on a sample of 3725 mtDNA sequences (HVS-I) from a diversity of sub-Saharan African populations. That's quite a massive sample of sequences, certainly on the scale that had been available earlier. It is substantially more numerous than

    When a sample of 74 Gullah/Geechee mtDNA sequences were compared with the sub-Saharan database, approximately half of the mtDNAs were identical to two or more mtDNAs in the database and only seven mtDNAs matched mtDNAs from a single ethnic group. The remaining 28 mtDNAs were not identical to any sequence in the expanded database.

    Similar results were obtained when the 97 African-American AFDIL mtDNAs were compared with the databases. Approximately half (49) of the mtDNAs were identical to multiple sequences in the original database. As with the Gullah/Geechee sample, fewer than 10% of the sequences matched a sequence from a single ethnic group, and 40% of the sequences did not have any perfect match in the database (Ely et al. 2006:3).

    There are two aspects worth noting in those results. On the one hand, the common haplotypes -- the ones that the African-American samples were likely to have a match to -- were not regionally specific within Africa. They are shared by many ethnic groups, distributed across the continent.

    On the other hand, 40% of the African-American sequences have no match among the nearly 4000 sequences taken from continental Africa. That's astounding to me, just from the standpoint of sampling. Most of the common haplotypes will emerge within a relatively small sample, so to find something you haven't already seen, you have to sample disproportionately more -- in fact, exponentially more -- individuals. You can just imagine how many tens or hundreds of thousands of sequences you would have to gather to have an adequate representation of African mtDNA for this purpose -- the purpose of finding matches for a large fraction (say, more than 90 percent) of African-American mtDNA haplotypes that originated in Africa (there are of course a substantial fraction whose recent maternal ancestry originated somewhere else).

    One of the features of the symposium is a discussion of the relevance of ancestry testing. Jackson is an expert in this field and well-recognized -- she appeared in several of the "African-American Lives" episodes, for example.

    With several companies and organizations now offering various kinds of ancestry tests, these have become increasingly affordable. But the results are often confusing; people don't know how to interpret them. Some of that confusion was evidenced in questions here at the symposium -- as part of a year-long discussion group, several local people submitted cheek swabs for ancestry interpretation. The results are often poor, because the sampling of recent populations is inadequate to really answer many questions. Where were today's populations 300 years ago? Have we adequately sampled the variation of present populations.

    Research like Jackson's has shown that even widespread and numerous samples provide a real poverty of information about mtDNA diversity. The situation is vastly worse if we turn to autosomal variation, because the samples are smaller and more scattered.

    Of course, for many anthropological purposes, the samples we have today are tremendously useful. My work on recent selection, for example, has made leaps and bounds on samples of a few hundred individuals.

    But the converse case -- you take a person and ask whether you can diagnose their origin -- that task requires much larger samples to gain any statistical confidence in the general case. There may be specific haplotypes that are highly specific as to their present distribution -- but then, all of those are rare haplotypes, and you have to be lucky enough to have it within the comparative sample that the organization or company has gathered.

    I'm still listening here and some of the later presentations will touch on the issues of genetic ancestry testing more directly. But I thought I would share a quote I really liked, with which Jackson ended her comments:

    I'm not against genetic ancestry testing. It's fun. But in the final analysis, you have to look in the mirror, and you decide who you are.

    Related posts:

    Skip Gates discovers that genetic tests don't mean what he thought they meant.

    Anne Wojcicki from 23andMe comments on genomics and race

    Unintended consequences of genetic ancestry tests

    References:

    Ely B, Wilson JL, Jackson F, Jackson BA. 2006. African-American mitochondrial DNAs often match mtDNAs found in multiple African ethnic groups. BMC Biology 4:34. doi:10.1186/1741-7007-4-34

    Jackson FLC. 2008. Ethnogenetic layering (EL): an alternative to the traditional race model in human variation and health disparity studies. Ann Hum Biol 35:121-144. doi:10.1080/03014460801941752

Pages

Subscribe to genetic ancestry

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.