What do you do with all these genomes?

I’m teaching a class right now in which the students are tackling this very issue: We’re getting an awful lot of new genomic data from living humans. What can we learn about human prehistory from these people’s genes? And where will the Neandertal genome fit in when we have it?

Let’s consider the genomes from the last two weeks. One is an ancient individual from Greenland. One complete genome comes from a Namibian Tuu-speaking man named !Gubi, another from Archbishop Desmond Tutu. Three additional exomes, covering the protein-coding fraction of the genome, were obtained from three additional Bushman men, two Ju/’hoansi and one !Kung speaker. We can add these to existing complete genome sequences from some 8 individuals, and draft genomes from chimpanzee, gorilla, orangutan, and macaque.

The data quality varies substantially among these genomes. Some have been sequenced at 20x coverage or higher; others much less so. In fairly short order, these few will be joined by a thousand more.

The density of coverage of these genomes makes them unique resources, but we know a lot more about the variability of genes in different populations from SNP genotype data. Connecting the two kinds of data – finding the actual nucleotide changes that explain regional signatures of selection, for example – will be an important research goal in the near future.

I’m going to do my best over the next few weeks to post new analyses involving these human genomes. A lot of answers are fairly trivial to get, once you have the sequences. Heck, just getting the things is the barrier for most of us – how do you get a genome, and once you have it, how do you read it?

These won’t be tutorials, exactly; they’ll be case studies of a sort. Research fragments, some of which will be part of papers coming out of my lab. I’ll tag these posts with the category, “DIY genomics”.