Long-time reader of your blog, non-paleo/anthro/genetics person, here. But please read on:
Just a couple of brief questions.
(i) It seems that it would make sense to look at pairwise comparisons (of shared derived Neanderthal SNP alleles) both within a population (e.g., Asians, or CEU) and between them, and build a histogram of how often they overlap.
(ii) Then one could remove from the data set all such African shared SNPs - assuming that most of them are incomplete lineage sorting but that Africa had the initial superset of alleles before ooA (I know some are likely West Asian or European admixture, reducing the data set slightly more than necessary), and repeat (i) and similar diagnostics. Is the typical unmodified genome chunk length around such sites much longer than in (i) - can one date this? Can one now better quantify the actual admixture percentage outside of Africa?
Wouldn't such a procedure give more insight about how Neanderthal introgression is distributed, when it occurred, and perhaps where it occurred?
I am sure you are already working on similar ideas - just wanted to know if you agree that these may be low-hanging fruit to pursue.
Hi -- thanks for writing!
I started with exactly the approach you describe, when we were working exclusively with SNP data in the spring. For example:
We were using linked haplotypes rather than single SNPs but the filtering process was the same.
Now I am hopeful that we will have decent age estimates for the introgressing SNPs from a different technique. I would rather find these ages independently of filtering by geographic location, because having this information will greatly simplify testing models of ancient population dynamics. If we succeed at this, we will also have a test of selection based on the same allele ages.
I am continuing to update and you'll see these results not long after we get them!