How many genotypes does it take to nab you?

1 minute read

I was reading this story about “genetic surveillance” by law enforcement. I’ll blog about it later.

In the meantime, I had this idle thought. Suppose you got a bunch of genotypes from some testing company and started writing about them on your blog. How many would it take for law enforcement to be able to identify you?

It happens on CSI all the time – they find some rare chemical or pathogen or gene in a sample from a crime scene, and that’s the crucial clue that leads them to the killer. So, if you happen to be a chimera, if your body has absorbed your fraternal twin, or if you’ve had a bone marrow transplant, you should expect that Grissom is going to catch you.

The principle of DNA fingerprinting is that you put together a bunch of relatively common alleles, and the combination of them is so rare that it identifies you uniquely. You don’t have to depend on a vanishingly rare allele, you just have to depend on Mendel’s Law of Independent Assortment. CODIS uses 13 STR markers, for which the alleles are relatively rare, giving a probability of one in billions that two people would share the same markers.

But if you just wanted something probative – maybe the investigators have a DNA sample, and they are just checking your blog to see if they should look at you further, you don’t need anywhere near 13 STRs. If you’re blabbing about your genotypes, the frequency of a common genotype in the population is going to be anywhere between 10 and 50 percent. Let’s say the average is 30 percent, like the frequency of blood type A, or lactase heterozygotes in the US. Well, five genotypes like that will give a chance less than 1 in 400 that you’re the same as the suspect by chance.

Good enough for a warrant? If you’re a type AB positive, lactase heterozygote, blue eyed redhead who blogs about her APOE status, that’s seven, and some are pretty rare. Definitely less than one in 4000. That would be good enough for CSI!

Unless it turns out you’re secretly a man.