john hawks weblog

paleoanthropology, genetics and evolution

genetics

  • Alzheimer's long read

    Sun, 2012-06-10 09:06 -- John Hawks

    The New York Times has a powerful story about the genetics of early onset Alzheimer's disease, by Gina Kolata: "An Alzheimer's gene: one family's saga".

    Gary was pretty sure it was his family whose gene had been found.

    He got a copy of Science and turned to the article, which included a family tree with members who had the gene represented by black diamonds. Those who did not have the gene were represented by white diamonds.

    It was scary even to look. Gary knew every person in that diagram, and he knew he was there too. Would he be a black diamond or a white one?

    The story focuses on one very large family in which onset at age 50 was clearly caused by a Mendelian dominant. But in addition to this, it gives a perspective on how the science works, where unraveling rare Mendelian causes for the disorder helps identify the pathways by which more common -- and more complex -- multigenic cases of the disorder may work.

  • Making Big Data work in genetics

    Tue, 2012-05-15 15:33 -- John Hawks

    Laura Clarke and colleagues report on the data access and management practices of the 1000 Genomes Project [1].

    The larger data volumes and shorter read lengths of high-throughput sequencing technologies created substantial new requirements for bioinformatics, analysis and data-distribution methods. The initial plan for the 1000 Genomes Project was to collect 2× whole genome coverage for 1,000 individuals, representing ~6 giga–base pairs of sequence per individual and ~6 tera–base pairs (Tbp) of sequence in total. Increasing sequencing capacity led to repeated revisions of these plans to the current project scale of collecting low-coverage, ~4× whole-genome and ~20× whole-exome sequence for ~2,500 individuals plus high-coverage, ~40× whole-genome sequence for 500 individuals in total (~25-fold increase in sequence generation over original estimates). In fact, the 1000 Genomes Pilot Project collected 5 Tbp of sequence data, resulting in 38,000 files and over 12 terabytes of data being available to the community. In March 2012 the still-growing project resources include more than 260 terabytes of data in more than 250,000 publicly accessible files.

    The paper acknowledges that this large-scale genetic sequencing project nevertheless generates far less data than physics and astronomy projects. The Large Synoptic Survey Telescope, for example, will generate 20 terabytes each night of operation, while the Large Hadron Collider will generate roughly 15 petabytes per year. The 1000 Genomes Project data to date add up to around two weeks of LSST operation. Still, it's not hard to see how high-coverage sequencing will start to catch up in data storage and transfer requirements.

    We are now in a golden age of data centralization. But five years from now, we may return to a second era of disposable data, as gene expression and whole-genome resequencing studies will generate far more data than any central repository can store. We will need curation practices to identify and preserve data that have value beyond the project for which they were collected.

    The beautiful thing about this is that when data are abundant, they don't all have to work together. There is a real role for a new generation of curators to facilitate the mashups of the future.


    References

  • A story of methemoglobinemia

    Wed, 2012-02-22 19:57 -- John Hawks

    A story by Susan Donaldson James of a unique genetic disorder and the social stigma of inbreeding in Appalachia: "Fugates of Kentucky: Skin Bluer than Lake Louise".

    By the time reports appeared in the media on the disorder, the Stacy family was upset with insinuations about in-breeding that fed into stereotypes of backwoods Appalachia.

    "There was a pain not seen in lab tests," wrote Trost. "That was the pain of being blue in a world that is mostly shades of white to black."

    The disorder involves an excess of methemoglobin in the blood, related to the examples I've been relating in my Anthropology 105 course the last week or so.

  • Is Nature Genetics something more than the GWAS Catalog?

    Tue, 2012-01-03 23:03 -- John Hawks

    I always look through the table of contents of Nature Genetics, which I have delivered to my inbox. Over the last couple of years, the journal has included a high fraction of papers that are either original genome-wide association studies or meta-analyses of multiple studies. These are substantial studies that have dozens of authors, on conditions of broad interest -- for example, this month there is a meta-analysis paper about type 2 diabetes. So I have no criticism of the journal, these studies need to be published somewhere.

    But others might be impatient with this course of research. The studies are formulaic: put together a large set of cases and controls, run them across a genotyping chip, and report the results. In the current issue, the journal's editorial board enters an op/ed suggesting that the current situation will not continue forever, because GWAS studies just aren't that interesting anymore [1]:

    Which Mendelian variants produce results suitable for publication in the journal? Our general principles are and have always been to select papers for review by the amount of new data and new ideas and the resource value contained within. Papers must meet current field-specific standards set by our latest benchmark papers and referee advice. Finally, we consider the value of the paper as a research tool, prioritizing those that will motivate larger numbers of scientists to do their research differently as a consequence. In principle it should be possible to find a phenotype for each of the tens of thousands of genetic elements in the human genome, but not all such results will be equally informative. However, if, say, 50 other labs will drop everything and instead use the results of your work, that paper is certainly suitable for this journal!

    Well, there you go. The editorial also addresses pedigree research, stating that new identifications of Mendelian disorders in single families will not be sent for review.

    I think this all is appropriate, it's just interesting that research has advanced to the point that finding a genetic cause for a disorder is no longer a sufficient reason for publication. If you look through the GWAS Catalog, you find study after study published in Nature Genetics. Those days are probably numbered.


    References

    1. Anonymous. Full spectrum genetics. Nature genetics. 2011;44(1):1.
  • Genetic variation and the Hardy-Weinberg proportions

    Mon, 2011-10-03 09:54 -- John Hawks
    Synopsis: 
    Allele frequencies and genotype frequencies are connected by math

    The fundamental information about genetics for any individual is her genotype — the alleles that she has. But genes in populations can be considered in other ways as well.

    For instance, a population consists of individuals, so a geneticist may count the number of individuals with every possible genotype. Comparing these numbers with the total number of individuals, the geneticist may calculate the genotype frequencies, the proportion of individuals who have each possible genotype.

    Cystic fibrosis (CF) is a very rare disorder. Among Americans of European ancestry, only around 1 in 2500 people will develop CF during her lifetime. The disorder is even rarer among people of non-European origin. Geneticists have surveyed many people to discover how many of them carry the disorder, and today a number of states screen newborns for cystic fibrosis as a way of directing affected children to early medical treatment (AAP Newborn Screening Task Force 2000). From this information, geneticists have determined that while only around 0.04 percent of the population is affected by cystic fibrosis, with a genotype of ff, approximately 2.5 percent of people are carriers of the allele, with a genotype of Ff. This leaves some 97.5 percent of the population with the genotype FF. These proportions are the frequencies of the three possible genotypes for this gene: 0.04% ff, 2.5% Ff and 97.5% FF.

    The frequency of an allele is the proportion of copies of that allele compared to the total number of copies of all alleles in a population.

    When geneticists know the frequencies of genotypes in a population, they can estimate how many copies of each allele are in the population as a whole. One recent study used genetic techniques to assess the genotypes of a group of colorectal cancer patients for the two possible alleles (A and B) of the p73 gene on chromosome 1 (Pfeifer et al. 2004). In this sample, 113 people (63 percent) were AA, 54 people (30 percent) were AB, and 12 people (7 percent) were BB. Each AA person has two copies of the A allele and each AB person has one copy, so the total sample included 280 copies of the A allele and 78 copies of the B allele. The allele frequency of the A allele is then 280/358, or 78 percent. The frequency of the B allele in this sample is 22 percent.

    Hardy-Weinberg proportions

    While geneticists can determine the frequency of an allele from the proportions of genotypes in a population, they may also do the same calculation in reverse — figuring out the expected proportions of genotypes from the frequencies of alleles. For example, if the cystic fibrosis f allele is found at a frequency of 5 percent in a population, then the chance that an individual will have two copies of this allele is simply the 5 percent chance for each of the two alleles, multiplied by each other. Five percent times five percent is 0.0025, which is 0.25% twenty-five chances out of 10,000.

    The Hardy-Weinberg genotype proportions are p2 + 2pq + q2 for a two-allele gene.

    The proportion of both homozygotes in the population may be determined in the same way. The probability that an individual will be a homozygote for either allele equals the frequency of the allele times itself, or squared. Thus, in the example above, the chance of f homozygotes is equal to the frequency of f squared. The proportion of heterozygotes is equal to the chance that one allele is F times the chance that the other allele is f, plus the chance that the first allele is f times the chance that the other allele is F. Since these likelihoods are the same, the total chance of an individual being a heterozygote is 2 times the frequency of one allele times the frequency of the other allele. Thus, the proportions of genotypes are p2 and q2 homozygotes, and 2pq heterozygotes.

    Geometric presentation of the Hardy-Weinberg proportions

    These proportions are called the Hardy-Weinberg proportions, after the British mathematician G. H. Hardy and German physician Wilhelm Weinberg, who independently formulated the relation in 1908. The proportions come from simple probability of sampling copies from a population with given allele frequencies. These proportions are expected to form an equilibrium. That is, they should stay the same over time, as long as the allele frequencies stay the same. When individuals mate without regard to the alleles they carry, every generation of a population should have genotype frequencies in approximately the Hardy-Weinberg proportions.

    The Hardy-Weinberg proportion is important for two reasons. First, the proportions of genotypes in a population may diverge from the expected ones for many reasons, including natural selection, division of populations into different subgroups, or mating that is not entirely random. Comparing the expected and observed proportions of genotypes allows biologists to determine whether these evolutionary forces may be contributing to a population.

    The heterozygosity of a population is the expected proportions of heterozygotes, given the allele frequencies in the population.

    Second, the proportions lead to a natural definition of genetic variation in a population: the heterozygosity. A population's heterozygosity is the expected proportion of heterozygotes from the Hardy-Weinberg formula, 2pq. Two populations may be compared by their heterozygosities: the one with the higher heterozygosity has a higher chance that any single individual will have two different alleles, which means the population is genetically more variable. Variation is a consequence of evolutionary history, including the patterns of selection and genetic drift, and the amount that individuals have moved from one population to another in the past. Thus, the Hardy-Weinberg proportions give an important way to study the evolution of populations over time.

    Study questions: 
    1. Suppose a population has two alleles, with frequencies of 70% and 30%. What are the Hardy-Weinberg proportions expected for the three genotypes of these alleles?
    2. Mendelian recessive disorders are rare in most populations, but their allele frequencies may be surprisingly high. Why?
    3. For a gene with two alleles, what is the highest possible value of heterozygosity? What is the lowest?
  • Genetic drift

    Fri, 2011-08-05 01:24 -- John Hawks
    Synopsis: 
    Many changes in gene frequencies are caused by random chance differences in reproduction.

    If everyone in a population lived a long life, mated, and reproduced absolutely equally (two offspring per person), then the population size would never change. There would always be approximately the same number of individuals, allowing for variations in when people are born or die. In this population, every gene has an equal chance of being passed into the next generation. Natural selection depends on differences in the chance that genes will survive and reproduce, so this population would not evolve by natural selection.

    But the population would still evolve by random chance. A single chromosome can illustrate this potential for evolution. The Y chromosome determines whether humans will be male or female: males have one X chromosome and one Y, females have no Y and two X chromosomes. Mendelian genetics predicts that if a father has two offspring, each of these children has a 50 percent chance of inheriting his Y chromosome and thereby being a son. But these odds mean that the man has a substantial chance of having no sons at all --- 25 percent of the time, both children will be daughters. If the man has no sons, then his Y chromosome is simply lost from the next generation. Genes disappear due to chance, even if everyone mates and reproduces equally.

    Genetic drift is a random change in allele frequencies.

    These random changes in allele frequency can accumulate over time. Across many generations, the frequency of an allele can gradually increase, gradually decrease, or fluctuate back and forth. In other words, the frequencies of different alleles seem to ``drift'' up and down, without any direction. This is why the random change in allele frequencies is called \term{genetic drift}. Over time, genetic drift can make once rare alleles common, or eliminate alleles altogether.

    Genetic drift is stronger in small populations.

    \begin{figure}
    \centering
    \includegraphics[width=4in]{genetic_variation_drift.png}
    \label{fig:genetic_drift}
    \caption[Genetic variation under genetic drift]{Genetic variation under genetic drift as a function of population size. The expected amount of genetic variation increases as a linear function of the size of the population, when genetic drift and mutation are the only causes of evolution. Larger populations are more variable; smaller populations are less variable. }
    \end{figure}

    The most obvious factor affecting the rate of genetic drift is the size of the population. If the population is small, then a small sample is taken of the gametic population in every generation. Small samples can vary more markedly from the larger sets from which they are selected than larger samples, so genetic drift is more powerful in smaller populations. For example, in a population of five individuals, an allele that exists in a single copy in one individual has a frequency of ten percent. Nevertheless, this allele is in constant jeopardy of being eliminated from the population, requiring only the chance of not being passed on once to never again be found. Likewise, it is very possible that in a very few generations this allele might increase from one copy to ten, eliminating all other alleles. In contrast, in a population of a thousand individuals, an allele with a frequency of ten percent exists in 200 copies. While random sampling of gametes will cause this number to fluctuate over time, it is extremely unlikely that chance alone would allow no copy of this allele to be passed on in any given generation. Indeed, it would likely take many hundreds of generations for random events to either eliminate this allele or all the others.

    Study questions: 
    1. Can you think of other human populations that have undergone founder effects?
  • What does it mean for a trait to be heritable?

    Fri, 2011-08-05 00:57 -- John Hawks
    Synopsis: 
    Heritability is the proportion of variation in a phenotype that can be explained by variation in genes.

    Tall people tend to have tall parents. The height of the body, called stature, is one of the most obvious phenotypic traits in human populations. Anyone who knows very many families can see that some families tend to have a lot of tall members, and others tend to be relatively short.

    Still, there is more to being tall than having tall parents. A child may grow to be taller than either of his parents, sometimes much taller. A very tall father and short mother don't always have children that are in between their statures: sometimes a daughter may match her mother's height, or even be shorter. Relatives resemble each other in height, but we cannot say that a woman's stature is determined by the statures of her parents.

    There are two key reasons why most phenotypes are not inherited as simple Mendelian traits. Many traits are affected not by one, but by multiple genes. At present, more than 300 genes are known to influence the variation in stature within populations of European ancestry. Other genes may be responsible for differences within other populations and between very different populations such as the short-statured Biaka Pygmies and tall-statured Dinka of equatorial Africa. The combination of so many genes helps to explain why stature is not a Mendelian trait with only two or three distinct forms. Instead, stature is a \term{continuous} trait, in which a person may be 62 inches tall, 61.5 inches, 61.25 inches, 61.0003 inches, or any measurable value in between.

    Another reason for variation is that different people may have experienced different environments. The effects of environmental variation are plain in many fields of corn. The corn plants in a single field generally have very little genetic variation: they are hybrids bred for their high yields with adequate fertilizer, pesticide, and water. But many fields have low spots where water pools, or drier spots on the edge of the field. Here, the plants do not grow as tall or yield as much, because their environment is not ideal for high yield by the planted strain. The variation in plant height visible in such fields is entirely the result of environmental variation, because the plants are genetically uniform.

    Environmental variation may come in at many points during an individual's life, from early embryonic development to childhood nutrition and disease. At all stages, this environmental variation can have a major effect on some phenotypes --- sometimes more extensive than variation in genes.

    Heritability is the proportion of phenotypic variation that is explained by genetic variation.

    \begin{figure}
    \includegraphics[width=\textwidth]{female_stature_heritability.png}
    \label{fig:female_stature_heritability}
    \caption[Female stature heritability]{Relationship of student stature to midparent stature in a sample of 100 female college students. The midparent stature is the average of the mother's and father's heights. Taller parents tend to have taller offspring, although some women are taller than the average of their parents, and others are shorter. The slope of the line relating the two reflects the heritability of stature in the sample, which for these students is estimated at 0.89. }
    \end{figure}

    For a phenotype like stature, it can be important to know how much of the variation results from genetic variation, and how much of the variation results from the environment. This is important because genetic changes can affect a phenotype only to the degree that the phenotypic variation is influenced by those genes. Geneticists also want to discover if phenotypic similarities among people are the result of shared alleles, or whether instead they are the effect of shared environments. The concept that relates genetic variation and phenotypic variation is called \term{heritability}, which is the proportion of phenotypic variation that can be explained by genetic variation in the population (Falconer and Mackay 1996). Because it is a proportion, heritability varies between 0 and 1.

    Phenotypic traits with heritabilities near zero are very weakly affected by genetic variation, while heritabilities near 1 indicate a very strong genetic influence. Very different traits can have similar heritabilities. For example, near-sighted, or myopic, parents are substantially more likely to have children with myopia. In Western populations, genetic variation accounts for more than 80 percent of the variation in juvenile-onset myopia (Gilmartin 2004). Likewise, variation in the many genes that influence stature accounts for approximately 80 percent of the phenotypic variation in stature, at least in Western societies with relatively stable childhood nutrition (Silventoinen 2003). Some human traits have heritabilities very near 1. One example is the mass of the brain, which has a heritability of approximately 0.94 (Bartley et al. 1997).

    Study questions: 
    1. Tall parents don't always have tall children. Why not?
    2. What does the heritability of a trait enable geneticists to predict?
    3. What explains the regression to the mean?
  • Genetic variation and the Hardy-Weinberg formula

    Thu, 2011-08-04 15:32 -- John Hawks
    Synopsis: 
    We can predict the proportions of genotypes in a population from the allele frequencies, which gives us a way to measure variation.

    The fundamental information about genetics for any individual is her genotype. The genotype for a single genetic locus is simply a list of two alleles, whether they're the same or different. Within a population, we can count alleles in other ways also, giving us ways to measure the genetic variation among many individuals.

    A population consists of individuals, so a geneticist may count the number of individuals with every possible genotype. Comparing these numbers with the total number of individuals, the geneticist may calculate the genotype frequencies, the proportion of individuals who have each possible genotype.

    For example, cystic fibrosis (CF) is a very rare disorder. Among Americans of European ancestry, only around 1 in 2500 people will develop CF during her lifetime. The disorder is even rarer among people of non-European origin. Geneticists have surveyed many people to discover how many of them carry the disorder, and today a number of states screen newborns for cystic fibrosis as a way of directing affected children to early medical treatment (AAP Newborn Screening Task Force 2000). From this information, geneticists have determined that while only around 0.04 percent of the population is affected by cystic fibrosis, with a genotype of ff, approximately 4 percent of people are carriers of the allele, with a genotype of Ff. This leaves nearly 96 percent of the population with the genotype FF. These proportions are the frequencies of the three possible genotypes for this gene.

    The frequency of an allele is the proportion of the total number of gene copies in the population that are that allele.

    When geneticists know the frequencies of genotypes in a population, they can determine how many copies of each allele are in the population as a whole. For example, one recent study used genetic techniques to assess the genotypes of a group of colorectal cancer patients for the two possible alleles (A and B) of the p73 gene on chromosome 1 (Pfeifer et al. 2004). In this sample, 113 people (63 percent) were AA, 54 people (30 percent) were AB, and 12 people (7 percent) were BB. Considering that each AA person has two copies of the A allele and each AB person has one copy, the total sample includes 280 copies of the A allele and 78 copies of the B allele. The allele frequency of the A allele is then 280/358, or 78 percent. The frequency of the B allele in this sample is 22 percent.

    While geneticists can determine the frequency of an allele from the proportions of genotypes in a population, they may also do the same calculation in reverse — figuring out the expected proportions of genotypes from the frequencies of alleles. For example, if the cystic fibrosis f allele is found at a frequency of 5 percent in a population, then the chance that an individual will have two copies of this allele is simply the 5 percent chance for each of the two alleles, multiplied by each other. Five percent times five percent is 0.0025, or twenty-five chances out of 10,000.

    The Hardy-Weinberg genotype proportions are p2 + 2pq + q2 for a two-allele gene.

    The proportion of both homozygotes in the population may be determined in the same way. The probability that an individual will be a homozygote for either allele equals the frequency of the allele times itself, or squared. Thus, in the example above, the chance of f homozygotes is equal to the frequency of f squared. The proportion of heterozygotes is equal to the chance that one allele is F times the chance that the other allele is f, plus the chance that the first allele is f times the chance that the other allele is F. Since these likelihoods are the same, the total chance of an individual being a heterozygote is 2 times the frequency of one allele times the frequency of the other allele. Thus, the proportions of genotypes are p2 and q2 homozygotes, and 2pq heterozygotes.

    These proportions are called the Hardy-Weinberg proportions, after the British mathematician G. H. Hardy and German physician Wilhelm Weinberg, who independently formulated the relation in 1908. The proportions come from simple probability of sampling copies from a population with given allele frequencies. These proportions are expected to form an equilibrium — that is, they should stay the same over time — as long as the allele frequencies stay the same. When individuals mate without regard to the alleles they carry, every generation of a population should have genotype frequencies in approximately the Hardy-Weinberg proportions.

    The Hardy-Weinberg proportion is important for two reasons. First, the proportions of genotypes in a population may diverge from the expected ones for many reasons, including natural selection, division of populations into different subgroups, or mating that is not entirely random. Comparing the expected and observed proportions of genotypes allows biologists to determine whether these evolutionary forces may be contributing to a population.

    The heterozygosity of a population is the expected proportion of heterozygotes, given the allele frequencies in the population.

    Second, the proportions lead to a natural definition of genetic variation in a population: the heterozygosity. A population's heterozygosity is the expected proportion of heterozygotes from the Hardy-Weinberg formula, 2pq. Two populations may be compared by their heterozygosities: the one with the higher heterozygosity has a higher chance that any single individual will have two different alleles, which means the population is genetically more variable. Variation is a consequence of evolutionary history, including the patterns of selection and genetic drift, and the amount that individuals have moved from one population to another in the past. Thus, the Hardy-Weinberg proportions give an important way to study the evolution of populations over time.

    Study questions: 
    1. Suppose that the frequency of one allele is 0.3 (30%). Use the Hardy-Weinberg formula to predict the proportion of homozygotes in the population with two copies of that allele.
    2. Consider a population with two alleles, with frequencies 0.4 and 0.6. What is the heterozygosity of the population?
    3. How could a population have many different alleles but still have a low heterozygosity?
  • Blueprints and recipes

    Tue, 2011-05-17 08:30 -- John Hawks

    Greg Mayer has a post on preformationism and epigenesis on the Why Evolution Is True blog:"Development is epigenetic".

    He later quotes Richard Dawkins in a similar light, but I'm linking because of Mayer's own useful synopsis of the blueprint analogy versus the recipe analogy for development.

    Preformationism, though wrong, is frequently reinforced by the common (though badly mistaken) practice of referring to DNA or the genome as a “blueprint” for the organism. It is of course no such thing. A blueprint is a two dimensional representation of a three dimensional object. There is, in a blueprint, a scaled representation of all the parts of the object. We can tell, for example, that the window on the second floor is 4 m above and 2m to the left of the door. There is nothing like that in your DNA: there isn’t a gene for your left eye, which is a scaled distance away from the gene for your right eye. Your DNA (and your development) is much more akin to a recipe. In a raisin cake recipe, there isn’t a line in the recipe that says place a raisin 2 cm in from the upper left hand corner (there would be, if we had a blueprint for the cake). Rather, if you combine the right ingredients, in the right sequence, in the right environment, the result is a cake with raisins distributed through it at a certain density.

    In the end both these analogies entail some mechanism. A blueprint needs some past mechanism capable of producing an iconic representation of the final object. A recipe needs some mechanism capable of recording a sequence of steps. Neither of those is impossible to evolve (Mayer briefly mentions the iconic nature of the arrangement of Hox genes), but it's pretty clear that the blueprint analogy does not apply to most developmental processes.

    I was thinking about this issue in light of the nativist and learning theoretic views of language development. In that problem, the question is about the locus of the recipe -- did evolution lay down special instructions for language learning, or does the language environment contain most of the structure necessary for children to learn without special instructions beyond those used for learning many other kinds of behavior? Chomsky argued that language environments cannot in principle supply the necessary structure, so biology must have done so ("Language and spandrels"). But he was essentially preformationist in this position, even to the extent of denying that language could have evolved. He instead preferred to see language as a side-effect of other evolutionary processes, or emerging as a physical principle from humanlike brains.

    Anyway, I'll return to this later, I just wanted to register a note on preformationism and epigenesis in relation to the issue.

  • A large mystery in historical genetics

    Thu, 2011-01-06 21:30 -- John Hawks

    Gina Kolata writes an interesting story about the genetics of a pituitary giant ("New Story Writ by a Giant's DNA"). The individual in question is a man known as the "Irish Giant" who lived in 18th century England. His skeleton was preserved as a curiosity and remains in the collection of the Hunterian Museum.

    [Korbonits] enlisted the help of an expert on ancient DNA, Joachim Burger of Gutenberg University in Mainz, Germany, to extract DNA from the giant’s teeth. She was worried that the DNA might be too degraded to analyze — after all, the giant’s corpse had been boiled in acid and then displayed in a museum for a couple of centuries.

    It turns out to be a relatively low-penetrance Mendelian pituitary tumor, caused by a mutation shared by a few hundred living Irish people. It may occur elsewhere, and it would be interesting to figure out what elements of the genetic background may affect the trait's incidence.

Pages

Subscribe to genetics

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.