john hawks weblog

paleoanthropology, genetics and evolution

gene regulation

  • Micro-RNA 941

    Sun, 2012-11-25 19:39 -- John Hawks

    John Timmer covers the story of miR-941, a micro-RNA that may influence the expression of genes in human brains, and which appears to have taken on a novel role in our lineage compared to other primates:

    Looking at the region in the human genome that contains miR-941 showed it's an area with a series of repeats of the same sequence, arranged in tandem. Chimps and macaques have similar sequences, but the duplications aren't arranged in a way that allows the production of a hairpin structure. Somewhere after we split off from chimps 6 million years ago, a rearrangement in the area (an event that's common in areas with duplicated sequences) created the human form of miR-941. It was already in place a million years ago, when the Denisovan population branched off.

    But the rearrangements didn't end there, as there have been a series of duplications that created as many as 11 extra copies of miR-941 (the numbers vary in different populations, but average is about six or seven copies in most). The extra copies should help ensure it's expressed at higher levels than it would be otherwise.

    The research was carried out by Hai Yang Hu and colleagues [1] in an open access paper ("Evolution of the human-specific microRNA miR-941". It deserves a bit more attention than I can give it at the moment, as it is one of a series of recent papers demonstrating human-specific duplications that affect gene expression. It is one of the first cases in which RNA structure and function have been investigated in an ancient genome. The number of copies of miR-941 varies substantially both within and among human populations.

    This passage from the paper is provocative:

    Humans display both increased longevity and increased occurrence of certain forms of cancer compared with both chimpanzees and macaques39. It is, therefore, appealing to speculate that emergence of miR-941 enhanced the maintenance of adult stem cell populations, thus supporting longer human lifespan, but rendering human cells more prone to malignant transformation. The role of miR-941 in the regulation of insulin signaling adds support to this notion. The insulin-signaling pathway was consistently implicated in lifespan regulation in many species, including humans. Notably, experimentally verified targets of miR-941 within this pathway include genes directly shown to be involved in lifespan extension in model organisms: IRS1, PPARGC1A and FOXO140 (ref. 40). Furthermore, FOXO1 was linked to extended human longevity.

    Still, I am skeptical of the idea that this molecule had a strong effect on the human phenotype. The greater the network of genes influenced by this micro-RNA, the less likely a massive up-regulation or down-regulation will have a simple phenotypic effect. Most genes that were duplicated or deleted during our evolutionary history probably were free to change because of a lack of fitness effect. Maybe this micro-RNA is an exception -- with a new effect on the human lineage, and extensive variation in copy number within humans. But it seems more likely to me that the variation in miR-941 dosage leads to a minor phenotypic effect across the network of affected genes, not a major directional effect.


    References

  • The ENCODE project and function in the human genome

    Wed, 2012-09-05 15:13 -- John Hawks

    I wanted to find out more about today's publication of the ENCODE catalog and data, and so I turned right away to lead bioinformatician Ewan Birney, who has an excellent blog post about it: "ENCODE: My own thoughts".

    I recommend the whole thing, which is an extended Q-and-A format like I often do. The most interesting for people reading science news stories will probably be about the claim that a very large proportion of the genome (up to 80%) is functional. Birney's comments put that number into context:

    Q. So remind me which one do you think is “functional”?

    A. Back to that word “functional”: There is no easy answer to this. In ENCODE we present this hierarchy of assays with cumulative coverage percentages, ending up with 80%. As I’ve pointed out in presentations, you shouldn’t be surprised by the 80% figure. After all, 60% of the genome with the new detailed manually reviewed (GenCode) annotation is either exonic or intronic, and a number of our assays (such as PolyA- RNA, and H3K36me3/H3K79me2) are expected to mark all active transcription. So seeing an additional 20% over this expected 60% is not so surprising.

    However, on the other end of the scale – using very strict, classical definitions of “functional” like bound motifs and DNaseI footprints; places where we are very confident that there is a specific DNA:protein contact, such as a transcription factor binding site to the actual bases – we see a cumulative occupation of 8% of the genome. With the exons (which most people would always classify as “functional” by intuition) that number goes up to 9%. Given what most people thought earlier this decade, that the regulatory elements might account for perhaps a similar amount of bases as exons, this is surprisingly high for many people – certainly it was to me!

    Even at 8%, the amount of potential regulatory activity in the genome is very large, and this should factor into the way we study recent human evolution. Birney discusses purifying ("negative") selection as one criterion for identifying functional DNA, but of course substantial functional variation might emerge under random genetic drift of such elements in human populations.

    Also, he writes about the process of inventing a new kind of publication -- "threads" -- which highlight related tracks across a large set of publications. With 30 papers in the current ENCODE publication release, in multiple journals, tracking a single subject would be complicated for anyone. So they tried to help out:

    Threads offer an alternative, lighting up a path through the assembled papers, pointing out the figures and paragraphs most relevant to any of 13 topics and taking you all the way through to the original data. The threads are there to help you discover more about the science we’ve done, and about the ENCODE data. Interestingly, this is something that’s only achievable in the digital form, and for the first time I found myself being far more interested in how the digital components work than in the print components.

    The post has a lot of interesting background information about the ENCODE project, the process of coordinating a project with hundreds of scientists, and the conflicts that arose between ENCODE and groups targeting smaller, narrower subjects related to DNA function.

    UPDATE (2012-09-05): Dan MacArthur has further thoughts about the influence of the publication model in the paper, with its innovative threaded e-structure and the inclusion of a virtual machine which archives many of the computational approaches: "The ENCODE project: lessons for scientific publication". But he adds an additional note related to openness:

    At the same time, it is worth noting the constraints that the standard embargo model of scientific publication have still imposed on the project. Much of the ENCODE data was mature and ready for use 12 months ago, and for those in the know has been a valuable component of functional annotation pipelines. Many of us in the genomics community were aware of the progress the project had been making via conference presentations and hallway conversations with participants. However, many other researchers who might have benefited from early access to the ENCODE data simply weren’t aware of its existence until today’s dramatic announcement – and as a result, these people are 6-12 months behind in their analyses.

    Even though the ENCODE project followed very open data release policies, we still have much progress to achieve on dispersing information rapidly enough to make a difference to researchers outside these big projects.

    Synopsis: 
    A giant project for cataloguing functional gene variation publishes its results.
  • A speaking gene?

    Thu, 2009-10-22 19:00 -- John Hawks

    I'm just going to quote from a press release that fell into my inbox. It's about a talk being given at the American Society for Human Genetics meeting by Raymond Clarke, who identified a gene disrupted in a family sharing a disorder of the vocal tract. They call the gene tospeak:

    The most exciting breakthrough in their research came when Clarke’s group discovered that the tospeak gene was unique to primates. Most of the human genome contains genes that are older (i.e., conserved over generations) and can also be found in other mammals, including the mouse. However, the tospeak gene is a relatively young gene that is only found in primates. Further excitement came when the group discovered that the tospeak gene has a special control region, known as a promoter, which is only found in humans.

    “The discovery that a unique and more powerful human gene/promoter was disrupted in this vocally impaired family is of particular interest to the field of evolutionary genetics, since humans are the only creatures that have developed the capacity to speak,” said Dr. Clarke.

    Clarke provided the following example as a comparison to help explain this new discovery: “Unlike GDF6, a bone protein gene which has existed since the dawn of vertebrate evolution, the tospeak gene is only found in primates. The best indication of the role of tospeak in human vocal development is that it was the only gene disrupted in a large family with a severe vocal disorder, altered composition of the vocal cords, and malformation of the voice box.”

    That is a good story, as described, and I'd say it points strongly to the hypothesis that this gene was a target of selection in Homo for its role in vocal development. But this gene can't be alone, and the appearance of the promoter in humans doesn't necessarily suggest a change in its vocal-specific function. I'd like to know more about the gene's variation in humans, and whether there are other functional polymorphisms of the gene in primates. Vocal anatomy is quite variable, with a few very distinctive outliers.

    At the same time, let's see some expression data -- the gene probably does other stuff, too, and that might be the target of selection.

  • Gene regulation and its evolution

    Mon, 2009-09-21 13:20 -- John Hawks

    I wrote early this week about Hopi Hoekstra's work on pigmentation evolution in mice ("The color of mice"). The linked article focusing on this empirical work didn't mention her interesting involvement in the debate over the nature and importance of gene regulation as a target of selection.

    I wanted to point out some articles on the topic, by Hoekstra and others, because more and more, paleoanthropological hypotheses are being found to involve the evolution of gene regulation. I'll just note a couple of examples (diet and pigmentation), and hint that there are more coming in the next year.

    If the topic of gene regulatory evolution, or cis-regulation in particular, are obscure, let me recommend a 2008 primer article by Wisconsin geneticist Sean B. Carroll: ("Evo-Devo and an Expanding Evolutionary Synthesis: A Genetic Theory of Morphological Evolution "). Carroll is probably the most well-known advocate of an "evo-devo" perspective on morphological evolution, and in particular the hypothesis that most morphological evolution may be explained by changes to cis-regulation -- "self-acting" sequence elements that affect gene transcription, such as promoter or enhancer elements.

    In a 2007 commentary, Hoekstra and Jerry Coyne presented a critique of the idea that cis-regulation is a central mechanism of adaptation. Here's a quote from their conclusion (Hoekstra and Coyne 2007:1006):

    While the study of cis-regulatory evolution is an important endeavor, justifiably championed by [evo-devotee Sean B.] Carroll and others, our survey of the theory and empirical data shows that the widespread enthusiasm for the importance of cis-regulatory change in evolution is at best premature. Analyzing the verbal theory, one finds no compelling reason to draw a distinction between the genetic basis of anatomical versus physiological evolution. Nor is there good reason to accept the a priori argument that—for either anatomy or physiology—changes in cis-regulatory genes are more likely to be fixed in evolution than are changes in the coding region of genes.

    Everyone agrees that changes in cis-regulation, trans-regulation and good old-fashioned changes to protein sequences may all be selected, and there are examples of each being involved in the evolution of new adaptive phenotypes. So at that level, the theoretical disagreement is relatively sterile -- all of them are possible and cases are known for each.

    The question is whether any of them account for a preponderance of adaptive evolution. Is there anything special about cis-regulation, or any other kind of change? Are they coequal, do they occur in proportion to the number of regulatory elements, amino acid-coding positions, gene duplications? Do any of them release constraints on adaptive changes, allowing more rapid evolution?

    Why does anybody care? Well, there is a mercenary answer: They all have their own empirical research agendas to look out for, and some of them work mainly with experimental models and techniques effective for studying cis-regulation, others on trans-regulation and still others on classical polymorphisms. To me, these are totally boring topics, since I'm not hoping for any funding to do molecular work on gene regulation. Hopefully, the funding conflict will become less important as genomic methods get cheaper. Of course, when it's no longer difficult to find out the answers, we'll have a decent survey of empirical cases!

    Search strategies. A second explanation for why we should care is a practical one. Now that we are able to get genomes from any species we like, the question arises: what is a sensible strategy for forming hypotheses about adaptive (and non-adaptive) change? What should we be looking for?

    Coyne and Hoekstra wrote a 2007 perspective on an article about amylase adaptive evolution inhuman populations. They returned to the issue of whether we should expect a predominant target of adaptive change: cis-regulatory or so-called "structural" mutations to coding sequences.

    The amylase results [showing adaptive change in gene regulation by duplication] follow a related study on the genetics of human dietary differences. In 2006, Tishkoff and colleagues [11] identified a mutation in the upstream regulatory region of the gene for lactase, an enzyme important for digesting milk, in pastoral African populations. Using an in vitro system, they showed that this mutation could increase gene expression. The relevant mutation, however, is not a duplication, but probably a change in cis-regulation. (An independent cis-regulatory mutation at this locus, also conferring lactose tolerance, was identified earlier in European populations [12].)

    Even in the simplest cases of adaptation, then — increased enzyme production to handle new diets — evolution works in multiple ways. Obviously, no amount of a priori speculation will tell us which sorts of mutations will be important; the answer, unfortunately, requires meticulous, case-by-case analysis of putative adaptations.

    Humans may be a poor model organism for considering this question. For one thing, we have a large store of loss-of-function mutations that have been selected for resistance to disease. The same thing probably occurs in other species, but the exceptional number of new diseases in humans may tilt the scales in favor of "structural" mutations.

    Still, these diet-related examples show pretty clearly that multiple mechanisms of gene regulation may be targets of recent selection. That's also evident when we consider human pigmentation variation, a system that is relatively well-understood now from a genetic perspective, for the same reason that Hoekstra's deer mice pigment variations are tractable. In a genome-wide context, it now looks like cis-regulation has been a frequent target of recent adaptive evolution (Kudaravalli et al. 2009), but most of the well-studied examples of recent adaptive change are amino-acid coding

    The breadth of pleiotropy. Pleiotropy ought to impede adaptation. If genes are solving multiple problems -- by interacting with distinct functional networks -- then changes that make one function better may often make others worse. If genes interact widely enough, then optimization becomes extremely difficult or impossible -- what Stuart Kauffman (1993) called a "complexity catastrophe." Make a system complicated enough, and the probability falls to nil that a random change might improve it.

    If evolution were generally mutation-limited in this way, you might well expect to see a highly modularized system of gene regulation evolve, and that's precisely the argument for the cis-regulatory evo-devo model. I don't have an opinion on the general question of how often adaptations may make use of this modular system of regulation as opposed to trans-regulation, duplication, or straight-on coding substitutions. It seems like toolkit genes, which are both highly conserved and strongly pleiotropic, may have evolved by altering cis-regulation more often than other means. Those genes have highly modularized "cassettes" of cis-regulatory elements that control their expression in different contexts. One regulatory element can change without necessarily impeding the function of the gene in other contexts.

    Empirical pigmentation research helps to illuminate the dispersal (and limits to dispersal) of recently selected mutations. That makes it a very relevant model system for understanding recent human evolution. Many human (and Neandertal) mutations to MC1R are trans-regulatory -- by altering the sequence of the hormone receptor, these mutations downregulate the pathway that converts pheomelanin to eumelanin. The nature of this regulatory change is structural -- it's an actual change in the gene product that affects pigmentation.

    From the deeper perspective of paleoanthropology, the evolution of form is the central topic. How did the distinctive conformation of the bipedal pelvis evolve? Some paleoanthropologists have already laid out scenarios in which morphological evolution took a small number of very broad changes -- pelvis, spine, and femora as integrated units that may have had correlated effects on arms and other morphological structures. Such hypotheses make the background assumption of strong pleiotropy on a hard-to-explore adaptive landscape. If human evolution was a product of a few hard-to-get mutations, which might not have happened at all, then our emergence was contingent on a series of unlikely events.

    Pleiotropic constraints may help to explain why we see a pulse of rapid adaptive evolution in humans along with population growth. Adaptive mutations rarely appeared during the earlier Pleistocene, making many possible changes were mutation-limited. If certain kinds of regulatory changes were very easily evolvable, then we might expect to see a different pattern of recent evolution.

    Still, most folks don't think very much about pleiotropy as a constraint on human evolutionary change. The "one gene, one trait" model is really universal out there. Certainly, if you ask people, they'll give you the textbook answer -- genes have more than one function; phenotypes are influenced by many genes. But I can't tell you how many times I've heard people refer to "skin color genes" as if they did nothing else.

    References:

    Carroll SB. 2008. Evo-Devo and an Expanding Evolutionary Synthesis: A Genetic Theory of Morphological Evolution. Cell 134:25-36. doi:10.1016/j.cell.2008.06.030

    Coyne JA, Hoekstra HE. 2007. Evolution of protein expression: New genes for a new diet. Curr Biol 17:R1014-R1016. doi:10.1016/j.cub.2007.10.009

    Hoekstra HE, Coyne JA. 2007. The locus of evolution: Evo devo and the genetics of adaptation. Evolution 61:995-1016. doi:10.1111/j.1558-5646.2007.00105.x

    Kauffman SA. 1993. The Origins of Order: Self-Organization and Selection in Evolution. Oxford University Press, New York.

    Kudaravalli S, Veyrieras J-B, Stranger BE, Dermitzakis ET, Pritchard JK. 2009. Gene expression levels are a target of recent natural selection in the human genome. Mol Biol Evol 26:649-658. doi:10.1093/molbev/msn289

  • The colors of mice

    Mon, 2009-09-14 14:28 -- John Hawks

    Science this week features an article by Elizabeth Pennisi about the research of evolutionary biologist Hopi Hoekstra. She studies pigment variations in wild mice.

    Pigment clines have become an interesting model field study, because we now understand the molecular pathway of melanin production. With a dozen or so genes influencing natural pigment variations in mammals, it's a complex enough system that selection on color will lead to different genetic outcomes. That means we can look at parallelism in many natural cases to understand the evolutionary dynamics.

    Hoekstra and her team are part of a genomics explosion in natural history studies. "This is an example of work ... merging the ‘green’ and ‘white’ side of biology, in which we learn about trait evolution from the biochemical levels within cells to how those traits are selected for or against in natural populations," says Hans Ellegren, an evolutionary biologist at Uppsala University in Sweden. Mark McKone, a biologist at Carleton College in Northfield, Minnesota, agrees: The work "could be a model for how to approach evolution in the postgenomic period," when genetic information and tools are more readily available.

    A couple of weeks ago, Hoekstra's lab had a research paper illuminating pigment evolution among the deer mice of the Nebraska Sand Hills: "On the origin and spread of an adaptive allele in deer mice." I like this example a lot, because the color variation in Sand Hills is clearly postglacial -- this was not an agreeable Like the evolution of stickleback varieties in British Columbia, it's a good example of rapid selection on new colonists.

    Multiple lines of evidence suggest that the wideband allele arose de novo and was not a preexisting allele. First, the haplotype carrying the deletion has greatly reduced variation relative to the wild type (Fig. 4A), which is inconsistent with a model in which the causative mutation was neutrally segregating in the population before any selective pressure (38). Second, the U-shaped SFS [site frequency spectrum] is most consistent with a model in which selection acts on a newly arising mutation (fig. S3). If the beneficial mutation existed on multiple haplotypes before the selective pressure, we would expect to see an excess of intermediate frequency mutations, which was not observed (38). Finally, the posterior probability density of the allele age falls entirely within the estimated age of the Sand Hills (Fig. 4B).

    Taken together, our results demonstrate that variation at the Agouti locus is responsible for adaptive coloration in deer mice living on the Nebraska Sand Hills.

    There are still details to be worked out in this example -- at the biochemical level, how does this allele cause lighter color? Can we say more about the spatial dynamics of the allele after it originated? But what I really like is that it shows obvious parallels to adaptive variations in humans that have recently been selected. It stands out in mice because Hoekstra is out there looking for pigment variations. Postglacial environments are one example in nature where recent environmental changes have generated new selection pressures. Of course, humans have induced new pressures on ourselves by means of massive cultural change.

    Anyway, I thought it was worth pointing out the articles could go together as a set. I'll follow up with a second post on Hoekstra's take on developmental biology.

    References:

    Linnen CR, Kingsley EP, Jensen JD, Hoekstra HE. 2009. On the origin and spread of an adaptive allele in deer mice. Science 325:1095-1098. doi:10.1126/science.1175826

    Pennisi E. 2009. How beach life favors blond mice. Science 325:1330-1333. doi:10.1126/science.325_1330

Subscribe to gene regulation

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.