genome structure

The history of junk DNA explored

T. Ryan Gregory (Genomicron) has been writing a long series of posts looking into the history of junk DNA. He's focusing on what research articles were saying about repetitive and noncoding elements like Alu, LINES, SINES, minisatellites and the rest -- both at the time they were discovered and since then.

The series arises from Gregory's irritation about the oft-heard claim that biologists are "discarding the long-held hypothesis that non-coding DNA has no function. For an example, here is the conclusion of a post about functional analysis of non-coding DNA in the 80's:

In other words, there was no real period in which noncoding DNA was dismissed by the scientific community, though there was a much-needed shift away from strictly adaptive interpretations in the 1980s. Some individual researchers ignored noncoding regions, but there is no gap in the literature other than limits on what could be done in a methodological capacity. The "new" view of noncoding DNA as potentially important has been proclaimed regularly for at least as long as the claimed period of neglect between 1980 and 1994.
One wonders just how long we will be told that we have long been neglecting noncoding DNA.

The contrary-to-evolutionists'-claims-junk-DNA-has-function idea is also a staple of intelligent design creationists. As Gregory points out in one of his comments, biologists seem to be "getting their information from textbooks rather from the primary literature." As long as they remain ignorant of the history, they will be susceptible to junk claims.

Too many scientists fail to realize that good literature review is just as important as good research design.

The series is called "Quotes of Interest." I really like the idea -- many posts, grouped together, presenting a shotgun view of the literature on a single question. I have a couple of topics that would benefit from this kind of treatment -- and it's a very bloggy way to write!

From 100,000 to 25,000, a tale

Larry Moran has summarized a long history of the changing estimates of human gene number over the last fifty years. The post was invoked by the supposed "surprise" at the current low estimate of human gene number -- only around 25,000 genes, genome-wide.

People who learned about human genetics around the time I did often heard that the total human gene number was estimated at 100,000. Of course, there was no real evidence for the gene number, aside from various limiting assumptions. Moran raises several of the ways that people tried to estimate total gene number, ranging from genetic load arguments to hybridization experiments that attempted to find "unique" versus repetitive DNA fractions.

Here's a sample:

It was about this time that Walter Gilbert made his famous back-of-the-envelope calculation of 100,000 genes in the human genome. This was the estimate that became widely quoted when the human genome project was first proposed. It's interesting to note that Gilbert's estimate was not based on any experimental evidence; indeed, it conflicted with most of the available evidence suggesting far fewer genes. The larger number seemed less threatening to scientists who were worried that we might not have more genes than a fruit fly.

If you ever find yourself needing to tell this story, Moran provides a good starting point.

UPDATE (3/22/2006): Carl Zimmer notes a recent estimate that places the human gene number just above 18,000. This post is also highly recommended, especially for its consideration of just what all those genes do:

Today scientists still don't know the function of 5898 genes in the human genome. In other words, over the past six years about 7,000 genes either have been figured out or have vanished into the land of nevermind. That's progress, of a sort. But unknown genes still represent a major slice of the human genome, because the total number of genes has fallen as well. The blue slice in the pie above represents 32.2% of all our known genes. For all the work that has poured into the genome, for all the grand announcements, we still don't know have the faintest idea of what about a third of our genes are for.

That's a bit generous; working with functional categories you soon realize that the "function" of most genes is only "known" by observing structural similiarities with other known genes. For instance, a gene in humans might have a similar part of its amino acid sequence (or "motif") with a gene in Drosophila, which has known effects when mutated. That's pretty indirect knowledge of function, but something like this is all we have for many inferred human genes.

That's what makes life interesting.

Syndicate content