Synthetic DNA with extra ‘unnatural’ bases

3 minute read

Yesterday at the Radcliffe Symposium on the Present and Future of DNA, I got to hear a lecture by Floyd Romesberg, whose lab has been working to create DNA with two novel “unnatural” bases.

The main idea is that natural DNA with its four bases is limited to a total genetic code that has only 64 possible codons. In nature, these 64 include substantial redundancy, so that only 20 amino acids are used to create proteins. But there are many additional amino acids that are chemically possible that organisms do not use in protein manufacture. If biochemists had an easy way of synthesizing proteins using these additional amino acids, they might find therapeutically useful polypeptides or proteins that could never exist in nature. In other words, it would be like taking a standard Lego kit and adding hundreds of new blocks.

Romesberg’s solution is to add two new bases to DNA to enlarge the number of possible codons. Then, synthetic transfer RNA that binds to the new codons would enable the incorporation into protein sequences of any amino acid that can be synthesized.

But there’s a problem: The Watson-Crick base pairing mechanism functions in a way that makes it hard to add new variations on the same theme. Previous attempts to work in possible synthetic nucleotides that use the same manner of hydrogen bonding have failed, mainly because DNA polymerase doesn’t distinguish them well enough from the existing nucleotides.

So Romesberg turned to a large-scale systematic testing of hydrophobic (oily) molecules to see if they could be turned into synthetic nucleotides that would work within the natural DNA sequence and still allow DNA polymerase to function. As he explained, he took a “medical biochemistry” approach of systematically testing hundreds of molecules using cheap assays to find the ones that might work. It was a real scientific detective story, and the product is a new pair of synthetic DNA bases that bind in a fundamentally different way than the A-T G-C pairings of natural DNA, but that still allow DNA polymerase to do its job.

Last year the News-Medical website did a nice interview with Romesberg that helps to give perspective on the research: “Unnatural DNA bases: an interview with Professor Floyd E. Romesberg, The Scripps Research Institute”.

[I]f you look around nature anywhere in the world, in all the diversity that you see, from the lowest, simplest single-celled organisms all the way up to the most complex organisms like you and me, all of the information is encoded in a four-letter alphabet. That’s all the information that nature has to draw upon. That’s all that evolution has to draw upon.
Evolutionary biologists have a way of looking back in time and it appears that, all the way back to the last common ancestor of all life on Earth, it had a four letter alphabet. Going to six letters and showing that it’s possible has conceptual implications for our understanding of information storage in a cell and, therefore, our understanding of what life can be, because the information stored in a genome defines what life can be. I hope it impacts people thinking in that way, and I hope that some evolutionary biologists and people who think about evolution will consider it.
For people who think about the origins of life or why life evolved the way it evolved, this is a rare piece of experimental data that actually addresses that question.

The most recent phase of the research has involved making bacterial cells work with the synthetic DNA sequences. The lab has overcome many interesting challenges, and it is such an interesting story in how science works, and how the products of evolution are really quite different from the products of systematic experimentation by human synthetic biologists.