Unnatural base pairs, their unnatural organisms, and the future

Cody Richardson (richcr02@students.ipfw.edu) 
Major: BIOL

Alyssa Robison (alyssarobison@aol.com) 
Major: PHYS



A research team at the Scripps Research Institute in California has engineered a functional organism coded by a genetic alphabet that includes unnatural base pairs. This review brings to question the functionality and scientific importance of this research in relation to its efficiency, cost-effectiveness, and application to the central goal of scientists: to explain and understand the natural world.


Recent research has aimed to test the limiting factor of the genetic code: its four-nucleotide alphabet (Primer 1). If four nucleotides coding for 20 amino acids are capable of producing life as we know it, what will additional nucleotide bases offer to living organisms? The idea of introducing synthetic nucleotides was first suggested by Alex Rich in 1962, though the topic did not gain much popularity until the early 21st century (Rich et al., 1962). After scientists mastered the ability to organize natural nucleotides into man-made sequences to create a functional bacterial cell in 2010, scientists are moving now towards incorporating synthetic nucleotides (Gibson et al., 2010). Within the past decade, multiple experiments have been successful in introducing unnatural base pairs (UBPs) with structures similar to those of natural base pairs but many have run into various issues mostly related to replication and functionality (Kimoto et al., 2013; Hwang & Romesberg, 2005; Leconte et al., 2006; Mastuda et al., 2007; Malyshev et al., 2014; Hirao et al., 2011; Li et al., 2013; Seo et al., 2009). Only one such group has overcome the obstacles to use a living organism that incorporates UBPs into its genome and their research is a main topic of discussion as well as possible future implications (Malyshev et al., 2014).



Analysis of Major Article:

Malyshev et al., 2014: A semi-synthetic organism with an expanded genetic alphabet

Over 30 different UBPs have been successfully added to DNA strands and have survived the PCR process, making them excellent candidates for expanding the genetic alphabet (Malyshev et al., 2014; Hirao, Kimoto, & Yamashige, 2011; Xie & Schultz, 2006; Davis & Chin, 2012) (Figure 1). The two best performing and most focused UBPs are called dNaM and d5SICS and have been successfully retained within a functional, living organism created by researchers in California (Malyshev et al., 2014) (Figure 2). The two bases are structured similarly to those of Guanine and Cytosine, the natural base pairs, and bond via three nitrogen-associated hydrogen bonds. Their similarity in structure to that of natural base pairs is absolutely necessary, as is their stability in various cell environments, and was based upon research explained in Primer 2 above. The synthesized base pair of dNaM and d5SICS are successfully transcribed in vitro and the unnatural nucleotide bases are retained in the DNA even after excision and proofreading mechanisms take place (Malyshev et al., 2014).


In this particular study, the researchers were focused on utilizing and maintaining the unnatural nucleotide bases in a living organism, Escherichia coli, as opposed to finding or describing UBPs. More information on this process is discussed in Primer 2. Their research centered around the obstacles researchers face as they introduce these synthetic letters of the genomic alphabet then work to find solutions to the problems: how to make unnatural nucleotides available to the organism, how to ensure that endogenous polymerases recognize and recruit UBPs and how to make the base pairs stable enough to survive in all conditions of the E. coli bacterium.

AC-Fig_3The team highlighted the need for nucleotide triphosphate transporters (NTTs) to introduce the unnatural nucleotides into the cytoplasm of E. coli so that the unnatural bases could be successfully recruited during transcription and translation. NTTs allow for researchers to provide the synthetic base pairs in the medium surrounding the organism so that they are available to the organism at all times. Eight NTTs collected from various bacteria and algae were tested to as possible transporters for UBPs due to their amount of activity and their broad specificity. The two that were able to not only bind to the UBPs but also to carry them across membranes were from Phaeodactylum tricornutum (PtNTT2) and Thalassiosira pseudonana (TpNTT2) (Malyshev et al., 2014). These NTTs made it possible for UBPs to be brought into the cell so long as they were stable in the medium around the cell. To ensure the stability of the UBPs in the medium, the group added potassium phosphate to the extracellular environment. This process was hypothesized after finding that the natural base pairs, A, G, C, and T, were more stable in the presence of excess phosphate. The researchers discovered the half-lives of the unnatural triphosphates to be about 9 hours in the presence of excess phosphate as opposed to 3 hours prior. After 30 minutes, intracellular concentrations of UBPs were determined to be well above the KM values of unnatural triphosphates for DNA polymerases, which indicates that if DNA polymerase is capable of recognizing and recruiting the UBPs, at this concentration, there could be replication of the UBP in living bacterial cell (Malyshev et al., 2014).

Now that the UBPs were available and stable, scientists had to ensure that they were being successfully replicated. Researchers had previously replicated the UBP in vitro using DNA polymerase I, but most of E. coli is replicated using DNA polymerase III. To compensate, the team engineered a plasmid housing a DNA sequence coding for polymerase I and it was strategically placed downstream of the ColE1 origin of replication such that polymerase I would be the main mediator of the leading and lagging strand. An informational plasmid (pINF) was synthesized by the researchers from the d5SICS in order to help determine whether the pINF was replicated in vivo or if it was synthetic.

Researchers measured UBP levels over fifteen hours of growth and showed the base pairs were retained to up to 99.8%, meaning that they were not excised by DNA proofreading functions. Experiments were repeated and growth continued for 6 days. What they found was that the UBP were still retained at high levels and when they were finally excised, it was due to replication-mediated mis-pairing of dA-dT and not from the repair pathway. The work done by these researchers show the existence of the first organism that can function with three sets of base pairs, opening the door for future scientist to explore synthetic organisms that can use more than just one unnatural nucleotide base.

Implications for the future

A compilation of literature in the field of UBPs has offered a host of possibilities that stem from the successful use and incorporation of UBPs. Based on our understanding the research gathered, we believe that the most exciting possibilities come from the increased stability found in UBPs, their potential role in incorporating new amino acids into proteins, and their insight to the origins of nucleotides in the primordial soup of early Earth. After much more research has been performed, considerations should be made as to the potential energetic efficiency of synthetic proteins as well as how synthetic proteins may be used in certain human diseases.

UBP Stability

As of product of the increased research surrounding UBPs, scientists have found that increased size, dipole moment, and polarity has been associated with greater stability of UBPs (Hwang & Romesburg, 2006). Finding UBPs with all of these qualities that still adhere to the stipulations regarding replication and protection from excision could mean that scientists will find UBPs with stability that far surpasses those of natural base pairs. This could mean that the DNA strands of this synthetic DNA may be more resistant to damage in more extreme conditions as compared to natural DNA and that organisms with synthetic DNA may be better informational warehouses than their natural counterparts. In addition, the research surrounding UBPs has shown that on some accounts, UBPs are sufficiently stable even when paired with the incorrect base, meaning that in the event that a UBP is mis-paired, it will not be lost from the desired template strand (Hwang & Romesburg, 2006; Seo, Matsuda, & Romesberg, 2009; Malyshev et al., 2014). These findings may have major implications for researchers working with storing information in the form of DNA. The possibility for storing DNA that not only is more stable but may also provide excess alphabetic combinations may out-perform the current DNA structures being used by offering a possibility for storing more information in a more resilient structure (De Silva & Ganegoda, 2016). Another finding regarding UBP stability shows that they are more stable in various mediums than their natural base pair equivalents (Maylshev et al., 2014). This could mean that synthetic organisms would be able to live in equally, if not more, extreme conditions than those of the extremophiles we are currently familiar with so long as they have access to their unnatural base pairs. This could also mean that incorporation of UBPs into currently known genomes may increase the tolerance of certain organisms to different environmental extremes.

UBP’s and Unnatural Amino Acids

The second important implication of the complied research seems to be limitless: UBPs allow for hundreds of opportunities for unnatural amino acid inclusion in proteins (Xie & Schultz, 2006). The incorporation of UBPs creates new codon combinations that can be used in conjunction with synthetic tRNA synthases to add unnatural amino acids into functional proteins (Figure 3). Unnatural amino acids have been proven to have multiple functions for scientists and the organisms themselves but are currently dependent on the natural amino acid sequence made from the native DNA strand and must often only be placed near the N or C terminus of a protein (Xie & Schultz, 2006; Antonczak, Morris, & Tippmann, 2011; Davis & Chin, 2012; Kimoto, Hikida, & Hirao, 2013; Li et al., 2013). Now that UBPs are being replicated and maintained within living organisms, additional possibilities for codon combinations could curb the previously mentioned restrictions and allow for more in-depth study of proteins and their functions to occur.

Unnatural amino acids have been used as spectroscopic probes, posttranslational modification, metal chelators, and photo-affinity labels. The use of these unnatural amino acids have allowed for scientists to study cofactors and posttranslational modifications of proteins by improving visibility for tracking (Xie & Schultz, 2006; Antonczak, Morris, & Tippmann, 2011; Davis & Chin, 2012; Kimoto, Hikida, & Hirao, 2013; Li et al., 2013). These unnatural base pairs have been incorporated using the codon UAG, the least used stop codon in E. coli, and a modified native tRNA structure. Scientists have also attempted to use four-nucleotide codons as another possibility for unnatural amino acid inclusion with some success (Xie & Schultz, 2006). Finally, both of these mechanisms limit the cell to using only one codon for unnatural amino acids, meaning that only one type of unnatural amino acid may be added to any given DNA strand.


The inclusion of UBPs allows for each of the desired unnatural amino acids to be paired with a unique codon not previously associated with a native amino acid, meaning that the native codons and tRNA do not need to be modified to “forget” the native amino acid in order for the unnatural amino acid to be incorporated. This also allows for multiple types of native amino acids to be added to the proteins at the same time. Scientists may also find ways to include these unnatural codons into functional areas of the proteins as opposed to being restricted to their N and C terminuses.

UBPs and the Origins of Natural Base Pairs

Researchers have already used UBPs and the information gained from synthesizing them to describe the possible origins of the natural base pairs (Hirao, Kimoto, & Yamashige, 2012).  By finding what does and does not work when producing synthetic UBPs, scientists are able to hypothesize what would and would not have been able to happen to create nucleotides and their complimentary base pairs in the primordial soup of early Earth, as well as whether or not our current genetic alphabet is advantageous over that of a 6 or 8 base system.

Scientists have a large gap to fill when it comes to predicting how RNA and DNA were synthesized in the primordial soup. A large question remains unanswered: how did nucleotides come about? The research conducted on UBPs showed that to make a complimentary base pair, it was almost necessary to make both at the same time and with complete precision. This suggests that it was unlikely that nitrogenous bases floating around the primordial soup just so happened to be complimentary to each other so researchers have suggested a hypoxanthine alternative (Figure 4) (Hirao, Kimoto, & Yamashige, 2012). AC_Fig-4Researchers studying the primordial soup of early Earth found a naturally occurring nitrogenous compound with a structure similar to that of the nucleotide bases Adenine and Guanine. Via a simple nitration reaction under relatively neutral conditions, hypoxanthine is transformed to Adenine. In addition, hypoxanthine has the ability to hydrogen bond to both Thymine and Cytosine by tautomerism between the keto and enol form, suggesting that this hypoxanthine may have been a primer for Adenine and Guanine to become their complimentary bases, a hypothesis previously ignored until research in UBPs flourished.

Various research teams theorized that the 4-nucleotide alphabet has persisted for millions of years because it is the most efficient and allows for enough diversity with the least amount of necessary bases (Hirao, Kimoto, & Yamashige, 2012). This lower amount may increase processing speed as there are not as many bases to sort through and may also save space within cells for other necessary molecules. Yet other researchers disagree, stating that though complex math and computational analysis of processing speed and genetic diversity, it would have been more beneficial to have a genetic code based on 6 or 8 nucleotides (Hirao, Kimoto, & Yamashige, 2012). The first organism synthesized using UBPs could provide insight to which hypothesis is correct.

UBPs and Enzymes

Though we did not review literature that presents this theory, we believe that synthetic nucleotides, amino acids, and proteins should also be studied with regard to their potential for more energetically favorable reactions via enzymes and the possibility of enzyme replacement in humans with enzyme based disorders.

If synthetic proteins were designed to carry out the function of natural enzymes but its unnatural amino acids allowed for increased stability of the transition state of the reaction, a synthetic enzyme could be even more energetically efficient than the enzymes used currently. Even if the enzyme was only slightly more efficient, multiple efficient reactions taking place in a small area may create energetic changes that impact the efficiency of the entire cell or organism.

UBPs and Synthetic Proteins for Treating Disease

Another possibility not discussed in the reviewed literature is related to the possibility of creating proteins modified for treatment of specific diseases. Disorders that arise from a deficiency in enzymes could be treated by introducing more efficient or more stable synthetic version of the missing natural enzyme. In addition, disorders that are caused by the build up of certain compounds may be degraded by a synthetic enzyme that is capable of destroying the build up while leaving the rest of the body unharmed. Both of these applications would require in-depth understandings of the chemical reactions required and the unnatural amino acids capable of performing these amino acids. It would also require an extremely specific structure to encase the reactive amino acids before they reach their desired destination.

More efficient synthetic proteins may also be used to supplement enzymes used during basic metabolism. Those interested in a new weight-loss drug may be able to create a synthetic protein that burns through nutrients at an increased rate. It may also be used to decrease glycogen storage in the form of fat for those who find things like that important.

Unfortunately, these types of applications will likely require much more research on the role of UBPs, unnatural amino acids, and unnatural proteins as these applications as they would require scientists to create extremely specific structures with perfectly placed reactive functional groups.


Unnatural base pairs were first hypothesized over 50 years ago and have only been made possible after chemists successfully added functional group tags to nucleotides that withstood replication, the excision process, and various cellular environments, suggesting that bases with excess functional groups could be incorporated into the sequence. Through trial-and-error, scientists found what was required of a UBP: aromatic structures with functional groups that are capable of hydrogen bonding to their complimentary base. Once scientists were able to make UBPs available for cells during replication and ensuring that the UBPs weren’t removed during the excision process, synthetic life was born. Synthetic life is just one of the possibilities that arise from UBPs and the others seem just as promising.

Works Cited

  1. Malyshev D. A., Dhami, K., Lavergne, T., Chen, T., Dai, N., Foster, J. M., Corrêa, I. R., and Romesberg, F. E. (2014). A semi-synthetic organism with an expanded genetic Nature, 509, 385-401.
  2. Rich A. On the problems of evolution and biochemical information transfer. In: Kasha M., Pullman B., editors. Horizons in Biochemistry. Academic Press; New York, NY, USA:             pp. 103–126.
  3. Gibson, D. G., Glass, J. I., Lartigue, C., Noskov, V. N., Chuang, R. Y., Algire, M. A., Benders, A., Montague, M. G., Ma L., Moodie, M. M., Merryman, C., Vashee, S.,            Krishnakumar, R., Assad-Garcia, N., Andrews-Pfannkoch, C., Denisova, E. A., Young, L., Qi, Z. Q., Segall-Shaprio, T., Calvey, C. H., Parmar, P. P., Hutchison, C. A., Smith,                  H. O., and Venter, J. C. (2010). Creation of a Bacterial Cell Controlled by a Chemically Synthesized Genome. Science, 329, 52-56.
  4. Kimoto, M., Hikida, Y., and Hirao, I. (2013). Site-Specific Functional Labeling of Nucleic Acids by In Vitro Replication and Transcription using Unnatural Base Pair Systems.      Israel Journal of Chemistry, 53, 450-468.
  5. Hwang, G. T. and Romesberg, F. E. (2006). Substituent effects on the pairing and polymerase recognition of simple unnatural base pairs. Nucleic Acids Research, 34,      2037-2045.
  6. Xie, J. and Schultz, P. G. (2006). A chemical toolkit for proteins- an expanded genetic code. Molecular Cell Biology: Perspective. 7, 775-782.
  7. Leconte, A. M., Matsuda, S., and Romesberg, F. E. (2006). An Efficiently Extended Class of Unnatural Base Pairs. Journal of American Chemical Society, 128, 6780-6781.
  8. Matsuda, S., Fillo, J. D., Henry, A. A., Rai, P., Wilkens, S. J., Dwyer, T. J., Geierstanger, B. H., Wemmer, D. E., Schutz, P. G., Spraggon, G., and Romesberg, F. E. (2007). Effects   towards Expansion of the Genetic Alphabet: Structure and Replication of Unnatural           Base Pairs. Journal of American Chemical Society, 129, 10466-10473.
  9. Seo, Y. J., Matsuda, S., and Romesberg, F. E. (2009). Transcription of an Expanded Genetic Alphabet. Journal of American Chemical Society, 131, 5046-5047.
  10. Li, L., Degardin, M., Lavergne, T., Malyshev, D. A., Dhami, K., Ordoukhanian, P., and Romesberg, F. E. (2013). Natural-like Replication of an Unnatural Base Pair for the        Expansion of the Genetic Alphabet and Biotechnology Applications. Journal of            American Chemical Society, 136, 826-829.
  11. Hirao, I., Kimoto, M., and Yamashige, R. (2011). Natural versus Artificial Creation of Base Pairs in DNA: Origin of Nucleobases from the Perspectives of Unnatural Base Pairs.       Accounts of Chemical Research, 45, 2055-2065.
  12. Antonczak, A. K., Morris, J., & Tippmann, E. M. (2011). Advances in the mechanism and understanding of site-selective noncanonical amino acid incorporation. Current    Opinion in Structural Biology, 21, 481-487.
  13. Davis, L. and Chin, J. W. (2012). Designer proteins: applications of genetic code expansion in cell biology. Nature: Reviews. 13, 168-182.
  14. De Silva, P. Y. & Ganegoda, G. U. (2016). Review Article: New Trends of Digital Data Storage in DNA. Hindawi, 2016, 1-14.
  15. Sample, Ian. (24 Jan 2017). Organisms created with synthetic DNA pave way for entirely new life forms. The Guardian. Retrieved from: https://www.theguardian.com/science/2017/jan/23/organisms-created-with-      synthetic-dna-pave-way-for-new-entirely-new-life-forms?CMP=oth_b-aplnews_d-3
  16. Crew, B. (24 Jan 2017). New Organisms Have Been Formed Using The First Ever 6-Letter Genetic Code. Science Alert. Retrieved from: https://www.sciencealert.com/new- organisms-have-been-formed-using-the-first-ever-6-letter-genetic-code

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s