Two more letters, thousands of opportunities

The 'genetic alphabet' of DNA constitutes two base pairs (A–T and G-–C). In a major breakthrough in synthetic biology, Synthorx's scientific founder Dr. Romesberg and his team have crafted another base pair d5SICSTP and dNaMTP (abbreviated X-–Y) that can be replicated in vivo. The incorporation and in vivo replication of X—Y into DNA expands the genetic alphabet, allowing for the increased storage of information and the incorporation of multiple non-natural amino acids into new, unique proteins.

Our technology supports adding multiple non-natural amino acids to proteins of any size

The limited combinations of the DNA bases, A, T, G and C, have restricted the types of new proteins, RNA and DNA that we could make. Genetic information in DNA is used to create proteins in a two-step process. First, in transcription, the protein coding regions of the DNA sequence are used to guide the synthesis of molecules of messenger RNA. Then, in translation, the mRNA molecules are read in three letter words (such as A U C), called codons, which instruct the placement of a specific amino acid to create a protein.

The four RNA bases (A, U, G, and C) can create 64 codons. However, the system contains redundancy. More than one codon will specify the same amino acid, and thus the natural codon alphabet encodes for only 20 amino acids.

Synthorx’s base pair, X and Y creates an expanded genetic alphabet with novel triplet codons (e.g. A X C) that can be used to incorporate multiple non-natural amino acids into new, unique protein therapeutics. With Synthorx bases, X and Y, we now have an expanded vocabulary that can encode for a total of 216 codons, which can be used to incorporate multiple non-natural amino acids without re-appropriation of the cell’s natural codons.

Safeguards are built-in

We have two very important and inseparable controls over the technology. The first is that the synthetic bases can only get into the cell if we give the cell the “base transporter” protein. The second and most important is that the synthetic bases do not occur in nature—they can only be created in the lab and must be given to the cells. If we do not give the cell the new bases, the cell will revert back to A, T, G, C and the X and Y will disappear from the genome.


The expanded genetic alphabet. (Review)
Malyshev DA, Romesberg FE.
Angew Chem Int Ed Engl. (2015) 54:11930-11944.

A semi-synthetic organism with an expanded genetic alphabet
D.A. Malyshev, K. Dhami, T. Lavergne, T. Chen, N. Dai, J.M. Foster, I.R. Corrêa, F.E. Romesberg, Nature (2014) 509:385-388.

Natural-like Replication of an Unnatural Base Pair for the Expansion of the Genetic Alphabet and Biotechnology Applications
L. Li, M. Degardin, T. Lavergne, D. Malyshev, K. Dhami, P. Ordoukhanian, F.E. Romesberg, J. Am. Chem. Soc. (2014), 136:826–829.

Efficient and sequence-independent replication of DNA containing a third base pair establishes a functional six-letter genetic alphabet
D.A. Malyshev, K. Dhami, H.T. Quach, T. Laveryne, P. Ordoukhanian, A. Torkamani, F.E. Romesberg, Proc. Natl. Acad. Sci. USA (2012) 109:12005-12010.

KlenTaq polymerase replicates unnatural base pairs by inducing a Watson-Crick geometry
K. Betz, D.A. Malyshev, T. Lavergne, W. Welte, K. Diederichs, T.J. Dwyer, P. Ordoukhanian, F.E. Romesberg, A. Marx, Nat. Chem. Biol. (2012) 8:612-614.

Site-specific labeling of DNA and RNA using an efficiently replicated and transcribed class of unnatural base pairs
Y.J. Seo, D.A. Malyshev, T. Lavergne, P. Ordoukhanian, F.E. Romesberg, J. Am. Chem. Soc. (2011) 133:19878-19888.

Discovery, characterization, and optimization of an unnatural base pair for expansion of the genetic alphabet
A.M. Leconte, G.T. Hwang, S. Matsuda, P. Capek, Y. Hari, F.E. Romesberg, J. Am. Chem. Soc. (2008) 130:2336-2343.