Should we reprogram the genetic code?

rosalindwithhash
Oct 4, 2021
6 min read

Updated: Oct 9, 2021

DNA is a wonderful thing; the genetic blueprint translates into the brilliant buzz of life. All that the eye can see, from stronger-than-steel spider silk to the fluttering of hummingbirds, is the beautiful result of DNA transcription and translation.

Despite its amazing functionality and how well it works, DNA is highly redundant. Multiple DNA codes, called codons, code for the same amino acid. Although redundancy is certainly a good thing (we’ll see why later), researchers have been trying to exploit the redundancies in DNA to reprogram it to do different things.

By rewriting the genetic code, we can use cells to synthesize designer proteins, experiment with artificial life, research cell machinery workings and so much more!

But first, a refresher on how what DNA is and how it works:

Essentially every living organism has a genetic blueprint made up of DNA (viruses), which codes for proteins that make up nearly everything we are! DNA is composed of 4 nucleotides Adenine A, Thymine T (sometimes Uracil U instead), Cytosine C and Guanine G. The exact position and precise order of these nucleotides determines the precise structure of the protein to be formed, which determines its function. A certain arrangement of nucleotides might code for collagen, which makes up skin, or something wildly different like haemoglobin, which carries oxygen in our body.

So how does DNA lead to the fashioning of a brand-new shiny protein, all ready to do its very important job? The DNA is unzipped, and a copy of the genetic code is made in the form of messenger RNA or mRNA that uses the nucleotide U instead of T. The strand of mRNA is then slotted into a ribosome, the factory of the cell.

The ribosome reads the mRNA three nucleotides at a time, a codon, and each codon corresponds to a particular building block of proteins called amino acids. Another molecule, known as tRNA, is charged with that particular amino acid and brings it to the ribosome. This process continues till all the three-letter codes have been read, a stop codon is hit and the respective amino acids strung up together to form a polypeptide chain.

The journey from DNA to polypeptide chain

Whew!

By this process of transcription and translation cells can turn twenty amino acids into the proteins that make us, us.

We have 20 amino acids and a couple of stop codons, but 64 possible codons. That is, inarguably, a bit too much than 20. We only really need 21 codes for the 20 amino acids and one stop codon.

For instance, both the codons GAU and GAG code for glutamic acid and the amino acids arginine, leucine and serine are coded for by 6 different triplet codons each! When we say the genetic code is redundant, this is what we mean.

Redundancy, or degeneracy, is quite important actually, as it reduces the likelihood a mistake will be made in translation and thereby reduces the chances of point mutations or cancer. As we saw earlier, GAU codes for glutamic acid but if a mutation changes the triplet codon GAU to GAG it wouldn’t matter as the codon would still code for glutamic acid. This is really neat, it allows for a margin of error and ensures that mistakes don’t always necessarily end in different amino acids being added to the protein chain which could have disastrous consequences.

The twenty amino acids naturally coded for by DNA are called the canonical amino acids. But there isn’t anything special about them, after all an amino acid is just a carbon atom attached to an amine group, a carboxyl group and a side chain of something else. Chemically we could have virtually thousands of different amino acids, which although not naturally coded for by living organisms, could still be used to make proteins. And new proteins, with shiny new features.

These other “unnatural” amino acids are called, with equal fanfare, the non-canonical amino acids.

The redundancy of the genome allows us to exploit it! We don’t really need all those six different codons coding for the same amino acid, we could repurpose one of those codons to code for a brand-new amino acid. That is at the heart of all this expansion of the genetic code.

Reprogramming our genetic code to make use of these non-canonical amino acids can open up so many new opportunities to modify or create new proteins that don’t exist in nature and change life processes, which is really exciting! This is not just fiction; it has been done successfully in bacteria and even in mice.

For example, metA is a crucial enzyme in the synthesis of the essential amino acid methionine, normal metA begins to untangle and break down at temperatures above 40 deg and so methionine cannot be synthesized, and the bacteria will die. In 2018 the Scripps Research Institute modified the enzyme protein metA in E. coli by incorporating non-canonical amino acids into its structure that conferred upon it much higher thermal stability. This allowed the bacteria to live through temperatures 21 degrees higher than normal!

But how do scientists reprogram the genetic code of organisms to include non-canonical amino acids (NCAA)?

There are some pre-requisites before we can do something like this:

We need an unused codon that can code for our non-canonical amino acid

A modified tRNA that can recognize this codon

A tRNA synthetase, it joins together the tRNA with its corresponding amino acid, which recognizes both the tRNA and the amino acid

Ensure the cell can make the amino acid or provide it

To this effect, we will need to change cellular machinery.

Provided all that is done, there are two ways of changing the code.

The first way is to, rather cheekily, add more nucleotides to the genome. Remember how we have A, T, C, G, U as nucleotides? Well, we could add more, and these novel combinations of nucleotides could theoretically code for our NCAA.

In 2014, researchers expanded the genetic alphabet of E. coli to also contain two additional DNA nucleotides X and Y. And in 2019 another team of scientists expanded the E. coli code to 8 nucleotides! More nucleotides mean more arrangements of codons and therefore more NCAAs that can be coded, this opens the door to a whole new world of exciting novel protein creations. This method has been used as far as 1992 when researchers modified cytosine to another distinct nucleotide.

The second way is repurposing existing codons to code for an NCAA, for example, the codon GAU that codes for glutamic acid can be used to code for another NCAA. So we could code for multiple NCAAs just within the existing framework of the genetic code without a need for adding extra nucleotides.

Stop codons are most commonly exploited to be repurposed. When the ribosome reads a stop codon, it stops translating the mRNA strand. Let’s take the stop codon UAG as an example. First, we need to free it up for our reprogramming purposes and this is done by replacing all instances of UAG to another stop codon like UGA. This is important because in the instances where UAG appears, translation should terminate so replacing it with another stop codon is necessary or we might end up interrupting vital protein synthesis.

Then we need to trick the cell into thinking UAG is a new codon, and now we can use our repurposed UAG codon. After adding the appropriate DNA precursor, UAG can be used to code for any NCAA.

A 2014 investigation by researchers concluded an expanded genetic code led to greater evolutionary fitness in bacteriophages. While just this year, a team of scientists led by Wesley Robertson inadvertently created viral resistance in E. coli by incorporating non-canonical amino acids. The team reprogrammed two serine and one stop codon but didn’t assign new NCAAs to the freed-up codons just yet. Viruses that tried to infect the modified E. coli sent their genetic material to the bacterial ribosomes so they could make new viruses, as happens in a lytic cycle. But when the ribosome read the repurposed codons on the viral code, it didn’t know what amino acid to add and the virus couldn’t replicate.

We have even used NCAAs to control protein function with light, create better proteins, increase immunogenicity, explore the inner secret workings of cell and so much more.

Even though redundancy has its place in biology, I do not think necessarily tweaking the genetic code to expand it is a bad thing, this technology can be used to do so much. This is just the beginning of what we can begin to do - and what we already have - with incorporating non-canonical amino acid translation into the very blueprint of life.

It will still be a long time off before we have humans walking the earth sporting eight nucleotides rather than four.

References:

Robertson et al. 2021 “Sense codon reassignment enables viral resistance and encoded polymer https://www.science.org/doi/abs/10.1126/science.abg3029
Hoshika et al. 2019 “Hachimoji DNA and RNA: A genetic system with eight building blocks” https://www.science.org/doi/abs/10.1126/science.aat0971
Intro into gene expression https://www.khanacademy.org/science/high-school-biology/hs-molecular-genetics/hs-rna-and-protein-synthesis/a/intro-to-gene-expression-central-dogma
Spencer and Barral 2012 “Genetic code redundancy and its influence on the encoded polypeptides https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3962081/
Li, Jack C., et al. "Enhancing protein stability with genetically encoded noncanonical amino acids." https://pubs.acs.org/doi/abs/10.1021/jacs.8b07157

Attributions:

Forluvoft, Public domain, via Wikimedia Commons
Mouagip, Public domain, via Wikimedia Commons
Overview of Transcription, Khanacademy

References:

Attributions:

Comments