Codon Wheel

The Two Sides of Codon Optimisation

The translation of mRNA into a functioning protein relies on the mechanisms by which codons, tRNAs and ribosomes are all involved. Once a mRNA has been transcribed, specific amino acids are assembled in order based on their specific codon. A codon is a three base code which codes for a specific amino acid, there are a total of 64 codons three of which code for stop codons. As there are 61 codons which code for the 20 most common amino acids, this means that most codons are degenerative where one amino acid can be coded for by more than one codon. Amino acids which are generally more abundant in proteins have a higher number of associated possible codons and those less frequently used have fewer codons. For example, leucine can be encoded by six different codons, serine has four, whereas tryptophan and methionine are each encoded by a single codon.

Protein synthesis | ribosome assemble protein molecules

A notable advantage of the degenerative nature of codons is that it minimises harmful effects of point mutations. Although an organism has ample choice of codons, it has been noticed that different organisms exhibit certain codon usage bias, meaning that they prefer certain codons over others. Certainly for bacterial systems, over time, the preference of codons has been directly linked to the availability of tRNAs and more efficient protein production, as faster growth rates have been associated with more abundant but less diverse tRNAs (Rocha, 2004). Consequently, the use of rare codons could be significant in limiting the rate of mRNA translation elongation and therefore protein synthesis (Mauro & Chappell, 2014). As a result, understanding codon usage bias in organisms has proven useful in enhancing protein expression in different expression systems. In contrast, higher eukaryotic systems such as mammalian cells appear to show much less codon bias and it is fair to say, based on codon bias studies in mammalian cells to date, that the use of codon optimisation per se will not necessarily enhance protein expression levels as other factors such as mRNA secondary structure can also influence translation rates.

A common tool in protein expression is to codon optimise gene sequences specific to the preferences of different expressions hosts such as bacteria, yeast and mammalian cells. Data from studies mainly in bacterial and yeast systems suggest that optimal codons increase protein output, whereas non-optimal codons actually decrease the rate of ribosome translocation (Presnyak et al., 2015). In addition, there are several other reasons why codon optimisation is widely used. One of these is to minimise or remove the formation of secondary RNA structures as these structures such as hairpins can negatively affect how ribosomes associate with mRNA. Numerous studies have shown how removal of secondary mRNA structures have resulted in faster association of the 30S ribosomal subunit to the mRNA and higher protein expression (Gaspar et al., 2013). Another significant aspect of why codon optimisation is a useful tool for protein expression is to either preserve restriction sites or remove specific restrictions sites that could pose problems when creating constructs for restriction cloning. In principle, codon optimisation is an incredibly useful tool for the successful cloning and production of recombinant proteins.

However, one reason not to codon optimise that should be mentioned, is the incorporation of the 21st  amino acid, selenocysteine, which is encoded by the stop codon UGA and found in a small number of proteins in bacterial through to eukaryotic organisms. Here, the incorporation of selenocysteine requires either a stem-loop structure in the 3′-untranslated region of the selenoprotein mRNA of eukaryotic proteins or secondary structure formation in the RNA immediately adjacent to the UGA codon in bacterial proteins. These structures initiate the formation of a complex of proteins which can then distinguish between the UGA for selenocysteine incorporation and the UGA for termination. In this case, use of the native rather than optimised sequence is key to maintaining the required secondary structure and correct amino acid incorporation.

Within the current topic of vaccines becoming increasingly predominant, especially the rollout of the successful RNA vaccine, there is an interesting alternate strategy which focuses on utilising codon deoptimisation as a tool to create live attenuated viruses. This was based on the finding that protein coding sequences can also show codon pair bias, where some codon pairs are found either significantly more or less often than would normally be expected. This strategy of deoptimisation in viral genes by using sub-optimal codon pairs results in increased mRNA decay and reduced translation efficiency resulting in less viral proteins which reduces overall virulence (Groenke et al., 2020). This has been successful in creating live attenuated Influenza virus that is capable of initiating a robust immune response in animal studies (Kaplan et al., 2018). As a result, codon pair deoptimisation could be a promising alternative to producing live attenuated vaccines at a faster and cheaper rate than using the traditional method of attenuating viruses through eggs.

Written by Naimah Begum

Go to Top