The genetic code is the basis for all life, allowing the information present in DNA to be translated into the proteins that perform most of a cell's functions. And yet it's … kind of a mess. Life typically uses a suite of about 20 amino acids, while the genetic code has 64 possible combinations. That mismatch means that redundancy is rampant, and a lot of species have evolved variations on what would otherwise be a universal genetic code.
So the code itself is significant, or it is something of a historic accident, locked in place by events fits in the distant evolutionary? Answering that question has not been an option until recently, since individual codes appear in hundreds of thousands of places in the genome of even the simplest organisms. But as our ability to make DNA has scaled up, it has become possible to synthesize entire genomes from scratch, allowing a wholesale rewrite of the genetic code.
Now, researchers are announcing that they have redone the genome of the bacteria E. coli to get rid of some of the genetic code's redundancy. The genetic code is spelled out in sets of three DNA bases. The genetic code is spelled out in sets of three DNA bases. Each of the three positions can hold any of the four bases, meaning there are 4 x 4 possible combinations, or 64. By contrast, there are only 20 amino acids, while at least one of the remaining codons has to be used to Tell the cell to stop translating the code. That leaves a mismatch of 43 codes that aren't strictly needed. Cells use those extra codes as redundancy; instead of one stop code, most genomes use three. Eighteen of the amino acids are coded by more than a set of three bases; two have as many as possible codes.
Is this redundancy useful? The answer is "sometimes." For example, many DNA sequences do double-duty, encoding both protein and regulatory information. The flexibility of redundancy makes it easier for one sequence to serve two purposes. The redundancy can also allow fine-tuning of gene activity, as some codes are translated into more efficiently than others. These factors suggest that the genetic code's redundancy could have been essential for an organism.
Testing whether that is the case, however, is a bit of a nightmare. Even the most compact genes have hundreds of genes. E. coli strains have between 4,000 and 5,500), and all of the individual codes can occur multiple times within each. Editing each of these is possible but would be phenomenally time consuming.
So the researchers simply recoded things on a computer. Focusing on one of the amino acids that has multiple redundant codes, they are tweaked so that more than 1
This is easier than it sounds, according to one of the researchers involved (and regular Ars reader) Wolfgang Schmied. With a project like that, where you ask questions about the rules of the genetic code, "you have to at some point commit to ordering a genome worth of synthetic DNA," he told Ars, "which is a rather large financial commitment and not an easy button to press. " Yet press it they did
Some assembly required
Unfortunately, there is a big gap between what a DNA synthesis machine can output and the multi-million base-lung genome. The group had an entire assembly process, stitching together small pieces into a large segment in one cell and then bringing in a different cell that had a large segment overlap. "Personally, my biggest surprise was really how the assembly process worked," Schmied said. "The success rate at each stage was very high, meaning that we could do the majority of the work with standard bench techniques."
During the process, there were a couple of spots where the synthetic genome ended up with problems — at least one case, this was where two essential genes were overlapped. But the researchers were able to tweak their version to get around the problems that they identified. The final genome also had a handful of errors that popped up during the assembly process, but none of these altered the three base codes that were targeted.
In the end, it worked. Rather than using 61 of the 64 potential codes for amino acids, the new organism — dubbed Syn61 — only used 59. The researchers were then able to delete the genes that normally allow E. coli to use the redirected codes. Normally, these genes are essential; in Syn61, they could be deleted without issue. That's not to say the Syn61 strain is fine; It grew more slowly than its normal peers. But this is probably the result of all the cases described earlier, where DNA sequences were performing more than one function. It is possible that, over time, the strain can change back to a normal growth rate.
Aside from answering questions about basic biology, the Syn61 strain may ultimately be useful. There are far more amino acids out there than the 20 life uses, and many of these have interesting chemical properties. To use them, however, we need to save genetic codes that can be redirected to the artificial amino acids – exactly what this new work has provided.
Nature 2019. DOI: 10.1038 / s41586-019- 1192-5 (About DOIs).