Let’s now turn our attention to real cells. The entire genetic sequence of an organism is usually referred to by biologists as its “genome.” The genome of an organism contains the blueprints for proteins, and these blueprints are stored in an encoded form within a DNA molecule. When the cell needs to make a particular protein, the blueprint for it is transcribed from DNA into a strand of mRNA, and then a machine called a ribosome reads this strand and translates it into a chain of amino acids. This chain is the protein, and it usually folds up into a shape that is useful to the cell. Most of an organism’s body is made up of different proteins.
Now, let’s imagine for a moment that a smart and enterprising young ribosome happened to stumble upon an idea for a better protein or a more efficient ribosome. Could it pass the idea on to others? Not really, because it is just one of several thousand ribosomes in a single cell. Even if it could somehow convince its peers to adopt the new process, the invention wouldn’t be taken up by the next generation.
Our inventive ribosome would somehow need to convince the DNA bosses, sitting in their plush cellular offices, to write the blueprint for a better protein or ribosome into the DNA itself, so that the update could be pushed out to future generations. In the case of sexually reproducing organisms, it would have to be written into the DNA of sperm or egg cells, because this is what is passed on from generation to generation.
In other words, from an evolutionary point of view, the only changes that matter in the long run are those affecting the specific DNA inherited by future generations of cells.
Now, when the entire genome of a cell is copied, the sequence of nucleotides it passes on to the next generation might turn out to be a little different from what the original cell inherited. A mutation can happen, which may be the result of a copying error, or damage to the sequence from sources such as radiation.
A common form of mutation is when a letter gets switched for another letter, like a scribe getting a letter wrong while copying a document. Sometimes letters can also get accidentally removed or added. Less common mutations include duplication of one or more letters, like copying and pasting in a word processor. The DNA copying machine has a proofreading system to make sure these errors are minimized, but mistakes still sometimes get through.
Changes to the genome can also be made in more organized ways. For example, in a process called “horizontal gene transfer,” bacteria are able to take DNA from their environment or from other bacteria, and transfer it between themselves. This allows replicating organisms to avoid something called “Muller’s ratchet,” where populations end up with genetic deletions that can’t be reversed. Sexually reproducing organisms avoid this through a process of genetic recombination, where similar sections of DNA from father and mother are exchanged, creating variety while also preserving essential information.
Most mutations are either neutral, meaning they don’t have any effect, or they are harmful. According to evolutionary theorists, mutations to the genome provide the raw materials for evolution. Sometimes a beneficial mutation takes place, which gives one particular organism a survival advantage over its fellow organisms. As a result, its offspring are more likely to survive and become predominant within the population. This, in essence, is what is meant by “natural selection.” Nature tends to preserve the fittest and sift out the less fit, although “fitness” in biological terms isn’t about whether creatures go to the gym or not, but is simply about survival or reproductive advantage.
Any advantage that might help an organism survive or reproduce better must ultimately be the product of a change to the DNA sequence in its genome, because this is what is passed on to future cells. For this reason, we could think of life as a game, where each organism comes with the equivalent of a barcode, the DNA sequence it is able to pass on to its offspring, who will inherit it with perhaps a small tweak here and there. Evolution is really just a game of nucleotides being changed a little bit from generation to generation. I’ll call this game the “Nucleotide Shuffle.”
As a result, some aspects of evolution simply become a math problem, because it is based on changes to sequences of information, and in some cases this can be modeled mathematically. All we need to do is look at sequences, and ask how we can get from one sequence to another through the shuffling of letters, and over what time frame. This allows us to see what evolution is truly capable of, and what its practical limits are, if any.
If we wanted to evolve, say, a sequence of random English letters into a specific line from a Shakespeare play, how could we do it? We could write a computer program to change one letter at a time. The program could do it very quickly if it knew the line of the play we wished to arrive at, and it was allowed to keep a letter once it matched up with the same letter used by Shakespeare.
But this isn’t how evolution works. Evolution doesn’t know the outcome beforehand. If it did, this would be directed evolution or intelligent design. Therefore, we would be cheating if we told the program what to find. Natural selection also assumes there are small, cumulative steps along the way, and that each step gives the organism a survival or reproductive advantage, or at least doesn’t kill it off. But what advantage does a line of random text with a few altered letters have? What function does it serve?
Changing a random sequence of text into a line from Shakespeare using a computer program is perhaps a reasonable analogy for what evolution might look like with the benefit of hindsight, but we can’t use it to show evolution is easy, since the initial random text has no function, the transitional sentences along the way have no function, and the program knew the outcome in advance, which evolution does not.
In the real world, a “sentence” in a genetic sequence used by the cell still has to make sense after a mutation, and each mutation still has to allow the sequence to maintain some kind of function. If it loses its function along the way, natural selection will no longer help the sequence to evolve and the sentence could mutate into gibberish; and there are infinitely more potential lines of gibberish than there are lines from Shakespeare.
Now, in order to evolve a useful function from scratch, or de novo as biologists call it (from the Latin, meaning “of new”), nature has to perform at least some “brute force” natural searches. These are searches over the entire range of possibilities.
For example, suppose you were to buy a combination lock, manufactured with a random three digit unlock code, but then you carelessly lose the piece of paper with the code on it. You could take the lock back to the store, but instead, you decide it would be quicker to find the correct code yourself, since it’s only three digits long.
There are 1,000 possible sequences to try, ranging from 000 to 999. You decide to start with the lowest number and work your way up. What you are doing is a “brute force” search. Searches involving longer sequences are normally done with computers because of the sheer number of permutations involved, but in this simple example, you are doing it manually.1
You might get lucky. The manufacturers might have set the default unlock code to 000, meaning you would find the correct sequence right away. But they might have set it to 999, laughing to themselves in their plush offices as they did so; in which case, you would have to go through 999 previous possibilities before you found the right one. If they set the default code to a random number, it would be impossible to say exactly how many tries it would take you to find the right code for any one lock. All we could say is that it would take between 1 and 1,000 tries.
However, if you happened to enjoy the process, and decided to spend the rest of your life finding the code to newly purchased combination locks by brute force searches, we could say that, on average, it would take you about 500 tries before you found the right combination for a lock. About half the time it would take you more than the average, and half the time it would take you less.
The point here is this: to find a beneficial mutation that gives an organism an advantage, nature has to “search” through lots of neutral or damaging mutations, much like searching for the correct sequence to a combination lock. When we understand this principle, we can estimate how long a particular change can take, or how many tries it will take nature. Let’s now test this out, as we evolve our very own protein from scratch.
1 In mathematics, “permutations” are used when sequence order matters, and “combinations” when it doesn’t. Even though we are talking about a “combination lock,” what actually matters here is the number of permutations. Each digit has 10 possibilities, so there are 103 = 1,000 permutations.