Genes Are Like Sentences

I admit that I lose track sometimes of how the common genetic terms relate to each other. What’s the difference between a chromosome and a strand of DNA, for example? a gene and a genome? Are each of those three-letter sets in DNA a gene? I’m not a scientist, but I was an English teacher, so comparisons between genetic terms and units of written language—words, sentences, and so on—are helpful. Maybe they will be for you also.

Let’s start small. Diagrams of DNA include four letters: A, C, G, and T. These letters and the letter we write with are similar in some ways not in others. In both cases, they are the smallest units of their respective languages. But the four DNA letters stand for the four nucleotides—Adenine, Cytosine, Guanine, and Thymine—that are present in DNA, while the letters of our language stand mainly for the sounds we would pronounce if we were reading aloud.

In DNA, those four nucleotides, abbreviated as letters, make up the three-letter codons that are the DNA version of words. A difference is that the letters and words I am writing with don’t do the whole job. I also use punctuation marks, spaces, and capital letters to show where words and sentence begin and end.

In DNA, however, the three-letter codon-words are more efficient. They represent all the content—amino acids—and all the necessary divisions and instructions. No actual spaces separate the codons in a gene. Since all codons are three letters long, where they begin and end is automatic. (And in fact early writing in the ancient world lacked space between words also. As long as one could read slowly and figurethewordsoutspacesweren’tessential.)

chromosome (

Groups of these codons make up a gene, which can be compared to a written sentence. These gene-sentences say something like, “The hair will be red.” The sentence can also be read as a recipe: “Put this together with this and this to get red hair.” The codons for this and all genes include one that indicates where to switch on the gene and when, and another that says “Stop here; gene complete.”

Sometimes a spelling error occurs in one of the letters-nucleotides in a codon. Such a mutation may change the meaning of the gene to, “The hair will be white.”

To recap: the four nucleotides are the basic components much as the letters of our alphabets are; codons are DNA words; and a group of codons form a gene that is a sentence/recipe.

Now we come to chromosomes, genomes, and DNA itself.

The 23 pairs of chromosomes in each of our cells are like chapters in a strange book in which each chapter appears twice.  The number of genes in a chromosome runs from a couple of hundred to over a thousand, many of them about similar matters, like sentences in a chapter. In the 23rd chromosome pair, which determines sex, the two chromosomes are very different from each other about half the time: females have two X chromosomes, but males have an X and a much shorter Y chromosome.

Finally, your genome is like the book itself, the totality of all your genes on all your chromosomes. Your genome book is almost exactly like mine, since we are both humans, but about one tenth of one percent of our 20,000 or so genes are different. That’s similar to two copies of the same long book that differ only in a few sentences.

Finally, DNA itself. I think of DNA as similar to our writing system as a whole with all its symbols, spaces, and graphic conventions. DNA is not a unit as these other genetic terms are; it is the stuff that all these units are composed of. The same goes for our writing system.

And it’s interesting that writing itself, developed a few thousand years ago, is structured roughly like the genetic code that appeared with the first cells over three billion years ago. In communicating information over distance and time, whether it’s an email about a meeting next month or the genetic instructions for building a baby organism, it seems the key is a method to preserve the necessary groupings and sequencing of components.