Genes Are Like Sentences, Genomes Are Like Books

It’s a little embarrassing to admit but sometimes I lose track of just how the common terms in genetics all fit together. I learned them late in life and never used them to make a living, and now I pay the price when they don’t stick. What’s the difference between a chromosome and a strand of DNA? A gene and a genome? What do you call those three-letter sets in a DNA diagram, and what do they do? As I said, embarrassing. So here I’m going to pull out the English teacher deep in my bones and connect some of the units of written language—words, sentences, books—to the names of genetic units. Maybe my genetic picture will stay a little clearer a little longer. Maybe for the reader also.

Let’s start small.  The spiraling rungs on diagrams of a DNA (deoxyribonucleic acid) molecule are each marked with two of four specific letters: A, C, G, and T.  The four DNA letters stand for the four nucleotides—Adenine, Cytosine, Guanine, and Thymine—that make up DNA. Like the letters of the full alphabet, these letters–or rather the four molecules they indicate–are the smallest building blocks of their language.

codons
moodle.clsd.k12.pa.us

In DNA, combinations of the letters for the four nucleotides make up the three-letter codons that are DNA’s version of words. Each three-letter codon/word specifies one amino acid. And most codons are “synonyms” in that several different codons refer to the same amino acid because there are many more codons than there are amino acids. The codons are “read” by a ribosome, a cellular reader/assembly-machine that produces the required amino acid and attaches it to the chain of amino acids that will form a protein.

Groups of these codons make up a gene, much as words make up a sentence. The genes/sentences are long because most proteins are complex; human proteins consist of anywhere from several hundred to several thousand amino acid molecules.  The gene/sentence for red hair says something like “Put this together with that and that and that….”

Genes also include a codon at the start that says “Start the gene here” and another at the end that says “Stop here; gene complete.” Within the gene, however, no actual spaces separate the codons, but since all codons are triplets, it’s always clear where codons themselves begin and end.  (We leave spaces between words when we write, but we didn’t always. Writing in the ancient world often lacked such spaces. As long as one could read slowly and figurethewordsoutspacesweren’tessential.)

chromosome (mayoclinic.org)
mayoclinic.org

So, to recap.  The four nucleotides are basic components much like the letters of our alphabet. Groups of three nucleotides spell out codons that can be thought of as words, which in this case are actual amino acid molecules.  And a sequence of codons/amino acids forms a gene that resembles a sentence in a protein recipe for some aspect of the organism.

Finally there are chromosomes and genomes.

A molecule of DNA is very long, a continuous strand of anywhere from a couple of hundred to more than a thousand genes, many of them about related aspects of the organism. Each DNA molecule is called a chromosome which, because its genes concern similar aspects of the body, can be compared to a chapter in a book.  But it is a strange book in that each chapter appears twice, in anticipation of the day when the molecule/chapter reproduces itself. Each human cell contain 23 such paired chromosomes, duplicate copies of the assembly instructions for an entire human being. Only the chromosome pair that determines sex contains chromosomes that are different from each other about half the time: females have two identical female chromosomes while males carry one female and one male chromosome.

Finally, our genome is like the book itself, the totality of all our genes on all our chromosomes. The book might be called Me And Us. Your genome book is almost exactly like mine except for about one tenth of one percent of our 20,000 genes that are different. That’s similar to two copies of the same long book that differ only in a few sentences.

Simplified though the comparison is, it’s startling what genetics and written language have in common. Keep in mind that writing is a recent human invention while DNA and other units of genetics have been forming life for almost four billion years ago. Yet both are composed of the smallest building blocks, then the groupings created from the building blocks, then the meaningful statements/instructions/recipes coded in the groupings, and finally the conversion of the code into organic construction/action/speech.