## Monday, July 15, 2013

### Science Clock Series: Part IV

Today's number comes from biology, but before we get to it, I just want to say “sorry” for the long delay between posts. I usually try to discipline myself to write more frequently, and although I've been a bit busy and had some trouble settling on the scope for this post, those are petty excuses. I did have some difficulty deciding how much to write for this post (given its subject), and began writing a lengthy dissertation before eventually deciding to cut back somewhat for conciseness. Anyway, without further ado:

Today's number comes from biology, and is given by:

$\text{# of bases in DNA}$ First of all, what does the word “base” even mean in this context? It does not (as I at first naively assumed) have anything to do the use of the word base in mathematics (specifically in exponentiation where it refers to the number b in the expression $$b^{\,n}$$). It is actually a contraction of the word “nucleobase” and its use is mainly historical, having to do with the properties of nucleobases in acid-base reactions. In this case it relates to the use of the word “base” in chemistry, in reference to substances that neutralize acids.

Although the use of the word “base” in this instance doesn't come from math, it does have a curious appropriateness. Going back to the mathematical side of things for a moment, numeral systems (such as the decimal system in place in most of the world today) can be specified as “base-X”, where X refers to the number of distinct symbols that can, in principle, express all the natural numbers. Thus the decimal system in use throughout most of the world today (which uses the symbols 0, 1, 2, 3, 4, 5, 6, 7, 8, 9) is a base-10 system. Binary, the system used by computers, is base-2, because it uses only 0 and 1. Any number can be used as the base of a numeral system, and many different numbers have been used by various people groups throughout history. Anyway, the point of this diversion is that DNA can be considered to be a form of a base-4 system, since it uses a collection of four different (nucleo-)bases to encode genetic information.

Just what are these mysterious nucleobases, however? They're four small molecules (containing between 9 and 15 atoms each) known as adenine, guanine, cytosine, and thymine, and abbreviated A, G, C, and T. (There's also a fifth molecule, uracil, that substitutes for thymine in RNA, but we're only concerned with DNA here.)

Just as information can be converted to base-2 and transmitted and stored digitally as a long string of 0’s and 1’s, the information in a creature's genetic code is stored in base-4, which we could represent using long strings of the numerals 1-4 (or as they actually do in genetics, as longs strings of A’s, G’s, C’, and T’s).

The details of how exactly ordered strings of tiny molecules are used by certain proteins to create all the other proteins in a living creature are truly fascinating, absolutely mind-boggling, and far, far too vast for me to get into in this post. Suffice to say, you should go read up about it on your own.

Anyway, tune in next time for a number from chemistry! Click here to jump directly to it.