IT WASN’T A TYPICAL BIOLOGY LAB, NOT BY TODAY’S STANDARDS, not by the standards of the time—1910. The first thing a visitor would notice, long before he entered, was the smell. That you could smell the lab wasn’t so unusual; many biology labs emanate odd odors of various sorts. But this smell wasn’t like any of the typical biological lab smells; it was distinctly funky, like the smell of a metal garbage bin baking in the sun behind a supermarket.
Visually the lab was no less unprepossessing; it was small and filthy. An impressive layer of detritus had accumulated on the floor, home to a flourishing population of cockroaches. This lab was as notable for what it lacked as for what was uncharacteristically present: no flasks or beakers or test tubes or pipettes. The only observable glassware was used milk bottles scattered haphazardly throughout. Also missing were microscopes, even the simplest low-power kind. Functioning in their stead was an assortment of hand lenses, like those that the elderly used to use for reading before the advent of reading glasses.
Also lacking was any sense of formality or hierarchy. This seemed particularly odd to visitors from European universities, most of which were modeled after those in Germany. There, the most informal way of addressing the person who headed the lab was “Herr Doktor Professor.” Moreover, a German professor’s office was always closed and he was available only by appointment. Here, the door to the professor’s office, located at one end of the lab, was always open, and those working in the lab approached it seemingly at the merest whim and without deference. Moreover, they addressed the professor by his first name, a practice that is now commonplace in American universities but was not then, and certainly not elsewhere in the world.
Yet in this humble lab, the infant science of genetics was nurtured to a degree unrivaled in the rest of the world. On any given day you could find at their labors two future Nobel prize laureates, as well as several other scientists who were to shape the course of genetics. First among them was the man who occupied the lone office, Thomas Hunt Morgan, whose significance in the history of genetics is second only to that of the Moravian monk Gregor Mendel.1 Morgan’s goal was to determine the location of Mendel’s “hereditary factors”—now called genes—on particular chromosomes. Morgan’s gene mapping was much different than today’s gene mapping. The technology was not then available to directly locate genes on chromosomes. Instead, he had to take a much more indirect route. He could only identify a gene through a mutation that caused some observable change in the appearance (phenotype) of his subjects. If this mutation was correlated with a different trait, he could assume that the genes for the two traits were located on the same chromosome. The more highly correlated they were, the closer together they must be.
Morgan was a scion of southern gentry whose immersion in science had helped him successfully navigate a culturally jarring transition to New York City. His lab at Columbia University was on the uppermost floor of Schermerhorn Hall, up-atmosphere from the more traditional biology labs and their more traditional odors. Morgan’s lab was called the Fly Room by its inhabitants, not because of the numerous houseflies that competed with the roaches for the lab’s effluvium, but for the much smaller critters that occupied all of those empty milk bottles: fruit flies. Though the fruit flies themselves don’t stink, they were responsible for the lab’s odors and dishevelment. For fruit flies, as their name suggests, eat fruit; they deposit their eggs in fruit as well. They prefer their fruit overripe—rotten by human standards—and to keep them happy there was plenty of rotting fruit, especially bananas, strewn about the lab. In fact, according to lab lore, it was someone’s lunch banana left inadvertently on a window sill that initially attracted fruit flies to the lab.
The utility of fruit flies was not immediately apparent to Morgan, though. Originally he wanted to use mice for his experiments. But mice had their limitations, given Morgan’s goal. Morgan needed subjects that have short life spans and can produce numerous generations per year. As mammals go, mice are pretty good in that respect, but all mammals, including mice, are relatively slow breeders compared with insects and many other invertebrate animals. So it was fortunate, in retrospect, that Morgan’s early mouse-based grant applications were turned down and he was forced to “go outside the mammal box”—way outside. He settled, fatefully, on fruit flies, which were readily available, easy to maintain in the milk bottles, and could produce in one year a whopping fifty generations. Though he couldn’t know it at the time, fruit flies would remain the animal of choice for many genetic investigations to this day.
Initially, though, it wasn’t at all obvious that Morgan had made the right choice. Notwithstanding the many generations of fruit flies bred in the lab, after two full years nary a mutation could be found. Morgan was close to despair for having wasted so much valuable research money and time on this increasingly quixotic-seeming quest. It wasn’t that the fruit flies weren’t mutating: all living things experience mutation; it is a fact of life. But the primary virtue of fruit flies—their short generations—came at a price: small size. As such, the only fruit fly mutations that the Fly Room scientists could identify would be those that caused some gross alteration in a fruit fly’s appearance—thus could be spotted through one of those hand lenses—yet was not lethal. Those mutations are extremely rare.
Finally, though, in the third year of their quest came their first success—a fruit fly born with white eyes. Normal fruit flies have wine-red eyes. Morgan referred to the red-eyed flies as the wild type. The white-eyed mutants were in fact blind, but they could still breed under the right conditions. After numerous crosses between white-eyed mutants and wild-type red-eyed individuals, Morgan and his coworkers were able to trace the white-eye mutation to a particular chromosome—a sex chromosome, as it happens.
At this point, some important terminological distinctions are in order. Let’s start with the notion of a chromosome. Chromosomes (“colored bodies”) are so called because they look purplish-brown under a microscope. By Morgan’s time, most scientists were convinced that genes resided on chromosomes. Though they did not know what chromosomes physically consisted of, chromosomes were assumed to be linear. One common conception was that genes were like beads on a chromosomal string. That simile will do for now. A gene, then, has a particular location on a chromosome, an address called a locus (plural loci). Sometimes there is only one genetic variant at a locus, but more often two or more. These variants are called alleles. Think of alleles as the different colors of beads found at a locus. Some loci have beads of only one color (that is, one type of allele), whereas most loci have two or more types of alleles and hence two or more colors.
Morgan found a mutation in a gene that affects eye development. This mutation caused the formation of a new allele, a differently colored bead, at that locus. It was this new allele that Morgan labeled “white.” There is no white-eye locus, only a white-eye allele at a locus that influences eye development. We humans have a couple of gene loci that influence eye color, one of which has two variant alleles that largely determine whether our eyes are brown or blue.
There are actually two distinct senses of the term allele in genetics. In the foregoing, I have treated alleles as types. But there is another sense of allele—as tokens. By token, I mean the particular instance of a certain type. We inherit two allele tokens at each locus, one from each parent. If the two allele tokens are of the same type, we are homozygous for that locus. If the two allele tokens are of different types, we are heterozygous for that locus. Let’s consider the trait of human eye color. I will assume, for the sake of exposition, that just one locus and two allele types are involved in the determination of eye color.
If we are homozygous for the “brown” allele, our eyes will be brown; if we are homozygous for the “blue” allele, our eyes will be blue. If we are heterozygous, however, things are more complicated. The two alleles could have equal influence, in which case our eye color would be intermediate (some shade of green). Often, though, one allele type has a greater effect on a trait than the other; sometimes the “stronger” allele completely masks the “weaker” allele in the heterozygous condition. In those cases in which there are pronounced differences between stronger and weaker alleles, the stronger allele is said to be dominant and the weaker allele recessive. For human eye color, the brown allele tends to be dominant and the blue allele recessive. The convention is to label the dominant allele with an uppercase letter and the recessive allele with a lowercase letter. The brown allele would be denoted with a “B,” the blue allele with a “b.” That is all the classical genetics you need to know for the purposes of this book.
The Gene Incarnate
Morgan, following Mendel, defined genes with respect to traits, such as eye color. He assumed, again following Mendel, that one gene (locus) corresponded to one trait. And that the gene variants (alleles) corresponded, in some straightforward way, to the variant traits: red or white eyes in fruit flies, brown or blue eyes in humans. That was a reasonable place to start. But it soon became apparent to all but a few unreconstructed Mendelians that most traits weren’t like eye color, but were more like height. That is, most traits vary quantitatively (continuously), not qualitatively (discretely). Moreover, differences in height must result from the contributions of many genes, as well as a host of environmental factors.
Morgan himself was not at all interested in what genes were physically, that is, with the material nature of genehood. For his purposes it was sufficient to know that they were units of inheritance that resided on chromosomes, and that they often came in more than one flavor. It was left to others to discover the physical (biochemical) gene.
The first step in this quest was to figure out what chromosomes consisted of. Chromosomes, it turned out, were comprised of two distinct kinds of biochemicals: DNA and proteins. The question then became whether it was the DNA or the proteins that acted as the hereditary material. The issue was decided in favor of DNA through a series of groundbreaking experiments. But now there was a new problem. Though proteins could not be the hereditary material, they—and not DNA—were clearly the primary physiological actors within cells. Some proteins are enzymes that catalyze biochemical reactions, others serve to bind and transport essential elements and chemicals, and still others comprise the structural elements in muscles, skin, and cartilage. Somehow, these essential proteins must be made from DNA. But proteins come in multitudinous varieties while all DNA seemed similar. How then, were all of these proteins made from seemingly unvarying DNA? To answer this question, scientists had to take a closer look at DNA.
It was discovered that the DNA molecule generally has two strands that coil in a helical fashion, the double helix. The “D” in DNA stands for the sugar deoxyribose (NA = nucleic acid). The deoxyribose sugar groups, each separated by a phosphate molecule, comprise the backbone of the DNA molecule. Attached to each sugar is a chemical called a base (as in basic, as opposed to acidic). The bases come in four varieties—adenine, cytosine, guanine, and thymine—which are generally denoted by their first letter: A, C, G, and T, respectively. The base from each strand is bound to a base on the other, connecting the two strands like steps on a ladder. But A can only bind to T (and vice versa), and C to G (and vice versa). So there are four types of steps: A-T, T-A, C-G, and G-C.
Francis Crick and James Watson are celebrated for their characterization of DNA as just described, and for their suggestion that the base sequence might relate to the composition of proteins.2 Soon thereafter came the discovery of the genetic code. This code concerns the mapping of the DNA base sequence to amino acids, which are the building blocks of proteins. This mapping is not nearly as precise as human-devised codes, such as Morse code.
The genetic code implied that genes must consist of linear base sequences. But where are the boundaries? How can you tell where one gene stops and another begins? By the late 1950s these questions seemed answered. Morgan’s “one gene (locus) = one trait” became “one gene (locus) = one protein.”3 This formulation provided a straightforward way to delineate genes on a chromosome. Geneticists simply had to find the place on the chromosome where coding started and stopped. But this wonderfully simple formula of “one gene = one protein” was, it turned out, too simple. The relationship between genes and proteins is not nearly that straightforward.
What Genes Actually Do
The process whereby proteins are constructed from genes is called protein synthesis. Protein synthesis is a two-stage process. During the first stage, called transcription, one strand of the double helix serves as a template for the creation of a molecule called messenger RNA (mRNA). The term transcription is meant to connote the transfer of information from one medium to another, as in musical transcription from, say, piano to guitar. In this case, the transcription is from DNA to RNA.
During the second stage, called translation, the mRNA serves as a template for the creation of a protoprotein. The term translation is meant to connote a larger transformation of this information, like that which occurs when one language is translated into another. In protein synthesis, the translation is from the language of the base sequence of the RNA to the amino acid sequence of the protoprotein. The protoprotein is generally not functional. It must be further transformed into a functional protein through a process called posttranslational processing. Posttranslational processing can render the functional protein quite different than what would be predicted given the original DNA sequence alone.4
It is tempting to view a gene as the instructor of protein synthesis, to attribute to the gene an executive function. Here is a metaphor that may illuminate this notion of the executive gene. Think of the cell as a theatrical production, a play. On this view, the gene functions as the director of the play, the proteins are the actors, and all of the other biochemicals in the cell function as stagehands. Genes direct the construction of proteins, through which they control cellular activities, including the construction of all other biochemicals (such as lipids and carbohydrates), which, in their turn, labor to the genes’ ends.
The problem with this view is that it gives the gene too much credit for what goes on during protein synthesis, and in the cell generally. The role of the gene in protein synthesis is to serve as an indirect template for a protoprotein. This templating function is crucial but does not render a gene an executive, any more than the particular plate from which this page was printed functioned as an executive during the printing process.
An alternative to the executive gene is what I will refer to as the “executive cell.” From this perspective, genes look more like members of an ensemble cast of biochemicals, the interactions of which constitute a cell. The executive function resides at the cellular level; it cannot be localized in its parts.5 Genes function as material resources for the cell. On this view, each stage of protein synthesis is guided at the cellular level.6 But most fundamentally, the “decisions” as to which genes will engage in protein synthesis at any point in time is a function of the cell, not the genes themselves. That is, gene regulation is a cellular activity. This is true for both garden-variety gene regulation and epigenetic gene regulation. Epigenetics is one form of cellular control over gene activity on this view.
Genes and Traits
Genes influence our traits through the proteins constructed from them. The eye color locus of fruit flies codes for a protein that transports red and brown pigments across cell membranes. Morgan’s mutant white-eye allele has a different base sequence than the wild-type allele, and hence codes for a protein that is deficient in that respect. As a result, the fly ends up with eyes that have no pigment and thus appear white. A defect of a similar transport gene in humans often results in cystic fibrosis. Much work on the genetics of human diseases follows a similar script: a mutant allele causes a specific defect in development. Some of the more topical of such defects are obesity, diabetes, breast cancer, depression, schizophrenia, and substance abuse. And invariably, one or more such mutant alleles have been discovered. Hence the talk of genes for obesity, breast cancer, schizophrenia, substance abuse, etc. It is important to note, however, that the mutant alleles discovered to date account for only a small fraction of these disease states.7
Prior to the advent of epigenetics, this quest for mutant alleles (altered base sequences) dominated biological research on these diseases. But disease researchers have become quite cognizant of epigenetics of late, hence the mutant allele approach is now augmented with a search for mutant epialleles, that is, alleles with abnormal epigenetic marks.8
There is another, quite different sort of explanation in which genes figure, which concerns the normal course of development. With respect to Morgan’s fruit flies, the issue would be: how do they usually come to have red eyes? For a fly to have red eyes, it must first of course have eyes. To have eyes, it must first have a nervous system, and so on. In explaining these increasingly generic features of normal development, biologists typically zoom out a bit and shift from the executive gene to the “executive genome.” In essence, on the traditional view, normal development results from the coordinated activities of executive genes, which collectively comprise a genetic program.
On the executive cell account, development is a function of coordinated gene actions as well. But this coordination is not programmed into the DNA sequence; rather, it emerges as a result of the cells’ interactions with their environment, most notably other cells. We will explore these two views later in the book; for now, it is sufficient to note that epigenetic processes are at the heart of both views.
The Elusive Gene
There used to be a consensus as to what constitutes a gene, but no longer.9 In the 1960s, there was one gene concept, embodied in the “one gene = one protein” rule. I will call this “the canonical gene.” The canonical gene was soon stretched, but not unduly, by the discovery that many genes code for more than one protein. More recent developments, though, have stretched the gene concept beyond recognition. Now pieces of DNA are called genes that don’t code for proteins at all.10
For the purposes of this book, a gene has two components: a protein-coding sequence and a control panel.11 The latter is a regulatory region to which proteins and other chemicals bind, either inhibiting or promoting transcription. This is where all garden-variety gene regulation and some epigenetic gene regulation occur. The control panel is not usually considered part of the gene proper because it is not transcribed. But the control panel and the coding sequence comprise a functional unit, so the two will be combined under the gene rubric here.
Much, perhaps most epigenetic gene regulation occurs via attachments outside of the gene proper, even on this expanded definition of genehood. That is, the chemical attachments occur outside of the gene that is epigenetically regulated. In fact, epigenetic attachments can affect genes quite distant from the point of attachment. So it is best to think of epigenetic processes as modifications of DNA, not just of individual genes.
We can think of epigenetics as a new way of looking at DNA that goes beyond the base sequence. The linear base sequence comprises one, albeit the primary dimension of the physical gene, but DNA is a three-dimensional molecule. Epigenetics is a science that extends the study of genes from one to three dimensions. These extra dimensions are particularly important for understanding gene regulation, which is where the epigenetic action is. First, though, let’s consider garden-variety gene regulation.