Glossary of Terms

Many of these terms are taken from the public “Talking Glossary of Genetic Terms” available from the National Human Genome Research Institute at http://genome.gov/glossary.cfm

ACGT: The acronym for the four types of bases found in a DNA molecule: adenine (A), cytosine (C), guanine (G), and thymine (T). A DNA molecule consists of two strands wound around each other, with each strand held together by bonds between the bases. Adenine pairs with thymine, and cytosine pairs with guanine. The sequence of bases in a portion of a DNA molecule, called a gene, carries the instructions needed to assemble a protein.

Allele: One of two or more versions of a gene. An individual inherits two alleles for each gene, one from each parent. If the two alleles are the same, the individual is homozygous for that gene. If the alleles are different, the individual is heterozygous. Though the term “allele” was originally used to describe variation among genes, it now also refers to variation among non-coding DNA sequences.

Amino Acid: 20 different molecules used to build proteins. Proteins consist of one or more chains of amino acids called polypeptides. The sequence of the amino acid chain causes the polypeptide to fold into a shape that is biologically active. The amino acid sequences of proteins are encoded in the genes.

Antisense: The non-coding DNA strand of a gene. A cell uses antisense DNA strand as a template for producing messenger RNA (mRNA) that directs the synthesis of a protein. Antisense can also refer to a method for silencing genes. To silence a target gene, a second gene is introduced that produces an mRNA complementary to that produced from the target gene. These two mRNAs can interact to form a double-stranded structure that cannot be used to direct protein synthesis.

Autosomal dominance: A pattern of inheritance characteristic of some genetic diseases. “Autosomal” means that the gene in question is located on one of the numbered, or non-sex, chromosomes. “Dominant” means that a single copy of the disease-associated mutation is enough to cause the disease. This is in contrast to a recessive disorder, where two copies of the mutation are needed to cause the disease. Huntington’s disease is a common example of an autosomal dominant genetic disorder.

Bacteria: Small single-celled organisms. Bacteria are found almost everywhere on Earth and are vital to the planet’s ecosystems. Some species can live under extreme conditions of temperature and pressure. The human body is full of bacteria, and in fact is estimated to contain more bacterial cells than human cells. Most bacteria in the body are harmless, and some are even helpful. A relatively small number of species cause disease.

Base pair: The two chemical bases bonded to one another forming a “rung of the DNA ladder.” The DNA molecule consists of two strands that wind around each other like a twisted ladder. Each strand has a backbone made of alternating sugar (deoxyribose) and phosphate groups. Attached to each sugar is one of four bases--adenine (A), cytosine (C), guanine (G), or thymine (T). The two strands are held together by hydrogen bonds between the bases, with adenine forming a base pair with thymine, and cytosine forming a base pair with guanine.

BRCA1 and BRCA2: The first two genes found to be associated with an inherited form of cancer. Both genes normally act as tumor suppressors, meaning that they help regulate cell division. When these genes are rendered inactive due to mutation, uncontrolled cell growth results, leading to breast cancer. Women with mutations in either gene have a much higher risk for developing breast cancer than women without mutations in the genes.

Carcinogen: An agent with the capacity to cause cancer in humans. Carcinogens may be natural, such as aflatoxin, which is produced by a fungus and sometimes found on stored grains, or manmade, such as asbestos or tobacco smoke. Carcinogens work by interacting with a cell’s DNA and inducing genetic mutations.

Carrier: An individual who carries and is capable of passing on a genetic mutation associated with a disease and may or may not display disease symptoms. Carriers are associated with diseases inherited as recessive traits. In order to have the disease, an individual must have inherited mutated alleles from both parents. An individual having one normal allele and one mutated allele does not have the disease. Two carriers may produce children with the disease.

Cell: The basic building block of living things. All cells can be sorted into one of two groups: eukaryotes and prokaryotes. A eukaryote has a nucleus and membrane-bound organelles, while a prokaryote does not. Plants and animals are made of numerous eukaryotic cells, while many microbes, such as bacteria, consist of single prokaryote cells. An adult human body is estimated to contain between 10 and 100 trillion cells.

Cell membrane (also called plasma membrane): Found in all cells and separates the interior of the cell from the outside environment. The cell membrane consists of a lipid bilayer that is semipermeable. The cell membrane regulates the transport of materials entering and exiting the cell.

Chromosome: An organized package of DNA found in the nucleus of the cell. Different organisms have different numbers of chromosomes. Humans have 23 pairs of chromosomes--22 pairs of numbered chromosomes, called autosomes, and one pair of sex chromosomes, XX (female) or XY (male). Each parent contributes one chromosome to each pair so that offspring get half of their chromosomes from their mother and half from their father.

Codon: A trinucleotide sequence of DNA or RNA that corresponds to a specific amino acid. The genetic code describes the relationship between the sequence of DNA bases (A, C, G, and T) in a gene and the corresponding protein sequence that it encodes. The cell reads the sequence of the gene in groups of three bases. There are 64 different codons: 61 specify amino acids while the remaining three are used as stop signals

Congenital conditions: Those present from birth. Birth defects are described as being congenital. They can be caused by a genetic mutation, an unfavorable environment in the uterus, or a combination of both factors.

Copy number variation (CNV): When the number of copies of a particular gene varies from one individual to the next. Following the completion of the Human Genome Project, it became apparent that the genome experiences gains and losses of genetic material. The extent to which copy number variation contributes to human disease is not yet known. It has long been recognized that some cancers are associated with elevated copy numbers of particular genes.

Cystic fibrosis: A hereditary disease characterized by faulty digestion, breathing problems, respiratory infections from mucus buildup, and the loss of salt in sweat. The disease is caused by mutations in a single gene and is inherited as an autosomal recessive trait, meaning that an affected individual must inherit two mutated copies of the gene to get the disease. In the past, cystic fibrosis was almost always fatal in childhood. Today, however, patients commonly live to be 30 years or older.

Cytoplasm: The gelatinous liquid that fills the inside of a cell. It is composed of water, salts, and various organic molecules. Some intracellular organelles, such the nucleus and mitochondria, are enclosed by membranes that separate them from the cytoplasm.

Diabetes mellitus: A disease characterized by an inability to make or use the hormone insulin. Insulin is needed by cells to metabolize glucose, the body’s main source of chemical energy. Type I diabetes, also called insulin-dependent diabetes mellitus, is usually caused by an autoimmune destruction of insulin-producing cells. Type II diabetes, also called non-insulin-dependent diabetes mellitus, occurs when cells become resistant to the effects of insulin.

Diploid: a cell or organism that has paired chromosomes, one from each parent. In humans, cells other than human sex cells, are diploid and have 23 pairs of chromosomes. Human sex cells (egg and sperm cells) contain a single set of chromosomes and are known as haploid.

DNA (Deoxyribonucleic Acid): the chemical name for the molecule that carries genetic instructions in all living things. The DNA molecule consists of two strands that wind around one another to form a shape known as a double helix. Each strand has a backbone made of alternating sugar (deoxyribose) and phosphate groups. Attached to each sugar is one of four bases--adenine (A), cytosine (C), guanine (G), and thymine (T). The two strands are held together by bonds between the bases; adenine bonds with thymine, and cytosine bonds with guanine. The sequence of the bases along the backbones serves as instructions for assembling protein and RNA molecules.

DNA sequencing: A laboratory technique used to determine the exact sequence of bases (A, C, G, and T) in a DNA molecule. The DNA base sequence carries the information a cell needs to assemble protein and RNA molecules. DNA sequence information is important to scientists investigating the functions of genes. The technology of DNA sequencing was made faster and less expensive as a part of the Human Genome Project.

Duplication: A type of mutation that involves the production of one or more copies of a gene or region of a chromosome. Gene and chromosome duplications occur in all organisms, though they are especially prominent among plants. Gene duplication is an important mechanism by which evolution occurs.

Enzyme: A biological catalyst and is almost always a protein. It speeds up the rate of a specific chemical reaction in the cell. The enzyme is not destroyed during the reaction and is used over and over again. A cell contains thousands of different types of enzyme molecules, each specific to a particular chemical reaction.

Exon: The portion of a gene that codes for amino acids. In the cells of plants and animals, most gene sequences are broken up by one or more DNA sequences called introns. The parts of the gene sequence that are expressed in the protein are called exons, because they are expressed, while the parts of the gene sequence that are not expressed in the protein are called introns, because they come in between--or interfere with--the exons.

Founder effect: The reduction in genetic variation that results when a small subset of a large population is used to establish a new colony. The new population may be very different from the original population, both in terms of its genotypes and phenotypes. In some cases, the founder effect plays a role in the emergence of new species.

Frameshift mutation: A type of mutation involving the insertion or deletion of a nucleotide in which the number of deleted base pairs is not divisible by three. “Divisible by three” is important because the cell reads a gene in groups of three bases. Each group of three bases corresponds to one of 20 different amino acids used to build a protein. If a mutation disrupts this reading frame, then the entire DNA sequence following the mutation will be read incorrectly.

Gene: The basic physical unit of inheritance. Genes are passed from parents to offspring and contain the information needed to specify traits. Genes are arranged, one after another, on structures called chromosomes. A chromosome contains a single, long DNA molecule, only a portion of which corresponds to a single gene. Humans have approximately 25,000 genes arranged on their chromosomes.

Gene expression: The process by which the information encoded in a gene is used to direct the assembly of a protein molecule. The cell reads the sequence of the gene in groups of three bases. Each group of three bases (codon) corresponds to one of 20 different amino acids used to build the protein.

Gene mapping: The process of establishing the locations of genes on the chromosomes. Early gene maps used linkage analysis. The closer two genes are to each other on the chromosome, the more likely it is that they will be inherited together. By following inheritance patterns, the relative positions of genes can be determined. More recently, scientists have used recombinant DNA (rDNA) techniques to establish the actual physical locations of genes on the chromosomes.

Genome: The entire set of genetic instructions found in a cell. In humans, the genome consists of 23 pairs of chromosomes, found in the nucleus, as well as a small chromosome found in the cells’ mitochondria. These chromosomes, taken together, contain approximately 3 billion bases of DNA sequence.

Genome-wide association study (GWAS): An approach used in genetics research to associate specific genetic variations with particular diseases. The method involves scanning the genomes from many different people and looking for genetic markers that can be used to predict the presence of a disease. Once such genetic markers are identified, they can be used to understand how genes contribute to the disease and develop better prevention and treatment strategies.

Genomics: Refers to the study of the entire genome of an organism whereas genetics refers to the study of a particular gene.

Genotype: An individual’s collection of genes. The term also can refer to the two alleles inherited for a particular gene. The genotype is expressed when the information encoded in the genes’ DNA is used to make protein and RNA molecules. The expression of the genotype contributes to the individual’s observable traits, called the phenotype.

Heterozygous: Where an individual inherits different forms of a particular gene from each parent. A heterozygous genotype contrasts to a homozygous genotype, where an individual inherits identical forms of a particular gene from each parent.

Homozygous: Where an individual inherits the same alleles for a particular gene from both parents.

The Human Genome Project: An international project that mapped and sequenced the entire human genome. Completed in April 2003, data from the project are freely available to researchers and others interested in genetics and human health.

Intron: A portion of a gene that does not code for amino acids. In the cells of plants and animals, most gene sequences are broken up by one or more introns. The parts of the gene sequence that are expressed in the protein are called exons, because they are expressed, while the parts of the gene sequence that are not expressed in the protein are called introns, because they come in between the exons

Messenger RNA (mRNA): A single-stranded RNA molecule that is complementary to one of the DNA strands of a gene. The mRNA is an RNA version of the gene that leaves the cell nucleus and moves to the cytoplasm where proteins are made. During protein synthesis, an organelle called a ribosome moves along the mRNA, reads its base sequence, and uses the genetic code to translate each three-base triplet, or codon, into its corresponding amino acid.

Mitochondria: Membrane-bound cell organelles (mitochondrion, singular) that generate most of the chemical energy needed to power the cell’s biochemical reactions. Mitochondria contain their own small chromosomes. Generally, mitochondria, and therefore mitochondrial DNA, are inherited only from the mother

Monomer: The simplest unit, of a repeating sequence of similar units, of a polymer.

Mutation: A change in a DNA sequence. Mutations can result from DNA copying mistakes made during cell division, exposure to ionizing radiation, exposure to chemicals called mutagens, or infection by viruses. Germ line mutations occur in the eggs and sperm and can be passed on to offspring, while somatic mutations occur in body cells and are not passed on.

Nonsense mutation: The substitution of a single base pair that leads to the appearance of a stop codon where previously there was a codon specifying an amino acid. The presence of this premature stop codon results in the production of a shortened, and likely nonfunctional, protein.

Nucleotide: The basic building block of nucleic acids. RNA and DNA are polymers made of long chains of nucleotides. A nucleotide consists of a sugar molecule (either ribose in RNA or deoxyribose in DNA) attached to a phosphate group and a nitrogen-containing base. The bases used in DNA are adenine (A), cytosine (C), guanine (G), and thymine (T). In RNA, the base uracil (U) takes the place of thymine.

Oligomer: A molecule that consists of a relatively small and specifiable number of monomers.

Oligonucleotide: Any molecule that contains a small number of nucleotide units connected by phosphodiester linkages between (usually) the 32’ position of one nucleotide and the 5’ position of the adjacent one. The number of nucleotide units in these small single-stranded nucleic acids (usually DNA) is variable but often in the range of 6 to 24 (hexamer to 24mer).

Oncogene: A mutated gene that contributes to the development of a cancer. In their normal, unmutated state, oncogenes are called proto-oncogenes, and they play roles in the regulation of cell division. Some oncogenes work like putting your foot down on the accelerator of a car, pushing a cell to divide. Other oncogenes work like removing your foot from the brake while parked on a hill, also causing the cell to divide.

Open reading frame: A portion of a DNA molecule that, when translated into amino acids, contains no stop codons. The genetic code reads DNA sequences in groups of three base pairs each being a codon).

Peptide: One or more amino acids linked by chemical bonds. The term also refers to the type of chemical bond that joins the amino acids together. A series of linked amino acids is a polypeptide. The cell’s proteins are made from one or more polypeptides.

Personalized medicine: An emerging practice of medicine that uses an individual’s genetic profile to guide decisions made in regard to the prevention, diagnosis, and treatment of disease. Knowledge of a patient’s genetic profile can help doctors select the proper medication or therapy and administer it using the proper dose or regimen. Personalized medicine is being advanced through data from the Human Genome Project.

Pharmacogenomics: A branch of pharmacology concerned with using DNA and amino acid sequence data to inform drug development and testing. An important application of pharmacogenomics is correlating individual genetic variation with drug responses.

Phenotype: An individual’s observable traits, such as height, eye color, and blood type. The genetic contribution to the phenotype is called the genotype. Some traits are largely determined by the genotype, while other traits are largely determined by environmental factors.

Plasma membrane: See cell membrane

Point mutation: When a single base pair is altered. Point mutations can have one of three effects. First, the base substitution can be a silent mutation where the altered codon corresponds to the same amino acid. Second, the base substitution can be a missense mutation where the altered codon corresponds to a different amino acid. Or third, the base substitution can be a nonsense mutation where the altered codon corresponds to a stop signal.

Proteins: An important class of molecules found in all living cells. A protein is composed of one or more long chains of amino acids, the sequence of which corresponds to the DNA sequence of the gene that encodes it. Proteins play a variety of roles in the cell, including structural (cytoskeleton), mechanical (muscle), biochemical (enzymes), and cell signaling (hormones). Proteins are also an essential part of diet.

Recessive: A quality found in the relationship between two versions of a gene. Individuals receive one version of a gene, called an allele, from each parent. If the alleles are different, the dominant allele will be expressed, while the effect of the other allele, called recessive, is masked. In the case of a recessive genetic disorder, an individual must inherit two copies of the mutated allele in order for the disease to be present.

Ribosome: A cellular particle made of RNA and protein that serves as the site for protein synthesis in the cell. The ribosome reads the sequence of the messenger RNA (mRNA) and, using the genetic code, translates the sequence of RNA bases into a sequence of amino acids.

Ribonucleic acid (RNA): A molecule similar to DNA. Unlike DNA, RNA is single-stranded. An RNA strand has a backbone made of alternating sugar (ribose) and phosphate groups. Attached to each sugar is one of four bases--adenine (A), uracil (U), cytosine (C), or guanine (G). Different types of RNA exist in the cell: messenger RNA (mRNA), ribosomal RNA (rRNA), and transfer RNA (tRNA). More recently microRNA has been found to be involved in regulating gene expression.

Sex linked: A trait in which a gene is located on a sex chromosome. In humans, the term generally refers to traits that are influenced by genes on the X chromosome. This is because the X chromosome is large and contains many more genes than the smaller Y chromosome. In a sex-linked disease, such as Duchenne muscular dystrophy, it is usually males who are affected because they have a single copy of X chromosome that carries the mutation. In females the effect of the mutation may be masked by the second healthy copy of the X chromosome.

Single nucleotide polymorphisms (SNPs): A type of polymorphism involving variation of a single base pair. Scientists are studying how single nucleotide polymorphisms, or SNPs (pronounced “snips”), in the human genome correlate with disease, drug response, and other phenotypes.

Transcription: The process of making an RNA copy of a gene sequence. This copy, called a messenger RNA (mRNA) molecule, leaves the cell nucleus and enters the cytoplasm, where it directs the synthesis of the protein, which it encodes.

Translation: The process of translating the sequence of a messenger RNA (mRNA) molecule to a sequence of amino acids during protein synthesis. The genetic code describes the relationship between the sequence of base pairs in a gene and the corresponding amino acid sequence that it encodes. In the cell cytoplasm, the ribosome reads the sequence of the mRNA in groups of three bases to assemble the protein.

X-linked: See sex linked