Systems biology attempts to understand structure, function, regulation, or development of biological systems by combining experimental and computational approaches. We must acknowledge that different parts of cellular organization are studied and understood in different ways and to different extent. This is related to diverse experimental techniques that can be used to measure the abundance of metabolites, proteins, mRNA, or other types of compounds. For example, enzyme kinetic measurements are performed for more than a century, while mRNA measurements (e.g., RNAseq or microarray data) or protein measurements (e.g., as mass spectrometry analysis) have been developed more recently. Not all data can be provided with the same accuracy and reproducibility. These and other complications in studying life caused a nonuniform progress in modeling different parts of cellular life. Moreover, the diversity of scientific questions and the availability of computational tools to tackle them provoked the development of very different types of models for different biological processes. Stimulating or modeling of biochemical networks is the recent development to assemble parts lists for various networks, such as genome-scale metabolic reconstructions, compilations of all compounds in signaling networks, lists of all transcription factors, and many more. These can often be found in specialized databases such as Reactome, KEGG, and BioModels (see Chapter 16). In this chapter, we will introduce a number of classical and more recent areas of systems biological research. Specifically, we discuss modeling of metabolic systems, signaling pathways, cell-cycle regulation, and development and differentiation, primarily with ODE systems.
Metabolic networks are on the one side defined by the enzymes converting substrates into products in a reversible or irreversible manner. Without enzymes, those reactions are essentially impossible or too slow. On the other side, the networks are characterized by the metabolites that are converted by the various enzymes. Biochemical studies have revealed a number of important pathways including catabolic pathways and pathways of the energy metabolism, such as glycolysis, the pentose-phosphate pathway, the tricarboxylic acid (TCA) cycle, and oxidative phosphorylation, and anabolic pathways including gluconeogenesis, amino acid synthesis pathways, synthesis of fatty acids, synthesis of nucleic acids, and synthesis and degradation of more complex molecules. Databases such as the Kyoto Encyclopedia of Genes and Genomes Pathway (KEGG, http://http://www.genome.jp/kegg/pathway.html) provide a comprehensive overview of the composition of pathways in various organisms [3].
Here, we will focus on their characteristics that are essential for modeling. Chapter 2 provides a summary of the first steps to build a model. First, we sketch the metabolites and the converting reactions in a cartoon to get an overview and an intuitive understanding (Figure 12.1). Based on that cartoon and on further information, we set the limits of our model. To this end, we consider what kind of question we want to answer with our model, what information in terms of qualitative and quantitative data is available, and how can we make the model as simple as possible but as comprehensive as necessary. Then, for every compound, which is part of the system, we formulate the balance equations (see also Section 3.1) summing up all reactions that produce the compound (with a positive sign) and all reactions that degrade the compound (with a negative sign). At this stage, the model is suited for a network analysis, such as the stoichiometric analysis (Section 3.1) or, with some additional information, flux balance analysis (Section 3.2). In order to study the dynamics of the system, we must add kinetic descriptions to the individual reactions (Chapter 4). Keep in mind that the reaction kinetics may depend on
In the following, we will discuss in more detail models for three example pathways: the upper glycolysis, the full glycolysis, and the threonine synthesis.
A first model of the upper part of glycolysis is depicted in Figure 12.2. It comprises six reactions and six metabolites. Note that we neglect the formation of phosphate Pi here.
The corresponding ODE system reads
With mass action kinetics, the rate equations read , , , , , , and . Given the values of the parameters and the initial concentrations, one may simulate the time behavior of the network as depicted in Figure 12.3.
We see starting with zero concentrations of all hexoses (here, glucose, fructose, and their phosphates), that they accumulate until production and degradation are balanced. They approach a steady state. ATP rises and decreases to the same extent as ADP decreases and rises, and their total remains constant. This is due to the conservation of adenine nucleotides, which could be revealed by stoichiometric analysis (Section 3.1).
For this upper glycolysis model, the concentration vector is S = (Glucose, Gluc6P, Fruc6P, Fruc1,6P2, ATP, ADP)T, the vector of reaction rates is , and the stoichiometric matrix reads
It comprises reactions and has . Thus, the kernel matrix (see Chapter 3) has two linear independent columns. A possible representation is
Figure 12.4 shows the flux and concentration control coefficients (see Section 4.2) for the model of upper glycolysis in gray scale (see scaling bar). Reaction v1 has a flux control of 1 over all steady-state fluxes, reactions v2, v3, v4, and v7 have no control over fluxes; they are determined by v1. Reactions v5 and v6 have positive or negative control over J4, J5, and J7, respectively, since they control the turnover of fructose phosphates.
The concentration control shows a more interesting pattern. As a rule of thumb, it holds that producing reactions have a positive control and degrading reactions have a negative control, such as v1 and v2 for glucose. But also distant reactions can exert concentration control, such as v4 to v6 over Gluc6P.
More comprehensive models of glycolysis can be used to study details of the dynamics, such as the occurrence of oscillations or the effect of perturbations. Examples are the models of Hynne et al. [4] or the Reuss group [5,6] or a model based on genome-scale reconstructions [3]. An overview of the most important reactions in glycolysis is given in Figure 12.5.
Threonine is an amino acid and it is essential for birds and mammals. The synthesis pathway from aspartate involves five steps. It is known for a long time and has attracted some interest with respect to its economic industrial production for a variety of uses. The kinetics of all the five enzymes from E. coli have been studied extensively, the complete genome sequence of this organism is now known and there is an extensive range of genetic tools available. The intensive study and the availability of kinetic information make it a good example for metabolic modeling of the pathway (Figure 12.6).
The reaction system can be described with the following set of differential equations:
with
This system has no nontrivial steady state, that is, no steady state with nonzero flux, since aspartate is always degraded, while threonine is only produced. The same imbalance holds for the couples ATP/ADP and NADPH+H+/NADP+. The dynamics is shown in Figure 12.7.
Threonine exerts feedback inhibition on the pathway producing it. This is illustrated in Figure 12.8. The higher the threonine concentration, the lower the rates of the inhibited reactions. The effect is that production of threonine is downregulated as long as its level is sufficient, thereby saving aspartate and energy.
On the molecular level, signaling involves the same type of processes as metabolism: production or degradation of substances, molecular modifications (mainly phosphorylation, also methylation, acetylation), localization of compounds to specific cell compartments, and activation or inhibition of reactions. From a modeling point of view, there are some important differences between signaling and metabolism. (i) Signaling pathways serve for information processing and transfer of information, while metabolism provides mainly mass transfer. (ii) The metabolic network is determined by the present set of enzymes catalyzing the reactions. Signaling pathways involve compounds of different type; they may form highly organized complexes, and may dynamically assemble upon occurrence of the signal. (iii) The quantity of converted material is high in metabolism (amounts are usually given in concentrations in the order of µM or mM) compared to the number of molecules involved in signaling processes (in yeast or E. coli cells, typical abundance of proteins in signal cascades is in the order of 10–104 molecules per cell). (iv) The different amounts of components have an effect on the concentration ratio of catalysts and substrates. In metabolism, this ratio is usually low, that is, the enzyme concentration is much lower than the substrate concentration, which gives rise to the quasi-steady state assumption used in Michaelis–Menten kinetics (Section 4.1). In signaling processes, amounts of catalysts and their substrates are frequently in the same order of magnitude.
Cells have a broad spectrum of receiving and processing signals; not all of them can be considered here. A typical sequence of events in signaling pathways is shown in Figure 12.9 and proceeds as follows.
The signal – a substance acting as ligand, hormones, growth factors or a physical stimulus, oxidative stress, and osmotic stress – approaches the cell surface and is perceived by a transmembrane receptor. The receptor changes its state from susceptible to active and triggers subsequent processes within the cell. The active receptor stimulates an internal signaling cascade.
In order to distinguish between fast and slow processes and to characterize the main location of events, we discriminate between responses on levels I–IV. Level I response comprises all processes that take place at the membrane due to receptor activation. These processes may include the recruitment of a number of proteins to the vicinity of the receptor that were previously located either at the membrane or in the cytosol. These proteins may form large heteromeric complexes and modify each other and the receptor and they form a local environment of information transmission. They may also serve as anchor for cell structures such as actin cables.
Level II response includes all downstream interaction of signaling molecules in the cytoplasm. There occur different types of protein–protein interactions such as complex formation, phosphorylation and dephosphorylation, or phosphotransfer, as well as targeting for degradation. The effects include both forward signaling as well as positive and negative feedback. Also crosstalk between signaling pathways that are primarily considered responsible for transmission of specific signals is observed. Note that the concept of crosstalk is strongly related to the perception of signaling by the investigator. If we consider separate signaling pathways that transmit the information about a specific cue, we can also consider their crosstalk, that is, interactions between components of diverse signaling pathways that lead to a divergence of the information to other targets than those of the specific pathway (measures for crosstalk are discussed below). If we consider all signaling processes as one large signaling network, then we simply observe various interactions in this network.
Level III response comprises all direct effects of active signaling molecules on other cellular networks such as phosphorylation of cell-cycle components or metabolic enzymes. These effects proceed on fast time scales and may lead to immediate cellular responses.
Level IV response is comparatively slow. The sequence of state changes within the signaling pathway crosses the nuclear membrane. Eventually, transcription factors are activated or deactivated. Such a transcription factor changes its binding properties to regulatory regions on the DNA upstream of a set of genes; the transcription rate of these genes is altered (typically increased). The either newly produced proteins or the changes in protein concentration cause an adaptation response of the cell to the signal. Also on level IV, signaling pathways are regulated by a number of control mechanisms including feedback and feed-forward modulation.
This is somehow the typical picture; many pathways may work in completely different manner. As example, an overview on signaling pathways that are stimulated in yeast stress response is given in Figure 12.10. More extended versions of the yeast signaling network with focus on the precise types of interactions can be found in Ref. [7].
With respect to the activating or inhibiting effect of an external cue on cellular response features, signaling pathways may exhibit different types of logical behavior. Positive and negative logic of signaling systems can be found:
These examples are explained below in more detail along with the general structure of signal pathway building blocks (Section 12.2.3).
Many receptors are transmembrane proteins. They receive the signal and transmit it. Upon signal sensing they change their conformation. In the active form, they are able to initiate a downstream process within the cell (Figure 12.11).
The simplest concept of the interaction between receptor R and ligand L is reversible binding to form the active complex LR:
The dissociation constant is calculated as
Typical values for are
Cells have the ability to regulate the number and the activity of specific receptors, for example in order to weaken the signal transmission during long-term stimulation. Balancing production and degradation regulates the number of receptors. Phosphorylation of serine, threonine or tyrosine residues of the cytosolic domain by protein kinases mainly regulates the activity.
Hence, a more realistic scenario for ligand–receptor interaction is depicted in Figure 12.12.
We assume that the receptor may be present in an inactive state Ri or in a susceptible state Rs. The susceptible form can interact with the ligand to form the active state Ra. The inactive or the susceptible forms are produced from precursors (), all three forms may be degraded (). The rates of production and degradation processes as well as the equilibria between the different states are influenced by the cell state, for example, by the cell-cycle stage. In general, the dynamics of this scenario can be described by the following set of differential equations:
For the production terms, we may either assume constant values or (as mentioned above) rates that depend on the actual cell state. The degradation terms might be assumed to be linearly dependent on the concentration of their substrates (). This may also be a first guess for the state changes of the receptor (e.g., ). The receptor activation is dependent on the ligand concentration (or any other value related to the signal). A linear approximation of the respective rate is . If the receptor is a dimer or oligomer, it might be sensible to include this information into the rate expression as , where denotes the binding constant to the monomer and n denotes the Hill coefficient (Eq. (4.44)).
Signaling pathways constitute often highly complex networks, but it has been observed that they are frequently composed of typical building blocks. These components include Ras proteins, G protein cycles, phosphorelay systems, and MAP kinase cascades. In this chapter, we will discuss their general composition and function as well as modeling approaches.
G proteins are essential parts of many signaling pathways. The reason for their name is that they bind the guanine nucleotides GDP and GTP. They are heterotrimers, that is, consist of three different subunits. G proteins are associated to cell surface receptors with a heptahelical transmembrane structure, the G protein-coupled receptors (GPCR). Signal transduction cascades involving (i) such a transmembrane surface receptor, (ii) an associated G protein, and (iii) an intracellular effector that produces a second messenger play an important role in cellular communication and are well studied [9,10]. In humans, G protein-coupled receptors mediate responses to light, flavors, odors, numerous hormones, neurotransmitters, and other signals [11–13]. In unicellular eukaryotes, receptors of this type mediate signals that affect such basic processes as cell division, cell–cell fusion (mating), morphogenesis, and chemotaxis [11,14–16].
The cycle of G protein activation and inactivation is shown in Figure 12.13. When GDP is bound, the G protein α subunit (Gα) is associated with the G protein βγ heterodimer (Gβγ) and is inactive. Agonist binding to a receptor promotes guanine nucleotide exchange; Gα releases GDP, binds GTP, and dissociates from Gβγ. The dissociated subunits Gα or Gβγ, or both, are then free to activate target proteins (downstream effectors), which initiates signaling. When GTP is hydrolyzed, the subunits are able to reassociate. Gβγ antagonizes receptor action by inhibiting guanine nucleotide exchange. RGS (regulator of G protein signaling) proteins bind to Gα, stimulate GTP hydrolysis, and thereby reverse G protein activation. This general scheme can also be applied to the regulation of small monomeric Ras-like GTPases, such as Rho. In this case, the receptor, Gβγ, and RGS are replaced by GEF and GAP (see below).
Direct targets include different types of effectors, such as adenylyl cyclase, phospholipase C, exchange factors for small GTPases, some calcium and potassium channels, plasma membrane Na+/H+ exchangers, and certain protein kinases [10,17–19]. Typically, these effectors produce second messengers or other biochemical changes that lead to stimulation of a protein kinase or a protein kinase cascade (or, as mentioned, are themselves a protein kinase). Signaling persists until GTP is hydrolyzed to GDP and the Gα and Gβγ subunits reassociate, completing the cycle of activation. The strength of the G protein-initiated signal depends on (i) the rate of nucleotide exchange, (ii) the rate of spontaneous GTP hydrolysis, (iii) the rate of RGS-supported GTP hydrolysis, and (iv) the rate of subunit reassociation. RGS proteins act as GTPase-activating proteins (GAPs) for a variety of different Gα classes. Thereby, they shorten the lifetime of the activated state of a G protein, and contribute to signal cessation. Furthermore, they may contain additional modular domains with signaling functions and contribute to diversity and complexity of the cellular signaling networks [20–23].
Small G proteins are monomeric G proteins with molecular weight of 20–40 kDa. Like heterotrimeric G proteins, their activity depends on the binding of GTP. More than 100 small G proteins have been identified. They belong to five families: Ras, Rho, Rab, Ran, and Arf. They regulate a wide variety of cell functions as biological timers that initiate and terminate specific cell functions and determine the periods of time [24].
The transition from GDP-bound to GTP-bound states is catalyzed by a guanine nucleotide exchange factor (GEF), which induces exchange between the bound GDP and the cellular GTP. The reverse process is facilitated by a GTPase-activating protein (GAP), which induces hydrolysis of the bound GTP. Its dynamics can be described with the following equation with appropriate choice of the rates vGEF and vGAP:
Figure 12.14 illustrates the wiring of a Ras protein and the dependence of its activity on concentration ratio of the activating GEF and the deactivating GAP.
Mutations of the Ras protooncogenes (H-Ras, N-Ras, and K-Ras) are found in many human tumors. Most of these mutations result in the abolishment of normal GTPase activity of Ras. The Ras mutants can still bind to GAP, but cannot catalyze GTP hydrolysis. Therefore, they stay active for a long time.
Most phosphorylation events in signaling pathways take place under consumption of ATP. Phosphorelay (or phosphotransfer) systems employ another mechanism: after an initial phosphorylation using ATP (or another phosphate donor), the phosphate group is transferred directly from one protein to the next without further consumption of ATP (or external donation of phosphate). Examples are the bacterial phosphoenolpyruvate:carbohydrate phosphotransferase [25–28], the two-component system of E. coli, or the Sln1 pathway involved in osmoresponse of yeast [29].
Figure 12.15 shows a scheme of a phosphorelay system from the high osmolarity glycerol (HOG) signaling pathway in yeast as well as a schematic representation of a phosphorelay system.
This pathway is organized as follows [30]. It involves the transmembrane protein Sln1, which is present as a dimer. Under normal conditions, the pathway is active, since Sln1 continuously autophosphorylates at a histidine residue, Sln1H-P, under consumption of ATP. Subsequently, this phosphate group is transferred to an aspartate group of Sln1 (resulting in Sln1A-P), then to a histidine residue of Ypd1 and finally to an aspartate residue of Ssk1. Ssk1 is continuously dephosphorylated by a phosphatase. Without stress, the proteins are mainly present in their phosphorylated form. The pathway is blocked by an increase in the external osmolarity and a concomitant loss of turgor pressure in the cell. The phosphorylation of Sln1 stops, the pathway runs out of transferable phosphate groups, and the concentration of Ssk1 rises. This constitutes the downstream signal. The temporal behavior of a generalized phosphorelay system (Figure 12.15) can be described with the following set of ODEs:
For the ODE system in Eq. (12.12), the following conservation relations hold
The existence of conservation relations is in agreement with the assumption that production and degradation of the proteins occurs on a larger time scale as the phosphorylation events.
The temporal behavior of a phosphorelay system upon external stimulation is shown in Figure 12.16. Before the stimulus, the concentrations of A, B, and C assume low, but nonzero, levels due to continuous flow of phosphate groups through the network. During stimulation, they increase one after the other up to a maximal level that is determined by the total concentration of each protein. After removal of stimulus, all three concentrations return quickly to their initial values.
Figure 12.16b illustrates the dependence of the sensitivity of the phosphorelay system on the value of the terminal dephosphorylation. For a low rate constant , for example, , the concentration C is low (almost) independent of the value of , while for high , for example, , the concentration C is (almost always) maximal. Changing of leads to a change of C-levels only in the range .
This system is an example for a case where we can draw initial conclusions about feasible parameter values just from the network structure and the task of the module.
Mitogen-activated protein kinases (MAPKs) are a family of serine/threonine kinases that transduce signals from the cell membrane to the nucleus in response to a wide range of stimuli. Independent or coupled kinase cascades participate in many different intracellular signaling pathways that control a spectrum of cellular processes, including cell growth, differentiation, transformation, and apoptosis. MAPK cascades are widely involved in eukaryotic signal transduction, and these pathways are conserved from yeast to mammals.
A general scheme of a MAPK cascade is depicted in Figure 12.17. This pathway consists of several levels (usually three to four), where the activated kinase at each level phosphorylates the kinase at the next level down the cascade. The MAP kinase (MAPK) is at the terminal level of the cascade. It is activated by the MAPK kinase (MAPKK) by phosphorylation of two sites, conserved threonine and tyrosine residues. The MAPKK is itself phosphorylated at serine and threonine residues by the MAPKK kinase (MAPKKK). Several mechanisms are known to activate MAPKKKs by phosphorylation of a tyrosine residue. In some cases, the upstream kinase may be considered as a MAPKKK kinase (MAPKKKK). Dephosphorylation of either residue is thought to inactivate the kinases, and mutants lacking either residue are almost inactive. At each cascade level, protein phosphatases can inactivate the corresponding kinase, although it is in some cases a matter of debate whether this reaction is performed by an independent protein or by the kinase itself as autodephosphorylation. Also ubiquitin-dependent degradation of phosphorylated proteins is reported.
Although they are conserved through species, elements of the MAPK cascade got different names in various studied systems. Some examples are listed in Table 12.1 (see also [31]).
Table 12.1 Names of the components of MAP kinase pathways in different organisms and different pathways.
Organism | Budding yeast | Xensopus oocytes | Human, cell-cycle regulation | |||
HOG pathway | Pheromone pathway | p38 pathway | JNK pathway | |||
MAPKKK | Ssk2/Ssk22 | Ste11 | Mos | Rafs (c-, A-, and B-) | Tak1 | MEKKs |
MAPKK | Pbs2 | Ste7 | MEK1 | MEK1/2 | MKK3/6 | MKK4/7 |
MAPK | Hog1 | Fus3 | p42 MAPK | ERK1/2 | p38 | JNK1/2 |
In the following, we will present typical modeling approaches and then discuss functional properties of signaling cascades. The dynamics of a MAPK cascade may be represented by the following ODE system:
The variables in the ODE system fulfill a set of moiety conservation relations, irrespective of the concrete choice of expression for the rates . It holds
The conservation relations reflect that we do not consider production or degradation of the involved proteins in this model. This is justified by the supposition that protein production and degradation take place on a different time scale as signal transduction.
The choice of the expressions for the rates is a matter of elaborateness of experimental knowledge and of modeling taste. We will discuss here different possibilities. Assuming only mass action results in linear and bilinear expressions such as
The kinetic constants are first- (i even) and second-order (i odd) rate constants. In these expressions, the concentrations of the donor and acceptor of the transferred phosphate group, ATP and ADP, are not explicitly considered but included in the rate constants and . Considering ATP and ADP explicitly results in
In addition, we have to care about the ATP–ADP balance and add three more differential equations
Here, we find two more conservation relations, the conservation of adenine nucleotides, and the conservation of phosphate groups
One may take into account that enzymes catalyze all steps [32] and therefore consider Michaelis–Menten kinetics for the individual steps [33]. Taking again the first and second reactions as examples for kinase and phosphatase steps, we get
where is a first-order rate constant, and are Michaelis constants, and denotes a maximal enzyme rate. Reported values for Michaelis constants are 15 nM [33], 46 nM and 159 nM [34], and 300 nM [32]. For maximal rates, values of about [33] are used in models.
The performance of MAPK cascades, that is, their ability to amplify the signal, to enhance the concentration of the double phosphorylated MAPK notably, and the speed of activation, depends crucially on the kinetic constants of the kinases, , and phosphatases, (Eq. (12.19)), and, moreover, on their ratio (see Figure 12.18). If the ratio is low (phosphatases stronger than kinases), then the amplification is high, but at very low absolute concentrations of phosphorylated MAPK. High values of ensure high absolute concentrations of MAPK-P2, but with negligible amplification. High values of both and ensure fast activation of downstream targets.
Frequently, the proteins of MAPK cascades interact with scaffold proteins. In this case, a reversible assembly of oligomeric protein complexes that includes both enzymatic proteins and proteins without known enzymatic activity precedes the signal transduction. These nonenzymatic components can serve as scaffolds or anchors to the plasma membrane and regulate the efficiency, specificity, and localization of the signaling pathway.
The Wnt/β-catenin signaling pathway is an important regulatory pathway for a multitude of processes in higher eukaryotic systems. It has been found due to its role in tumor growth, especially through the use of mouse models of cancer and oncogenic retroviruses. However, it is also important in normal embryonic development. It is relevant for cell fate specification, cell proliferation, and cell migration or formation of body axis. Recently, its role in stem cell differentiation and reprogramming has been discussed.
In fact, the Wnt signaling comprises not just one pathway, but at least three pathways, that is, the canonical Wnt pathway, the noncanonical planar cell polarity pathway, and the noncanonical Wnt/calcium pathway. All three pathways employ binding of the Wnt ligand to the Frizzled receptors, but have different downstream architecture and effects (Figure 12.19). Wnt denotes a family of 19 genes (in vertebrates, other figures hold for other branches of the animal kingdom).
The major function of the canonical pathway is to stimulate the accumulation of β-catenin, which is normally kept at a low concentration through balanced continuous production and degradation by the destruction complex comprising adenomatous polyposis coli (APC), axin (axin1 or axin2), glycogen synthase kinase 3 (GSK3), and casein kinase 1a (CK1). This complex targets β-catenin for ubiquitination and proteasomal digestions. The G-protein-coupled receptor Frizzled often functions together with coreceptors, such as the lipoprotein receptor-related protein LRP or receptor tyrosine kinases. When the ligand Wnt binds to the receptor and its coreceptors, they recruit cytoplasmic proteins to the membrane (level 1 response, see Section 12.2.1): the phosphoprotein Dsh is recruited to the receptor, axin binds to LPR. The destruction complex is reorganized and the GSK3 activity is inhibited. β-catenin can accumulate (level 2 response). Among other consequences, it enters the nucleus and activates transcription factors leading to gene expression changes (level 4 response).
An early experiment-based model of the dynamics of the destruction complex and its regulation upon Wnt signaling investigated the quantitative roles of APC and axin (present in low amounts) [35]. They investigated which reaction steps would have high control (see concentration control coefficients introduced in Chapter 4) over the β-catenin concentration and found that strong negative control is exerted by the assembly of the destruction complex and the binding of β-catenin to the destruction complex. Positive control is exerted by reactions leading to the disassembly of the destruction complex, the dissociation of β-catenin from the destruction complex, the axin degradation, and the β-catenin synthesis.
Another model has analyzed the robustness of β-catenin levels in response to Wnt signaling [36]. They found that while the absolute levels of β-catenin are very sensitive to many types of perturbations, the fold changes in response to Wnt are not. Here fold change means the level of β-catenin after Wnt-stimulation divided by the level before Wnt-stimulation. An incoherent feed-forward loop can cause such a behavior [37].
The following minimal model is suited to demonstrate major features of the canonical Wnt signaling pathway in wild-type and some mutant conditions [38]. The wiring of the model is shown in Figure 12.20. The ODE system reads
Conservation relations:
Rate equations:
The parameter values are , ,,, ,, , , ,, , , , , , , , , and .
The dynamics and steady-state behavior of the model are shown in Figure 12.21.
The existence of positive or negative feedback cycles in complex signaling networks that are described with ODE systems can be derived from the analysis of the Jacobian matrix M (see Chapter 4, Chapter 15, Eq. (15.39)) [39,40]. With () being differential equations describing the dynamics of component xi and being the terms of the Jacobian , we can classify single nonzero terms of the Jacobian or sequences thereof as follows:
Obviously, a feedback loop should not contain more than n elements in a sequence (otherwise, it would be repetitive). Zero elements of the Jacobian in a sequence cut a feedback. An odd number of negative elements causes negative feedback, while a positive (or zero) number of negative terms leads to positive feedback.
Signaling pathways can exhibit interesting dynamic and regulatory features, such as other regulatory pathway. Dynamic motifs are discussed in detail in Section 8.2.
Among the various regulatory features of signaling pathways, negative feedback has attracted outstanding interest. It also plays an important role in metabolic pathways, for example, in amino acid synthesis pathways (Section 12.1.3), where a negative feedback signal from the amino acid at the end to the precursors at the beginning of the pathway prevents an overproduction of this amino acid. The implementation of the feedback and the respective dynamic behavior show a wide variation. Feedback can bring about limit cycle-type oscillations in cell-cycle models. In signaling pathways, negative feedback may cause an adjustment of the response or damped oscillations [41].
The dynamic behavior of signaling pathways can be quantitatively characterized by a number of measures [42]. Let be the time-dependent concentration of the kinase i (or another interesting compound). The signaling time describes the average time to activate the kinase i. The signal duration gives the average time during which the kinase i remains activated. The signal amplitude is a measure for the average concentration of activated kinase i. The following definitions have been introduced. The quantity
is the total amount of active kinase i generated during the signaling period, that is, the integrated response of (the area covered by a plot versus time). Further measures are
The signaling time can now be defined as
that is, as the average of time, analogous to mean value of a statistical distribution. Note that other definitions for characteristic times have been introduced in Section 6.3. The signal duration
gives a measure of how extended the signaling response is around the mean time (compatible to standard deviation). The signal amplitude is defined as
In a geometric representation, this is the height of a rectangle whose length is and whose area equals the area under the curve . Note that this measure might be different from the maximal value that assumes during the time course.
Figure 12.22 shows a signaling pathway with successive activation of compounds and the respective time courses. The characteristic quantities are given in Table 12.2 and for shown in the figure.
Table 12.2 Dynamic characteristics of the signaling cascade shown in Figure 12.22.
Compound | Integral, Ii | Maximum, | Time , | Characteristic time, | Signal duration, | Signal amplitude, Ai |
X1 | 0.797 | 0.288 | 0.904 | 2.008 | 1.458 | 0.273 |
X2 | 0.695 | 0.180 | 1.871 | 3.015 | 1.811 | 0.192 |
X3 | 0.629 | 0.133 | 2.855 | 4.020 | 2.109 | 0.149 |
Signal transmission in cellular context is often not unique in the sense that an activated protein has a high specificity for another protein. Instead, there might be crosstalk, that is, proteins of one signaling pathway interact with proteins assigned to another pathway. Strictly speaking, the assignment of proteins to one pathway is often arbitrary and may result, for example, from the history of their function discovery. Frequently, protein interactions form a network with various binding, activation, and inhibition events, such as illustrated in Figure 12.10.
In order to arrive at quantitative measures for crosstalk, let us consider the simplified scheme in Figure 12.23: External signal α binds to receptor A, which activates target A via a series of interactions. In the same way, external signal β binds to receptor B, which activates target B. In addition, there are events that mediate an effect of receptor B on target A.
Let us concentrate on pathway A and define all measures from its perspective. Signaling from α via RA to TA shall be called intrinsic, while signals from β to TA are extrinsic. Further, in order to quantify crosstalk, we need a quantitative measure for the effect of an external stimulus on the target. If we are interested in the level of activation, such a measure might be the integral over the time course of TA (Eq. (12.27)), its maximal value or its amplitude (Eq. (12.31)). If we are interested in the response timing, we can consider the time of the maximal value or the characteristic time (for an overview on measures see Table 12.2). Whatever measure we chose, shall be denoted by X in the following.
The crosstalk C is the activation of pathway A by the extrinsic stimulus β relative to the intrinsic stimulus α
The fidelity F [43] is viewed as output due to the intrinsic signal divided by the output in response to the extrinsic signal and reads in our notation:
In addition, the intrinsic sensitivity expresses how an extrinsic signal modifies the intrinsic signal when acting in parallel, while the extrinsic sensitivity quantifies the effect of the intrinsic signal on the extrinsic signal, respectively [44]:
Table 12.3 shows how different specificity values can be interpreted.
Table 12.3 Effect of crosstalk on signaling.
Se > 1 | Se < 1 | |
Si > 1 | Mutual signal inhibition | Dominance of intrinsic signal |
Si < 1 | Dominance of extrinsic signal | Mutual signal amplification |
Growth and reproduction are major characteristics of life. Crucial for these is the cell division by which one cell divides into two and all constituents of the mother cell are distributed between the two daughter cells. This requires that the genome has to be duplicated in advance, which is accomplished by the DNA polymerase, an enzyme that utilizes deoxynucleotide triphosphates (dNTPs) for the synthesis of two identical DNA double strands from one parent double strand. In this case, each single strand acts as template for one of the new double strands. Several types of DNA polymerases have been found in prokaryotic and eukaryotic cells, but all of them synthesize DNA only in 5′ → 3′ direction. In addition to DNA polymerase, several further proteins are involved in DNA replication: proteins responsible for the unwinding and opening of the mother strand (template double strand), proteins that bind the opened single-stranded DNA and prevent it from rewinding during synthesis, an enzyme called primase that is responsible for the synthesis of short RNA primers that are required by the DNA polymerase for the initialization of DNA polymerization, and a DNA ligase that is responsible for linkage of DNA fragments that are synthesized discontinuously on one of the two template strands, because of the limitation to 5′ → 3′ synthesis. Like the DNA, also other cellular organelles have to be doubled, such as the centrosome involved in the organization of the mitotic spindle.
The cell cycle is divided into two major phases: the interphase and the M phase (Figure 12.25). The interphase is often a relatively long period between two subsequent cell divisions. Cell division itself takes place during M-phase and consists of two steps: first, the nuclear division in which the duplicated genome is separated into two parts, and second, the cytoplasmic division or cytokinesis, where the cell divides into two cells. The latter not only allocates the two separated genomes to each of the newly developing cells, but also distributes the cytoplasmic organelles and substances between them. Finally, the centrosome is replicated and divided between both cells as well.
DNA replication takes place during interphase in the so-called S phase (S = synthesis) of the cell cycle (Figure 12.25). This phase is usually preceded by a gap phase, G1, and followed by another gap phase, G2. From G1 phase, cells can also leave the cell cycle and enter a rest phase, G0. The interphase normally represents 90% of the cell cycle length. During interphase, the chromosomes are dispersed as chromatin in the nucleus. Cell division occurs during M phase, which follows the G2 phase, and consists of mitosis and cytokinesis. Mitosis is divided into different stages. During the first stage – the prophase – chromosomes condense into their compact form and the two centrosomes of a cell begin recruiting microtubules for the formation of the mitotic spindle. In later stages of mitosis, this spindle is used for the equal segregation of the chromatids of each chromosome to opposite cellular poles. During the following prometaphase, the nuclear envelope dissolves and the microtubules of the mitotic spindle attach to protein structures, called kinetochores, at the centromeres of each chromosome. In the following metaphase, all chromosomes line up in the middle of the spindle and form the metaphase plate. During anaphase, the proteins holding together both sister chromatids are degraded and each chromatid of a chromosome segregates into opposite directions. Finally, during telophase, new nuclear envelopes are recreated around the separated genetic materials and form two new nuclei. The chromosomes unfold again into chromatin. The mitotic reaction is often followed by a cytokinesis where the cellular membrane pinches off between the two newly separated nuclei and two new cells are formed.
The cell cycle is strictly controlled by specific proteins. When a certain checkpoint, the restriction point, in the G1 phase is passed, this leads to a series of specific steps that end up in cell division. At this point the cell checks, whether it has achieved a sufficient size and whether the external conditions are suitable for reproduction. The control system ensures that a new phase of the cycle is only entered if the preceding phase has been finished successfully. For instance, to enter a new M phase it has to be assured that DNA replication during S phase has been correctly completed.
Passage through the eukaryotic cell cycle is strictly regulated by periodic synthesis and destruction of cyclins that bind and activate cyclin-dependent kinases (CDKs). The term “kinase” expresses that their function is phosphorylation of proteins with controlling properties. A contrary function is carried out by a “phosphatase.” It dephosphorylates a previously phosphorylated protein and thereby toggles its activity. Cyclin-dependent kinase inhibitors (CKI) also play important roles in cell-cycle control by coordinating internal and external signals and impeding proliferation at several key checkpoints.
The general scheme of the cell cycle is conserved from yeast to mammals. The levels of cyclins rise and fall during the stages of the cell cycle. The levels of CDKs appear to remain constant during cell cycle, but the individual molecules are either unbound or bound to cyclins. In budding yeast, one CDK (Cdc28) and nine different cyclins (Cln1–Cln3, Clb1–Clb6) are found that seem to be at least partially redundant. In contrast, mammals employ a variety of different cyclins and CDKs. Cyclins include a G1 cyclin (cyclin D), S phase cyclins (A and E), and mitotic cyclins (A and B). Mammals have nine different CDKs (referred to as CDK1-9) that are important in different phases of the cell cycle. The anaphase-promoting complex (APC) triggers the events leading to destruction of the cohesions, thus allowing the sister chromatids to separate and degrades the mitotic cyclins. A comprehensive map of cell cycle-related protein–protein interactions can be found in large-scale reconstructions, that is, for budding yeast [45].
Let us take a course through the mammalian cell cycle starting in G1 phase. As the level of G1 cyclins rises, they bind to their CDKs and signal the cell to prepare the chromosomes for replication. When the level of S phase promoting factor (SPF) rises, which includes cyclin A bound to CDK2, it enters the nucleus and prepares the cell to duplicate its DNA (and its centrosomes). As DNA replication continues, cyclin E is destroyed, and the level of mitotic cyclins begins to increase (in G2). The M phase-promoting factor (the complex of mitotic cyclins with the M-phase CDK) initiates (1) assembly of the mitotic spindle, (2) breakdown of the nuclear envelope, and (3) condensation of the chromosomes. These events take the cell to metaphase of mitosis. At this point, the M-phase promoting factor activates the APC, which allows the sister chromatids at the metaphase plate to separate and move to the poles (anaphase), thereby completing mitosis. APC destroys the mitotic cyclins by coupling them to ubiquitin, which targets them for destruction by proteasomes. APC turns on the synthesis of G1 cyclin for the next turn of the cycle and it degrades geminin, a protein that has kept the freshly synthesized DNA in S phase from being rereplicated before mitosis.
A number of checkpoints ensure that all processes connected with cell cycle progression, DNA doubling and separation, and cell division occur correctly. At these checkpoints, the cell cycle can be aborted or arrested. They involve checks on completion of S phase, on DNA damage, and on failure of spindle behavior. If the damage is irreparable, apoptosis is triggered. An important checkpoint in G1 has been identified in both yeast and mammalian cells. Referred to as “Start” in yeast and as “restriction point” in mammalian cells, this is the point at which the cell becomes committed to DNA replication and completing a cell cycle [46–49]. All the checkpoints require the services of complexes of proteins. Mutations in the genes encoding some of these proteins have been associated with cancer. These genes are regarded as oncogenes. Failures in checkpoints permit the cell to continue dividing despite damage to its integrity. Understanding how the proteins interact to regulate the cell cycle has become increasingly important to researchers and clinicians when it was discovered that many of the genes that encode cell cycle regulatory activities are targets for alterations that underlie the development of cancer. Several therapeutic agents, such as DNA-damaging drugs, microtubule inhibitors, antimetabolites, and topoisomerase inhibitors, take advantage of this disruption in normal cell cycle regulation to target checkpoint controls and ultimately induce growth arrest or apoptosis of neoplastic cells.
For the presentation of modeling approaches, we will focus on the yeast cell cycle since intensive experimental and computational studies have been carried out using different types of yeast as model organisms. Mathematical models of the cell cycle can be used to tackle, for example, the following relevant problems:
One of the first genes to be identified as being an important regulator of the cell cycle in yeast was cdc2/cdc28 [50], where cdc2 refers to fission yeast and cdc28 to budding yeast. Activation of the cdc2/cdc28 kinase requires association with a regulatory subunit referred to as a cyclin.
A minimal model for the mitotic oscillator involving a cyclin and the Cdc2 kinase has been presented by Goldbeter [51]. It covers the cascade of posttranslational modifications that modulate the activity of Cdc2 kinase during cell cycle. In the first cycle of the bicyclic cascade model, the cyclin promotes the activation of the Cdc2 kinase by reversible dephosphorylation. In the second cycle, the Cdc2 kinase activates a cyclin protease by reversible phosphorylation. The model was used to test the hypothesis that cell-cycle oscillations may arise from a negative feedback loop with delay, that is, that cyclin activates the Cdc2 kinase, while the Cdc2 kinase eventually triggers the degradation of the cyclin.
The minimal cascade model is represented in Figure 12.26. It involves only two main actors, cyclin and cyclin-dependent kinase. Cyclin is synthesized at constant rate, , and triggers the transformation of inactive (M+) into active (M) Cdc2 kinase by enhancing the rate of a phosphatase, . A kinase with rate reverts this modification. In the lower cycle, the Cdc2 kinase phosphorylates a protease () shifting it from the inactive (X+) to the active (X) form. The activation of the cyclin protease is reverted by a further phosphatase with rate . The dynamics is governed by the following ODE system:
where C denotes the cyclin concentration; M and X represent the fractional concentrations of active Cdc2 kinase and active cyclin protease, respectively, while and are the fractions of inactive kinase and phosphatase, respectively. values are Michaelis constants (Chapter 4). and are effective maximal rates (Chapter 4). Note that the use of Michaelis–Menten kinetics in the differential equations for the changes of M and X leads to the so-called Goldbeter–Koshland switch or ultrasensitivity [52], that is, a sharp switch in the activity of the downstream component when the concentration of the activating component reaches the Km value.
This model involves only Michaelis–Menten-type kinetics, but no form of positive cooperativity. It can be used to test whether oscillations can arise solely as a result of the negative feedback provided by the Cdc2-induced cyclin degradation and of the threshold and time delay involved in the cascade. The time delay is implemented by considering posttranslational modifications (phosphorylation/dephosphorylation cycles and ). For certain parameters, they lead to a threshold in the dependence of steady-state values for M on C and for X on M (Figure 12.26b). Provided that this threshold exists, the evolution of the bicyclic cascade proceeds in a periodic manner (Figure 12.26c). Starting from low initial cyclin concentration, this value accumulates at constant rate, while M and X stay low. As soon as C crosses the activation threshold, M rises. If M crosses the threshold, X starts to increase sharply. X in turn accelerates cyclin degradation and consequently, C, M, and X drop rapidly. The resulting oscillations are of the limit cycle type. The respective limit cycle is shown in phase-plane representation in Figure 12.26d.
Tyson, Novak, Chen and colleagues have developed a series of models describing the cell cycle of budding yeast in great detail [53–56]. These comprehensive models employ a set of assumptions that are summarized in the following.
The cell cycle is an alternating sequence of the transition from G1 phase to S/M phase, called “Start” (in mammalian cells it is called “restriction point”), and the transition from S/M to G1, called “Finish.” An overview is given in Figure 12.27.
The CDK (Cdc28) forms complexes with the cyclins Cln1 to Cln3 and Clb1 to Clb6, and these complexes control the major cell-cycle events in budding yeast cells. The complexes Cln1–2/Cdc28 control budding, the complex Cln3/Cdc28 governs the execution of the checkpoint “Start,” Clb5–6/Cdc28 ensures timely DNA replication, Clb3–4/Cdc28 assists DNA replication and spindle formation, and Clb1–2/Cdc28 is necessary for completion of mitosis.
The cyclin–CDK complexes are in turn regulated by synthesis and degradation of cyclins and by the Clb-dependent kinase inhibitor (CKI) Sic1. The expression of the gene for Cln2 is controlled by the transcription factor SBF, the expression of the gene for Clb5 is controlled by the transcription factor MBF. Both transcription factors are regulated by CDKs. All cyclins are degraded by proteasomes following ubiquitination. APC is one of the complexes triggering ubiquitination of cyclins.
For the implementation of these processes in a mathematical model, the following points are important. Activation of cyclins and cyclin-dependent kinases occurs in principle by the negative feedback loop presented in Goldbeter's minimal model (see Section 12.3.2). Furthermore, these models are based on the assumption that the cells exhibit exponential growth, that is, the dynamics of the cell mass M is governed by . At the instance of cell division, M is replaced by . In some cases, uneven division is considered. Cell growth implies adaptation of the negative feedback model to growing cells.
The transitions “Start” and “Finish” characterize the wild-type cell cycle. At “Start,” the transcription factor SBF is turned on and the levels of the cyclins Cln2 and Clb5 increase. They form complexes with Cdc28. The boost in Cln2/Cdc28 has three main consequences: it initiates bud formation, it phosphorylates the CKI Sic1 promoting its disappearance, and it inactivates Hct1, which in conjunction with APC was responsible for Clb2 degradation in G1 phase. Hence, DNA synthesis takes place and the bud emerges. Subsequently, the level of Clb2 increases and the spindle starts to form. Clb2/Cdc28 inactivates SBF and Cln2 decreases. Inactivation of MBF causes Clb5 to decrease. Clb2/Cdc28 induces progression through mitosis. Cdc20 and Hct1, which target proteins to APC for ubiquitination, regulate the metaphase–anaphase transition. Cdc20 has several tasks in the anaphase. It activates Hct1, promoting degradation of Clb2, and it activates the transcription factor of Sic1. Thus, at “Finish,” Clb2 is destroyed and Sic1 reappears.
The dynamics of some key players in cell cycle according to the model given in Chen et al. [54] is shown in Figure 12.28 for two successive cycles. At “Start,” Cln2 and Clb5 levels rise and Sic1 is degraded, while at “Finish,” Clb2 vanishes and Sic1 is newly produced.
Recent models consider that cell growth is determined by nutrient uptake, nutrient conversion, and the resulting metabolic capacity to synthesis proteins, and are hence dependent on the cellular surface-to-volume ratio [57]. This results in individual growth dynamics between linear and exponential volume increase over time. Within a population context, this can elucidate why small young daughter cells need longer growth periods before division, while big mother cells divide again quickly, yet the population retains stable average cell sizes. This flexibility in cell-cycle length makes some specific assumptions dispensable about how to accomplish all cell-cycle steps in the given period, allowing to formulate self-oscillating models for the dynamics of the important components without artificial events at the end of one period (see Figure 12.29, [58]).
Cell cycle is strongly regulated by signaling pathways conveying information about external stresses (such as osmotic stress transmitted by the HOG pathway discussed in Section 12.2), about the nutritional status, or about the presence of mating partners (sensed via the pheromone pathway). The osmotic stress leads to pausing of cell cycle, providing time for adaptation such as production of osmotically active compounds, while the pheromone pathway triggers a complete cell cycle stop in early G1 phase allowing the cell to prepare for mating. Detailed models and experimentation helped to clarify the differential roles of transcriptional and posttranslation regulation by Hog1 in cell-cycle regulation upon osmotic stress [59].
Aging is a complex biological phenomenon that practically affects all multicellular eukaryotes. It is manifested by an ever-increasing mortality risk, which finally leads to the death of the organism. Modern hygiene and medicine has led to an amazing increase in the average life expectancy over the last 150 years, but the underlying biochemical mechanisms of the aging process are still poorly understood. However, a better comprehension of these mechanisms is increasingly important, since the growing fraction of elderly people in the human population confronts our society with completely new and challenging problems. The aim of this chapter is to provide an overview of the aging process, discuss how it relates to system biological concepts, and to discuss some specific examples that make use of interesting modeling techniques such as delay differential equations and stochastic simulations.
Looking at the enormous rise of average human life span over the last 150 years, one could get the impression that modern research actually has identified the relevant biochemical pathways involved in aging and has successfully reduced the pace of aging. Oeppen and Vaupel [61] collected data on world-wide life expectancy from studies going back to 1840. Figure 12.30 shows the life expectancy for males (squares) and females (circles) for the country that had the highest life expectancy at a given year. Two points are remarkable. First, there is an amazingly linear trend in life expectancy that corresponds to an increase of 3 months per year (!), and second, this trend is unbroken right to the end of the analyzed data (year 2000).
These impressive data strongly suggest that life span will also continue to rise in the next years, but it is not suitable to decide if the actual aging rate has fallen during the last century. Aging can best be described as a gradual functional decline, leading to a continuously rising risk to die within the next time interval (mortality). Already in the nineteenth century, Gompertz and Makeham studied the rise of human mortality with age and found that it can be described extremely well by an exponential function [62,63]. The Gompertz–Makeham equation, , models the increase of mortality with age depending on three parameters called intrinsic vulnerability (µ0), actuarial aging rate (ß), and environmental risk (γ). From this equation, an expression for the survivorship function can be derived (see Section 12.4.1 and [64]), which shows that the number of remaining survivors depends on all three parameters and consequently a change of the median life expectancy (time until 50% of the population has died) can be caused by a modification of any of those parameters. Analyzing the survivorship data of the last 100 years more closely, it becomes clear that the remarkable increase in life expectancy was achieved exclusively by changes of intrinsic vulnerability and environmental risk. However, the aging rate, ß, remained constant all the time. This is remarkable and important because it is actually ß that most strongly controls life expectancy. Because of the drastic social, economical, and political consequences that are brought about by the demographic changes of the age structure of the population, it is now more important than ever to understand what constitutes the biochemical basis for a nonzero aging rate, ß. Systems biology might help to achieve this goal.
Evolutionary theories of the aging process explain why aging has evolved, but unfortunately they do not predict specific mechanisms to be involved in aging. As a consequence, more than 300 mechanistic ideas exist [65], each centered around different biochemical processes. This is partly due to the fact that even the simplest multicellular organisms are such complex systems that many components have the potential to cause deterioration of the whole system in case of a malfunction. Figure 12.31 shows a small collection of the most popular mechanistic theories. The spatial arrangement of the diagram intends to reflect the various connections between the different theories. And it is exactly the large number of interactions that makes it so difficult to investigate aging experimentally and renders it ideal for systems biology. To understand this, we will look at a few examples.
The Telomere Shortening Theory is an important idea that has gained considerable support in the last 15–20 years. Telomeres are the physical ends of linear eukaryotic chromosomes and vital for the functioning of the cell [66,67]. It has been recognized for a long time that linear DNA has a replication problem, since DNA polymerases can only replicate in the 5′–3′ direction and cannot start DNA synthesis de novo [68,69]. This inability leads to a gradual loss of DNA, which was confirmed experimentally for human fibroblasts in 1990 and consequently proposed as being responsible for aging [70,71]. Telomere shortening provides the explanation and connection to the Hayflick Limit, the long known phenomenon that most cultured cell types have only a limited division potential [72]. This in turn can be interpreted as evolutionary selected trait that acts as a cancer prevention system by preventing unlimited cell division. While the direct cause of telomere shortening is the lack of the enzyme telomerase, there is an interaction to oxygen radicals that complicates the mechanism. Oxidative stress was shown to increase the rate of telomere shortening and thus modulating the telomere attrition [73]. As the figure indicates, free oxygen radicals are also the central hub to several other prominent phenomena affecting the aging process. They damage all kinds of macromolecules, leading to cross-linking of proteins and the generation of indestructible waste products (i.e., lipofuscin) that accumulate in slowly dividing cells [74]. Degraded mitochondria are supposed to be a major fraction of these waste products and certain oxygen radicals are known to damage the mitochondrial membrane. It seems, however, that radical-induced somatic mutations of the mitochondrial DNA (mtDNA) are the main route to defective mitochondria that produce less energy, but generate more radicals [75–77]. These mutants are capable of taking over the mitochondrial population of the cell, causing a chronic energy deficiency and maybe aging. It is still unclear what the selection pressure and mechanism is that leads to the accumulation of defective mitochondria, but several suggestions have been made [78–80].
Supporting evidence has been found for all the above-mentioned ideas and the many interdependencies make aging a phenomenon that is very difficult to study experimentally. If a single mechanism is studied in isolation, it is hard to interpret the results that were obtained without the contribution of the other mechanisms. And if a complex system is studied with all involved pathways, it is expensive, technically demanding, and the results are difficult to interpret because of the large number of factors that might have influenced the results.
This is exactly the situation where a systems biological approach is useful. Systems biology aims at investigating the components of complex biochemical networks and their interactions, applying experimental high-throughput and whole genome methods, and integrating computational and mathematical methods with experimental efforts. The growing number of high-throughput techniques that have been developed in the last years is a major driving force behind the wish to utilize computational methods to manage and interpret the high data output. Modelers, on the other side, are keen to use the generated data to develop quantitative models of systems with complex interactions. Because of the large number of parameters, such models would not be meaningful without sufficient experimental measurements.
In addition, quantitative modeling of complex systems has several benefits. First of all, it requires that each aspect of a verbal hypothesis is being made specific. Before a computational model can be developed, the researcher has to define each component and how it interacts with all the other components. This is a very useful exercise to identify gaps in current knowledge and in the verbal model. It helps to complete the conceptual model, respectively motivates experiments to collect the missing experimental information. To understand complex systems with components that produce opposing effects, it is essential to have a model with quantitative predictions. Purely qualitative models (such as verbal arguments) are not sufficient to decide how a system develops over time if it contains nonlinear opposing subcomponents. Computational models are also a convenient way to explore easily and cheaply “what if” scenarios that were difficult or impossible to test experimentally. What if a certain reaction would not exist? What if a certain interaction would be ten times stronger? What if we are interested in time spans too short or too long to observe in an experiment? First, we will study a model that deals with the evolution of the aging process (Section 12.4.1) followed by models that make use of stochastic modeling (Section 12.4.2) or use delay differential equations (Section 12.4.3) to understand the accumulation of damaged mitochondria.
It is not obvious why organisms should grow old and die. What is the selective advantage of this trait? And if aging is advantageous, why have different species such widely differing life spans?
The first attempt to explain the evolution of aging was made by Weismann [81]. He proposed that aging is beneficial by removing crippled and worn-out individuals from the population and thus making space and resources available for the next generation. This type of reasoning is very similar to other suggestions like the prevention of overcrowding or the acceleration of evolution by decreasing the generation time.
These ideas suggest that aging itself confers a selective advantage and that the evolution of genes that bring life to an end is an adaptive response to selective forces. All these theories have in common that they rely on group selection, the selection of a trait that is beneficial for the group, but detrimental to the individual. However, group selection only works under very special circumstances like small patch size and low migration rates [82].
The weaknesses of adaptive theories have been recognized for some time and newer theories are no longer based on group selection, but on the declining force of natural selection with time. This important concept refers to the fact that, even in the absence of aging, individuals in a population are always at risk of death due to accidents, predators, and diseases. For a given cohort, this leads to an exponential decline over time in the fraction of individuals that are still alive. Events (e.g., biochemical processes) that occur only in chronologically old individuals will therefore only affect a small proportion of the whole population. The later the onset of the events, the smaller the involved fraction of the population.
Medawar [83] was the first to present a theory for the evolution of the aging process based on this idea. His “Mutation Accumulation” theory states that aging might be caused by an accumulation of deleterious genes that are only expressed late in life. Because of the declining force of natural selection, only a small part of the population would be affected by this type of mutation and the resulting selection pressure to remove them would only be very weak. Mutations with a small selection pressure to be removed can persist in a mutation–selection balance [84] and thus explain the emergence of an aging phenotype.
Another theory of this kind is the “Antagonistic Pleiotropy” theory [85]. Genes that affect two or more traits are called pleiotropic genes, and effects that increase fitness through one trait at the expense of a reduced fitness of another trait are antagonistic. Now, consider a gene that improves the reproductive success of younger organisms at the expense of the survival of older individuals. Because of the declining force of natural selection, such a gene will be favored by selection and aging will occur as a side effect of the antagonistic pleiotropy property of this gene. Possible candidate genes might be found in males and females. Prostate cancer appears frequently in males at advanced ages, but can be prevented by administration of female hormones or castration. It seems to be a consequence of long-term exposure to testosterone, which is necessary for male sexual and thus reproductive success. In older females, osteoporosis is mediated by estrogens that are essential for reproduction in younger women. In both cases, gene effects that are beneficial at younger ages have negative consequences later in life.
Genes that trade long-term survival against short-term benefit are probably the strongest candidates to explain the aging process. A specific version of this hypothesis that connects evolutionary concepts with molecular mechanisms is the “Disposable Soma” theory [86,87]. The theory realizes that organisms have a finite energy budget (food resources) that must be distributed among different tasks like growth, maintenance, and reproduction. Energy spent for one task is not available for another. Organisms have to solve this resource allocation problem such that evolutionary fitness is maximized. On the basis of quite general assumptions, a mathematical model can be constructed that describes the relationship between investment in maintenance and fitness. We are going to have a closer look at this model as an example how to formulate a mathematical description of such a qualitative idea.
To get started, we need a mathematical concept of fitness. A standard measure that is often used in population genetics is the intrinsic rate of natural increase, r, (also called Malthusian parameter) that can be calculated by numerically solving the Euler–Lotka (equation 12.36). To calculate r for a given genotype, the survivorship function, l(t), and the fertility function, m(t), have to be known. l(t) denotes the probability that an individual survives to age t and m(t) is the expected number of offspring produced by an individual of age t.
If the value of r that solves this equation is negative, it implies a shrinking population, if it is positive, the population grows. Thus, the larger r, the higher is the fitness. An exact derivation of the Euler–Lotka equation is outside the scope of this book, but can be found in Maynard Smith [88] or Stearns [89]. Investment in somatic maintenance and repair will affect both, survivorship and fertility, and the question remains whether there is an optimal level of maintenance that maximizes fitness. Unfortunately, the precise physiological trade-offs are unknown, so we have to develop some qualitative relationship. It was already mentioned that in many species mortality increases exponentially according to the Gompertz–Makeham (equation 12.37) [63]. μ0, β, and γ represent basal vulnerability, actuarial aging rate, and age-independent environmental mortality, respectively.
Mortality and survivorship are connected via the relation . By solving this equation, we obtain an expression for l(t) that depends on two parameters that are influenced by the level of maintenance, μ0 and β (12.38). We now define the variable ρ to be the fraction of resources that are allocated for maintenance and repair, ρ = 0 corresponding to zero repair and ρ = 1 corresponding to the maximum that is physiologically possible. We also make the assumption that above a critical level of repair, ρ*, damage does not accumulate and the organism has reached a nonaging state. The rational for this postulation is the idea that aging is caused by the accumulation of some kind of damage and by investing more in repair the accumulation rate is slowed down until finally the incidence rate is equal to the removal rate, in which case the physiological steady state can be maintained indefinitely. The modifications to μ0 (12.39) and β (12.40) are only one way to implement the desired trade-off (decreasing μ0 and β with increasing ρ), but in qualitative models like this, the main results are often very robust with regard to the exact mathematical expression used.
The level of maintenance does also influence fertility, m(t). It is assumed that the age at maturation, a, will increase with rising ρ and that the initial reproductive rate, f, is a decreasing function of ρ. We furthermore assume that fertility declines due to age-related deterioration with the same Gompertzian rate term as survivorship. From these conditions, equations can be derived for fertility (12.41), age at maturation (12.42), and initial reproductive rate (12.43).
Now we have all the information that are necessary to solve the Euler–Lotka equation for different values of repair, ρ, (Figure 12.32). The calculations confirm that there exists an optimal level of maintenance, ρopt, which results in a maximal fitness. Depending on the ecological niche (represented by environmental mortality, γ), the optimal amount of maintenance varies. In a risky environment, the optimum is shifted toward lower maintenance and a niche with low external mortality, selects for individuals with a relatively high investment in maintenance.
This fits quite nicely with biological observation. Species like mice or rabbits live in a high-risk environment and, as predicted, they invest heavily in offspring but have little left for maintenance. Consequently, their aging rate is high and life expectancy low (even under risk-free laboratory conditions). Humans or elephants, by contrast, inhabit a low-risk environment, spend fewer resources for offspring, and invest more in repair. Especially instructive are birds, which live two to three times as long as mammals of comparable body weight. Again, this long life span can be predicted by the enormous reduction of external mortality that accompanies the ability to fly and thus escape predators or starvation.
Another important result of the model is that the optimal level of maintenance and repair is always below the critical level, ρ*, that is required for the evolution of a nonaging organism. This result can also be understood intuitively. Since all species have an external mortality that is above zero, they have a finite life expectancy, even without aging. This means, even though it might be physiological possible to have such an efficient repair system that damage does not accumulate with time, resulting in a potentially immortal organism, this never results in a maximal fitness value. Only in the limit of γ = 0, ρopt approaches ρ*.
It was already mentioned that defective mitochondria accumulate in old cells. These organelles developed roughly two billion years ago from free-living prokaryotic ancestors [90]. This endosymbiotic origin also explains the fact that mitochondria still contain their own genetic material in form of a small circular DNA, named mtDNA. During the course of evolution most mitochondrial genes were transferred to the nucleus, but the genes for 13 proteins are still located on the human mtDNA [91]. Mitochondria are involved in several essential processes like apoptosis, calcium homeostasis, and fatty acid degradation, but their most important task is the generation of ATP via oxidative phosphorylation at the inner mitochondrial membrane. All of the proteins encoded on the mtDNA are involved in this process. Since ATP is essential for thousands of biochemical reactions, it could very well be that damage to the “powerhouses of the cell” is a major contributor to the aging process.
Many studies have shown that damage to mtDNA accumulates with age in various mammalian species such as rats, monkeys, and humans [92–97]. The most prominent type of damage seems to be deletion mutations, which have lost up to 50% of the total mtDNA sequence. These mutants then accumulate in a cell, overtaking the wild-type population of mitochondria. Cells containing damaged mitochondria can be identified by staining for enzymatic activity and Figure 12.33 shows a characteristic result of a muscle cross-section of a 38-months-old rat. Affected cells are typically negative for the mitochondrially encoded enzyme cytochrome-c oxidase (COX) and overexpress the nuclear-encoded enzyme succinate dehydrogenase (SDH). It can be seen that the cross-section displays a mosaic pattern with most cells being healthy and interspersed a few cells with effectively no COX activity at all. Furthermore, the experimental studies have shown that in each COX negative cell, it is a single type of deletion mutant that is clonally expanded, and different COX negative cells are overtaken by different clonally expanded deletion mutants.
The challenge is now to identify a biochemical mechanism that can explain these experimental findings. That is, it has to explain (i) why deletion mutants accumulate in the presence of wild-type mtDNA, (ii) why it is always a single mutant and not several that are present in an affected cell (low heteroplasmy), and (iii) how such a mechanism can work for species with life spans ranging from 3 to 100 years. An interesting hypothesis that has been put forward and that we want to study in more detail is the idea that pure random drift might be sufficient to explain this clonal expansion [98,99]. Elson et al. [98] simulated a population of 1000 mtDNAs with a 10 day half-life and a deletion probability between 10−6 and 10−4 per replication event. They defined COX negative cells as those that contain more than 60% mutant mtDNA at the end of the simulation and showed that, after 120 years, between 1 and 10% COX negative cells had a degree of heteroplasmy compatible with experimental observations. However, can such a process also work for short-lived animals like rodents, which show a similar pattern of accumulation as in humans but on a greatly accelerated time scale [92,96]?
How can such an idea be modeled mathematically? This is an interesting problem, since the standard approaches are not applicable in this special case. Most often the time course of the biological entities to be modeled (molecules, cells, and organisms) is described by differential equations, which are then solved analytically or more often numerically by a suitable software tool. However, we cannot use differential equations to describe the fate of wild-type and mutant mtDNAs because the hypothesis assumes that all mtDNAs have identical parameter values for growth and degradation. Thus, in the deterministic case, the number of mutant mtDNAs would always remain at their initial value after the mutation event, which is 1. So, we have here a situation where the system must be modeled stochastically, since the proposed explanation relies completely on random effects. Normally we would use one of the simulation tools that are capable to perform stochastic simulations such as Copasi or VirtualCell, but it turns out that this, too, is not possible in our situation. The problem is that we have an undetermined number of variables. Each time a deletion event happens, it creates a new type of mutant, that is, per definition, different from all currently existing ones in the cell. This effectively creates a new variable in our model, whose time course we need to follow. Furthermore, there is no way to know how many of these new variables will be created during the course of the simulation. This behavior cannot be handled by classical tools for stochastic simulations.
The only solution in such a case is to write a computer program that is tailor made for the specific problem. Exactly, this has been done by Kowald and Kirkwood [100] and in the following we will have a closer look at the source code. Stochastic simulations are always computationally demanding and thus a suitable programming language such as Java or C/C++ should be chosen. In our case, a Java program was developed within the Eclipse IDE. We cannot discuss the whole program, but instead will concentrate and comment on the core number crunching routine.
Stochastic simulations only generate single trajectories and therefore it is necessary to perform many repetitions to calculate meaningful statistics (e.g., fraction of COX negative cells). At line 121, this outer loop starts, which performs Nrepeats repetitions. In the next lines, important program variables are initialized for the next trajectory simulation, which starts at line 130. A central step during program development is the choice of suitable data structures to hold the state variables of the system. In this example, mitoL is a simple list of integers that holds the number of wild-type mtDNAs followed by the numbers of one or more mutant mtDNAs. The simulation progresses in fixed time intervals of 1 h until the simulation time is reached or all wild-type mtDNAs are lost (line 130). This is similar to the τ-leap method (see Section 7.2.3) with the difference that the time interval is much shorter than the average time interval of the modeled reactions (mtDNA degradation & replication). In lines 131–137, the program collects and stores information about the number of type of mutants in ergL. This happens only every 30 days to keep the amount of data at a manageable size.
In the “for loop” starting at line 141, the program iterates through all different types of mtDNA and the small loop from line 145–147 calculates how many molecules should be degraded during this time step and at line 148 the number of existing mtDNAs is reduced accordingly. This is the place where stochasticity enters the simulation. The “if clause” starting at 151 checks if all wild-type mtDNAs have been lost and performs some necessary bookkeeping in that case. Finally, line 164 removes elements from mitoL that are zero. These stem from entries for mutant mtDNAs that have been completely eradicated by degradation.
The model implements a very simple mechanism for replacing degraded mtDNAs. Exactly the same number that is degraded, is newly synthesized via replication, such that the total number of mtDNAs is always equal to mitoNumber. This is a typical example how certain aspects of biology are simplified for mathematical modeling because they are not relevant for the modeled problem or because more details are not known. Within the “for loop” starting at line 168, the new mtDNAs are created by making a copy of a randomly chosen (line 169) parent mtDNA. Since mtDNAs are arranged by types within mitoL, some lines of code (171–175) are necessary to identify the parent type. During the replication process a deletion can happen, creating a new mutant type. This depends on the parameter mutProb and is implemented by lines 177–181. That completes the computations for this time step and in line 185 the number of simulated hours is increased by one until simTimeH has been reached.
If the wild type has been lost or we have less than 40% wild-type mtDNAs (line 189), we keep track of this by incrementing some counter variables (190–199). Finally, we write the data that were collected on a monthly basis to an output file (204–209). Thus, for each repetition one long line of text is generated with the monthly count of the total number of mutants and another long line with the monthly count of the number of different mutant types. This concludes our inspection of the simulation code and in the next section we look at some of the results that can be obtained from the simulation data.
The output of the simulation program is a collection of trajectories describing how many mutant mtDNAs accumulate with time and to how many different mutant types they belong. Figure 12.34(a) shows three typical trajectories for a simulation lasting for 120 years. During one simulation (black), some mutant mtDNAs appear after approx. 20 years and reach roughly 200 copies, which correspond to 20% of the total mtDNA population. However, random fluctuations cause these mutants to disappear again and after 120 years this simulation ends with a healthy cell, containing no mutant mtDNAs. The simulation shown in red displays a completely different behavior. After about 60 years, there is a sharp rise in mutant mtDNAs that completely replace the wild type around year 95. Once all wild-type mtDNAs are lost, the simulation has reached an absorbing boundary condition since there is no way to recreate wild-type mtDNAs.
By calculating a large number of trajectories, interesting statistical insights can be obtained. Figure 12.34(b) illustrates the increase in COX negative cells calculated from 3000 (Pmut = 10−4 and 10−5) respectively 15 000 trajectories (Pmut = 10−6). The larger number of simulations for the low mutation rate is necessary to obtain a “smooth” curve. Experimentally, it is known that around 4% of postmitotic human cells are COX negative after 80 years [93,101], which means that a mutation rate of 10−5–10−4 per replication is compatible with these results.
As mentioned earlier an important experimental observation is the low degree of mtDNA heteroplasmy for species with widely different life spans. To study this phenomenon, stochastic simulations were performed for different life spans (3, 10, 40, 80, and 120 years) and the average number of different mtDNA types per COX negative cell is shown in Figure 12.35. Since the level of COX negative cells in old individuals of species like rat and man is similar, the mutation rate was adjusted such that at the end of the simulation time the 3000 trajectories yielded 10% of COX negative cells.
From the diagram, it is immediately obvious that the number of different mtDNA types per COX negative cell depends strongly on the simulated life span. If random drift can act over a time of 120 years, there are on average only 1.46 different mutant types per COX negative cell. A value that is in good agreement with experimental results obtained from humans. However, already for a life span of 40 years (e.g., rhesus monkeys), there are 2.8 mutant types per cell, hardly compatible with observation. And finally for a life span of 3 years (e.g., rats), the simulations predict more than 35 different mtDNA mutants per cell, completely incompatible with experimental studies. Thus, the simulation results are very clear and rule out random drift as possible explanation for the accumulation of mtDNA deletion mutants in short-lived species.
If random drift is not a likely explanation for the accumulation of mitochondrial deletion mutants, what else could be the mechanism? An interesting possibility that has been proposed is related to the reduced genome size of the deletion mutants [102,103]. If the reduced size directly translates into a reduced replication time, this should provide the mutants with a selection advantage. However, replication normally only occurs to replace mtDNAs that have been degraded via mitophagy. And since the time necessary for replication (1–2 h) [104] is much shorter than the half-life (1–3 weeks) [105,106], it is unclear if this idea can work. Figure 12.36 provides an overview of the model system. mtDNAs can either be in a “free” or a “busy” state. In the free state, they can respond to cellular signals and enter the busy state by starting a round of replication. After a certain delay, Δt, the original molecule plus a copy return into the free pool. Wild-type and mutant form respond with the same probability to replication signals, but deletion mutants will spend a shorter time in the busy state and hence return earlier into the free pool than wild-type mtDNAs. This is the source of the selection advantage.
Kowald et al. [107] have investigated this problem using delay differential equations as well as stochastic simulations, but we will concentrate here on the development of the system of DDEs. The model contains four variables that require four differential equations:
These variables can undergo the following set of reactions, which describe the transfer from one pool into the other, the increase in molecule number caused by replication and the removal of mtDNAs because of degradation.
Differential equations are a standard approach to model biochemical reactions. While ordinary differential equations (ODE) only refer to variable values at one time point, x(t), delay differential equations (DDE) also refer to variable values at some time point in the past, x(t−Δt). Because in our model replication needs a certain time span after which molecules reappear from the busy state, DDEs are required instead of ODEs.
Each of the four Eqs. (12.44)–(12.47) consists of three terms describing (i) the transfer of mtDNAs to or from the busy pool, (ii) the transfer of mtDNAs to or from the free pool, and (iii) the degradation process. Let us inspect the second term of (12.44) more closely. It describes the flux of wild-type mtDNAs from the free pool into the busy pool. This depends on the current amount of free mtDNAs, wtF(t), and on how strongly mtDNAs responds to cellular replication signals. It is assumed that the replication probability declines with the total amount of existing mtDNAs. The rational is that some form of negative feedback exists that limits the production of new mtDNAs if already many exist. The mathematical expression that was used to model this behavior is of the form . This construct has the property that the replication probability is equal to the parameter replS if only very few mtDNAs are present and drops to replS/2 when the number of mtDNAs has reached carryC. Now, we can also understand the first term that quantifies the flux of mtDNAs from the busy pool back into the free pool at the end of the replication process. This depends on the amount of free wild-type mtDNAs that started replication at t−Δt, which is given by . The factor 2 reflects the fact that each mtDNAs that starts replication is doubled in that process. The origin of the exponential expression, however, is not that obvious. It stems from the fact that also mtDNAs in the busy state are subject to degradation. As can be seen from the third term of the equation, a first-order decay is assumed, leading to an exponential decline. Therefore, the quantity of returning mtDNAs has to be diminished by this factor. The only difference between the equations for mutants and wild type is a shorter replication time, expressed as Δt times the size of the deletion (ranging from 0 to 1).
Please note that the DDE model does not contain de novo mutations. It describes the competition between existing mutant and wild-type molecules. The original publication [107] also contains a stochastic model to take mutations into account, but we will restrict ourselves to the results of the deterministic DDE model. The software Mathematica was used to solve the equations numerically and Figure 12.37a shows the time course of wild-type (wt) and deletion mutant (mt) for a set of parameters based on literature values. The simulation was started with 999 molecules of wild-type and a single mutant molecule, reflecting the situation immediately after a mutation event. The main conclusions that can be drawn are that the shorter replication time does indeed provide a selection advantage for the deletion mutant, but it takes almost 100 years before the wild type goes extinct (drops below one copy).
To better understand the model behavior, it is necessary to see how extinction times depend on parameters such as the replication time or the half-life of mtDNAs. To make the results comparable between simulations, it is important that the steady-state level of the total mtDNA population remains the same, since it is very likely that extinction times also depend on the population size. One way to achieve this is to adjust the free parameter carryC, which controls the replication probability, accordingly. The relationship between the mtDNA steady-state level, Mss, and the model parameters can be obtained by setting the left-hand side of the DDE system to zero and then solving it for the steady-state values of mtDNAs in the free and busy state. For this, it is important to realize that under steady-state conditions it holds that x(t) = x(t−Δt). The sum of these values is Mss and given by the following expression:
And thus carryC is equal to:
Figure 12.38 shows the effects of varying half-life or replication time while maintaining a constant total population of mtDNAs. The simulations reveal that either shortening the half-life or increasing the time required for replication leads to a drastic reduction of the extinction time. Thus, the larger the ratio between half-life and replication time, the longer is the resulting extinction time. The reduced size of the mutant is only an advantage in the busy state (via a faster exit time) and consequently the selection advantage dwindles if the total fraction of replicating mtDNAs is decreasing. And as can be seen from Figure 12.38 increasing the half-life or decreasing the replication time diminishes this important parameter.
Thus, mathematical modeling confirms that it is very unlikely that mitochondrial deletion mutants gain a selection advantage from a faster replication time based on their reduced size.