CHAPTER 11

Mammalian Methods for Detecting and Assessing Endocrine-Active Compounds

M. SUE MARTY

Toxicology & Environmental Research and Consulting, The Dow Chemical Company, Midland, Michigan, United States

11.1 INTRODUCTION

11.1.1 Battery of Assays

11.2 MAMMALIAN TIER 1 SCREENING ASSAYS

11.2.1 Rat Uterotrophic Assay

11.2.2 Hershberger Assay

11.2.3 Male and Female Pubertal Assays

11.2.4 Enhanced OECD 407 28-Day Study

11.2.5 15-Day Intact Male Assay

11.2.6 Using In Vitro Data in Conjunction with In Vivo Assays

11.3 TIER 2 TESTS

11.4 HUMAN AND WILDLIFE RELEVANCE OF ESTROGEN, ANDROGEN, AND THYROID SCREENING ASSAYS

11.5 POTENTIAL FUTURE ASSAYS FOR ENDOCRINE SCREENING

REFERENCES

11.1 INTRODUCTION

In the early 1990s, concern over the potential of environmental chemicals to cause effects on the endocrine system came under scrutiny. With books like Our Stolen Future [1] and reports of declining sperm counts, altered reproductive systems in wildlife, and potential synergistic responses to endocrine-active compounds, the scene was set for greater regulatory involvement by both the United States and the Organisation for Economic Cooperation and Development (OECD). In 1996, passage of the Food Quality Protection Act and an amendment to the Safe Drinking Water Act mandated the U.S. Environmental Protection Agency (EPA) to establish a screening program to identify compounds that have the potential to interact with the endocrine system. Initially, EPA convened the Endocrine Disruptor Screening and Testing Advisory Committee (EDSTAC), a multistakeholder committee that proposed a two-tier system for evaluating endocrine activity. Tier 1, which included both in vitro and in vivo assays, was designed to identify compounds with the potential to interact with the estrogen, androgen, or thyroid systems. Tier 2 was designed to evaluate adverse effects for potential endocrine-active compounds identified in Tier 1 as well as generate dose-response data for use in risk assessment.

For over a decade, expert panels in both the European Union and the United States identified potential assays to screen for endocrine activity; reviewed assay data for sensitivity and specificity; established and monitored validation programs; and reevaluated, refined, and standardized potential endocrine screening assays. In 2002, OECD released its conceptual framework for the evaluation of potential endocrine-active compounds; this framework was basically a tool box with five levels for evaluating existing data and, as warranted, performing screening assays and/or tests with increasing complexity and strength of evidence to evaluate endocrine activity. In 2009, EPA launched the Endocrine Disruptor Screening Program (EDSP) to identify potential endocrine-active compounds. While both EPA and OECD have programs and frameworks for evaluating potential endocrine-active compounds, only EPA has issued test orders at this time. As a result, some aspects of this chapter are specific to EPA’s implementation of the EDSP.

11.1.1 Battery of Assays

Tier 1 assays in the EDSP are designed to determine whether a test compound has the potential to interact with the estrogen, androgen, or thyroid systems. Tier 1 was designed to minimize false negative results, while recognizing that there would be some corresponding increase in false positive findings. To enhance its sensitivity and specificity, the Tier 1 battery was designed to have some redundancy across assays. This allows regulators to apply a weight-of-evidence (WoE) approach to determine whether a compound has potential endocrine activity. If a compound is deemed positive in Tier 1 endocrine screening, it will undergo Tier 2 testing to confirm or refute endocrine activity.

This chapter describes the mammalian assays developed as part of the Tier 1 EDSP. It includes a description of each assay and its strengths and weaknesses and explains how these assays, coupled with other Tier 1 assays, can be used in a WoE assessment to determine potential endocrine activity. Tier 2 mammalian endocrine tests are briefly described. Last, the applicability of these data to humans and potential next steps for the EDSP are discussed. Additional reviews on EDSP Tier 1 screening also are available [2–4].

11.2 MAMMALIAN TIER 1 SCREENING ASSAYS

11.2.1 Rat Uterotrophic Assay

The rodent uterotrophic assay is a short-term, in vivo Tier 1 endocrine screening assay designed to detect estrogenic activity by measuring a compound’s ability to produce an increase in rodent uterine weight. The premise of the assay is based on the transient changes in uterine weight that occur during the estrous cycle; that is, the increase (or decrease) in uterine weights in response to increases (or decreases) in endogenous estrogen levels [5–7]. While this assay can detect both estrogen agonists and antagonists, registrants are required only to use the uterotrophic assay to screen for estrogen-active compounds as part of the EDSP. Test guidelines (TGs) [8,9] are available that describe the conduct, interpretation, and performance specifications for this assay.

Prior to its inclusion in the EDSP, the uterotrophic assay underwent an extensive validation program coordinated by OECD [10–15]. The uterotrophic assay has been shown to reliably detect estrogenic activity across numerous laboratories using different routes of exposure in either immature rats or adult ovariectomized rats [10]. Overall, the assay shows good reproducibility both within and between laboratories and relatively good specificity (e.g., see Table 1 in [13]). The discussion here is focused primarily on the conduct and utility of the uterotrophic assay for endocrine screening.

The uterotrophic assay (see Figure 11.1) typically uses immature female rats or ovariectomized adult rats (or mice), which have low levels of endogenous estrogens and are not cycling [6]. The lack of endogenous estrogen means that baseline uterine weights are low, making the assay sensitive to detect any estrogenic changes in uterine weights. Animals (6/dose group) are administered the test compound, by either oral gavage or subcutaneous (sc) injection, daily for three days. On test day (TD) 4, approximately 24 hours after the last dose, animals are examined for vaginal patency (if immature rats are used), weighed, and euthanized. The uteri are excised, trimmed, and wet (imbibed) and blotted uterine weights are recorded. If a loss of fluid is noted, the wet weight for that sample is excluded. For blotted weights, each uterine horn is nicked and blotted to remove luminal fluid (methods for collecting blotted weights are described in [16]). Uterine handling must be sufficiently rapid to avoid desiccation of the tissues [17]. Increases in uterine weights are typically due to interaction of a compound with estrogen receptor α (ERα), which can result in uterine hypertrophy, hyperplasia, and fluid imbibition. Uterine weights from compound-treated animals are compared with uterine weights in the vehicle-treated control group; a compound that causes a significant increase in wet and/or blotted uterine weights is considered positive for estrogenic activity. EPA and OECD TGs for the uterotrophic assay specify a limit dose of 1,000 mg/kg body weight/day (mg/kg/day) for test compounds. If the limit dose approach is not applicable, the assay typically requires two dose levels and a vehicle control group, although more dose levels may be included if better characterization of the dose-response curve is desired. An untreated control group also may be included to ensure that the vehicle has no impact on uterine weights as this would alter assay sensitivity.

FIGURE 11.1 Schematic drawing of the uterotrophic assay design (see the text for study design details). The inset shows an ethynylestradiol (EE)-stimulated increase in uterine size as well as an EE-dose response curve used to demonstrate laboratory proficiency for the uterotrophic assay. VO = vaginal opening; MTD = maximum tolerated dose.

c11f001

OECD and EPA differ in the parameters they prefer for conducting the uterotrophic assay, including the route of exposure and which animal model to use. Several factors should be considered when selecting a route of exposure for the uterotrophic assay, including the relevant route of exposure to the test compound, bioavailability by the oral route, and metabolism to an active or inactive metabolite. The typical route of administration for the uterotrophic assay is either oral gavage or sc injection. OECD TG states that the most relevant route of exposure should be used with consideration to avoid first pass metabolism [8], whereas EPA TG favors sc dosing for direct entry of a compound into the general circulation, thereby avoiding gut metabolism and slowing the rate of liver metabolism [9]. The sc route may warrant consideration if the compound has a positive in vitro estrogen receptor (ER) binding assay, particularly if uterotrophic results by the oral route were negative. These data may be useful to characterize hazard but may be of little relevance if the germane exposure to a compound is via the oral route and the substance is rapidly deactivated by first-pass hepatic metabolism. For many compounds, the route of exposure does not impact the ability to detect a positive uterotrophic response if dose levels are selected appropriately, although the minimum effective dose may be an order of magnitude higher with oral gavage compared with sc injection [16]. Of course, the precise difference between effective sc and oral doses depends on the specific compound in question.

When selecting the animal model, OECD favors the immature female rat, whereas EPA prefers the ovariectomized young adult model. Both the immature and ovariectomized adult models are viewed as equivalent for reliability and assay sensitivity [18,10]. There are advantages and disadvantages to each animal model, which should be considered prior to model selection. The immature model may be preferred for animal welfare concerns as surgery is avoided; however, careful planning is needed to avoid the use of excess numbers of animals (i.e., due to the young age of the immature females, litters of animals must be ordered, yet dams and male littermates are not needed for this assay). If the immature model is selected, female rats must be used prior to puberty and the corresponding increase in endogenous estrogen production (i.e., postnatal day [PND] 18–25 with necropsy no later than PND 25; PND 0 is defined as the day of birth). This coincides with the period of maximum sensitivity of the uterus to exogenous estrogens [19–21]. Due to the need to necropsy these animals by PND 25, the dosing period is limited with the immature model, whereas the dosing period can be extended with the ovariectomized adult model if desired. Furthermore, immature rats are more sensitive to dietary phytoestrogen content due to the consumption of greater amounts of feed per kilogram of body weight, and uterine weights are more sensitive to body weight-mediated effects in this model, which increases the need to carefully consider group body weights during the study. The immature model is slightly less specific than the adult ovariectomized model; positive uterotrophic results with the immature model can be caused by interaction of a test substance with the hypothalamic-pituitary-gonadal (HPG) axis (e.g., aromatizable androgens can be positive in this assay).

If the ovariectomized model is selected, animals must undergo surgery, then be allowed time for the uterus to regress (i.e., approximately two weeks). During this interval, uterine weights will decrease markedly; however, these weights may not reach a stable baseline [22,23]. In addition, incomplete ovariectomy can lead to marked increases in uterine weights [24]; thus, vaginal smears are included over the last five days of the recovery period to verify that ovariectomy was complete, and laboratory personnel must look for ovarian remnants at necropsy if this animal model is used. One advantage of the ovariectomized adult is increased specificity for compounds that interact with the ER compared to the immature model. The adult ovariectomized model allows for a longer dosing period if desired; assay responsiveness for some weak estrogenic compounds (e.g., o,p′-DDT) improved with seven days of sc dosing, although assay sensitivity (i.e., the ability to detect a positive response with a true estrogenic compound) was not affected with either three or seven days of dosing [25].

In addition to the control and treated groups, a positive control group exposed daily to 17α-ethynylestradiol (EE) may be included in the assay to verify assay sensitivity and performance. In fact, OECD and EPA TGs require the inclusion of an EE-treated positive control group or a study demonstrating laboratory proficiency by generating a dose-response curve for EE-induced uterotrophic responses when conducting uterotrophic assays for regulatory purposes. If a positive control group is included, the EE dose should approximately equal the effective dose (ED) inducing a 70 to 80 percent increase in uterine weight relative to the maximum uterine weight increases induced with EE in the dose-response study (i.e., ED70 or ED80). Inclusion of an EE positive control group allows the laboratory to confirm responsiveness of the assays relative to previous historical control data to verify that there are no shifts in assay sensitivity.

To ensure assay sensitivity, steps should be taken to verify that baseline uterine weights are not artificially elevated due to alternate sources of estrogens. Consequently, test animals are given a low phytoestrogen rodent diet (genistein equivalents ≤350 μg/g diet), as higher phytoestrogen content may increase baseline uterine weights [16]. Criteria for acceptable baseline uterine weights have been defined by both OECD and EPA. OECD states that baseline uterine weights generally range from 20 to 35 mg for immature rats and 80 to 110 mg in the ovariectomized young adults [16]. EPA states that the mean blotted uterine weight for the vehicle control group should be <0.04 percent of terminal body weight (ovariectomized model) or <0.09 percent of terminal body weight (immature model) to yield sufficient assay sensitivity [9]. Furthermore, immature blotted uterine weights of 40 to 45 mg in the control group may warrant concern for assay sensitivity, and weights greater than 45 mg may require the assay to be rerun in the event of a negative or equivocal result. Thus, these baseline data, coupled with EE response data, indicate that the assay is performing as expected and would detect treatment-related increases in uterine weight if these increases were present.

Variability also is an area of concern. If variability in control uterine weights is excessive, the ability of the uterotrophic assay to detect weakly acting estrogenic compounds will be diminished. In OECD validation study [10], statistical power calculations for the uterotrophic assay are presented; for example, there was reasonable power (81 percent probability) of detecting a 35 percent increase in uterine weight with six animals per group if the coefficients of variation (CVs) remained low (i.e., 15.0 percent). Blotted uterine weights generally have less variability than imbibed uterine weights [10]. Each laboratory should establish its own historical control data for baseline uterine weights to determine whether uterine weights in the vehicle control group are atypically high, and a repeat of the assay should be considered.

In the uterotrophic assay, issues with reproducibility may arise when increases in uterine weights are in the lower portion of the dose-response curve and, therefore, may or may not achieve statistical significance [16] and/or baseline uterine weights are outside the normal range, altering assay sensitivity. Careful dose selection, using range-finding studies if needed, and evaluation of baseline uterine weights within the context of historical control data can help clarify uterotrophic results. Suspect results may warrant assay replication. Thus, modest increases in uterine weight should be interpreted carefully using a WoE approach.

While relatively specific for estrogenic compounds, the uterotrophic assay also may yield a positive response with androgens, progestins, and other growth factors [6,26,27]. While measurement of uterine weight increases has been the most consistent and sensitive indicator of estrogenic activity [28], uterine histopathology may confirm an estrogenic response (e.g., testosterone can increase uterine weight but produces different histopathology from estrogen [19]). In addition, data from ER binding assays, ER transactivation assays, or other related end points (e.g., estrogen-sensitive molecular markers) may help confirm estrogenicity [29]. Results of previous toxicity studies also should be reviewed for signs of estrogen perturbations.

While not included as a requirement of the EDSP, the estrogen antagonist portion of the uterotrophic assay also can be used to detect anti-estrogens by measuring a test compound-mediated attenuation in uterine weight increases when co-administered with estrogen [7,30]. OECD has issued a guidance document for laboratories interested in using the uterotrophic assay to test for anti-estrogenicity [31]. However, most environmental agents with estrogenic activity are mixed agonists/antagonists [10] and can be detected in the uterotrophic assay without including the anti-estrogenic portion of the assay. With the anti-estrogenic assay, it is important to note that the induction of liver enzymes by a test compound may enhance the rate of estradiol clearance and reduce the increases in uterine weight [32–35] via a non-ER-mediated mechanism.

11.2.2 Hershberger Assay

The rodent Hershberger assay is a short-term, in vivo Tier 1 endocrine screening assay designed to detect androgenic or anti-androgenic activity by measuring a compound’s ability to produce an increase in accessory sex tissue (AST) weights or inhibit the testosterone-induced increase in AST weights in castrated (orchidoepididyectomized) peripubertal rats. AST weights are androgen dependent (i.e., testosterone and dihydrotestosterone [DHT]); thus, the Hershberger assay was designed to detect chemicals that potentially act as androgen receptor (AR) agonists, antagonists, or 5α-reductase inhibitors [36–39] (5a-reductase converts testosterone to DHT). Because animals are castrated, the HPG axis is disrupted, so the assay does not detect agents that act directly on the hypothalamus or pituitary or via a non–receptor-mediated mode of action (e.g., altered steroidogenesis). The Hershberger assay is included in EPA’s EDSP, and TGs are available that describe the conduct, interpretation, and performance specifications for this assay [40,41].

As with the uterotrophic assay, the Hershberger assay underwent an extensive validation program coordinated by OECD [15,38,39,42–45]. The Hershberger assay has been shown to reliably detect androgenic and anti-androgenic activity across numerous laboratories using different routes of exposure in castrated rats. Overall, the assay shows good reproducibility both within and between laboratories and relatively good specificity (e.g., see Tables 4 and 5 in [39]). The following discussion is focused primarily on the conduct and utility of the Hershberger assay for EDSP screening.

To conduct the Hershberger assay (see Figure 11.2), male rats are castrated at 42 days of age or shortly thereafter and allowed 7 days to recover from surgery. During this time, the ASTs regress. Castrated male rats (6/dose group) receive test material for 10 days by gavage or sc injection in the presence or absence of testosterone propionate (TP). EPA TG for the Hershberger assay specifies a limit dose of 1,000 mg/kg/day [41]. If the limit dose approach is not applicable, a minimum of two treatment groups are required for the androgenic portion of the assay and three treatment groups for the anti-androgenic portion of the assay. Additional treatment groups can be included if needed. As with the uterotrophic assay, the test material may be given either orally or subcutaneously, depending on the relevant route (oral, dermal, or inhalation), toxicological considerations, and the desire to avoid first-pass metabolism. Animals are euthanized 24 hours after the final dose, and organ weights are collected for the ASTs (i.e., ventral prostate, seminal vesicles with coagulating glands and fluid, levator ani-bulbocavernosus muscles, glans penis (if preputial separation has occurred), Cowper’s (bulbourethral) glands), and other organs (i.e., liver, kidneys, and adrenals). Compounds that increase AST weights in the absence of TP are considered positive for androgenic activity, whereas compounds that decrease AST weights in the presence of TP are considered positive for anti-androgenicity. On the designated necropsy day, blood may be collected for possible future measurement of serum testosterone, luteinizing hormone (LH), or follicle stimulating hormone (FSH). Significant alterations in two or more AST weights are required for a positive assay outcome.

FIGURE 11.2 Schematic drawing of the Hershberger assay design (see the text for study design details). The inset shows a testosterone-stimulated increase in seminal vesicle weights (top) and illustrates the relative size of accessory sex tissues (LABC, Cowper’s glands, seminal vesicles) in non-androgen-treated castrated adult rats (bottom). The lowest tissue shown in the bottom photo is the LABC from a testosterone-treated male.

c11f002

During the validation work, numerous factors were examined for effects on assay specificity. The assay is not sensitive to rat (mouse) strain used, diet, bedding, caging, light cycles, or animal room conditions (temperature, humidity) as long as the animals in a particular study are maintained under the same conditions [36,38]. Furthermore, the assay is relatively insensitive to body weight-mediated changes in AST weights [46].

Because the Hershberger assay can detect both androgenic and anti-androgenic responses, concurrent controls are an important component of the assay protocol. For the androgenic portion of the assay, animals given 0.2 or 0.4 mg/kg/day TP sc are included as a positive control for comparison with vehicle control animals to verify sensitivity to androgenic responses. For the anti-androgenic portion of the assay, animals are given a combination of 3 mg/kg/day flutamide orally and TP subcutaneously as a positive control for comparison with the TP-only animals to verify sensitivity to anti-androgenic responses. Thus, for each assay, a vehicle control, positive androgenic control (TP), and a positive anti-androgenic control (flutamide with TP) are needed along with groups exposed to the test compound with and without TP.

In the Hershberger assay, as with the uterotrophic assay, a number of nonrequired optional end points may be measured, including liver, kidney, and adrenal weights, and hormone levels (testosterone, LH, FSH, triiodothyronine [T3], and thyroxine [T4]). Liver, kidney, and adrenal weights can provide additional information on systemic toxicity and/or metabolic enzyme induction. Measurement of testosterone, LH, and FSH may provide additional information about mode of action. Some investigators have assessed thyroid weight and/or histopathology as part of the Hershberger assay [47–49]. The value of measuring thyroid hormone (TH) levels is questionable because TSH levels are not measured, and results may be very difficult to interpret in the absence of thyroid weights and thyroid histology, which are not measured in this assay. Furthermore, the sample sizes specified in the Hershberger assay (n = 6) are generally considered too small for accurate assessment of TH levels due to variability among animals; thus, increasing group size should be considered if these optional end points are to be included. In general, the optional end points that have the greatest potential for enhancing assay interpretation are liver weights and, if warranted, hepatic enzyme evaluation [7].

While not technically difficult, the Hershberger assay requires some practice to conduct consistently. Dissection of AST tissues from animals not exposed to TP requires some practice as these tissues are very small (e.g., paired Cowper’s gland weights are often 5–6 mg). Due to the small size, variability may increase in these end points, which may confound assay results [50]. In addition, there may be a limited sample size for glans penis weights if all animals have not undergone preputial separation. If male Sprague-Dawley-derived rats are castrated at 42 days of age and the mean age for preputial separation is 41.8 to 45.9 days of age [51], some animals may not have completed preputial separation at the time of the assay. Laboratories are required to report and statistically analyze the number of animals that failed to complete preputial separation in the control and treated groups; however, given the 7-day recovery period after castration (i.e., animals are at least 49 days of age at the time of dosing), it is questionable whether differences in preputial separation can be attributed to treatment. Thus, differences in the number of animals achieving preputial separation, and therefore sample sizes for glans penis weights, do not necessarily reflect androgenic or anti-androgenic potential of the test material.

While EPA and OECD provide performance criteria for assay results, laboratories must be cautious in their use of those criteria. Per the TGs, CVs for the control and high-dose groups must meet the performance criteria when assay outcomes are negative; if the maximum CV criteria have been exceeded, the assay may lack sufficient sensitivity to detect AST weight changes. Therefore, the assay may need to be repeated in the event of a negative or equivocal finding. CV criteria are generally applied to the control group; it would not be unusual for the high-dose group to have greater variance than the controls due to greater interanimal variability in the response to test material treatment. High CVs can result in the need to repeat the assay. To assist interpretation, laboratories should maintain historical control data for vehicle controls, testosterone-treated controls, and testosterone-plus-flutamide-treated controls to verify that assay results are consistent with previous findings.

Within the various AST tissues, there are differential responses to androgens, which can provide some information on mode of action. For example, the weight of the ventral prostate is more sensitive to DHT than to testosterone and so is more reflective of 5α-reductase activity. In contrast, the size of the levator ani muscle is testosterone dependent, with little capacity to convert testosterone to DHT, thereby serving as a more specific marker of anabolic (myotrophic) androgens. With the 5α-reductase inhibitor finasteride, effects on ventral prostate weights were seen at lower doses than those affecting levator ani-bulbocaveronosus muscle weights [52]. Although differences in response magnitude may occur across tissues, AST weights should exhibit a trend in the same direction [39]. Thus, differential effects on organ weights may provide some clues about a chemical’s mode of action; however, additional data would be required for a definitive determination.

It is important to use a WoE when evaluating Hershberger assay results because the weights of the target tissues may be altered by agents other than androgen agonists or antagonists (e.g., estrogens can increase seminal vesicle weights). Aside from testosterone and DHT, AST weights can be altered by TH, growth hormone, prolactin, and/or epithelial growth factor as well as estrogens.

Compounds that induce hepatic enzymes may enhance the rate of testosterone clearance and thereby produce a positive anti-androgenic response without interaction with either ARs or 5α-reductase. This possibility was recognized during the assay validation process; hence, the inclusion of liver weights and optional histopathology as part of the Hershberger assay. One example of this possibility, identified in our laboratory, involves a dinitroaniline herbicide. When tested in a Hershberger assay, this herbicide markedly increased liver weights and significantly decreased AST weights in testosterone-treated animals. Analysis of serum hormone samples collected 24 hours after the last dose indicated that testosterone levels were decreased 29 percent with compound treatment. A subsequent clearance study indicated that treated animals eliminated 14C-testosterone in the blood four times faster than control animals, resulting in plasma area under the curve (AUC) values significantly lower than the controls after 10 days of treatment. Previous toxicity studies have demonstrated that this compound activates enzymes involved in thyroid and steroid hormone clearance. Furthermore, the available data do not support anti-androgenicity via altered 5α-reductase activity or AR binding. When comparing organ weight decrements in the Hershberger study, the levator ani-bulbocavernosus muscle was more sensitive than ventral prostate, which suggests that an effect on 5α-reductase activity is less likely. Furthermore, this dinitroaniline herbicide was negative for AR binding at concentrations up to 1,000 μM, as demonstrated using a competitive binding assay with ARs from rat prostate cytosols (M. LeBaron, personal communication). Thus, these data suggest that enzyme-inducing compounds can produce positive Hershberger responses through a mode of action that does not directly involve the endocrine system. Optional end points (e.g., liver weights/histopathology and serum testosterone levels in terminal blood samples) can aid in determining the mode of action for anti-androgenic responses. Another alternate (or complementary) approach is to examine hepatic enzyme induction directly [37,53]. After collecting organ weights, the liver can be frozen for possible enzyme evaluation if anti-androgenic effects are observed. It also may be useful to monitor testosterone clearance in satellite animals when hepatic enzyme induction is suspected.

While the Hersbherger assay is relatively specific for detecting compounds interacting with the AR or inhibiting 5α-reductase, it is still possible to obtain positive results by other modes of action aside than these. Thus, this assay, as with all of the Tier 1 EDSP assays, should be used in a WoE approach that considers all available toxicity data and the results of other Tier 1 assays, particularly results for AR binding, the male pubertal assay, and the fish short-term reproduction assay.

11.2.3 Male and Female Pubertal Assays

The male and female pubertal assays are longer term (21–31 days), repeat-dose in vivo assays designed to detect potential estrogenic/anti-estrogenic effects (primarily the female assay), androgen/anti-androgen effects (primarily the male assay), steroid biosynthesis inhibitors, alterations in the HPG axis, and thyroid perturbations by evaluating a compound’s ability to alter age at puberty onset, estrous cycles (female only), or reproductive/AST/thyroid weights or histopathology. Serum hormone measurements for testosterone (males only), T4, and TSH are included to provide additional information on possible endocrine activity. One of the strengths of the pubertal assays is that endocrine end points are examined in young animals during a dynamic period when integrated function of the endocrine system is required. These assays are multimodal, capable of detecting endocrine-active chemicals that operate through a variety of modes of action, some of which are not detected by in vitro, uterotrophic, or Hershberger components of the EDSP. Thus, the male and female pubertal assays are included in EPA’s EDSP Tier 1 assays. TGs are available that describe the conduct, interpretation, and performance specifications for these assays [54,55].

The male and female pubertal assays underwent a validation program coordinated by EPA [56,57]; however, the validation work for these assays was not as extensive and did not include as many laboratories as the uterotrophic or Hershberger validation efforts. Overall, the male and female pubertal assays have been shown to reliably detect multiple endocrine modes of action. For endocrine-active compounds used in the validation program, the overall pattern of effects was reproducible across laboratories, although sometimes effects on specific end points were not reproducible. Furthermore, these assays showed good sensitivity to detect endocrine-active compounds, but there is some concern that these assays may lack specificity due to their reliance on apical end points. The next discussion is focused primarily on the conduct and utility of the pubertal assays for EDSP screening.

To conduct the male and female pubertal assays (see Figure 11.3), male or female weanling rats are randomly assigned to treatment groups in a manner that yields similar mean body weights and variances across groups; littermates are not assigned to the same group. Rats are exposed to the test compound by oral gavage from PND 23 to 53 (males) or 22 to 42 (females). Beginning on PND 30 (males) or PND 22 (females), animals are evaluated daily for puberty onset, which is indicated by preputial separation in the males and vaginal opening in the females. When puberty onset is achieved, the animal’s age and body weight are recorded. Once vaginal opening is complete, daily vaginal smears are collected from female rats to monitor age at first estrus and to evaluate the pattern and regularity of the estrous cycle. Males and females are necropsied on PND 53 and 42, respectively. A terminal blood sample is collected for clinical chemistry and serum hormone analyses (TSH and T4 in both the males and females and testosterone in the males). The liver, kidneys, adrenals, thyroid, and pituitary are weighed in both sexes. Gender-specific organ weights include: ovaries, and uterus (with and without fluid) in the females and testes, epididymides, ventral prostate, dorsolateral prostate, seminal vesicles with coagulating glands (with and without fluid), and levator ani-bulbocavernosus muscles in the males. Tissues examined histopathologically include the ovary, uterus, kidney, and thyroid for the females and the testis, epididymis, kidney, and thyroid for the males. Many of the end points evaluated in the pubertal assays (e.g., puberty onset, estrous cyclicity, some organ weights, and histopathology) are redundant end points with the EDSP Tier 2 multigeneration rat reproduction study, which allows confirmation of these screening results under more robust test conditions.

FIGURE 11.3 Schematic drawing of the male and female pubertal assay designs (see the text for study design details). VO = vaginal opening; PPS = preputial separation; MTD = maximum tolerated dose.

c11f003

The inclusion of numerous apical end points in the male and female pubertal assays creates greater inherent biological variability in the end points evaluated and may contribute to difficulties in identifying the mode of action for endocrine-active compounds. For example, in EPA’s prevalidation and validation studies, mean age at preputial separation in control animals varied from 39.6 to 43.9 days of age and mean age at vaginal opening in control animals ranged from 31.5 to 34.9 days of age. The basis for this interanimal variability in age at puberty onset is poorly understood, although puberty onset can be influenced by a number of factors, including alterations in higher brain function [58–60], growth hormone [61], melatonin [62], body weight/composition [63], animal husbandry practices [64–66], and interlaboratory differences in recording these landmarks. Similarly, estrous cycle length is variable across animals, particularly with the onset of cycling, and it is subject to influence by body weight gain and stress [67,68]. In the female pubertal assay, estrous cycle must be monitored after vaginal opening; however, the duration of time for estrous cycle monitoring (∼ 10 days if vaginal opening occurs at 32 days of age) is limited, particularly when one considers that females have four- to five-day cycles and that the first days after vaginal opening may be acyclic. In one of the prevalidation studies conducted for the female pubertal assay, 12 of 14 control animals failed to achieve regular cycles during the monitoring period after vaginal opening [56]. Furthermore, organ weights can be affected by multiple endocrine modes of action, by the stage of estrous cycle at necropsy, or by nonspecific effects, such as systemic toxicity or stress, which can cause changes in body weight gain. Thus, given that multiple factors can affect end points like puberty onset, estrous cyclicity, and organ weights, examining patterns of effects across end points is more useful than examining individual end points when determining a chemical’s mode of action.

Results with known endocrine-active compounds have raised questions regarding the sensitivity of the pubertal assays and their ability to differentiate modes of action. With respect to sensitivity (i.e., the proportion of active substances that are correctly identified by a new test), the available data indicate that the male and female pubertal assays are relatively sensitive with respect to alterations in estrogen and androgen function. However, there remains some question as to the ability of the female pubertal assay to detect weak aromatase inhibitors, particularly those with mixed endocrine activities. The female pubertal assay did not detect δ-testolactone, a moderately specific aromatase inhibitor, at doses high enough to cause anti-androgenicity in the male pubertal assay [69]. The Integrated Summary Report for the female pubertal assay [56] notes that minimal effects were seen with the weak aromatase inhibitor fenarimol at doses that produced a significant decrease in terminal body weight (9.8 percent). Ultimately, fenarimol was detected as an endocrine-active compound due to thyroid changes, which would complicate mode of action determinations. In contrast, the potent aromatase inhibitor fadrozole was readily detected in the female pubertal assay [69].

Similar issues were reported for the male pubertal assay with respect to some weak thyroid-active compounds. Phenobarbital was used as a weak thyroid agent during validation of several EDSP assays; however, phenobarbital did not alter thyroid weights, thyroid histopathology, or serum T4 or TSH levels in the male pubertal assay [57]. Instead, phenobarbital delayed preputial separation and decreased reproductive and AST weights, likely via a central nervous system effect. Therefore, phenobarbital produced the same pattern of effects as the anti-androgens linuron and flutamide. If phenobarbital was an unknown compound, it is unclear whether it would have been identified as a primary thyroid toxicant when a variety of anti-androgenic effects were seen at lower dose levels.

Specificity also has been recognized as a potential issue for the male and female pubertal assays. The Interagency Coordinating Committee on the Validation of Alternative Methods defines “specificity” as the proportion of inactive substances that are correctly identified. Initially, 2-chloronitrobenzene was used as a negative control chemical for the male and female pubertal assay validation; however, it was later determined that this was a poor choice for a negative control compound as multiple end points were altered in these assays [56,57]. EPA recently reported that hydroxyatrazine (OH-ATR) and 2,4-dichlorophenoxyacetic acid (2,4-D) were negative in the male and female pubertal assays [70]. The maximum tolerated doses (MTDs) for both OH-ATR and 2,4-D were identified through renal toxicity rather than a significant change in body weight/body weight gains. For OH-ATR, renal toxicity seen at 45 mg/kg/day included pyelonephritis (inflammatory of the renal pelvis), whereas for 2,4-D, renal toxicity (3 and 30 mg/kg/day) was based on minimal to slight renal changes (mineralization, tubular regeneration, and the presence of protein casts). Thus, the renal changes caused by OH-ATR and 2,4-D span a spectrum of severity. With only a few negative compounds included in the pubertal assay validation, the question of assay specificity remains open and may be further examined as additional data are developed.

The negative control examples further emphasize the importance of careful selection of the high dose level (MTD) when conducting the male and female pubertal assays. According to the pubertal TGs, these dose criteria may constitute an MTD:

1. The limit dose of 1,000 mg/kg/day is sufficient for a high-dose level.
2. A dose is considered an MTD if it causes a statistically significant reduction in the terminal body weight gain in treated animals versus controls, provided that the reduction is not greater than approximately 10 percent of the mean terminal body weight for the controls and there are no clinical signs of toxicity throughout the study.
3. Abnormal blood clinical chemistry parameters at termination, particularly creatinine and blood urea nitrogen, may indicate that the MTD has been exceeded.
4. Histopathology of the kidney or other organ for which gross observations indicate damage may indicate that the MTD has been reached or exceeded.

MTD dose selection for the pubertal assays is critical to avoid nonspecific outcomes. Thus, a relatively inclusive range-finding study may be useful to select dose levels.

Male and female pubertal assay results may be difficult to interpret in the presence of changes in growth rate and terminal body weight. Interpreting delays in puberty onset can be problematic in studies where endocrine-mediated effects must be distinguished from generalized delays in growth. While Laws et al. [71] reported that 20 to 21 percent decreases in body weight did not significantly affect age at puberty onset in male or female rats, other data suggest that age at puberty onset and body weight function as a continuum [72], and body weight alterations of approximately 10 to 15 percent could alter puberty onset [73,74]. Organ weight end points also may be affected by changes in body weight. Feed restriction studies have demonstrated that female organ weights were not altered with a 5 percent change in terminal body weight; however, the next level of feed restriction (12 percent) altered pituitary, adrenal, liver, kidney, and ovarian weights [71]. In a separate study, a 8.6 percent change in terminal body weight altered female pituitary and kidney weights and the number of four- to five-day estrous cycles [75]. Similar results have been reported in the male pubertal assay. Terminal body weight decreases of 4 percent altered adrenal and liver weights in the males, whereas adrenal, pituitary, liver, and kidney weights were altered at the next level of feed restriction (12.5 percent) [71]. Marty et al. [46] reported that an 11 percent decrease in terminal body weight altered epididymidal, prostate, ventral prostate, seminal vesicle, and liver weights, and a reanalysis of these data indicated a significant decrease in dorsolateral prostate weights as well. These findings are similar to feed restriction results reported by Stoker et al. [73], wherein a 15 percent difference in terminal body weight resulted in significant decreases in ventral prostate, seminal vesicle, and epididymal weights. There are no data as to whether the weights of the levator ani-bulbocavernosus muscles and seminal vesicles without fluid are influenced by body weight changes as neither organ weight has been measured in pubertal feed restriction studies. Data also indicate that TH levels can be altered with feed restriction [71]. Although the precise magnitude of body weight change that results in significant differences in assay end points is unclear, both male and female pubertal assay results must be interpreted with caution if a 10 percent change in terminal body weight is observed. EPA has recognized this point; for the male pubertal assay, the TG cautions that a 6 percent decrease in terminal body weight should be interpreted with caution using a WoE approach and in these cases, additional studies may be needed to determine endocrine activity. According to the female pubertal TG, terminal body weight decreases up to 10 percent are consistent with the MTD description.

While body weight may affect pubertal assay end points, the statistical analysis (analysis of covariance, ANCOVA) for these data do not consider terminal body weight differences across groups [54,55]. As a premise of the ANCOVA analysis, the covariate should be independent of treatment; therefore, using terminal body weight as the covariate often violates this statistical principle; consequently, body weight at weaning (PND 21) was selected as the appropriate covariate. Furthermore, endocrine-active substances may affect overall body weight gain and terminal body weight; therefore, there was some concern that adjustments for terminal body weight may mask endocrine-related effects. In some cases, compounds that affect organ weights secondary to alterations in growth rate/body weight gain may be identified by the pattern of effects observed and WoE. For example, environmental estrogens can decrease rate of growth; however, estrogenic compounds typically accelerate age at vaginal opening, which occurs at a lower body weight (e.g., methoxychlor, ethynylestradiol [56]). A positive uterotrophic assay in this situation would add to the WoE. Furthermore, thyroid-active agents can decrease rate of growth/body weight gain; however, these agents also decrease T3 and T4, increase TSH, increase thyroid weights, and produce characteristic changes in thyroid histopathology (e.g., propylthiouracil, DE-71; [56,57]). Anti-androgens also may affect growth rate; however, such agents may be identified in the Hershberger assay, which is designed to be sensitive to this mode of action [76]. Thus, while these endocrine modes of action can affect growth rate, interpretation of assay results would not be confused with systemic toxicity when evaluating all available data. Identifying the effects of steroidogenesis inhibitors in the presence of decreased body weights may be more problematic.

To ensure laboratory proficiency and assay sensitivity, EPA has outlined performance criteria in the pubertal assay TGs. These performance criteria include acceptable mean ranges and CVs for many assay end points. These data may be particularly useful for laboratories conducting the assays for the first time and will ensure some consistency across laboratories in assay performance; however, it is important to note that laboratories may not meet all of these performance criteria. During the pubertal assay validation programs, interlaboratory comparison studies were conducted. At the conclusion, none of the laboratories met all of the mean range and CV performance criteria for either the male or female pubertal assays. In the female pubertal assay, only one of three laboratories had mean values within the acceptable range for all end points, and none of the laboratories met acceptable CV values for all end points (laboratories met the CV criteria for 7 to 9 end points of the 11 end points measured in the female pubertal assay). In the male pubertal assay, none of the three laboratories had mean values within the acceptable range for all end points, and none met acceptable CV values for all end points (laboratories met the CV criteria for 11 to 13 end points of the 17 end points measured in the male pubertal assay). The TG states that if laboratories miss one or two performance criteria, it will not be regarded as fatal to the study. Evaluation of the acceptability of studies will depend on which performance criteria were missed and the magnitude of the deviations. As pubertal assay data are submitted, it is possible that the performance criteria may be further refined.

In conclusion, the male and female pubertal assays are sensitive assays that are capable of detecting multiple modes of action for endocrine-active compounds. However, there is still a question regarding assay specificity; therefore, results of these assays are best interpreted in a WoE context with other assays in the Tier 1 EDSP battery and previous toxicity data.

11.2.4 Enhanced OECD 407 28-Day Study

OECD 407 repeated dose 28-day oral toxicity study has been in use for many years to characterize toxicity in rodents following repeated exposure to a test compound. In 2008, OECD revised OECD 407 TG to incorporate endocrine-sensitive end points [77], thus, allowing this study type to be included as an in vivo Tier 1 endocrine screening assay. This study examines endocrine-sensitive apical end points and, therefore, is capable of detecting a wide variety of endocrine-active materials, including estrogens/anti-estrogens, androgens/anti-androgens, agents that alter steroidogenesis, agents that affect the HPG axis, and agents that perturb thyroid function. Unlike the other EDSP screening assays, the Enhanced OECD 407 study can be used for hazard identification and/or risk assessment, if three dose levels plus a control are used.

Despite extensive experience with the 28-day study design, the Enhanced OECD 407 underwent an extensive validation effort; OECD examined the relevance and practicability of the additional endocrine-sensitive end points, the potential interference of these new parameters with those required by the prior version of the TG, and intra- and interlaboratory reproducibility [77]. End points added to the 28-day study included endocrine-sensitive organ weights (thyroid optional), histopathological evaluation of endocrine-sensitive organs, and optional serum hormone measurements (T3, T4, and TSH). Overall, the assay shows good reproducibility both within and between laboratories and relatively good specificity. The Enhanced OECD 407 assay is included in OECD endocrine screening program but is not included in EPA EDSP. A TG is available that describes the conduct of this assay [77]. The next discussion is focused primarily on the validation, conduct, and utility of OECD 407 for EDSP screening.

To conduct OECD 407 toxicity study, the test substance is administered 7 days/week to young adult male and female rats (≥5/sex/dose) for 28 days via gavage, diet, or drinking water. Animals are <9 weeks old at the initiation of dosing and animals are assigned to three dose groups and a control group. Dose levels are selected so that the high dose induces some toxicity without death or severe suffering, up to the limit dose of 1,000 mg/kg/day. In-life end points include clinical observations, functional observational battery, body weights, and feed/water consumption. Urinalysis evaluation during the final week is optional. At the time of necropsy, blood samples are collected for hematology, clinical pathology, and optional thyroid hormone assessments. Organ weights are collected for the liver, kidneys, adrenals, testes, epididymides, prostate with seminal vesicles and coagulating glands, thymus, spleen, brain, and heart. Histopathological examination of these tissues are performed, plus gross lesions, brain, spinal cord, eye, stomach, intestines, thyroid, trachea, lungs, ovaries, uterus/cervix/vagina, urinary bladder, lymph nodes, peripheral nerve, skeletal muscle, and bone with bone marrow. An additional satellite group, which is exposed for 28-days then maintained for ≥14 days without treatment, may be included to evaluate delayed toxicity or the reversibility of any effects noted during the 28-day dosing period. The 28-day study is designed to detect a broad variety of potential toxicities (e.g., reproductive, endocrine, immune, neuro-, etc.); therefore, additional studies may be needed to better characterize any toxicity observed.

As with the pubertal assays, numerous modes of endocrine action may be detected with the Enhanced OECD 407 study; however, the end points assessed are often apical, which may make it difficult to determine the precise mode of action by which a test compound alters endocrine end points. One advantage to the 407 study is the inclusion of more systemic toxicity end points, which allows greater opportunity to put altered endocrine end points in the context of systemic toxicity. The 407 TG cautions that the interpretation of endocrine end points may be problematic if there is systemic toxicity (e.g., reduced body weights, liver, heart, lung or kidney effects, etc.) or other changes that may not represent a toxic response (e.g. reduced food intake, liver enlargement).

While the Enhanced OECD 407 study can detect both moderate and strong endocrine-active agents, it is generally recognized that it does not have the sensitivity of some of the other EDSP screening assays because this study is not performed in developing animals (i.e., the life stage most sensitive to endocrine perturbations). Thus, the Enhanced OECD 407 may not identify all endocrine-active compounds, particularly those having weak interactions with ER and AR. The inclusion of systemic toxicity end points may limit the dose levels used, which further limits the opportunity to detect weak-acting endocrine compounds. Because of this decreased sensitivity, the Enhanced OECD 407 is not included as a screening assay in EPA EDSP. Thus, while this study contributes to the WoE for endocrine activity, negative results in this study alone are not sufficient to discount effects on the endocrine system.

The addition of new endocrine-sensitive end points to OECD 407 study design has raised concern that sample sizes may be insufficient to ensure specificity and sensitivity for these end points. In particular, prostate weights and TH levels (if included) are recognized as variable end points [50,54,55,78,79]. To address this issue, some laboratories have increased sample sizes to 10 animals/sex/dose level.

During the validation of the enhanced 407 study, several parameters that lacked sufficient data for inclusion in the assay or with questionable ability to enhance the detection of endocrine-active chemicals were included as optional end points. These end points included determination of THs (T3, T4, and TSH), estrous cyclicity, and weights of the uterus, ovaries, and thyroid. Additional endocrine-sensitive tissues that may warrant histopathological evaluation include pituitary gland and male mammary glands. If THs are measured, samples should be collected at the same time each day to avoid diurnal variations in TH levels, and care must be taken to avoid stress to the extent possible as this may alter TH levels [80,81]. With numerous factors capable of altering TH measurements, OECD concluded that thyroid histopathology provides a more definitive identification of thyroid-active substances. Thus, the assessment of serum TH levels should occur only with a concurrent evaluation of thyroid histopathology.

11.2.5 15-Day Intact Male Assay

The 15-day intact male assay was developed by industry in support of product registration. This assay was designed to screen endocrine-active compounds while providing information on the mode of action for the test compound. The 15-day intact male assay evaluates reproductive, accessory sex gland (ASG), and thyroid weights, as well as hormone levels, in young male rats after 15-days of oral gavage dosing. The assay is designed to detect estrogen agonists/antagonists, androgen agonists/antagonists, progesterone agonists/antagonists, steroid biosynthesis inhibitors, thyroid-active compounds, prolactin modulators, and agents affecting the HPG/thyroid axes. This assay is included in OECD framework for EDSP assays but is not included in EPA’s EDSP due to inconsistent results with weak anti-androgens.

As with other EDSP Tier 1 assays, the intact male assay has undergone extensive validation [82]. Twenty-nine endocrine-active chemicals, operating by several different modes of action, have been evaluated using the intact male assay. Overall, the assay has been both sensitive and specific. One of the strengths of the intact male assay is its ability to generate mode of action data (i.e., the assay can differentiate between receptor and non–receptor-mediated effects based on the profile of hormonal changes and organ weight effects). O’Connor et al. [83] used a series of positive control compounds operating via different modes of action to develop “fingerprints” for different endocrine modes of action. Thus, by comparing the effects on organ weights, histopathology, and serum hormone changes to these fingerprints, investigators gain information on a potential mode of action for the test compound [84]. This mode of action data can be used to tailor the design of Tier 2 tests to fully evaluate relevant end points.

To conduct the intact male assay (see Figure 11.4), adult male rats (15/dose group) are dosed daily for 15 days with the test compound from 70 to 85 days of age. Rats are necropsied approximately 2 hours after the last dose on test day 15. At necropsy, blood is collected for hormone measurements and organ weights are collected (i.e., liver, thyroid gland, testes, epididymides, prostate, seminal vesicles with fluid, and ASG unit, which is a combined weight of the prostate and the seminal vesicles with fluid). The testes, epididymides, and thyroid gland are examined histopathologically. Serum from terminal blood samples is used for the measurement of nine hormones: testosterone, estradiol, DHT, LH, FSH, prolactin, T3, T4, and TSH. As an optional end point, liver samples can be frozen in the event that future enzyme analyses are desirable. A positive assay involves significant changes in weights of the thyroid, reproductive organs, or ASGs, and/or histopathology changes; hormone data are used to provide information on the mode of action of the test compound. Hormone data alone are generally insufficient to be deemed positive in the assay, although follow-up work may be considered.

FIGURE 11.4 Schematic drawing of the 15-day intact male assay design (see the text for study design details). MTD = maximum tolerated dose; T = testosterone; E2 = estradiol; PRL = prolactin.

c11f004

Initially, the reliance on serum hormone data was seen as a weakness for the 15-day intact male assay because hormones are not routinely measured in toxicology laboratories and are known to exhibit interanimal variability. To minimize variability, laboratories are advised to collect blood samples within a limited time frame using appropriate anesthesia and blood collection methods. The sensitivity of hormone measurements has been verified by power analyses using hormone data from the intact male assay. With 15 animals/dose group, there is a >99 percent chance of detecting a significant, 50 percent change in serum concentrations of each of the nine hormones measured [82]. If a treatment-related difference exists, there is 92 to 100 percent chance of detecting a significant, 25 percent change in LH, FSH, TSH, T3, and T4 and a 61 to 83 percent chance of detecting a significant change for testosterone, DHT, estradiol, and prolactin. Furthermore, toxicology laboratories are gaining additional experience evaluating serum hormone levels, because these end points (i.e., testosterone, T4, and TSH) are required in the pubertal assays.

As mentioned previously, there has been some inconsistency detecting weak anti-androgens with the intact male assay, specifically, p,p′-DDE (dichlorodiphenyldichloroethylene) and linuron. Difficulties detecting the anti-androgenic effects of p,p′-DDE were attributed to differences in sensitivity between the strains of rats used to evaluate its endocrine activity; that is, the intact male assay readily detected p,p′-DDE when conducted in Long-Evans rats, but not in Sprague-Dawley rats [76]. This strain difference is due to pharmacokinetic differences that make Sprague-Dawley rats less responsive to p,p′ -DDE [85]. In the case of linuron, anti-androgenic effects (e.g., decreased epididymal, prostate, and ASG weights, and retained spermatids) were detected when dose levels produced a greater than 10 percent change in terminal body weight [86,83]. However, the MTD for the intact male assay is defined as the dose producing a 10 percent decrease in terminal body weight. Because dose levels of linuron exceeded the defined MTD, a pair-fed control group was included to differentiate endocrine-mediated effects from nonspecific, body weight–mediated effects.

The 15-day intact male assay is not conducted in growing animals, and therefore it is less sensitive to body weight–mediated changes in assay end points. Feed restriction studies conducted by O’Connor et al. [87,88] determined that body weight changes of <26 percent do not alter assay end points with the exception of thyroid hormone levels. The specificity of the 15-day intact male assay has been verified by testing a known hepatotoxicant, allyl alcohol [89]. At the MTD, allyl alcohol showed minimal effects on some serum hormone levels but did not alter thyroid, reproductive, or ASG weights or histopathology. Thus, allyl alcohol was negative in the 15-day intact male assay, a result that has been confirmed by other laboratories during interlaboratory validation, provided that the MTD is not exceeded [90].

The 15-day intact male assay has good sensitivity for the detection of thyroid-active compounds. Each thyroid-active compound evaluated in the intact male assay has been positively identified [87,91]. Furthermore, the comprehensive thyroid assessment (thyroid weight, histopathology, and serum T3, T4, and TSH levels) coupled with an evaluation of liver weights and optional liver enzyme analysis (uridine diphosphate-glucuronosyltransferase (UDPGT) and/or 5′-deiodinase) allows investigators to differentiate primary thyroid effects from effects due to enhanced TH clearance secondary to hepatic enzyme induction.

11.2.6 Using In Vitro Data in Conjunction with In Vivo Assays

The EDSP Tier 1 screening assays are comprised of a combination of in vitro and in vivo assays. There are several advantages to this approach because in vitro assays can improve the efficiency and specificity of the Tier 1 EDSP battery as well as provide additional information on mode of action for endocrine-active compounds. To improve efficiency, in vitro assays can be coupled with qualitative structure-activity relationship models to prioritize compounds for the EDSP based on in silico/in vitro results, the potency of positive responses, and human or wildlife exposure levels.

In vitro assays also contribute to the interpretability of Tier 1 screening assays. For example, if a compound exhibits ER binding in vitro yet does not elicit a uterotrophic response when administered to immature rats by the oral route, one possibility is that the compound is metabolized to an inactive compound. To test this, the compound could be administered by sc injection to avoid first-pass metabolism. Conversely, a negative or weakly positive ER binding assay coupled with a positive uterotrophic via oral administration may indicate that the test compound is metabolized to an estrogenic compound with higher potency (e.g., methoxychlor to 2,2-bis( p-hydroxyphenyl)-1,1,1-trichloroethane [HPTE]). Alternatively, with the male pubertal assay, delays in age at puberty onset coupled with decreases in reproductive/AST weights may indicate that a compound is an anti-androgen or a steroid biosynthesis inhibitor, or that the compound generated a nonspecific response due to systemic toxicity. Again, a positive or negative steroidogenesis or AR binding assay, particularly one that incorporates a metabolic activation system, may provide context for the pubertal assay findings.

Last, in vitro assays coupled with in vivo assays provide data for the WoE to determine whether compounds have potential endocrine activity and should proceed to Tier 2 testing. As described, the results of in vitro assays may facilitate interpretation of positive or negative in vivo assays to allow an accurate determination of endocrine potential. As shown in Table 11.1, an anti-estrogen operating via ER binding could produce positive results in the ER binding assay, uterotrophic assay, the pubertal female assay, and the fish short-term reproduction assay, whereas an anti-estrogenic response mediated via altered aromatase activity may generate positive results in the steroidogenesis assay, aromatase assay, the pubertal female assay, and the fish short-term reproduction assay. In these cases, the profile of responses across EDSP assays would offer assurance that the battery did not produce false positive results and that assays that rely on apical end points (e.g., the pubertal female assay) were positive due to an endocrine mode of action.

Table 11.1 Modes of endocrine action detected by EDSP Tier 1 assays

Table011-1

11.3 TIER 2 TESTS

Using a WoE approach, compounds that are identified as positive after Tier 1 EDSP screening or, perhaps, compounds for which data already indicates potential endocrine activity would undergo Tier 2 testing. The current Tier 2 mammalian test is the two-generation reproductive toxicity study; the extended one-generation reproduction toxicity study also is included as an option in EDSP Tier 2.

The two-generation reproductive toxicity study has been used to evaluate chemical hazards for decades, but the TGs (OPPTS 870.3800 and OECD 416, “Reproduction and Fertility Effects”) underwent revision in 1998 (870.3800; [92]) and 2001 (OECD 416; [93]) to include more end points that are sensitive to reproductive and endocrine-mediated effects.

To conduct a two-generation reproductive toxicity study (see Figure 11.5), male and female rats (∼25/sex/dose group, P1 generation) are administered test compound for 10 weeks, typically via the diet (although other dosing scenarios, such as drinking water or inhalation, can be used if relevant). The 10-week prebreeding period was selected as this encompasses an entire spermatogenic cycle in the male rat. Typically, the study design includes three dose levels of the test compound and a control group. After the prebreeding exposure, male and female rats continue on test diet and are cohoused for a two-week mating period. Once there is evidence of mating (e.g., sperm positive vaginal lavage sample) or the two weeks have elapsed, the males and females are separated. P1 males continue on test diet until necropsied. P1 females also continue exposures throughout gestation and lactation, including during delivery and rearing their offspring (F1 generation). Reproductive indices are determined for the P1 adults and F1 litter size, pup growth, and development are monitored. At weaning, F1 pups are exposed to test compound for an additional ten week period, then these F1/P2 adults are bred to produce a second generation of offspring (F2 generation). The second-generation mating and offspring are monitored in the same manner as the first generation (reproductive indices, litter sizes, pup growth, survival, etc.), and the study is concluded when the second-generation F2 offspring are weaned. New end points that were added when the TGs were revised include: estrous cycle evaluation, postimplantation loss, anogenital distance in F2 offspring (triggered), puberty onset in F1 offspring, weanling organ weights and histopathology, additional reproductive organ weights, expanded histopathology, sperm analysis (motility, morphology, and counts), and ovarian follicle counts. Interlaboratory control data for many multigeneration study end points have been published [79].

FIGURE 11.5 Schematic drawing of the two-generation reproductive toxicity study design (see the text for study design details). The two-generation study is a mammalian Tier 2 test suitable for risk assessment purposes.

c11f005

Due to the extensive assessment of reproductive- and endocrine-sensitive end points, the two-generation study provides a reliable test design to characterize reproductive/endocrine hazards and determine the dose-response relationship of these hazards for risk assessment. However, in its current form, the two-generation reproduction study does not include specific end points to evaluate thyroid function. TH measurements, thyroid weight, and thyroid histopathology are not included in the standard study design but should be considered as additional end points if thyroid assessment is desirable.

In 2011, OECD approved a new TG (No. 443) for the evaluation of reproductive and endocrine-sensitive end points, the extended one-generation reproductive toxicity study (EOGRTS) [94]. The EOGRTS, originally described by Cooper et al. [95], is an integrated study design that not only evaluates reproductive and endocrine-sensitive end points but also examines developmental immunotoxicity, developmental neurotoxicity, and systemic toxicity. Several feasibility studies have been conducted by industry laboratories (vinclozolin by BASF [96]; methimazole by Bayer Crop Science, and lead acetate by Syngenta) and an EOGRTS study was conducted to fulfill regulatory requirements for 2,4-D in response to a data call-in by EPA and Canadian Pest Management Regulatory Agency (PMRA) [97–99]. EPA recently indicated that the 2,4-D EOGRTS data were sufficient to replace required mammalian Tier 1 screening assays for EDSP screening.

To conduct an EOGRTS study (see Figure 11.6), male and female rats (∼25/sex/dose group, P1 generation) are exposed to test compound for two to four weeks prior to breeding. Again, the study typically includes three dose levels and a control group, and dietary exposures are most common, although other dosing scenarios are possible. Dosing is continued throughout the study as P1 males and females are given a two-week mating period, followed by gestation and lactation in mated females. P1 males are necropsied after a ten-week exposure period to once again cover a full spermatogenic cycle. Because subsequent assessments of the male reproductive system (histopathology and sperm assessments) are generally more sensitive than mating for the detection of chemically induced perturbations [100,101], it was deemed unnecessary to expose males to test compounds for ten weeks prior to mating. P1 females are allowed to naturally deliver their F1 offspring. In this study, additional endocrine-sensitive end points are required, including thyroid assessments (hormones, weight, and/or histopathology in P1 adults and F1 offspring), F1 anogenital distance (not triggered as in the two-generation study), F1 nipple/areolae retention, and puberty onset evaluated in three pups/sex/litter as opposed to one pup/sex/litter in the two-generation study.

FIGURE 11.6 Schematic drawing of the extended one-generation reproductive toxicity study design (EOGRTS) (see the text for study design details). The EOGRTS also is accepted as a mammalian Tier 2 test suitable for risk assessment purposes. Numbers of animals examined and ages at necropsy are indicated for each F1 cohort. X = mating; AGD = anogenital distance; TDAR = T-cell dependent antibody response.

c11f006

One advantage to this study design is that it makes better use of the F1 offspring, which allows the EOGRTS to use fewer animals than the two-generation study. This is accomplished by evaluating multiple systems for toxicity in three pups/sex/litter as opposed to examining growth, development, and reproductive toxicity in only one pup/sex/litter. Thus, not only are more animals evaluated, but the quality of data generated from these offspring is enhanced. This allows for a large reduction in animal use as a second generation is not routinely bred (unless certain trigger criteria are met). If separate studies were conducted (i.e., a two-generation study, developmental neurotoxicity study, and developmental immunotoxicity study), it is estimated that >3,000 additional animals would be needed over those used in the EOGRTS. Note that the EOGRTS includes a preliminary evaluation of developmental neurotoxicity. Depending on the results, a full developmental neurotoxicity study or assessment of other end points of concern may be required [95]. A retrospective analysis of 498 existing two-generation studies [102] has concluded that the second generation (F2) offspring very rarely impact risk assessment and/or chemical classification/hazard labeling. Furthermore, the EOGRTS also allows more flexibility than the two-generation study to evaluate development and function of multiple systems (neurotoxicity, immunotoxicity, other organ systems) in animals exposed during critical windows of development from the in utero period into adulthood.

Most of the end points included in the EOGRTS are commonly used in laboratories currently conducting regulatory toxicity studies; however, the EOGRTS presents some challenges for the logistical management of F1 offspring assigned to multiple groups and undergoing different evaluations at different ages. This is particularly noticeable at the time of weaning, when some animals are sent to necropsy (PND 21–22), whereas others undergo collection of neuropathology samples (PND 21–22), and others are assigned to the various cohorts for assessment of reproductive/ endocrine toxicity, developmental neurotoxicity, or developmental immunotoxicity. Furthermore, developmental neurotoxicity assessments begin on PND 24. This high activity period can span two weeks, as this is the permissible mating period for P1 adults; therefore, litters can be born over this same two-week period. In addition, laboratories must remain current with their data collection and analysis during the EOGRTS as this information is needed to determine whether production of a second generation (F2 offspring) is needed. Triggers for the EOGRTS have not been finalized but may include end points such as P1/F1 estrous cyclicity, F1 litter parameters, and F1 developmental landmarks (AGD, nipple/areolae retention, puberty onset). Last, with more end points assessed, the EOGRTS is more expensive to conduct than the two-generation study.

11.4 HUMAN AND WILDLIFE RELEVANCE OF ESTROGEN, ANDROGEN, AND THYROID SCREENING ASSAYS

Many facets of the estrogen, androgen, and thyroid systems are conserved across species. For example, the HPG axis is basically conserved across vertebrate animals [103]. Processes for TH synthesis, its regulation, and release (hypothalamic-pituitary-thyroid axis) also appear to be conserved in humans and rodents. Furthermore, the amino acid sequences of the TH binding proteins show a high degree of sequence homology (between 70 and 90 percent) between humans and animals [104]. The EDSP is designed to take advantage of these similarities as the WoE assessments rely on in vitro, in vivo mammalian, and in vivo nonmammalian assays in fish and frogs to determine potential endocrine activity. Once deemed positive in Tier 1, the hazard of potential endocrine-active compounds will be confirmed and, if applicable, better characterized in Tier 2 endocrine testing, where data would be generated for risk assessment. As with Tier 1 assays, many end points in Tier 2 tests are apical and must be evaluated in the context of overall systemic toxicity to discern specific, endocrine-mediated effects.

Ultimately, the relevance of the EDSP screening assays will be tied to the outcome and interpretation of Tier 2 testing. How many of the compounds deemed positive in Tier 1 screening were verified to have an endocrine activity of concern in Tier 2 studies? How did this alter the risk assessment for these chemicals? In performing these risk assessments, regulators also must consider the complexity of the endocrine system, exposure versus potency, and species differences between rats and humans with respect to sensitivity to endocrine-active compounds. For example, while studies indicate that the anatomy and function of the thyroid gland is qualitatively similar in rats and humans, evidence indicates that humans have lower sensitivity to thyroid perturbations than rodent models due to differences in TH pharmacokinetics and metabolism as well as thyroid gland morphology, particularly with respect to enhanced thyroid hormone clearance [104–106]. Clearly, diethylstilbesterol (DES) has demonstrated that potent estrogens can cause adverse health effects in humans, particularly with perinatal exposures [107]. However, some researchers advocate that there is a threshold for the adverse effects of DES and the potency of DES makes it less comparable to environmental xenoestrogens [108], which have considerably lower binding affinity for the ER (1/1,000th to 1/1,000,000th the affinity of DES; [109]). Thus, the issue is whether humans are at risk for adverse health effects due to exposures to environmentally relevant concentrations of xenoestrogens. The complexity of this question is increased when one considers the susceptibility of sensitive life stages to environmental exposures of endocrine-active compounds.

In some ways, the expectation that the EDSP can accurately screen for endocrine-active agents affecting the estrogen, androgen, and thyroid systems is daunting. The endocrine system is complex, operating at the organ, tissue, cellular, and molecular levels. Given the complexity of estrogen signaling pathways (i.e., influence of plasma binding proteins, ER subtypes, selective ER modulators, orphan receptors, different coregulators, cross-talk between signaling pathways, alternate modes of estrogen action—i.e., non–receptor mediated effects, etc.), it is a challenging directive to expect the EDSP to accurately screen for environmental estrogens. This picture is further complicated by differences in endogenous background levels of estrogen, other environmental exposures to estrogenic compounds (e.g., phytoestrogens), life stage, and so on. Furthermore, the endocrine system responds to nonspecific perturbations in homeostasis, which can be secondary to systemic toxicity. Ultimately, time will determine whether the EDSP is suitable to adequately evaluate perturbations in the endocrine system or whether regulatory efforts have outpaced the science [110]. In any event, the EDSP is certain to drive a greater understanding of the endocrine system in both humans and wildlife.

11.5 POTENTIAL FUTURE ASSAYS FOR ENDOCRINE SCREENING

Given the current design of the EDSP, some assays seem poised for inclusion if the program is expanded, provided that assay validation is successful. The EDSP has two in vitro assays (steroidogenesis and aromatase assays) that provide information on aromatase activity, which converts androgens to estrogens; however, there is no assay to evaluate the conversion of testosterone to DHT via 5α-reductase. The activity of this enzyme is evaluated only indirectly using apical end points in the in vivo mammalian screens. While the Tier 1 EDSP assesses both ER and AR binding, it lacks an assay to evaluate thyroid receptor binding. Addition of thyroid binding or a thyroid transactivation assay could fill this gap [111]. In addition, the next generation of Tier 1 EDSP screening could mandate the incorporation of metabolic activation systems into the in vitro screening assays to improve assay predictiveness.

Alternatively, the Tier 1 EDSP could incorporate nonreceptor targets that may lead to alterations in estrogen, androgen, or thyroid function. Competitive binding assays could be used to assess the interactions of compounds with hormone transport proteins [24,112]. One critical protein for steroid transport is sex hormone-binding globulin. Rodents lack this transport protein, but it plays a central role in hormone transport, bioavailability, and metabolic clearance rate for sex steroids in humans [113]. Another transport protein of concern is thyroxine-binding globulin, as displacement of TH from this transport protein could result in enhanced hormone clearance [111].

If the EDSP program expands to evaluate new endocrine activities beyond estrogen, androgen, and thyroid, a number of additional endocrine pathways could become targets for screening. With the increase in obesity and type II diabetes, one can envision some focus on altered insulin levels in the coming years. Furthermore, a number of nonreproductive hormones can affect the reproductive endocrine system. For example, glucocorticoids affect the pituitary-gonadal axis in both genders in a dose-dependent manner such that normal levels are generally beneficial whereas excessive levels may be detrimental to reproductive function and fetal development.

With the advent of the ToxCast chemical screening program in the United States, hundreds of compounds can be screened quickly using high-throughput in vitro screening assays to assess pathway toxicity. Where available, the results of the in vitro assays can be compared with in vivo toxicity data contained in Toxicity Reference Database to look for correlations between in vitro pathways and in vivo toxicological outcomes. Currently, the ToxCast database can be used to generate relative priority scores (Toxicological Priority Indices, ToxPi) for compounds in the ToxCast library, focusing on estrogen-, androgen-, and thyroid-related pathways [114]. These ToxPi values can be used to help prioritize compounds for endocrine screening. This approach could ensure that many of the most active chemicals enter the EDSP first and could limit the amount of animal testing that is required for compounds showing little or no endocrine activity. One limitation to this approach is the ability to adequately evaluate metabolites for endocrine activity. This can be partially addressed by the addition of metabolic activation systems to in vitro screening assays. In the future, there is an opportunity to identify target pathways for chemical interactions with the endocrine system. These identified pathways may serve as targets for future endocrine screening programs.

REFERENCES

1. Colborn, T., Dumanoski, D., Myers, J. P. (1996). Our Stolen Future. Are We Threatening Our Fertility, Intelligence and Survival?—A Scientific Detective Story. Penguin Books, New York.

2. Gelbke, H. P., Kayser, M., Poole, A. (2004). OECD test strategies and methods for endocrine disruptors. Toxicology 205: 17–25.

3. Clode, S. A. (2006). Assessment of in vivo assays for endocrine disruption. Best Practice & Research Clinical Endocrinology & Metabolism 20: 35–43.

4. Borgert, C. J., Mihaich, E. M., Quill, T. F., Marty, M. S., Levine, S. L., Becker, R. A. (2011). Evaluation of EPA’s tier 1 endocrine screening battery and recommendations for improving the interpretation of screening results. Regulatory Toxicology and Pharmacology 59: 397–411.

5. O’Connor, J. C., Cook, J. C., Craven, S. C., Van Pelt, C. S., Obourn, J. D. (1996). An in vivo battery for identifying endocrine modulators that are estrogenic or dopamine regulators. Fundamental and Applied Toxicology 33: 182–195.

6. Reel, J. R., Lamb, J. C. IV, Neal, B. H. (1996). Survey and assessment of mammalian estrogen biological assays for hazard characterization. Fundamental and Applied Toxicology 34: 288–305.

7. Owens, J. W., Ashby, J. (2002). Critical review and evaluation of the uterotrophic bioassay for the identification of possible estrogen agonists and antagonists: in support of the validation of the OECD uterotrophic protocols for the laboratory rodent. Critical Reviews in Toxicology 32: 445–520.

8. Organisation for Economic Cooperation and Development. (2007). OECD Test Guideline 440 Uterotrophic Bioassay in Rodents: A short-term screening test for oestrogenic properties. Available at: http://titania.sourceoecd.org/vl=782802/cl=17/nw=1/rpsv/ij/oecdjournals/1607310x/v1n4/s34/p1.

9. U.S. Environmental Protection Agency. (2009). Series 890-Endocrine Disruptor Screening Test OPPTS 890.1600: Uterotrophic Assay. Available at: www.regulations.gov/#!documentDetail;D=EPA-HQ-OPPT-2009--0576–0012.

10. Kanno, J., Onyon, L., Haseman, J., Fenner-Crisp, P., Ashby, J., Owens, W. (2001). The OECD program to validate the rat uterotrophic bioassay to screen compounds for in vivo estrogenic responses: Phase 1. Environmental Health Perspectives 109: 785–794.

11. Kanno, J., Onyon, L., Peddada, S., Ashby, J., Jacob, E., Owens, W. (2003). The OECD program to validate the rat uterotrophic bioassay. Phase 2: Coded single-dose studies. Environmental Health Perspectives 111: 1550–1558.

12. Kim, H. S., Kang, T. S., Kang, I. H., Kim, T. S., Moon, H. J., Kim, I. Y., Ki, H., Park, K. L., Lee, B. M., Yoo, S. D., Han, S. Y. (2005). Validation study of OECD rodent uterotrophic assay for the assessment of estrogenic activity in Sprague-Dawley immature female rats. Journal of Toxicology and Environmental Health, Part A 68: 2249–2262.

13. Owens, W., Koëter, H. B. (2003). The OECD program to validate the rat uterotrophic bioassay: An overview. Environmental Health Perspectives 111: 1527–1529.

14. Owens, W., Ashby, J., Odum, J., Onyon, L. (2003). The OECD program to validate the rat uterotrophic bioassay. Phase 2: Dietary phytoestrogen analyses. Environmental Health Perspectives 111: 1559–1567.

15. Yamasaki, K., Takeyoshi, M., Sawaki, M., Imatanaka, N., Shinoda, K., Takatsuki, M. (2003). Immature rat uterotrophic assay of 18 chemicals and Hershberger assay of 30 chemicals. Toxicology 183: 93–115.

16. Organisation for Economic Cooperation and Development. (2004). Detailed background review of the uterotrophic bioassay. Available at: www.oecd-ilibrary.org/environment/detailed-background-review-of-the-uterotrophic-bioassay_9789264078857-en.

17. Thigpen, J. E., Li, L.-A., Richter, C. B., Lebetkin, E. H., Jameson, C. W. (1987). The mouse assay for the detection of estrogenic activity in rodent diets: I. A standardized method for conducting the mouse assay. Journal of the American Association for Laboratory Animal Science 37: 596–601.

18. Ashby, J., Odum, J., Foster, J. R. (1997). Activity of raloxifene in immature and ovariectomized rat uterotrophic bioassays. Regulatory Toxicology and Pharmacology 25: 226–231.

19. Price, D., Oritz, E. (1944). The relation of age to reactivity in the reproductive system of the rat. Endocrinology 34: 215–239.

20. Katzenellenbogen, B. S., Greger, N. G. (1974). Ontogeny of uterine responsiveness to estrogen during early development in the rat. Molecular and Cellular Endocrinology 2: 31–42.

21. Branham, W. S., Sheehan, D. M., Zehr, D. R., Ridlon, E., Nelson, C. J. (1985). The postnatal ontogeny of rat uterine glands and age-related effects of 17β-estradiol. Endocrinology 117: 2229–2237.

22. Sheehan, D. M., Branham, W. S., Gutierrez-Cernosek, R., Cernosek, S. F. Jr. (1984). Effects of continuous estradiol administration of polydimethylsiloxane and paraffin implants on serum hormone levels and uterine responses. Journal of the American College of Toxicology 3: 303–316.

23. Langston, W. C., Robinson, B. L. (1935). Castration atrophy. A chronological study of uterine changes following bilateral ovariectomy in the albino rat. Endocrinology 19: 51–62.

24. Zacharewski, T. R., Meek, M. D., Clemons, J. H., Wu, Z. F., Fielden, M. R., Matthews, J. B. (1998). Examination of the in vitro and in vivo estrogenic activities of eight commercial phthalate esters. Journal of Toxicological Sciences 46: 282–293.

25. Kanno, J., Onyon, L., Peddada, S., Ashby, J., Jacob, E., Owens, W. (2003). The OECD program to validate the rat uterotrophic bioassay. Phase 2: Dose-response studies. Environmental Health Perspectives 111: 1530–1549.

26. Lerner, L. J., Holthaus F. J., Thompson. C. R. (1958). A non-steroidal oestrogen antagonist 1-(p-2-diethylaminoexthoxyphenyl)-1-phenyl-2-p-methoxyphenylethanol. Endocrinology 63: 295–318.

27. Gray, L. E. Jr., Kelce, W. R., Wiese, T., Tyl, R., Gaido, K., Cook, J., Klinefelter, G., Desaulniers, D., Wilson, E., Zacharewski, T., Waller, C., Foster, P., Laskey, J., Reel, J., Giesy, J., Laws, S., McLachlan, J., Breslin, W., Cooper, R., Di Giulio, R., Johnson, R., Purdy, R., Mihaich, E., Safe, S., Sonnenschein, C., Welshons, W., Miller, R., McMaster, S., Colborn, T., et al. (1997). Endocrine screening methods workshop report: Detection of estrogenic and androgenic hormonal and antihormonal activity for chemicals that act via receptor or steroidogenic enzyme mechanisms. Reproductive Toxicology 11: 719–750.

28. Laws, S. C., Carey, S. A., Ferrell, J. M., Bodman G. J., Cooper, R. L. (2000). Estrogenic activity of octylphenol, nonylphenol, bisphenol A and methoxychlor in rats. Journal of Toxicological Sciences 54: 154–167.

29. Coldham, N. G., Dave, M., Sivapathasundaram, S., McDonnell, D. P., Connor, C., Sauer, M. J. (1997). Evaluation of a recombinant yeast cell estrogen screening assay. Environmental Health Perspectives 105: 734–742.

30. Odum, J., Lefevre, P. A., Tittensor, S., Paton, D., Routledge, E. J., Beresford, N. A., Sumpter, J. P., Ashby, J. (1997). The rodent uterotrophic assay: critical protocol features, studies with nonyl phenols, and comparison with a yeast estrogenicity assay. Regulatory Toxicology and Pharmacology 25: 176–188.

31. Organisation for Economic Cooperation and Development. (2010). Guidance Document Uterotrophic Bioassay—Procedure to Test for Antioestrogenicity. Available from: www.oecd.org/dataoecd/38/19/37773994.pdf.

32. Welch, R. M., Levin, W., Conney, A. H. (1967). Insecticide inhibition and stimulation of steroid hydroxylases in rat liver. Journal of Pharmacology and Experimental Therapeutics 155: 167–173.

33. Welch, R. M., Levin, W., Kuntzman, R., Jacobson, M., Conney, A. H. (1971). Effect of halogenated hydrocarbon insecticides on the metabolism and uterotropic action of estrogens in rats and mice. Toxicology and Applied Pharmacology 19: 234–246.

34. Levin, W., Welch, R. M., Conney, A. H. (1967). Effect of chronic phenobarbital treatment on the liver microsomal metabolism and uterotropic action of 17β-estradiol. Endocrinology 80: 135–140.

35. Levin, W., Welch, R. M., Conney, A. H. (1967). Effect of phenobarbital and other drugs on the metabolism and uterotropic action of estradiol-17β and estrone. Journal of Pharmacology and Experimental Therapeutics 159: 362–371.

36. Ashby, J., Lefevre, P. A. (2000). Preliminary evaluation of the major protocol variables for the Hershberger castrated male rat assay for the detection of androgens, antiandrogens, and metabolic modulators. Regulatory Toxicology and Pharmacology 31: 92–105.

37. Freyberger, A., Ellinger-Ziegelbauer, H., Krötlinger, F. (2007). Evaluation of the rodent Hershberger bioassay: Testing of coded chemicals and supplementary molecular-biological and biochemical investigations. Toxicology 239: 77–88.

38. Owens, W., Zeiger, E., Walker, M., Ashby, J., Onyon, L., Gray, L. E. (2006). The OECD program to validate the rat Hershberger bioassay to screen compounds for in vivo androgen and antiandrogen responses. Phase 1: Use of a potent agonist and a potent antagonist to test the standardized protocol. Environmental Health Perspectives 114: 1259–1265.

39. Owens, W., Gray, L. E., Zeiger, E., Walker, M., Yamasaki, K., Ashby, J., Jacob, E. (2007). The OECD program to validate the rat Hershberger bioassay to screen compounds for in vivo androgen and antiandrogen responses: Phase 2 dose-response studies. Environmental Health Perspectives 115: 671–678.

40. Organisation for Economic Cooperation and Development. (2009). OECD Test Guideline 441 Hershberger Bioassay in Rats: A Short-term Screening Assay for (Anti) Androgenic Properties. Available at: http://titania.sourceoecd.org/vl=23730795/cl=21/nw=1/rpsv/ij/oecdjournals/1607310x/v1n4/s56/p1.

41. U.S. Environmental Protection Agency. (2009). Series 890-Endocrine Disruptor Screening Test OPPTS 890.1400: Hershberger Bioassay. www.regulations.gov/#!documentDetail;D=EPA-HQ-OPPT-2009-0576-0008.

42. Moon, H. J., Kang, T. S., Kim, T. S., Kang, I. H., Ki, H. Y., Kim, S. H., Han, S. Y. (2009). OECD validation of phase 3 Hershberger assay in Korea using surgically castrated male rats with coded chemicals. Journal of Applied Toxicology 29: 350–355.

43. Shin, J. H., Moon, H. J., Kang, I. H., Kim, T. S., Lee, S. J., Ahn, J. Y., Bae, H., Jeung, E. B., Han, S. Y. (2007). OECD validation of the rodent Hershberger assay using three reference chemicals; 17alpha-methyltestosterone, procymidone, and p,p′-DDE. Archives of Toxicology 81: 309–318.

44. Yamasaki, K., Sawaki, M., Ohta, R., Okuda, H., Katayama, S., Yamada, T., Ohta, T., Kosaka, T., Owens, W. (2003). OECD validation of the Hershberger assay in Japan: Phase 2 dose response of methyltestosterone, vinclozolin, and p,p′-DDE. Environmental Health Perspectives 111: 1912–1919.

45. Yamasaki, K., Ohta, R., Okuda, H. (2006). OECD validation of the Hershberger assay in Japan: Phase 3. Blind study using coded chemicals. Toxicology Letters 163: 121–129.

46. Marty, M. S., Johnson, K. A., Carney, E. W. (2003). Effect of feed restriction on Hershberger and pubertal male assay endpoints. Birth Defects Research (Part B) 68: 363–374.

47. Yamada, T., Kunimatsu, T., Miyata, K., Yabushita, S., Sukata, T., Kawamura, S., Seki, T., Okuno, Y., Mikami, N. (2004). Enhanced rat Hershberger assay appears reliable for detection of not only (anti-) androgenic chemicals but also thyroid hormone modulators. Journal of Toxicological Sciences 79: 64–74.

48. Noda, S., Muro, T., Takakura, S., Sakamoto, S., Takatsuki, M., Yamasaki, K., Tateyama, S., Yamaguchi, R. (2005). Ability of the Hershberger assay protocol to detect thyroid function. Archives of Toxicology 79: 627–635.

49. Chowdury, A., Gautam, A. K., Chatterjee, B. B. (1984). Thyroid-testis interrelationship during development and sexual maturity. Archives of Andrology 13: 233–239.

50. Ashby, J., Tinwell, H., Odum, J., Lefevre, P. (2004). Natural variability and the influence of concurrent control values on the detection and interpretation of low-dose or weak endocrine toxicities. Environmental Health Perspectives 112: 847–853.

51. Clark, R. L. (1999). Endpoints of reproductive system development. In: Daston, G., Kimmel, C. (eds.), An Evaluation and Interpretation of Reproductive Endpoints for Human Risk Assessment. International Life Sciences Institute, Health and Environmental Science Institute, Washington, DC. pp. 27–62.

52. Kennel, P. F., Pallen, C. T., Bars, R. G. (2004). Evaluation of the rodent Hershberger assay using three reference endocrine disrupters (androgen and antiandrogens). Reproductive Toxicology 18: 63–73.

53. Freyberger, A., Schladt, L. (2009). Evaluation of the rodent Hershberger bioassay on intact juvenile males—Testing of coded chemicals and supplementary biochemical investigations. Toxicology 262: 114–120.

54. U.S. Environmental Protection Agency. (2009). Series 890-Endocrine Disruptor Screening Test Number OPPTS 890.1450: Pubertal Development and Thyroid Function in Intact Juvenile/Peripubertal Female Rats. Available from: www.regulations.gov/#!documentDetail;D=EPA-HQ-OPPT-2009-0576-0009.

55. U.S. Environmental Protection Agency. (2009). Series 890-Endocrine Disruptor Screening Test OPPTS 890.1500: Pubertal Development and Thyroid Function in Intact Juvenile/ Peripubertal Male Rats. Available from: www.regulations.gov/#!documentDetail;D=EPA-HQ-OPPT-2009-0576-0010.

56. U.S. Environmental Protection Agency. (2007). Integrated summary report for validation of a test method for assessment of pubertal developmental and thyroid function in juvenile female rats as a potential screen in the Endocrine Disruptor Screening Program Tier-1 Battery. Available from: www.epa.gov/endo/pubs/female_isr_v4.1c.pdf.

57. U.S. Environmental Protection Agency. (2007). Integrated summary report for validation of a test method for assessment of pubertal development and thyroid function in juvenile male rats as a potential screen in the Endocrine Disruptor Screening Program Tier-1 Battery Available from: www.epa.gov/scipoly/oscpendo/pubs/male_pubertal_isr.pdf.

58. Cicero, T. J., Adams, M. L., O’Connor, L., Nock, B., Meyer, E. R., Wozniak, D. (1990). Influence of chronic alcohol administration on representative indices of puberty and sexual maturation in male rats and the development of their progeny. Journal of Pharmacology and Experimental Therapeutics 255: 707–715.

59. Cicero, T. J., Adams, M. L., Giordano, A., Miller, B. T., O’Connor, L., Nock, B. (1991). Influence of morphine exposure during adolescence on the sexual maturation of male rats and the development of their offspring. Journal of Pharmacology and Experimental Therapeutics 256: 1086–1093.

60. Collu, R., Letarte, J., Leboeuf, F., Ducharme, J. R. (1975). Endocrine effects of chronic administration of psychoactive drugs to prepubertal male rats. I. Delta9–tetrahydrocannabinol. Life Sciences 16: 533–542.

61. Zipf, W. B., Payne, S. H., Kelch, R. P. (1978). Prolactin, growth hormone, and luteinizing hormone in the maintenance of testicular luteinizing hormone receptors. Endocrinology 103: 595–600.

62. Lang, U., Rivest, R. W., Schlaepfer, L. V., Dradtke, J. C., Aubert, M. L., Sizonenko, P. C. (1984). Diurnal rhythm of melatonin action on sexual maturation. Neuroendocrinology 38: 261–268.

63. Frisch, R. E., Hegsted, D. M., Yoshinaga, K. (1975). Body weight and food intake at early estrus of rats on a high-fat diet. Proceedings of the National Academy of Sciences USA 72: 4172–4176.

64. Odum, J., Tinwell, H., Jones, K., VanMiller, J. P., Joiner, R. L., Tobin, G., Kawasaki, H., Ashby, J. (2001). Effect of rodent diets on the sexual development of the rat. Journal of Toxicological Sciences 61: 115–127.

65. Smith, S. S., Neuringer, M., Ojeda, S. R. (1989). Essential fatty acid deficiency delays the onset of puberty in the female rat. Endocrinology 125: 1650–1659.

66. Grota, L. J. (1971). Effects of age and experience on plasma testosterone. Neuroendocrinology 8: 136–143.

67. Matysek, M. (1989). Studies on the effect of stress on the estrus cycle in rats. Annales Universitatis Mariae Curie Skłodowska Sectio D Medicina 44: 143–149.

68. Roozendaal, M. M., Swarts, H. J., Wiegant, V. M., Mattheij, J. A. (1995). Effect of restraint stress on the preovulatory luteinizing hormone profile and ovulation in the rat. European Journal of Endocrinology 133: 347–353.

69. Marty, M. S., Crissman, J. W., Carney, E. W. (1999). Evaluation of the EDSTAC female pubertal assay in CD rats using 17β-estradiol, steroid biosynthesis inhibitors, and a thyroid inhibitor. Journal of Toxicological Sciences 52: 269–277.

70. Stoker, T. E., Zorrilla, L. M. (2010). The effects of endocrine disrupting chemicals on pubertal development in the rat: use of the EDSP pubertal assays as a screen. In: Eldridge, J. C., Stevens, J. T. (eds.), Endocrine Toxicology. Informa Healthcare, New York. pp. 27–81.

71. Laws, S. C., Stoker, T. E., Ferrell, J. M., Hotchkiss, M. G., Cooper, R. L. (2007). Effects of altered food intake during pubertal development in male and female Wistar rats. Journal of Toxicological Sciences 100: 194–202.

72. Ashby, J., Lefevre, P. A. (2000). The peripubertal male rat assay as an alternative to the Hershberger castrated male rat assay for the detection of anti-androgens, oestrogens and metabolic modulators. Journal of Applied Toxicology 20: 35–47.

73. Stoker, T. E., Laws, S. C., Guidici, D. L., Cooper, R. L. (2000). The effect of atrazine on puberty in male Wistar rats: An evaluation in the protocol for the assessment of pubertal development and thyroid function. Journal of Toxicological Sciences 58: 50–59.

74. Carney, E. W., Zablotny, C. L., Marty, M. S., Crissman, J., Anderson, P., Woolhiser, M., Holsapple, M. (2004). The effects of feed restriction during in utero and postnatal development in CD rats. Journal of Toxicological Sciences 82: 237–249.

75. Laws, S. C., Ferrell, J. M., Stoker, T. E., Schmid, J., Cooper, R. L. (2000). The effects of atrazine on female Wistar rats: An evaluation of the protocol for assessing pubertal development and thyroid function. Journal of Toxicological Sciences 58: 366–376.

76. O’Connor, J. C., Frame, S. R., Davis, L. G., Cook, J. C. (1999). Detection of the environmental antiandrogen p,p′-DDE in CD and Long-Evans rats using a Tier 1 screening battery and a Hershberger assay. Journal of Toxicological Sciences 51: 44–53.

77. Organisation for Economic Cooperation and Development. (2008). OECD Test Guideline 407 Repeated dose 28-day oral toxicity study in rodents. Available from: http://puck.sourceoecd.org/vl=34075725/cl=27/nw=1/rpsv/ij/oecdjournals/1607310x/v1n4/s7/p1.

78. Elswick, B. A., Welsch, F., Janszen, D. B. (2000). Effect of different sampling designs on outcome of endocrine disruptor studies. Reproductive Toxicology 14: 359–367.

79. Marty, M. S., Allen, B., Chapin, R. E., Cooper, R., Daston, G. P., Flaws, J. A., Foster, P. M. D., Makris, S. L., Mylchreest, E., Sandler, D., Tyl, R. W. (2009). Interlaboratory Control Data for Reproductive Endpoints Required in the OPPTS 870.3800/OECD 416 Reproduction and Fertility Test. Birth Defects Research (Part B) 86: 470–489.

80. Döhler, K.-D., Wong, C. C., Gaudssuhn, D., von zur Mühlen, A., Gärtner, K., Döhler, U. (1978). Site of blood sampling in rats as a possible source of error in hormone determinations. Journal of Endocrinology 79: 141–142.

81. Döhler, K.-D., Wong, C. C., von zur Mühlen, A. (1979). The rat as model for the study of drug effects on thyroid function: Considerations of methodological problems. Pharmacology & Therapeutics 5: 305–318.

82. U.S. Environmental Protection Agency. (2007). Integrated summary report for validation of 15-day Intact Adult Male Rat Assay as a potential screen in the Endocrine Disruptor Screening Program Tier-1 Battery. Available from: www.epa.gov/scipoly/oscpendo/pubs/isr_adultmalerat.pdf.

83. O’Connor, J. C., Frame, S. R., Ladics, G. S. (2002). Evaluation of a 15-day screening assay using intact male rats for identifying anti-androgens. Journal of Toxicological Sciences 69: 92–108.

84. O’Connor, J. C., Cook, J. C., Marty, M. S., Davis, L. G., Kaplan, A. M., Carney, E. W. (2002). Evaluation of Tier 1 screening approaches for detecting endocrine-active compounds (EACs). Critical Reviews in Toxicology 32: 521–549.

85. You, L., Casanova, M., Archibeque-Engle, S., Sar, M., Fan, L.-Q., Heck, H. d’A. (1998). Impaired male sexual development in perinatal Sprague-Dawley and Long-Evans hooded rats exposed in utero and lactationally to p,p′-DDE. Journal of Toxicological Sciences 45: 162–173.

86. Cook, J. C., Mullin, L. S., Frame, S. R., Biegel, L. B. (1993). Investigation of a mechanism for Leydig cell tumorigenesis by linuron in rats. Toxicology and Applied Pharmacology 119: 195–204.

87. O’Connor, J. C., Frame, S. R., Cook, J. C. (1999). Detection of thyroid toxicants in a Tier 1 screening battery and alterations in thyroid endpoints over 28 days of exposure. Journal of Toxicological Sciences 51: 54–70.

88. O’Connor, J. C., Davis, L. G., Frame, S. R., Cook, J. C. (2000). Evaluation of a Tier 1 screening battery for detecting endocrine-active compounds (EACs) using the positive controls testosterone, coumestrol, progesterone, and RU486. Journal of Toxicological Sciences 54: 338–354.

89. O’Connor, J. C., Marty, M. S., Becker, R. A., Snajdr, S., Kaplan, A. M. (2008). Results of the negative control chemical allyl alcohol in the 15-day intact adult male rat screening assay for endocrine activity. Birth Defects Research (Part B) 83: 117–122.

90. Becker, R. A., Bergfelt, D. R., Borghoff, S., Davis, J. P., Hamby, B. T., O’Connor, J. C., Kaplan, A. M., Sloan, C. S., Tyl, R. W., Wade, M., Marty, M. S. (2012). Interlaboratory study comparison of the 15-day intact adult male rat screening assay: Evaluation of an antithyroid chemical and a negative control chemical. Birth Defects Research (Part B) 95(2): 95–193.

91. O’Connor, J. C., Frame, S. R., Ladics, G. S. (2002). Evaluation of a 15-day screening assay using intact male rats for identifying steroid biosynthesis inhibitors and thyroid modulators. Journal of Toxicological Sciences 69: 79–91.

92. U.S. Environmental Protection Agency. (1998). Series 870-Health Effects Test Guidelines OPPTS 870.3800: Reproduction and Fertility Effects. Available from: www.regulations.gov/#!documentDetail;D=EPA-HQ-OPPT-2009-0156-0018.

93. Organisation for Economic Cooperation and Development. (2001). OECD Test Guideline 416: Two Generation Reproduction Toxicity Study. Available from: www.oecd-ilibrary.org/environment/test-no-416-two-generation-reproduction-toxicity_9789264070868-en.

94. Organisation for Economic Cooperation and Development. (2011). OECD Test Guideline 443 Extended One-generation Reproductive Toxicity Study. Available from: www.oecd-ilibrary.org/environment/test-no-443-extended-one-generation-reproductive-toxicity-study_9789264122550-en.

95. Cooper, R. L., Lamb, J. C., Barlow, S. M., Bentley, K., Brady, A. M., Doerrer, N. G., Eisenbrandt, D. L., Fenner-Crisp, P. A., Hines, R. N., Irvine, L. F. H., Kimmel, C. A., Koeter, H., Li, A. A., Makris, S. L., Sheets, L. P., Speijers, G. J. A., Whitby, K. E. (2006). A tiered approach to life stages testing for agricultural chemical safety assessment. Critical Reviews in Toxicology 36: 69–98.

96. Schneider, S., Kaufmann, W., Strauss, V., van Ravenzwaay, B. (2011). Vinclozolin: A feasibility and sensitivity study of the ILSI-HESI F1-extended one-generation rat reproduction protocol. Regulatory Toxicology and Pharmacology 59: 91–100.

97. Andrus, A. K., Woolhiser, M., Boverhof, D., Bus, J. S., Neal, B. H., Marty, M. S. (2010). 2,4-Dichlorophenoxyacetic Acid (2,4-D): Evaluation of developmental neurotoxicity (DNT) and developmental immunotoxicity (DIT) in a dietary extended one-generation study in Crl:CD(SD) rats. Abstract No. 421. Itinerary Planner. Society of Toxicology, Salt Lake City, Utah.

98. Bus, J. S., Neal, B. H., Zablotny, C. L., Yano, B. L., Saghir, S., Marty, M. S. (2010). 2,4-Dichlorophenoxyacetic Acid (2,4-D): Evaluation of systemic toxicity in a dietary extended one-generation study in Crl:CD(SD) rats. Abstract No. 419. Itinerary Planner. Society of Toxicology, Salt Lake City, Utah.

99. Neal, B., Bus, J. S., Zablotny, C. L., Yano, B. L., Passage, J., Marty, M. S. (2010). 2,4-Dichlorophenoxyacetic Acid (2,4-D): Evaluation of reproductive/endocrine endpoints in a dietary extended one-generation study in Crl:CD(SD) rats. Abstract No. 420. Itinerary Planner. Society of Toxicology, Salt Lake City, Utah.

100. Mangelsdorf, I., Buschmann, J., Orthen, B. (2003). Some aspects relating to the evaluation of the effects of chemicals on male fertility. Regulatory Toxicology and Pharmacology 37: 356–369.

101. Ulbrich, B., Palmer, A. K. (1995). Detection of effects on male reproduction – a literature survey. Journal of the American College of Toxicology 14: 293–327.

102. Piersma, A. H., Rorije, E., Beekhuijzen, M. E., Cooper, R., Dix, D. J., Heinrich-Hirsch, B., Martin, M. T., Mendez, E., Muller, A., Paparella, M., Ramsingh, D., Reaves, E., Ridgway, P., Schenk, E., Stachiw, L., Ulbrich, B., Hakkert, B. C. (2010). Combined retrospective analysis of 498 rat multi-generation reproductive toxicity studies: On the impact of parameters related to F1 mating and F2 offspring. Reproductive Toxicology 31(4): 392–401.

103. Ankley, G. T., Johnson, R. D. (2004). Small fish models for identifying and assessing the effects of endocrine-disrupting chemicals. ILAR Journal 45: 469–483.

104. Choksi, N. Y., Jahnke, G. D., St. Hilaire, C., Shelby, M. (2003). Role of thyroid hormones in human and laboratory animal reproductive health. Birth Defects Research (Part B) 68: 479–491.

105. McClain, R. M. (1995). Mechanistic considerations for the relevance of animal data on thyroid neoplasia to human risk assessment. Mutation Research 333: 131–142.

106. Jahnke, G. D., Choksi, N. Y., Moore, J. A., Shelby, M. D. (2004). Thyroid toxicants: assessing reproductive health effects. Environmental Health Perspectives 112: 363–368.

107. Newbold, R. R. (2004). Lessons learned from perinatal exposure to diethylstilbestrol. Toxicology and Applied Pharmacology 199: 142–150.

108. Golden, R. J., Noller, K. L., Titus-Ernstoff, L., Kaufman, R. H., Mittendorf, R., Hatch, E. E., Stillman, R., Reese, E. A. (1998). Environmental endocrine modulators and human health: An assessment of the biological evidence. Critical Reviews in Toxicology 28: 109–227.

109. Witorsch, R. J. (2002). Endocrine disruptors: Can biological effects and environmental risks be predicted? Regulatory Toxicology and Pharmacology 36: 118–130.

110. Marty, M. S., Carney, E. W., Rowlands, J. C. (2011). Endocrine disruption: Historical perspectives and its impact on the future of toxicology testing. Journal of Toxicological Sciences 120(S1): S93–S108.

111. Zoeller, R. T., Tan, S. W. (2007). Implications of research on assays to characterize thyroid toxicants. Critical Reviews in Toxicology 37: 195–210.

112. Zacharewski, T. (1998). Identification and assessment of endocrine disruptors: limitations of in vivo and in vitro assays. Environmental Health Perspectives 106(Suppl. 2): 577–582.

113. Rosner, W., Hyrb, D. J., Khan, M. S., Nakhla, A. M., Romas, N. A. (1992). Sex-hormone binding globulin binding to cell membranes and generation of a second messenger. Journal of Andrology 13: 101–106.

114. Reif, D. M., Martin, M. T., Tan, S. W., Houck, K. A., Judson, R. S., Richard, A. M., Knudsen, T. B., Dix, D. J., Kavlock, R. J. (2010). Endocrine profiling and prioritization of environmental chemicals using ToxCast data. Environmental Health Perspectives 118: 1714–1720.