The number of breeding lines discarded in a plant breeding programme is inversely proportional to the number of selection rounds. In the early stages there are many thousands of lines evaluated, with a low proportion selected for testing at the intermediate stage. At the advanced stage a few surviving lines are tested with great intensity, with only a few lines being discarded after each selection stage.
A number of researchers have shown that selection in the initial stages (that period where the greatest proportion of genotypes are discarded, and hence the greatest genotypic variation lost) is the most ineffective stage at identifying the most desirable lines. At the early generation stage, selection has been shown to result in, at best, only a random reduction in the number of genotypes within the breeding scheme. Some advances (particularly for qualitatively inherited traits) have shown a response to initial selection, although some have shown a negative response where the best phenotypes under conditions of early generation selection have been shown to be those least likely to become commercial cultivars.
Therefore, the early generation selection stage of a plant breeding programme is often very ineffective in terms of selection. This inefficiency is in part simply due to low heritability in performance characters in the early selection stages, compared with those in more advanced levels. Low efficiency is, in part, the result of:
Selection in the early stages is most ineffective for quantitatively inherited traits such as yield, quality and durable disease resistance. If early generation selection is purely a random (or near-random) reduction in the number of lines that are to be tested at the intermediate stage, then it is questionable whether this operation will merit the time and resources to complete the task. It has therefore been suggested that a more effective protocol would result from growing fewer breeding lines and doing no selection at the earliest stages. The reduced effort and resources at the early generation stage could therefore be used more efficiently to screen more genotypes at the intermediate stage (where efficiency due to replication and larger plots is more effective).
An alternative method of reducing the numbers involved in early to intermediate selection is available. This procedure involves the identification of the most attractive cross combinations from the many that would be possible, assuming that there is greater probability of obtaining a successful cultivar from the most desirable cross combinations. Having identified the ‘best’ crosses, then maximum effort and resource can be directed to screening individual recombinants from within these specific crosses, while the ‘poorer’ crosses are completely discarded. This process is called cross prediction.
As we noted briefly earlier, methods of predicting the properties and distribution of recombinant inbred lines (derived by inter-mating homozygous parents) using early generations of crosses have been proposed by Jinks and Pooni.
They showed that, for any continuously varying character, the expected mean and variance of all possible inbred lines, derived by inbreeding following an initial cross between two homozygous parents, can be specified in terms of the components of means and variances as specified by biometrical genetics. For example, if an additive–dominant genetic model of inheritance proves adequate, the expected mean is , the mid-parent value, and the expected variance of the inbred sample is
.
From the predicted mean and variance, we can determine many of the properties of the recombinant inbred lines that can be derived in a pure-line breeding programme, based upon the performance of generations in the early generation stages. In addition, the relative probabilities with which different pairwise crosses will produce inbred lines with particular properties can also be predicted and hence used as a selection criteria for reducing the number of breeding lines in a plant breeding programme.
The crosses that show highest probability of producing desirable recombinants can therefore be identified from those with a lesser chance of producing desirable lines. Rather than selecting individual genotypes at the early generation stage, the number of surviving lines can be reduced by selection of the superior cross combinations. Similarly, if the probability of a desirable recombinant is known from a particular cross, then this value can be used to determine the number of recombinants that need to be evaluated to ensure that ‘at least one’ is found. When a single trait is examined, this procedure of estimating and using genetic parameters is called univariate cross prediction.
Univariate cross prediction has been applied to a number of inbreeding species based on the initial work of Jinks and Pooni with Nicotiana rustica. Predictions of the proportion of recombinant inbred lines that will transgress a predefined target value are based on the evaluation of the integral:
where the trait of interest is normally distributed and the function is based on
, the mean of all possible inbreds for a character and
, the additive genetic variance for the character (Figure 7.12).
Figure 7.12 Illustration of univariate cross prediction technique.
The additive genetic components of the expected variance can be estimated from a number of different sources initiated by a cross between two pure-breeding lines. Methods that have proven reliable include:
If a sample of inbred lines from a number of different crosses are grown in properly designed assessment trials, it is possible to estimate , the average performance of all possible inbred lines from each cross, and
, the additive genetic variation for each cross. The average performances of the inbreds are a direct estimate of
, and the variance between inbreds
in the sample is a direct estimate of
after error variance has been removed.
It is possible to calculate the proportion of lines expected to transgress a predefined target value by using:
depending on whether the predictions are for values greater than (or equal to), or less than (or equal to) the target value set – where is the target value,
is the mean of all inbred lines and
is the genetic standard deviation.
Following the calculation of the probability integral from the predicted equations, the expected proportions of transgressive segregants can be obtained from tables of the normal probability integral.
In cases where it is easy to obtain a sample of inbred (or near-homozygous) lines from a large number of crosses (i.e. by using a doubled haploid techniques or rapid-cycle single seed descent), then this method will produce excellent predictions as there are no dominance effects to complicate estimation.
In many instances, however, it is not possible to produce inbred lines quickly and cheaply on a practical level in a breeding programme, and so cross prediction will involve estimating genetic means and variances from early generations of crossing designs.
Although the triple test cross will provide breeders with the best estimate of additive genetic means and variances, it requires a great deal of time and effort to complete. A similar effort will be required to obtain these estimates using the standard and
method. Both these mating designs, therefore, have merit for genetic investigation but may have limited use in a practical plant breeding situation where many hundreds of cross combinations need to be screened.
Evaluation of a random sample of families from each cross under investigation offers a more practical approach. Approximate genetic parameters can be estimated from the mean of a random sample of
families and from variation between families derived from a common cross.
For example, consider a single cross . Then
seed would be grown to produce
single plants. A random sample of these would be harvested and a single plot grown from each of the (say 20 to 25) single plant plots. These plots would be evaluated to obtain the average performance of the families, and variation between families would also be estimated.
The mean (average performance of families) of the plots would be:
and the true variance of the family means
would be:
assuming that the additive–dominance model of inheritance is adequate to describe the character of interest. Therefore the following approximations can be made:
Both estimates will of course be accurate only if dominance effects are relatively small in comparison to additive effects. In cases where is large, then the average of all possible inbred lines can be estimated by growing the parental lines in the prediction trial and estimating
as
. Alternatively, when
and
are large then they can be estimated by including a bulk sample of the
or
in the prediction trial. This latter option will of course also offer a better estimate of
.
If the parental lines are included in the cross prediction trial in which the families are evaluated, then it is also possible to carry out a crude scaling test to determine the dominance components and effect from:
Using these predictions of the additive genetic mean and the additive genetic variance, we can predict the probability that a single inbred line selected at random will equal or exceed a predefined target value by:
Similarly, the probability that a single inbred line, taken at random, would be equal to or less than a predefined target value would be:
Following the calculation of the probability integral from the predicted equations, the expected proportions of transgressive segregants can, as above, be obtained from tables of the normal probability integral.
A number of different options are available in setting target values upon which the predictions are based. These include:
When a sample of inbred lines is produced, then a second option for determining the frequency of particular recombinants has also shown promise. This method simply involves counting the number of inbreds within the sample which exceed the given target value.
Obviously, the accuracy of this method will be directly proportional to the sample size of inbreds used upon which the counts are made. Several researchers have shown, however, that even relatively small samples (around 25 lines) can still provide useful prediction results.
Initially cross prediction was not considered as a tool in the breeding of clonal crops. This may perhaps have been due to the fact that heterozygosity is not a problem in the selection procedure. In other words, although the initial seedlings in, say, a potato breeding programme are all genetically unique and highly heterozygous, they are subsequently multiplied clonally and so are fixed in the sense that they are the genotypes that can be commercially exploited. However, it has been shown in a number of clonal crops that early generation selection suffers from all the inefficiencies found in inbred cultivar development.
In clonal crops there are just as many difficulties incurred in trying to identify desirable lines in the early generation stages (i.e. seedling stage and first clonal year stage) where only single plants are evaluated.
At the Scottish Crop Research Institute, research has shown severe inefficiencies in the traditional method of selection used in the early generations. Work resulting from this prompted an examination of cross prediction methods which might prove an effective alternative to the recurrent phenotypic selection used. A sample of 25 seedlings from the 200 grown from each of 8 cross families was evaluated for breeders' preference (a visual assessment of commercial worth on a 1 to 9 scale with increasing value attributed to increasing commercial worth) by four breeders independently. Table 7.9 shows the progeny means and within-progeny variances
used to determine the frequency of clones that would exceed a preference score of greater than 5 on the 1 to 9 scale.
Table 7.9 Progeny means of breeders' preference ratings and within-progeny variances , estimated on 25 progeny from each of eight potato crosses (C1 to C8), and the univariate probability that a genotype chosen at random from each cross will exceed a breeder's preference rating greater than 5, on the 1 to 9 scale.
Cross | Mean | ![]() |
Predicted ![]() |
C1 | 4.36 | 1.52 | 0.337 |
C2 | 4.01 | 1.65 | 0.274 |
C3 | 3.61 | 1.50 | 0.176 |
C4 | 4.17 | 1.23 | 0.251 |
C5 | 3.04 | 0.91 | 0.015 |
C6 | 3.68 | 1.52 | 0.192 |
C7 | 4.21 | 1.36 | 0.281 |
Seed tubers from a sample of 200 clones from each cross were increased without selection to a stage where a large amount of field-grown seed tubers were available and the 1,600 genotypes were evaluated for three years in a breeding programme. Selection of the populations was based on all the characters (yield, quality and appearance) that are normally assessed in the breeding scheme. Table 7.10 shows the number of selected clones that survived four, five and six rounds of selection from each cross. Also shown is the rank of each cross based upon the 25 seedlings and cross prediction of breeders' preference.
Table 7.10 Relative ranking of eight potato crosses (C1 to C8) based upon cross prediction of breeder's preference that genotypes chosen at random from each cross will exceed a breeder's preference rating greater than 5, on the 1 to 9 scale, and the number of selected breeding lines from each cross that was selected in the 4th, 5th and 6th selection stages in the breeding programme at the Scottish Crop Research Institute.
Cross | Rank | Selected to stage: | ||
Four | Five | Six | ||
C1 | 1 | 15 | 3 | 2 |
C2 | 3 | 9 | 3 | 2 |
C3 | 6 | 1 | 0 | 0 |
C4 | 4 | 2 | 0 | 0 |
C5 | 8 | 1 | 0 | 0 |
C6 | 5 | 11 | 6 | 1 |
C7 | 2 | 12 | 7 | 3 |
C8 | 7 | 0 | 0 | 0 |
Obviously, there were more highly desirable clones from crosses C1, C2 and C7 which were ranked first, second and third on the univariate cross prediction of glasshouse grown seedlings.
In the potato, cross prediction was investigated with higher numbers of crosses (204) in a similar way as a result of this first study. Results from the larger study were in agreement with those shown above. This has prompted several other breeding organizations to change the means by which they reduce clonal numbers in the early generations of potato and sugarcane breeding schemes.
The area under a unit normal distribution (a normal distribution with mean of zero and variance of one) is frequently tabulated in statistical tables. It is common for the area to be given from to the required target value
. The whole area from
to
is of course equal to 1, so the area from T to
can be obtained by subtracting the table value from 1.
For example, from a given cross estimates were obtained of the mean
and genetic variance (
, therefore
). What would be the probability that a recombinant will exceed a set target value of 14? To solve this we have:
Using the unit normal distribution tables and a value of 0.50 we have the probability of to
to be equal to 0.6915. The actual probability we want is
.
Therefore given the above genetic parameters we would expect that 30.85% of all possible recombinants from the cross will have a greater (or equal) value than the target value.
Consider the same set of parameters but now with a target value of 11 (i.e. a target value less than the progeny mean). We now have:
Looking this value up from the tables we have a probability value of 0.5987. In the example above we then subtracted this value from one to obtain the correct probability. In this case however, this has a negative value, and so our required probability is
which is in fact simply the value obtained from tables.
In summary, four possibilities exist:
To consider further possible problems involved in selection of the ‘best’ cross combinations, consider the following example.
Below are shown the means and genetic variances of crop yield (t/ha) of four barley crosses (A, B, C and D). Also grown in this prediction trial were five control lines. These controls are all commercially grown cultivars predominating in the region where new varieties are to be grown. The average yield of the controls was 21 t/ha and the variance of the controls was 7.5 t/ha. Which of the crosses should have greatest emphasis in a breeding programme where high yield is the major selection criteria?
Cross | Mean | Genetic variance |
A | 20.0 | 24.135 |
B | 22.0 | 8.111 |
C | 21.5 | 19.245 |
D | 18.0 | 26.051 |
First it should be decided whether selection is based only on the mean performance of the crosses. If this is the case then the answer is quite simple. Greatest emphasis should be placed on cross B, followed by cross C. The remaining two crosses perhaps should be discarded as their average performance is less than the average of the control cultivars.
However, it should be noted that the four crosses have different mean yield values, but there are also large differences in the genetic variance . Would our decision now change if we consider the ‘best’ cross based on the mean and variance?
First it is necessary to set a target value upon which the prediction is to be based. As there were a number of commercial cultivars included in the cross prediction trial, it may be useful to use as the target value the average performance of the controls (21 t/ha).
When this target value is used, the four crosses were estimated to have and
of their progeny to be greater (or equal) to the mean of the control entries. Again, if these were the criteria used then the greatest emphasis should be put on cross B, followed by cross C, A, and lastly D (the same order as when only the means were used).
If this were an actual breeding scheme, however, it may be several years before a selected genotype from any of these crosses would become a commercial cultivar in agriculture. It would therefore be wise to set our target higher than the controls, as it might be expected that in several years' time newer and higher-yielding lines would be available. As the variance of the controls is available, we can use this to set a target value which is the mean of the controls plus the standard error of the controls (i.e. ), which would be approximately 24 t/ha.
With this target value we have and
. Now there has been a change in the cross rankings (in parentheses in Table 7.11), with cross C now giving the highest probability of lines exceeding 24 t/ha. Cross B is now ranked second, but there is little difference between the probabilities of cross B and cross A.
Table 7.11 Progeny mean, genetic variance and genetic standard deviation of progeny from four different parent cross combinations, and the probability that genotypes chosen at random from each cross will exceed a specific target yield.
Cross | Mean | ![]() |
![]() |
![]() |
![]() |
![]() |
A | 20.3 (3) | 24.13 | 4.91 | 0.424 (3) | 0.209 (3) | 0.111 (2) |
B | 22.0 (1) | 8.11 | 2.85 | 0.637 (1) | 0.242 (2) | 0.081 (3) |
C | 21.5 (2) | 19.24 | 4.38 | 0.544 (2) | 0.284 (1) | 0.153 (1) |
D | 18.0 (4) | 20.05 | 5.10 | 0.278(4) | 0.119 (4) | 0.058 (4) |
If the target value is further increased (say to the control mean plus twice the control standard error), we would have a target value approximately equal to 26. With this target value the ranking of the four crosses is C, A, B and D. Now, while cross C has the highest probability, cross A has a higher probability of producing a genotype exceeding the target value (11.12%) than cross B (8.08%).
In conclusion, therefore, it is clearly important that univariate cross prediction is based upon the mean and genetic variance of a cross. When target values are relatively close to the progeny mean values, then not surprisingly the mean of each cross will be a large factor in the cross prediction. As target values are increased then the genetic variance becomes a more important factor in determining the probability of desirable recombinants. Finally, in the above example it should be noted that it is the genetic standard error that is used in the estimate and not the genetic variance
. Even when there are large differences in genetic variance (compare cross B and cross D), the cross with highest mean value was always the better choice for further breeding work, despite the high variance of cross D. However, if the target was taken to a greater extreme, then the relationship would cease to hold true, and, of course, it is extremely ‘good’ genotypes that breeders are usually trying to identify.
Despite the usefulness of univariate cross prediction in determining the frequency of desirable recombinants that would transgress a given target value, its use is limited because, by definition, only a single character is being evaluated. As we noted many times already, a new cultivar will not be successful because of high expression in a single character, but rather it needs to express an overall improvement in a number of morphological, pathological and quality characters combined with high productivity.
The problem of selecting the most desirable cross combinations can partially be overcome by considering a trait such as breeders' preference, which is based on a visual assessment of several characters simultaneously by a breeder. Indeed breeders' preference scores have been shown to give very similar results to multivariate index selection schemes.
Visual inspection of several characters simultaneously, to result in a single overall rating for each individual, has several limitations. In potatoes this form of assessment has been shown to have advantageous features when used in a plant breeding selection scheme. Breeders' preference scores in potato breeding are highly related to actual yield, number of tubers per plant, tuber size, tuber conformity, tuber disease and absence from defects. It has been shown that this type of evaluation does not have such a good agreement with other important characters such as seed size, disease resistance, yield, and so on. Similarly it is not possible to combine characters that are expressed at different times in the growth cycle. For example, it is difficult to consider pre-harvest characters such as flowering time, plant height or maturity if preference scores are recorded at harvest. In addition, it is difficult to combine morphological characters such as yield along with quality characters that may be assessed in a laboratory at a later stage. Thus it is usually necessary to consider selection for more than a single trait.
If more than one trait is to be considered in cross prediction studies, it is possible to treat each independently, carry out univariate cross prediction on each character, and examine the probabilities obtained to make decisions on the ‘best’ crosses. This would of course ignore the fact that the different traits are interrelated (correlated) and that the relationship between the traits is constant over all crosses involved. This may cause problems, and so it may be necessary to expand the univariate procedure to cover several different traits simultaneously.
Univariate cross prediction is based upon evaluation of the normal distribution function determined by the mean and genetic variance of each cross and a chosen target value :
Suppose that two characters are to be considered. The bivariate normal distribution of the data from these two traits can be described by the mean of each character ( and
), the genetic variance of each character (
and
) along with the correlation between the characters
. Given these five parameters it is possible to estimate the proportion of recombinants from the cross that will transgress a given target value for character 1
, and simultaneously transgress a second target value
for the other trait. This probability is given by:
where the function is a bivariate normal distribution function based on the mean of both traits, the variance of both traits and the correlation between traits.
It is easy to extend this to cover different traits by evaluation of the integral:
In this case the function is a multinormal distribution function based on the means
of all
traits, the genetic variances
of all
traits and the genetic correlations
between all
traits.
Given the various means, variances and correlations, it is possible to obtain bivariate, trivariate and multivariate probability estimates from statistical tables. These tables are, however, not commonly presented in standard statistical tables (as for example the ones that usually show unit normal distribution function, -tables,
tables or F-tables). In addition, use of the tables that do exist can be complex and would require detailed description.
Parameters used in multivariate prediction are estimated using the same design types (i.e. triple test cross, prediction) that were explained previously for univariate predictions.
When it is necessary to estimate multivariate probabilities, computers offer an easier alternative. Computer software is available (although not commonly commercially) which projects a probable value, when the means, variances, correlations and target values are entered. To our knowledge there is software that can handle up to seven traits simultaneously – how the software manages this need not detain us here!
Similarly, it is beyond the scope of this book to try to explain in more detail the theory of estimating these probabilities. It is sufficient to understand the basic concept and to be aware of the usefulness of the procedure as applied in cross prediction techniques. You should, however, be aware that the procedure exists and that multivariate cross prediction can offer a powerful tool for selection in plant breeding.
The eight crosses that were evaluated for breeders' preference (see earlier in univariate prediction) also had tuber yield, tuber size and the number of tubers recorded for the 25 progeny from each cross. Tuber shape was also visually assessed. The means and variances of each trait were estimated along with the correlations between traits for each cross. Based on these statistics the probability that genotypes would exceed target values for each character simultaneously was estimated using a computer software package called POTSTAT. The relative rankings of the multivariate predicted values (MV.rank) are shown in Table 7.12 along with the rankings of the univariate cross prediction of breeders preference (UV.rank) and the frequency of desirable clones selected from a large sample in the fourth, fifth and sixth round of selection.
Table 7.12 Relative rankings of eight potato crosses (C1 to C8) based upon multivariate cross prediction (MV.rank) and univariate cross prediction (UV.rank) of breeder's preference that a genotype chosen at random from each cross will exceed a breeder's preference rating greater than 5, on the 1 to 9 scale, and the number of selected breeding lines from each cross that was selected in the 4th, 5th and 6th selection stage in the breeding programme at the Scottish Crop Research Institute.
Cross | MV.rank | UV.rank | Selected to stage | ||
Four | Five | Six | |||
C1 | 2 | 1 | 15 | 3 | 2 |
C2 | 2 | 3 | 9 | 3 | 2 |
C3 | 6 | 6 | 1 | 0 | 0 |
C4 | 4 | 4 | 2 | 0 | 0 |
C5 | 8 | 8 | 1 | 0 | 0 |
C6 | 5 | 5 | 11 | 6 | 1 |
C7 | 1 | 2 | 12 | 7 | 3 |
C8 | 7 | 7 | 0 | 0 | 0 |
There is good agreement between the multivariate predictions, based on four traits and the univariate prediction based on breeders' preference. Therefore the preference scores were highly related to yield, number of tubers, tuber size and tuber shape. There was also very good agreement with the predicted worth of each cross and the number of clones that indeed show commercial value in the advanced selection stages.
It is possible to obtain good multivariate probability estimates by observing the frequency of individuals in a small sample that exceed given target values. The difficulty in using observed frequencies is related to sample size. The accuracy of the predictions will be directly related to the sample size examined. When the frequency of desirable recombinants is low (i.e. when high target values are used), then larger samples will need to be examined.
Similarly, if there are low correlations between traits of interest, sample sizes will need to be relatively large to carry out prediction effectively.
It has been noted above in the univariate case that the relative importance of the different selection pressures imposed can affect the results of prediction. In multivariate prediction, three types of parameter are used: means, variances and correlations. In cases where the progeny means predominate in the prediction equations, which is often the case, then very good estimation of progeny worth can be obtained by summing the relative rankings (based on the phenotypic mean of the cross) for each of several traits.
Consider the potato example shown above, where four traits, yield, tuber size, tuber number and shape, were used to assess progeny worth of eight potato crosses. The ranking of the eight crosses based on the multivariate normal probability (MVP) and those obtained by summing the relative rankings of each character were:
Cross | Sum rank | MVP |
C1 | 1 | ![]() |
C2 | ![]() |
![]() |
C3 | 5 | 6 |
C4 | 4 | 4 |
C5 | 8 | 8 |
C6 | 6 | 5 |
C7 | ![]() |
1 |
C8 | 7 | 7 |
As can be clearly seen, ranking each individual trait and then summing the rankings of each cross can be a good estimate of the commercial worth of different cross combinations.
Selection in a plant breeding programme takes two forms:
Selection of the desirable recombinants has been covered in the foregoing sections of this book. We will now consider parental selection.
Parents used in plant breeding programmes are chosen from a wide range of possible genetic material. In general, however, parents are of three different types:
It may appear strange that recombinant selection was discussed prior to parental selection (i.e. putting the cart before the horse). In actual practice there is no definite order of either selecting parents or selecting offspring. A large majority of parents used in plant breeding scheme are derived from selections within the breeding programme.
Parental selection is therefore a cyclic operation where parents are selected, inter-mated, recombinants screened from segregating populations and these, in turn, are used as parents in the next round of the scheme.
In deciding which parents are to be used in a breeding scheme, there are basically two types of evaluations possible:
This information could have been derived from experiments or assessment trials carried out within the breeding scheme or by other organizations (e.g. available in germplasm databases).
Phenotypic evaluation is often the first stage of parental selection. New genetic material is continually being added to the available parental lines within a plant breeding programme.
It can be of great benefit to a breeder, and will add increasing knowledge of possible new parental lines, to grow parental evaluation trials. When a potential new parent is made known, often the information of commercial worth is lacking. Information may be available from a database management scheme, although often these data are related to performance in different geographical regions to the target region of the breeding programme.
Phenotypic parent evaluation trials can be carried out at relatively low cost. When many new parents are to be assessed then specific trials can be arranged. These trials should be organized with the same criteria of good experimental design that any evaluation requires. In cases where only one or two new parents are to be considered, it is often useful to include these genotypes as controls in one of the breeding trials.
Although it is often only after a new parent has proven to have some merit on its phenotypic performance that the more time-consuming and detailed examination of genotypic worth is carried out, it must be pointed out that the possibility that a valuable genotype (in terms of becoming a parent) might hide within a poor phenotype still exists. Nevertheless, because of limited resources and a lesser probability of a poor phenotype proving to be a good genotype, most effort is devoted to further evaluating proven material to determine the true value of the parent in cross combination.
The most common means to determine the genetic potential of new parental lines is to examine a series of progeny in which the new parent features as one of the parents. From these studies it is possible to determine the general combining ability of a genotype and to use this information to select the most desirable parental lines.
General combining ability is an indication from the progeny of how a particular genotype, when crossed with a range of other genotypes, responds. The most effective means of determining general combining ability is by diallel crossing designs, where the variation observed in the diallel table is divided into general combining ability of the parents used, and specific combining ability (all variation that cannot be explained by an additive model of parental values). But as noted before, this does limit the number of lines that can be examined.
General combining ability can be estimated from other crossing designs. The simplest of these involves evaluating the progeny that are produced by crossing the potential parent with one or more tester lines. Tester lines are chosen because of past experience in producing worthwhile results. For example, a new parent may be crossed to a genetically productive genotype and also to one with little genetic worth. The contribution of the parent can be observed by examination of the offspring from the crosses.
General combining ability can also be estimated using North Carolina crossing designs. These, as noted earlier, are of two forms:
In addition to the statistical analysis of diallels and other crossing designs, more information can be obtained from genetic analysis. The most common means to achieve this is from a Hayman and Jinks' analysis where within-array variances and between-array covariances
are used to estimate the proportion of dominant to recessive alleles for a given character. Hayman and Jinks' analysis can be used therefore to choose parents with high phenotypic performance and with a known degree of dominant alleles.
When a suitable cross prediction scheme is employed in the early generations of a plant breeding scheme, it is possible to use the cross prediction data to indicate which specific parents have the highest probability of producing desirable recombinants. A potential new parent is hybridized to a number of different genotypes and the progeny are examined to estimate the mean of all crosses in which the parent is used and the genetic variance of all crosses in which the parent appears. These data can be used in the same way as illustrated earlier in cross prediction.
Similar probabilities based on several traits simultaneously can provide useful indicators of the exact worth of a new parent without waiting several years to determine this potential from survivors in a selection scheme.
At the Scottish Crop Research Institute, cross prediction at the seedling stage of the potato breeding programme became standard practice. Each year between 200 and 300 crosses were evaluated in cross prediction trials. The hybrid combinations with the highest probability of producing a new cultivar were increased, while the less desirable cross combinations were discarded.
This scheme, in addition to providing information on the commercial potential of each cross combination, was also used to determine the suitability of individual parents. The progeny mean and genetic variance of each parent was used in the prediction estimation. Shown in Table 7.13 are the rankings of nine parents based on this system along with the number of desirable recombinant lines that resulted from crosses involving the parents. Despite one or two changes in rank order, there was good agreement between the predicted and observed indicators. The differences that were observed could be explained by morphological characters (i.e. Cara has a pink eye and there was positive emphasis to select these types) or pest preferences (i.e. Maris Piper has nematode resistance and only clones that possessed the resistance were continued, irrespective of other characters).
Table 7.13 Univariate probability that a genotype taken at random from a segregating progeny with a common parent will have a breeders' preference greater than 4, on a 1 to 9 scale, relative ranking of that probability, along with the proportion and ranking of genotypes that survive the 4th selection stage at the Scottish Crop Research Institute.
Clone | Cross prediction of preference ![]() |
Rank | Percentage of year 4 clones selected in year 7 | Rank |
Maris Peer | 69.17 | 1 | 17.69 | 1 |
3683.A.2 | 62.57 | 2 | 11.76 | 2 |
Pentland Ivory | 60.40 | 3 | 7.11 | 4 |
G.6755.1 | 59.74 | 4 | 6.29 | 5 |
Cara | 57.34 | 5 | 10.95 | 3 |
8204.A.4 | 54.42 | 6 | 5.13 | 6 |
Pentland Squire | 49.25 | 7 | 3.18 | 7 |
Dr Macintosh | 47.37 | 8 | 0.00 | ![]() |
Self crosses | 37.99 | 9 | 0.00 | ![]() |
Having decided on a set of parental lines, the next decision to be made is how many crosses should be made and which combinations will yield the best results?
If there is a means by which large numbers of crosses can be evaluated, then many crosses will yield better results than if only a few are tried. However, it should be noted that there is little to be gained by making more crosses than can be screened in an effective manner.
In a straightforward commercial context, and for a short-term objective, only a limited number of crosses are to be considered; one simple and effective strategy is to cross the best with the best. Therefore identify the phenotypically and genetically best parents, intercross these and select amongst their progeny.
Many breeders use the strategy of combining complementary parents, for example, to inter-mate a high-yielding, poor-quality line with a low-yielding but high-quality line. In theory this type of combination could allow the selection of a high-yield, high-quality recombinant. However, what is often achieved is an average yield with average quality. It is usually necessary to use some form of pre-breeding where the high-yielding line is first crossed (or backcrossed) to a high-quality line, parents are selected several times, and these are in turn used in final cross combinations.
Similarly when a character is introduced from a wild or unadapted genotype, it may take many rounds of backcrossing to get the desired character into a commercial background before the trait is introduced into a new cultivar. This is, of course, where some of the newer techniques of genetic transformation and marker-assisted selection offer other alternatives to the breeder.
Germplasm is the basic raw material of any plant breeding programme. It is important that genetic diversity is maintained if crop development is to continue, so that new characters can be introduced into already existing cultivated genotypic background.
Why is genetic variability so important? Well, it has been continually stated in this book that without genetic variability, there can be no gain from selection. A further need is related to the appearance of new forms of pest or disease or new husbandry techniques, or new environmental challenges. If a new disease became important in an agricultural area to which all known cultivars were susceptible, then it may be possible to identify new sources of disease resistance from closely related wild or weedy species.
There is a growing awareness of reduced germplasm resources throughout modern agriculture. The greater use of monoculture crops and homozygous cultivars has greatly reduced the genetic variability within our agricultural crop species. For example, at the turn of the century, farmers growing cereal crops were propagating land races that were a collection of genetically different types grown in mixture. Land races have been replaced, in most countries, by homozygous lines or hybrids, and much of the variability that existed has already been lost. Disease epidemics can also greatly reduce genetic variability within a crop species. The potato blight that affected Western Europe (not just Ireland) had the effect of greatly reducing the genetic variability within European potato lines (in addition it triggered a famine killing around one million people in Ireland alone in the nineteenth century). Worldwide organizations have been formed with the specific aim of conserving germplasm which is accessible to breeders to search for new traits that are not available within the cultivated crops. Bioversity International is one such organization which coordinates germplasm collection activities on an international level. Bioversity International is part of the Consultative Group on International Agricultural Research (CGIAR) Consortium.
In addition to the national germplasm collections and Bioversity International other organizations in the CGIAR Consortium centres, such as the International Potato Research Center (CIP, Peru), the International Center for Maize and Wheat Improvement (CIMMYT, Mexico), the International Rice Research Institute (IRRI) and the International Crops Research Institute for the Semi-Arid Tropics (ICRISTAT, India), have remits to maintain germplasm collections on specific crop species.
Germplasm is available within the US from the Plant Introduction System. Genotypes are made available from the location that maintains plant introduction material, or from one of the regional stations. Some of the major crop responsibilities of each station are as follows:
Germplasm in itself is of little use to a plant breeder unless there is information regarding the attributes or defects of different genotypes. Most germplasm collections have associated data banks detailing and classifying material within the collection. For example, the Germplasm Resources Information Network (GRIN) is a computerized database containing information on the location, characteristics and availability of accessions within the plant introduction scheme. This information is available to any breeder through the Database Management Unit of the Agricultural Research Service, Plant Genetics and Germplasm Institute, Beltsville, Maryland.
Parent | Mean | ![]() |
![]() |
1 | 32.3 | 234.0 | 215.2 |
2 | 15.2 | 45.2 | 19.2 |
3 | 21.3 | 150.4 | 298.1 |
4 | 24.5 | 17.3 | 19.2 |
5 | 29.3 | 100.1 | 90.1 |
6 | 17.4 | 210.9 | 250.3 |
7 | 16.3 | 199.0 | 99.1 |
8 | 19.1 | 26.9 | 15.6 |
9 | 17.1 | 292.8 | 211.2 |
10 | 22.3 | 379.5 | 403.1 |
A cross is made between two homozygous parents where parent 1 is dwarf with round beans (ttrr) while parent 2 is tall with oval beans (TTRR). A number of plants are selfed to produce
seeds from which 1,600
plants are grown. Assuming independent assortment of genes, outline a selection scheme that will result in harvesting
seeds that are homozygous for oval beans and dwarf stature. Indicate the number of plants selected at each selection stage.
Mean | ![]() |
|
Cross 1 | 25.60 | 27.34 |
Cross 2 | 19.33 | 19.40 |
Cross 3 | 27.71 | 13.31 |
Cross 4 | 12.06 | 10.39 |
Cross 5 | 13.11 | 15.63 |
Cross 6 | 26.56 | 14.21 |
Cross 7 | 27.45 | 25.69 |
Cross 8 | 19.21 | 15.21 |
Cross 9 | 23.21 | 39.13 |
Cross 10 | 19.32 | 17.31 |
Also grown in the same trial were ten commercial cultivars. The average performance of the commercial cultivars was 20.14 and the standard deviation was 3.26. Using univariate cross prediction procedures, determine which three crosses should be used in breeding for cultivars that would have high yields. Rank your choices as first, second and third and include the probabilities used for your decision, under the following criteria:
Using the data presented above, how many clonal lines would need to be raised from Cross 6 to be 90% certain of having one line that would have a yield potential exceeding 3 standard deviations from the control mean?
Also from this trial, the phenotypic variance of the families was found to be
, and the average yield of all 435
lines was 301.5 kg. What would be the expected gain from selection if the
families were selected at the 10% level (i.e. discard 90% of families), and what would be the expected yield of these 43 selected lines at
, according to yield performance? Would you expect the same response to selection if the 43 selected
families were further selected for yield the following year? (Explain your answer).
and indicate the importance of these terms and the use of Griffing analysis in selecting superior parental lines.
Character | Year | 95. BAR. 31 | 95. BAR. 69 | ||
Moscow | Boise | Moscow | Boise | ||
Plant | 1993 | 0.20 | 0.22 | 0.83 | 0.86 |
height | 1994 | 0.01 | 0.21 | 0.31 | 0.74 |
1995 | 0.21 | 0.30 | 0.52 | 0.79 | |
Plant | 1993 | 0.11 | 0.25 | 0.52 | 0.57 |
yield | 1994 | 0.02 | 0.15 | 0.12 | 0.46 |
1995 | 0.21 | 0.24 | 0.21 | 0.48 |
Which family is likely to give better responses in a breeding programme, given equivalent selection intensities, and why? In such a breeding programme, which of the two characters (plant height or plant yield) is likely to give the better response to equivalent selection intensities, and why? Consistently higher average values of were obtained at the farm near Boise. Which site, if either, is likely to provide the more accurate estimate of
, and why? The heritability estimates (particularly those for plant height for 95.BAR.69 in Moscow) varied greatly over the 3-year period. What could be the cause?
Cross code | MEAN | VAR | ![]() |
DF.33.111 | 22.3 | 16.7 | 0.45 |
DF.66.123 | 24.6 | 14.3 | 0.51 |
DF.97.37 | 28.1 | 6.3 | 0.47 |
DF.97.332 | 26.1 | 15.3 | 0.11 |
DF.99.1 | 22.5 | 10.2 | 0.84 |
DF.99.131 | 18.9 | 26.1 | 0.75 |
Mean | ![]() |
|
92.WW.46 | 236.60 | 127.34 |
92.WW.53 | 199.33 | 191.40 |
92.WW.54 | 241.82 | 91.43 |
92.WW.61 | 142.00 | 119.10 |
92.WW.71 | 133.11 | 125.46 |
92.WW.74 | 236.73 | 102.14 |
92.WW.93 | 233.55 | 281.77 |
92.WW.108 | 201.22 | 106.63 |
92.WW.111 | 229.37 | 299.39 |
92.WW.116 | 169.11 | 119.32 |
Also grown in the same trial were five commercial cultivars. The average performance of the commercial cultivars was 211.10 kg and the standard deviation was 41.83 kg. Using univariate cross prediction procedures, determine which three crosses should be used in breeding for cultivars that would need to produce yields:
Rank your choices as first, second and third and include the probabilities used for your decision.
In this same study, the following phenotypic variances and site means for yield (over all 500 lines) were:
Hillside | Nethertown | |
Phenotypic variance | ![]() |
![]() |
Site mean | 27 kg | 29 kg |
Given these data and the heritability estimated above, determine the expected response to selection at 10% level.
The following are average plant heights (over four replicates) of a half diallel (including selfs) between sweet cherry cultivars.
Golden Glory | 112 | ||||
Early Crimson | 72 | 53 | |||
Sweet Delight | 102 | 64 | 99 | ||
Dwarf Evens | 56 | 41 | 65 | 49 | |
Giant Red | 130 | 100 | 109 | 107 | 115 |
Golden | Early | Sweet | Dwarf | Giant | |
Glory | Crimson | Delight | Evens | Red |
Estimate the general combining ability of each parent and calculate the specific combining ability of each cross. Which parents would be ‘best’ in a breeding programme to develop cultivars with short heights?
Eight yellow mustard breeding lines were grown at 20 locations throughout the Pacific Northwest region. Significant
interactions were detected for yield by analysis of variance, and when a joint regression analysis was carried out. The following are line means and environmental sensitivity
values from the analysis.
Line | Overall mean | Environmental sensitivities |
90.EW.34.5 | 1,234 | 0.55 |
90.JB.456 | 1,890 | 1.05 |
90.JB.562 | 2,345 | 1.34 |
91.HG.12 | 1,897 | 0.76 |
91.HH.145 | 1,976 | 0.52 |
92.22.12 | 2,567 | 1.42 |
92.AE.1 | 2,156 | 0.83 |
92.HK.134 | 2,152 | 1.72 |
Select the two ‘best’ breeding lines that show general environmental adaptability. Select the two ‘best’ breeding lines that show specific environmental adaptability.