4.5.1 Model Selection: Illustration l, Loading Times

In heavy construction operations, variations in the time required for various steps, for example, in the loading, the hauling, the dumping, and the return of individual dump trucks, affect the interaction between pieces of equipment. For example, owing to fluctuations in truck-arrival times, a power shovel loading the trucks may have to wait for the next truck to return, or several trucks may have to wait in a queue before being filled. An engineer is developing a large computer program to simulate such operations. It will be used by project engineers to plan better future construction operations. In particular, for example, it could be used to predict the best number of trucks to use with a shovel in order to minimize the expected losses due to waiting equipment.

As just one part of this program the engineer must decide what distribution to use for the time required to load a truck. Of particular concern to him is the type of distribution to be used. The specific values of the distribution’s parameters for a particular job will be input by future users of the computer program.

Although he recognizes that the total time of each loading operation is made up of the sum of the times required for each of a (small) number of buckets of material, there is no strong prior physical justification for a particular mathematical model. On the other hand, facility in model description and manipulation favors one of the common distributions. The engineer intends to choose from among several contenders on the basis of a small amount of data available to him from a previous job.

The data, a total of 18 values, are presented in Table 4.5.1 and in Fig. 4.5.1. The predominant characteristics are the shift away from the origin and the apparent skew toward higher values. The sample mean, standard deviation, and skewness coefficient are (Table 4.5.1):

image

Uniform model The simplest probability density function to use is the uniform distribution

image

image

Fig. 4.5.1 Loading-time data. (a) Histogram; (b) cumulative frequency polygon.

To compare this distribution with data, one must obtain estimates of the parameters a and b. These could be based on the method of moments (Sec. 4.1.1) or, more simply in this case, on the maximum-likelihood estimators (Prob. 4.20)

image

This density function and the corresponding CDF are plotted in Fig. 4.5.2 along with the observed histogram and cumulative frequency polygon. This simple distribution does not appear to be a reasonable representation of the phenomenon. It does not reproduce either the peak around 1.8 min or the relatively infrequent nature of the longer times.

Table 4.5.1 Loading-time data and sample moments

image

image

Notice that for the uniform distribution, simple arithmetic graph paper is the appropriate probability paper. A straight line, marked “alternate CDF” in Fig. 4.5.2b, can be fit by eye on this probability paper to represent very well all but the relatively large observations, which it predicts to be impossible. If these tail values were unimportant, the uniform model might be justified solely on the basis of its simplicity. These occasional longer loading times may, however, have a significant influence on the performance of the operation. The presence or absence of these larger times may influence an engineer’s equipment-allocation decision, and consequently some reasonable attempt should be made to reflect them in the model chosen.

image

Fig. 4.5.2 Uniform model of loading time, (a) Uniform distribution, PDF; (b) Uniform distribution, CDF.

Normal model A gaussian distribution can also be considered. It is a very tractable model analytically and a very convenient distribution for hand calculations or computer simulations. Parameter estimates are:

image

The corresponding density function and CDF are shown in Figs. 4.5.3 and 4.5.4. The peak in the data is reproduced by the normal PDF, although it is shifted to the right somewhat. The values away from the mean are represented perhaps less well. Once again, on probability paper (Fig. 4.5.4, alternate model), it is clear that if the larger values could be ignored, a very good representation could be produced by a normal model with a smaller mean and variance. In other words, except for these larger observations, either a uniform distribution or a normal distribution would appear to represent the data very well. This is true despite the marked differences in the shapes of these two distributions. When the amount of data available is small, it may seem to be fit very well by several, nonsimilar distributions. This is an inherent difficulty in empirical mathematical modeling (whether the models are probabilistic or not). Small samples simply do not contain sufficient information to permit fine discrimination among models.

image

Fig. 4.5.3 Normal model of loading time, (a) Normal distribution, PDF; (b) normal distribution, CDF.

image

Fig. 4.5.4 Normal model of loading time on normal probability paper.

The intended use of this model again dictates that a more determined attempt be made to represent the skew of data. Several convenient distributions capable of representing positive skew will be considered: the gamma distribution, the lognormal distribution, and the beta distribution.

Gamma model The gamma distribution, Sec. 3.2.3, has positive skewness and might be considered as a possible way to represent the data here. The PDF is [Eq. (3.2.18)]:

image

with

image

image

and

image

If the method of moments is used to estimate parameters k and λ, we must solve two equations for the estimates image and image:

image

The solution is

image

The implication is that the skewness is

image

which is significantly less than the observed sample value of 0.97. In fact, with k so large, this gamma distribution is not very different from a normal distribution (Sec. 3.3.1); hence the same weaknesses will be encountered in the use of the gamma distribution as found with the normal distribution, but without the relative convenience of the latter.

Shifted gamma model The major characteristics of the observed shape can be reproduced, however, if a shifted gamma distribution is used. Such a distribution will have the following form (Sec. 3.5.1):

image

in which a is an added, shifting parameter. Since T – a has the same distribution as T in Eq. (4.5.2), it is clear that the mean will become

image

while the variance, standard deviation, and skewness coefficient, all being related to central moments, will remain unchanged in their relationship to the parameters.

Again using the method of moments, the parameter estimates become the solution of the three equations

images

The solutions are

images

This shifted gamma distribution is plotted in Fig. 4.5.5. It is clear that distribution is capable of reproducing all the pertinent properties observed in the data. This ability, combined with mathematical tractability and widely available tables and computer routines to evaluate gamma functions, make the shifted gamma distribution a strong contender to represent loading time, at least with the data available.

Shifted lognormal model A second distribution that can successfully be fitted to data of this shape is a shifted lognormal distribution:

images

in which σln Y is the standard deviation of the natural logarithm of Y = Ta and mln Y is the mean of ln Y.

Method-of-moments estimation of the parameters is somewhat

images

Fig. 4.5.5 Shifted gamma model of loading time, (a) Shifted gamma distribution, PDF; (b) shifted gamma distribution, CDF.

more awkward, but it is straightforward. Since the standard deviation and skewness coefficient of Y equal those of T, which are in turn estimated by the corresponding sample moments, we can solve Eq. (3.3.38) for the mean of Y. That equation is

images

implying

images

With estimates of σY and γ1 this cubic equation can be easily solved by trial and error for an estimate of mY. Then the coefficients σln Y and mln Y can be estimated as in Sec. 3.3.2 [Eqs. (3.3.35) and (3.3.36)], and the shift parameter is simply

images

The results are

images

images

Fig. 4.5.6 Shifted lognormal model of loading time, (a) Shifted log-normal distribution, PDF; (b) shifted lognormal distribution, CDF.

images

Fig. 4.5.7 Shifted lognormal model of loading-time on probability paper.

The shifted lognormal distribution with these parameters is plotted in Figs. 4.5.6 and 4.5.7. Everything said above for the shifted gamma distribution also holds for this shifted lognormal distribution. It is capable of reflecting the observed shift and skew, and it is a reasonably tractable distribution. Without further information it must be considered equally as good an empirical model as the shifted gamma distribution.

Beta model As a final consideration, a beta distribution (Sec. 3.4.2) will be fit. In general this distribution has four parameters [Eq. (3.4.17)], here denoted p, r, a, and b.

images

With the wide variety of shapes possible (Fig. 3.4.1) and with a total of four parameters (one more than either of the previous two distributions) free to be used in the fitting process, the beta distribution can be made to reproduce very closely virtually any observed histogram. If the method of moments is used for estimation, it is clear that at least the first four moments of the data would be matched identically. In practice, however, other factors may influence the values of the parameters used. The four equations which must be solved to obtain the estimates can only be solved by arduous trial and error in several variables simultaneously. Tables are not complete for a wide range of the parameters. The estimates of the limits a and b may turn out to be such that an observed value lies outside of them. This failure in the model may or may not be important, depending on its intended use.

In this specific case it would be desirable to use a value of r greater than unity in order to have the density function zero at the lower limit (Sec. 3.4.2). In fact, the reader can verify that a beta distribution with a = 1.58, b = 3.44, r = 1.5, and p = 8.0 has mean and standard deviation equal to the observed values, a skewness coefficient of 0.96 (which is very close to the observed value of 0.97), and a shape very much like the shifted gamma distribution. Lack of convenient tables for noninteger r suggests, however, the use of r = 1.0 and p = 5.0, which yield a skewness coefficient [Eq. (3.4.14)] of 1.05. Solving Eqs. (3.4.18) and (3.4.19) for a and b estimates gives â = 1.63 and images.

This beta distribution is plotted in Fig. 4.5.8. The cumulative seems to fit well, but one might object to the fact that the PDF does not start from a zero value at the lower limit. In addition, the estimated value of a is larger than the smallest observed value, 1.58. These facts, combined with the rather unsatisfactory parameter-estimation procedure, lead to the judgment that the beta distribution may not be as useful a model for the simulation program as either the shifted gamma or the shifted lognormal.

In the light of the available data there seems little evidence to dictate the choice of the shifted gamma over the shifted lognormal or vice versa. The final choice should be made on the basis of ease of use in the intended program. Consequently no general recommendations can be made here.

Closeness-of-fit statistics If the engineer needs for some reason to make a final selection between the remaining contending models in an “objective” way, influenced only by the data, a closeness-of-fit statistic such as the χ2 statistic (Sec. 4.4.2) can be used in a somewhat heuristic manner to accomplish this choice. It should be emphasized once again that goodness -of-fit hypothesis tests (Sec. 4.4.2) are not designed to choose among contending models but rather to suggest that a given proposed model should or should not be retained.

images

Fig. 4.5.8 A beta model of loading time, (a) Beta distribution, PDF; (b) beta distribution, CDF.

To use closeness-of-fit statistics to aid in the model-selection process we adopt an attitude analogous to that used in maximum-likelihood parameter estimation: We shall select that model for which the likelihood of the observed value of the corresponding closeness-of-fit statistic is the largest. (We do not, notice, select the model with the smallest value of the statistic.)

Consider the use of the χ2 statistic for choosing between the shifted gamma and shifted lognormal models. In this illustration we chose to lump the data into six intervals (of equal likelihood if the assumed model holds ). The expected and observed numbers of observations are listed in Table 4.5.2. The observed numbers can be read directly from cumulative plots (Fig. 4.5.9).

In terms of this small number of intervals the shifted gamma distribution happened to yield a “perfect fit.” The statistic is zero. This may not, however, be as likely an outcome as the value ⅔ observed with the shifted lognormal model. To determine the more likely of the two observed values of the statistics, however, their distributions must be known. In this example, both models contain three data-estimated parameters. In general, of course, the contending models may have different numbers of data-estimated parameters. In both cases, then, the statistics have a χ2 distribution with ν = 6 – 1 – 3 = 2 degrees of freedom (Sec. 4.4.2). The expected value of a χ2 random variable is v or 2 in this case. The lognormal value of ⅔ is in fact closer to the expected value than is the gamma value of 0. But a maximum-likelihood approach is preferred here. The χ2 distribution with v = 2 is sketched in Fig. 3.4.3 and happens to coincide with an exponential distribution with parameter ½. Clearly with a distribution of this shape, smaller values are more likely than larger ones. The observed value of the shifted gamma’s statistic, 0, is more likely than that of the shifted lognormal’s statistic, ⅔. The former distribution is therefore to be preferred on this basis.

Table 4.5.2 χ2 statistic evaluation

images

images

Fig. 4.5.9 Evaluation of χ2 statistic, loading time, (a) Shifted gamma; (b) shifted lognormal.

Discussion It is important to distinguish between the more familiar curve fitting associated with deterministic functional relationships between variables and the process here of selecting a probabilistic model on the basis of a comparison between mathematical functions (PDF’s and CDF’s) and observed values or shapes. In the latter case the closest fit cannot be said to be the best fit. To a point, as in the earlier part of this illustration, closeness of fit of a shape of distribution to the observed shape of data can be used to select from among models. This is reasonable for the gross characteristics (skewness, shift, etc.) which are more reliably represented by the data. At a finer scale, however, between distributions which reproduce the grosser characteristics equally well, the expected variation must be recognized and the closeness-of-fit criterion dropped. In its place either one can state that the data are insufficient to discriminate when professional factors (convenience, etc.) will govern, or he may elect to choose on the basis of a maximum-likelihood criterion using a convenient statistic such as the χ2 statistic.

Without further evidence, no extrapolations should be made or physical significance placed on the distribution chosen to represent the observed data. In particular, in this illustration, the estimated lower limits of the shifted distributions should not be interpreted as values which will always be exceeded. One should place no great confidence in the predictions based on the tails of empirically adopted models. In the same sense that we expect the observed sample mean and sample standard deviation to be close but not perfect estimates of the variable’s moments, so the shape in the central portion and other gross characteristics of the fitted model, including its mean and standard deviation (which, of course, are usually chosen to equal the corresponding sample values), are probably close but not perfect representations of the same characteristics of the physical variable. As with the parameters, the larger the sample upon which it is based the more confidence one can have that these characteristics are close to the true ones.

If for some reason loading time proves to be a critical component in the total simulation of construction operations, the engineer would undoubtedly review his choice of distribution. More data from the same or other jobs would help, although combining the data from situations where the parameters are different is not a simple task. More data must be expected to alter the estimated moments, including the coefficients of variation and skewness; further, the additional data may even indicate that some of the gross shape characteristics of the adopted model are incorrect. We cannot stress frequently enough the variation inherent in samples of small size and hence the small confidence one can have in conclusions based on an observation of such a sample.

On the other hand, the engineer may prefer to study the loading operation more carefully in order to justify the adoption of a particular distribution as a reasonable model of the physical mechanism generating the variation. He might be guided in this investigation by knowledge of the distributions which he found to fit the data well and the mechanisms known to generate them (Chap. 3).

For example, in this case the fact that a gamma variable is the sum of other independent gamma variables (including the exponential variable) might suggest that the engineer investigate the validity of a model which recognizes that the total time is the sum of the times required to load a (small) number of power-shovel buckets into a truck. Each bucket load requires a fixed time plus a random time. If, for example, the number of buckets required was k, if the fixed time was a/k, and if the random times were exponential with common parameter λ, then the total loading time would have a shifted gamma distribution with parameters a, k, and λ, where k is integer. This physical model could be verified by collecting appropriate data and testing goodness-of-fit hypotheses (Sec. 4.4.2) on the shifted gamma distribution and on the individual exponential distributions. Questions as to the validity of the assumptions of independence of these latter times, of a fixed number of buckets per load, etc., could be raised, but more elaborate models without these simplifying assumptions could always be constructed (with correspondingly more complicated distributions for T). The point is simply that the distribution suggested empirically by the data may be a starting point for a more detailed, more physically based investigation, if necessary.

4.5.2 Model Selection: Illustration II, Maximum Annual Flows

In this illustration it is desired to select a model to describe the observations of annual maximum flood flows on the Feather River at Oroville, California. The data for 1902 to 1960 are presented in Table 4.5.3. A histogram of this data appears in Fig. 4.5.10. The model will be used for economic studies in the design of small flood-control projects (dikes, channels, etc.) in the area.

Physical models Some engineers have given arguments that such floods, as results of a chain of influencing factors, should be governed by the lognormal distribution (Sec. 3.3.2); others have argued that, as the maximum of daily or weekly flows, such flows should be governed by an extreme-value distribution (Sec. 3.3.3). There are reasons for questioning either of these physical models. Both are used in practice, however, and in this case the engineer wishes to choose between these two models after looking at the data.

Table 4.5.3 Maximum annual flows, Feather River, Oroville, California

images

Qualitative comparisons Consider first the gross characteristics of the two models as compared to the data. The lognormal distribution is limited to positive values, as the data are, while the Type I extreme-value distribution is not. To overcome this limitation one might quite reasonably consider a Type II extreme-value distribution limited at zero; there is no evidence as to the form of the distribution of the underlying random variables whose maximum this flow represents, and hence there is no fundamental reason to prefer Type I to Type II. We chose here to continue to consider the Type I distribution, however, for two reasons. First, civil engineers who have used extreme-value distributions to describe floods have commonly used Type I only, probably simply for computational convenience. Our example here will retain this tradition because it will permit those familiar with the problem to compare the conclusions given here with similar studies. Second, and more important, it permits us to reinforce the idea that the use to which a model will be put strongly influences its choice. In this case, since the interest in flood prevention dictates a focus on the upper tail of the distribution, we need not be overly concerned with agreement in the lower tail. In this case the future engineering use permits a wider freedom in model selection than would have been the case had we been seeking a “scientific” description of all the observed data. In short we shall not reject the Type I extreme-value distribution in favor of the lognormal simply because one predicts no negative values while the other does.

images

Fig. 4.5.10 Annual maximum flood discharge, Oroville, California, 1902 –1960.

The histogram of data indicates a decided skew to the right. Such skew can have a critical influence on upper-tail probabilities. Observing Figs. 3.3.3 and 3.2.3, it is clear that both models demonstrate skew in this same direction.

Probability paper plots The data are plotted on the appropriate probability paper for each of the models in Figs. 4.5.11 and 4.5.12. Straight lines have been fit by eye in both cases. The indicated bands, defined by the value that the Kolomogorov-Smirnov statistic would exceed with probability 20 percent, suggest that in neither case can the sample be considered unlikely, if in fact the model holds. Such bands provide a rough, but sample-size-dependent initial screening. Smaller sample sizes would lead to even more widely spread curves.

Except for the anticipated discrepancies between the extreme-value model and the very small observed values, both models seem to be capable of representing the data adequately in the sense that systematic, major deviations of the data away from straight lines are not evident in either case. (This is true despite the fact that if either cumulative distribution were plotted on the probability paper of the other, a gentle curve would result, as the reader can easily verify.)

Use of a goodness-of-fit statistic The χ2 statistic can also be used to compare the two distributions in the manner suggested in the last section. In each case, a histogram with intervals chosen to be equally likely (if the particular model were true) was constructed. The results are tabulated in Table 4.5.4. A χ2 random variable with ν = 10 – 1 – 2 = 7 degrees of freedom has a mean of 7 and a mode, or most likely value, of ν – 2 or 7 – 2 = 5. Thus, the observed value of 6.6 associated with the log-normal assumption is quite close to the most likely value. On the other hand, the standard deviation of the χ2 variable is images, implying that even the large value of 14.6 observed under the extreme-value assumption is only 2 standard deviations from the mean. At the same time the major contribution to this large value comes from a single interval in the lower tail, where, as discussed above, the failure to fit was anticipated and is not considered important. In short this statistic cannot be used successfully in this example to help in choosing between the distributions, as it cannot properly account for the engineering judgment that these smaller values are not of concern.

images

Fig. 4.5.11 Annual-floods illustration: extreme-value paper.

images

Fig. 4.5.12 Annual-floods illustration: lognormal paper.

Table 4.5.4

images

Discussion In summary, the available tools provide no strong reason to favor one model over the other as providing a “better” description of the observed data. The data are simply insufficient to permit a clear-cut choice between them.

The final choice must necessarily remain a professional one. The engineer must consider the validity of the physical justifications for the models, the implications of the two models, and the consequences associated with an improper choice. In this case, the authors favor the extreme-value physical argument over the multiplicative one. The implications in terms of relative computational convenience depend upon such factors as the availability of electronic computational facilities and the characteristics of a possible subsequent, larger model, of which this distribution will become a part. (For example, if the flood level is going to be the devisor in a ratio of capacity to demand, representing a “safety factor,” computations with the lognormal distribution may very well be simpler since X –1 is also lognormal if X is.)

The consequences of an improper model choice will lie, in this case, in a nonoptimal design for the flood-control facilities. Notice that larger floods are predicted to be more frequent by the lognormal model than by the extreme-value model. For example, the 50-year flood in the former case is 300,000 cfs, while in the latter it is only 220,000 cfs. If, for example, the lognormal model is used when the extreme-value model is in fact more accurate, overconservative capabilities will be designed. The extra money spent on the initial investment in these structures would no longer be available to construct similar flood facilities in another location. On the other hand, if the extreme-value model were used as a basis for design when the lognormal model was a better representation, smaller than optimal capacities might result with a subsequently higher risk of flood damage than appropriate. If time and design costs permit it, the engineer might develop preliminary designs on the basis of both models to assess the sensitivity of the final decision, namely the facility’s capacity and cost, to his choice of model. In many, diverse problems this sensitivity has been found to be significantly smaller than one might expect. Finally, of course, the engineer must simply make what appears to him at the time, with the evidence available, to be the better choice.

4.5.3 Summary

In summary, the selection of a convenient distribution to represent a random variable when observed empirical data are the only source of information is a two-part process:

1. The first step is an initial qualitative screening of contending distributions to eliminate those that are not capable of reproducing the important, gross characteristics of the data. In this gross sense closeness of fit is the criterion governing the choices.

2. The final selection step is one in which, if professional reasons have not already dictated the choice, a quantitative “model estimation” procedure may be used. Here maximum likelihood rather than closeness of fit is a more reasonable criterion.

    In both phases professional judgment will influence the choices. Such judgment is necessary to weigh the relative importance of computational tractability, accuracy of representation, and failure of a model to reproduce certain aspects displayed in data (limits, skew, etc.). In all such considerations the ultimate use of the model will be the single critical factor. If, for example, the model will be used in part of an engineering decision-making process, two models are equally accurate if the same decision is reached in either case. If they are equally accurate in this sense, the distribution which is easier to use is to be preferred.

Caution must be urged in any form of extrapolation of empirically fit distributions. The data are seldom sufficient to discriminate among at least two or three acceptable distributions, yet each may predict quite different probabilities for particular events. The discrepancies may be particularly large in the tails of these distributions; here there may exist order-of-magnitude differences in the small probabilities based on extrapolations of different distributions, even though these distributions appear to fit central observed data equally well.

With small samples, two distributions of quite dissimilar density functions may both seem to fit the data, particularly if viewed on graphs as cumulative distributions. Histograms of the same small set of data may take on different shapes by merely changing interval lengths and starting points. Density functions of one shape may, simply by the nature of random variables, yield a histogram of quite dissimilar shape (Sec. 4.4.1). Such an event is more likely with smaller samples.

In short, there may be need for and convenience in looking at data, choosing some mathematically defined distribution, and attributing to the physical variable the properties of this model, but the engineer must at all times be aware of the limitations and potential inaccuracies of the process. He must keep these difficulties in mind when assessing engineering predictions and decisions based on the adopted model.

4.6 SUMMARY OF CHAPTER 4

This chapter presents the conventional statistical methods available for dealing with the uncertainty in the parameters of a probabilistic model and with the uncertainty in the form of the model itself. In both cases real data, representing observations of a random phenomenon, must be available. The degree of uncertainty in the parameters and in the model is inversely related to the amount of data available.

Section 4.1 considers parameter estimation. Two methods, the method of moments and the method of maximum likelihood, are introduced for generating rules for extracting estimates from data. These rules or estimators are functions of the random variables X1, X2, . . ., Xn in the sample, and therefore are themselves random variables. Study of the moments and distributions of estimators permits comparison among contending estimators or rules, and permits quantitative statements to be made about the relationship between the estimator and the parameter it estimates (e.g., bias, mean square error, and confidence limits). Confidence limits, although not strictly probabilities, represent quantitative statements about the degree of uncertainty in a parameter.

A significance (or hypothesis) test is a scheme for extracting significance from dispersed statistical data. It provides a quantitative comparison between observed deviations and the deviations expected in light of the probabilistic nature of the phenomenon. If the deviation of the estimate (or sample statistic) from the hypothesized value (or values) is larger than considered “likely,” it is said to be statistically significant and the hypothesis is rejected. The measure of what is “likely” or not is the significance level α.

The methods of estimation and hypothesis testing are applied in Sec. 4.3 to a simple multivariate linear model of the form

images

More general models of the linear form are also discussed. They form a widely used class of models of which the reader should be aware.

The form of the model itself is evaluated in Sec. 4.4. The “shape” of the data is compared with that of the model. Proper evaluation requires an appreciation of the inherent variability of the histogram or cumulative histogram of the data and the factors upon which that variability depends. Again, significant differences between data and model can be evaluated by conventional hypothesis tests, namely by the χ2 (chi-square) or Kolmogorov-Smirnov goodness-of-fit tests. These tests are designed to evaluate a particular model, not to compare several models.

Finally, the problem of choosing from among several empirical models given statistical data is discussed (Sec. 4.5). Professional as well as closeness-of-fit criteria are important. Quantitative statistics can be used in a maximum-likelihood manner, but it must be recognized that the smallest value of, say, a χ2 statistic is not necessarily the most likely value.

REFERENCES

General

Bowker, A. H. and G. J. Lieberman [1959]: “Engineering Statistics,” Prentice-Hall, Inc., Englewood Cliffs, N.J.

Brownlee, K. A. [I960]: “Statistical Theory and Methodology in Science and Engineering,” John Wiley & Sons, Inc., New York.

Brunk, H. D. [I960]: “An Introduction to Mathematical Statistics,” Ginn and Company, Boston.

Cramer, Harald [1946]: “Mathematical Methods of Statistics,” Princeton University Press, Princeton, N.J.

Freeman, H. [1963]: “Introduction to Statistical Inference,” Addison-Wesley Publishing Company, Inc., Reading, Mass.

Fisz, M. [1963]: “Probability Theory and Mathematical Statistics,” 3d ed., John Wiley & Sons, Inc., New York.

Graybill, F. A. [1961]: “An Introduction to Linear Statistical Models,” vol. I, McGraw-Hill Book Company, New York.

Hald, A. [1952]: “Statistical Theory with Engineering Applications,” John Wiley & Sons, Inc., New York.

Lindgren, B. W. and McElrath, G. W. [1966]: “Introduction to Probability and Statistics,” 2d ed., The MacMillan Company, New York.

Lindley, D. V. [1965]: “Introduction to Probability and Statistics,” part 1, “Probability,” part 2, “Inference,” Cambridge University Press, Cambridge, England.

Neville, A. M. and J. B. Kennedy [1964]: “Basic Statistical Methods for Engineers and Scientists,” International Textbook Company, Scranton, Pa.

Rao, C. R. [1965]: “Linear Statistical Inference and Its Applications,” John Wiley & Sons, Inc., New York.

Specific text references

Alexander, B. A. [1965]: Use of Micro-Concrete Models to Predict Flexural Behavior of Reinforced Concrete Structures Under Static Loads, M. I. T. Dept. Civil Engr. Res. Rept. R65–04, Cambridge, March.

Allen, C. R., P. St. Amand, C. F. Richter, and J. M. Nordquist [1965]: Relationship between Seismicity and Geologic Structure in the Southern California Region, Bull. Seismol. Soc. Am., vol. 55, pp. 753–797.

Beard, L. R. [1962]: “Statistical Methods in Hydrology,” U.S. Engineer District, Army Corps of Engineers, Sacramento, Calif., January.

Beaton, J. L. [1967]: Statistical Quality Control in Highway Construction, ASCE Conf. Preprint 513, Seattle, Washington, May.

Bendat, J. S. and A. G. Piersol [1966]: “Measurement and Analysis of Random Data,” John Wiley & Sons, Inc., New York.

Birnbaum, Z. E. [1952]: Numerical Tabulation of the Distribution of Kolmogorov’s Statistic for Finite Sample Size, J. Am. Statist. Assoc, vol. 47, pp. 425–441.

Chow, V. T. and S. Ramaseshan [1965]: Sequential Generation of Rainfall and Runoff Data, J. Hydraulics Div., Proc. ASCE, vol. 91, HY4, pp. 205–223, July.

Gerlough, D. L. [1955]: “The Use of Poisson Distribution in Highway Traffic,” The Eno Foundation for Highway Traffic Control, Saugatuck, Conn.

Granholm, H. [1965]: “A General Flexural Theory of Reinforced Concrete,” John Wiley & Sons, Inc., New York.

Grant, E. L. [1938], Rainfall Intensities and Frequencies., ASCE Trans., vol. 103, pp. 384–388.

Greenshields, B. D. and F. M. Weida [1952]: “Statistics with Applications to Highway Traffic Analysis,” The Eno Foundation for Highway Control, Saugatuck, Conn.

Gumbel, E. J. [1958]: “Statistics of Extremes,” Columbia University Press, New York. Haight, F. A. [1963]: “Mathematical Theory of Traffic Flow,” Academic Press, Inc., New York.

Johnson, A. I. [1953]: Strength, Safety, and Economical Dimensions of Structures, Roy. Inst. Technol. Div. Bldg. Statics Structural Eng. (Stockholm), Bull. 12.

Langejan, A. [1965]: Some Aspects of the Safety Factor in Soil Mechanics, Considered as a Problem of Probability, Proc. 6th Intern. Conf. Soil Mech. Foundation Des., Montreal.

Liu, T. K. and T. H. Thornburn [1965]: Statistically Controlled Engineering Soil Survey, Univ. Illinois Civil Eng. Studies Soil Mech. Ser., no. 9, January.

Mann, H. B. and A. Wald [1942]: On the Choice of the Number of Class Intervals in the Application of the Chi-square Tests, Ann. Math. Statist., vol. 13, p. 306.

Markovic, R. D. [1965]: Probability Function of Best Fit to Distributions of Annual Precipitation and Rainfall, Colorado State Univ. Hydrology Paper No. 8, August.

Massey, F. J. [1951]: The Kolmogorov Test for Goodness of Fit, J. Am. Statist. Assoc, vol. 46, pp. 68–78.

Mills, W. H. [1965]: Development of Procedures for Using Statistical Methods for Process Control and Acceptability of Hot Mix Asphalt Surface, Binder, and Base Course, Proc. Highway Conf. Res. Develop. Quality Control Acceptance Specifications Using Advan. Technol., Washington, D.C., pp. 304–378, April.

Natrella, M. G. [1963]: “Experimental Statistics,” National Bureau of Standards Handbook 91.

Pattison, A. [1964]: Synthesis of Rainfall Data, Stanford Univ. Dept. Civil Eng. Tech. Rept. No. 40, July.

Pieruschka, E. [1963]: “Principles of Reliability,” Prentice-Hall, Inc., Englewood Cliffs, N.J.

Shook, J. F. [1963]: Problems Related to Specification Compliance on Asphalt Concrete Construction, Proc. Highway Conf. Res. Develop. Quality Control Acceptance Specifications Using Advan. Technol., vol. 1, Washington, D.C., April.

Venuti, W. J. [1965]: A Statistical Approach to the Analysis of Fatigue Failure of Prestressed Concrete Beams, J. Am. Concrete Inst., AC I Proc, vol. 62, pp. 1375–1394, November.

Vesserau, A. [1958]: Sur les conditions d’application de criterion χ2 de Pearson, Rev. Statist. Appl., vol. 6, p. 83.

Vorlicek, M. [1963]: The Effect of the Extent of Stressed Zone upon the Strength of Material, Acta Tech. Csav., no. 2, pp. 149–176.

Wald, A. and J. Wolfowitz [1943]: An Exact Test for Randomness in the Nonparametric Case Based on Serial Correlation, Ann. Math. Statist., vol. 14, pp. 378–388.

Williams, C. A. [1950]: On the Choice of the Number and Width of Classes for the Chi-square Test of Goodness of Fit, J. Am. Statist. Assoc, vols. 45, 77.

Tables

Burlington, R. S. and D. C. May, Jr. [1953]: “Handbook of Probability and Statistics with Tables,” McGraw-Hill Book Company, New York.

Hald, A. [1952]: “Statistical Tables and Formulas,” John Wiley & Sons, Inc., New York.

Owen, D. B. [1962]: “Handbook of Statistical Tables,” Addison-Wesley Publishing Company, Inc., Reading, Mass.

Pearson, E. S. and H. O. Hartley [1966]: “Biometrika Tables for Statisticians,” Cambridge University Press, Cambridge, England.

PROBLEMS

4.1. Use a table of random numbers to take five independent samples, each of size 10 observations, of a random variable uniformly distributed on the interval 0 to 100. From each of the five samples estimate the mean and standard deviation of the distribution.

All the estimates produced by the class will be plotted in a histogram and their mean and standard deviation will be estimated. What are the true values of these moments of the estimator of the mean?

4.2. The results of testing samples from 15 very large lots of concrete pipe for in-place porosity were as follows. Each sample included 100 sections of pipe:

Lot no. i No. failing to meet standard, ni
1 1
2 7
3 3
4 5
5 4
6 2
7 6
8 9
9 1
10 3
11 4
12 8
13 0
14 1
15 6

(a) Using all 1500 tests, estimate the fraction of failures, p by the observed fraction of failures.

(b) Using a normal approximation for the distribution of the estimator in part (a), form and compute 95 percent confidence limits on p. You will need an estimate of the variance of the estimator. Determine this variance as a function of p and estimate the variance by substituting the estimate of p.

(c) Of what form is the true distribution of Ni/100, the estimator, for a sample size of 100? Hint: What is the distribution of Ni?

(d) Plot a histogram of these 15 observed values of the estimator images Compare this to a bar chart of the true distribution of the estimator given that p = 0.04. Compare, too, the variance of this estimator with the observed sample variance of the 15 values of images.

4.3. The following 31 values of annual maximum runoff (in cfs) were observed in the Weldon River at Mill Grove, Missouri, from 1930 to 1960: 108.0, 53.6, 585.0, 98.1, 40.6, 472.0, 96.5, 217.0, 42.7, 208.0, 143.0, 93.7, 398.0, 298.0, 248.0, 441.0, 386.0, 567.0, 122.0, 151.0, 244.0, 400.0, 245.0, 114.0, 659.0, 132.0, 44.0, 72.5, 135.0, 635.0, 508.0.

(a) Using these data, estimate the parameters of a normal distribution model.

images

(b) What are the “95 percent two-sided confidence limits” on mX assuming that σ is known with certainty to equal 180 cfs?

(c) Estimate the parameters of a lognormal distribution model.

images

4.4. Simulation sample-size determination. When designing a simulation study, it is important to determine how many times to sample the system behavior. Suppose it is desired to estimate the probability that some event pi occurs (e.g., maximum system response exceeds some critical value). The obvious estimator of pi is

images

in which Ni is the observed number of occurrences of the event in a sample of size n. Ni has a binomial distribution, but for large enough values of npi, Ni is approximately normal. Use this approximation to

(a) Find (1 – α) 100 percent two-sided confidence limits images on the true Pi given n and an observed value of the estimator images Replace pi by images where necessary.

(b) Show that the necessary sample size to insure that the (1 – α) 100 percent confidence limits are within γ100 percent of the true value of pi (that is, that cα ≤ γpi) is

images

where kα/2 is that value which a standardized normal random variable exceeds with probability α/2. The answer is a function of pi which is unknown before the experiment is performed. This implies that the engineer must estimate the value of pi before the experiment.

(c) In a rainfall generation study (Chow and Ramaseshan [1965]) it was desired to estimate pi (which was thought to be about 0.15) to within γ100 = 20 percent of its true value with confidence (1 – α) 100 percent = 90 percent; find the required n.

4.5. Data from an AASHO road test revealed that the deviations from the planned thickness of pavement layers were very nearly normally distributed with

images

The sample size was 4217.

(a) What are the 95 percent confidence limits on m assuming

(i) σ = 0.195 in.

(ii) σ is not known with certainty.

(b) Adopting the normal model using images and s as point estimates of m and σ, what is the probability that a particular specimen will exceed the tolerances – 0.02 ft and +0.04 ft?

4.6. Control charts. To aid in controlling manufacturing processes, control charts on the sample mean are maintained by engineers responsible for quality control. If the mean m and standard deviation σ of the strength, say, of the material being produced are known, one can compute limits of a control band, images outside of which the average of a sample of size n should fall only very rarely (k usually equals 3) unless something has gone “wrong” with the process. By periodically (say, daily) taking a sample of size n and observing the average, the engineer can ascertain whether the process is remaining stable, i.e., within the limits of natural variation. The values of m and σ are presumably known from experience or are estimated during a stable period.

One control chart for a plant producing binder mix for a highway surface appeared as shown on page 505. The characteristic involved was the percent of the aggregate passing a no. 4 sieve. The parameters, estimated over a long stable period, are m = 40.0 and σ = 4.76. Each day a sample of size five was taken. The control band is images. Corrective action to the process is indicated by the observed value of images in the fourteenth lot. [Conventional (nonstatistical) specifications tolerances were, incidentally, 40 ± 4.0 percent.] Some control schemes call for additional testing if an observation lies outside, say, images

images

Fig. P4.6 Control-chart illustration.

Construct a control chart with k = 3 for the following reported concrete test data. Averages are of samples of size 10.

Lot no. Average ultimate compressive
   
1 4004
2 3845
3 4195
4 4170
5 4043
6 3695
7 3900
8 3730
9 3670
10 3667

The mean has been estimated to be 3868 psi, and σ is 318 psi. (The specified strength was 3500 psi.) Is process correction indicated at any point? Under the assumption that the sample mean is normally distributed, what is the likelihood that for any single lot process-control correction will be initiated unnecessarily, i.e., that the excursion of images beyond the control band is simply a rare random deviation? What is the probability of at least one unnecessary correction during a job of 100 days or lots?

4.7. Specifications have been written to control concrete pavement thickness during construction which state that the proportion of slab less than 6 in. thick must not exceed 10 percent.

(a) To control his production the contractor periodically measures the thickness just as it is poured. The contractor sets up a null hypothesis that states that by his present operating procedure the true proportion p, of such thicknesses is just 0.1. He must go to the expense of changing his method of operation if forced to reject this hypothesis, and so he wants to set α at least as low as 10 percent. With a sample size of five, what should the value of c be for a test of the form: accept H0 if R, the number of specimens less than 6 in. thick, is less than c.

(b) The state inspector, on the other hand, must take the position that the major loss will occur if he mistakenly accepts a poor-quality job. He has cores cut from the slab after it is completed. He sets up the null hypothesis that p, the proportion of specimens less than 6 in. thick is 0.2, that is, that the job is inadequate. He will only reject this hypothesis for the alternative that the proportion is only 0.1 if more than c specimens out of n are greater than 6 in. thick. To keep the risk of accepting a poor job low, he wishes to make the significance level less than 5 percent. He recognizes, however, that if he rejects the job (accepts the null hypothesis), the contractor will make many additional tests to attempt to demonstrate a good job. If the contractor proves that p is in fact 0.1, the state must bear the expense of the added testing and delay. Therefore the inspector wants to keep the type II error at least as low as 20 percent. How many specimens must he take to satisfy both criteria? (Assume in this part that the number is large enough that the binomial variable R is approximately normally distributed with the same mean and variance.)

Remark. The problem here is illustrative of the common statistical decision problems called “quality control” and “acceptance sampling.” Note that one is carried out by the manufacturer and the other by the purchaser, not necessarily with the same α and β probabilities. More commonly the problem is set up with composite rather than simple hypotheses, for example, p ≥ 0.1 versus p < 0.1. Many interesting problems can be explored. The contractor will usually carry out a two-sided test, for example, p = 0.1 versus p ≠ 0.1, to avoid wasting material and profit. (See Prob. 4.6 on quality control charts.) To arrive at his proper sample size, the contractor must ask himself what is his acceptable β probability for failing to catch an out-of-adjustment operation. If p is too high and the job is rejected, he may have to redo it at a loss. Both parties should ask how expensive the observations are. (It is suggested in the problem statement that they are cheaper for the contractor than the inspector. This factor should in principle influence the error probabilities. The interested reader should reconsider this problem after studying Chap. 6.)

4.8. Since, during a slip of the soil, small strengths in parts of a slope of generally homogeneous soil will be compensated for by larger strengths elsewhere, it has been suggested that the stability of the slope is related directly to the spatial average strength of the material. If the strength value (say, the shear strength) used in design is the observed mean images divided by a safety factor, find the probability that the design value used will be less than the true mean if a sample size of three is used and safety factor = 1.5. Assume that the underlying variable X is normally distributed with unknown mean and variance. Hint: images has, within some constants, the t distribution.

4.9. In a test of the hypothesis that vehicle speeds are normally distributed, a χ2 test was applied to a sample size of 100. The estimated mean and standard deviation were 40.8 and 9.31 mph, respectively. The number of intervals used in the test was seven; the observed statistic was d1 = 4.506. Is the evidence such as to reject the hypothesis at the 10 percent level? If the hypothesis is true, what is the likelihood that a similar sample and test would lead to a test statistic greater than 4.506? (That is, what is the “p level” of the observed statistic?)

4.10. Triangular distribution probability paper. (a) Construct a probability paper for distributions with symmetrical triangular PDF’s on the interval a to b. Hint: Use a graphical construction.

(b) A number of individuals measured the same item twice and each reported his average reading. If each of the two reading errors is uniformly distributed between –0.5 and +0.5 unit (the smallest division on a fine scale) and the two readings are independent, what is the distribution of each individual’s average reading? (See Sec. 3.3.1.)

(c) The following readings were reported by eight individuals: 14.9, 15.0, 15.1, 15.15, 15.2, 15.25, 15.4, 15.6. Use the probability paper of part (a) to compare this data and the model of part (b). The true value is 15.2.

4.11. Using Table A.5 to first construct Type I extreme-value paper, make Type III extreme-value paper for smallest values (with a lower bound w equal to zero). This is also known as the Weibull distribution and is commonly used in reliability studies. Hint: Recall (Sec. 3.3.3) the logarithmic relationship between Types I and III.

Based on a probability paper plot inspection, is a Weibull model appropriate for the following data on the modulus of rupture strength of concrete? The model is considered because the brittle nature of concrete in tension suggests that the weakest of many microscopic elements determines the strength of any specimen. Observed data: (1 × 1 × 11 in. specimens) (psi): 1047, 986, 922, 955, 940, 1005, 1093, 1074, 1074.

4.12. Is the following set of data from Southern California consistent with the assumption that the occurrences of earthquakes (of magnitude greater than 3.0) are Poisson arrivals? That is, is the number per year in this region Poisson-distributed? (Consider using a normal approximation to the Poisson distribution.) Use probability paper, and a graphical (10 percent level) Kolmogorov-Smirnov test.

images

4.13. Among 29,531 drivers the following numbers of accidents per driver were observed (in Connecticut, 1931–1936):

images

The overall average is 0.24 accidents per operator. If the accident rate is the same for all operators, the number of accidents for any driver has a Poisson distribution. Test the hypothesis that there is no accident “proneness” in certain drivers by comparing these numbers with a Poisson distribution. Use α = 10 percent. (Proneness includes here, of course, the factor that a driver simply may drive many more than the average number of miles and hence be exposed to more accident possibilities.)

4.14. The following data on annual sediment load in the Colorado River at the Grand Canyon have been recorded (in millions of tons) 49, 50, 50, 66, 70, 75, 84, 85, 98, 118, 122, 135, 143, 146, 157, 172, 177, 190, 225, 235, 265, 270, 400, 480.

(a) Plot this data on lognormal probability paper and use a straight-line fit by eye to estimate the parameters.

(b) Does the model pass a Kolmogorov-Smirnov goodness-of-fit test at the 10 percent level of significance?

4.15. It is common in the study of fatigue failures to deal with the logarithm of the number of cycles to failure rather than with the number itself. One reason lies in the effect upon the distribution of observations (under a given level of cyclic loading). Compare the shapes of histograms of the following data. Choose between the use of two models. One says that the number N is normally distributed and the other that the log of the number is normally distributed. Use probability paper.

images

4.16. Two-population Kolmogorov-Smirnov test and simulation-model evaluation.

When a stochastic simulation model has been constructed (Sec. 2.3.3) and must be evaluated, a valuable test of the model is to compare the distribution of some output generated by the model to the observed distribution of the corresponding historical data. A test is available based on the Kolmogorov-Smirnov statistic which permits such a comparison even though the true underlying distribution is not known. In other words the hypothesis is simply that the data are from the same (but unknown) distribution.

The test is accomplished by using the statistic images the maximum value of the difference between images and images the observed cumulative histograms [Eq. (4.4.17)] of the two samples:

images

images

Fig. P4.16 Comparison of observed histograms.

The distribution of this statistic is the same as that of D2 [Eq. (4.4.18)]. The value of n used should be (Fisz [1963])

images

where n1 and n2 are the sizes of the two samples. The distribution of D2 is given in Table A.7. The hypothesis that the two samples came from the same distribution (i.e., that the simulation model is a good one) should be rejected at the α percent level of significance if the observed value images of the statistic images is greater than the critical value, i.e., that value which the random variable images exceeds only with probability α percent.

Use this test to evaluate a model of rainfall proposed by Pattison [1964]. The cumulative histograms of historic annual floods and the synthesized annual floods are shown in the figure on page 509 (plotted on extreme-value paper). In both cases the sample size is 40. Use α = 0.05.

4.17. Using the data of Prob. 4.3, make plots on uniform, normal, and lognormal probability paper. Fit straight lines and compare the forecast runoffs with 10-, 30-, and 50-year return periods, according to each model. Discuss the variability of the forecasts. What values would you recommend for flood-control studies assuming that a major flood would involve (a) minor damage or (b) major damage?

4.18. Soil sampling sizes. Based on a pedologic soil survey map, an engineer has an initial idea as to types of soil which will be found along a proposed highway alignment. He wishes to distribute the total number of anticipated tests n among the r different soil types in a manner which accounts for their relative variability. Show that in order to achieve uniform “accuracy” (i.e., equal coefficients of variation) for the estimator images of mi, the mean of the ith soil type, the number of independent tests in the region with this soil type should be

images

in which Vi = σi/mi. It is assumed that previous tests and experience make initial estimates of the Vi possible.

In a road through Will County, Illinois, 11 soil types were encountered with the following preliminary estimates of coefficients of variation of the plasticity index. Use the values to determine how to proportion 100 samples among the various soil types:

Soil type Coefficient of variation of plasticity index, %
Lisbon silt loam 26.4
Harpster silty clay loam 62.0
Saybrook silt loam 27.9
Elliot silt loam 23.3
Drummer silty clay loam 64.6
Ashkum silty clay loam 32.4
Andres silt loam 32.4
Symerton silt loam 15.7
Lorenzo silt loam 69.4
Dresden silt loam 23.8
Homer silt loam 20.2

4.19. The contingency table: test for independence. A popular method of testing the independence of two events A and B is to form a contingency table:

images

in which a, b, c, and d are the observed numbers of observations in which the corresponding attributes are observed; for example, b = number of tests in which B and Ac were observed.

Consider the logical estimates of p1 = P[A] and p2 = P[B]; they are (a + c) /n and (a + b) /n. Use them to compute the probability of all four possible events (such as (BAc), (AB), etc.) under the hypothesis that A and B are independent, and finally apply the χ2 goodness-of-fit test to show that the D1 statistic reduces to

images

which is approximately χ2-distributed with 1 degree of freedom, where n = a + b + c + d, n1 = a + b, n2 = c + d, n3 = a + c, and n4 = b + d.

The contingency table is often used to test whether there is relationship (dependence) between two factors. The two-way classification is often forced to simplify the test. For example, to test whether the type of failure (tension T or compression C) is related to the fatique life (number of cycles to failure) of similar specimens of prestressed concrete beams, the accompanying data can be simplified to a two-way classification contingency table.

Type of failure No. of cycles to failure
C 550
C 603
C 5,000
C 40,500
T 45.360
C 52,020
T 55.420
T 68,040
T 94,540
T 110,200
T 114,600
T 121,200
T 126,400
T 133,100
T 134,600
T 174,400
T 187,900
T 236,500

images

Test at the 5 percent significance level the hypothesis that the type of failure is unrelated to the number of cycles of failure.

4.20. For a particular type of heavy construction equipment, the breakdown time X is believed to be uniformly distributed on the interval 0 to θ, but θ is unknown. A sample of n values, x1, x2, . . ., xn, is observed (n > 1). The largest is xj.

(a) Find the likelihood function L(θ | x1,x2, . . ., xn). What are the limits over which the function is nonzero? In particular, given the observations, what is the smallest value that θ can be?

(b) What is the maximum-likelihood estimator of the parameter? That is, for what value of θ is L(θ | x1,x2, . . ., xn) a maximum? A sketch may help.

(c) Show that the method-of-moments estimator is images.

(d) What are the two estimates of θ if the observed sample is 1, 1, 1.5, 2, 4? Is the method-of-moments estimate reasonable?

4.21. The timber-beam data of Prob. 1.1 is to be used to make probability statements about strength and deflection.

(a) Based on the histograms and professional judgment, select two possible models for each set of data. Check for fit with plots on probability paper and comparative χ2 statistic likelihoods.

(b) Assuming that normal distributions provide satisfactory fits, state 90 percent confidence limits on the means and standard deviations.

4.22. The B-C Construction Company has found the following data for the last 20 jobs.

images

“Actual” costs were adjusted to eliminate the effects of change orders, etc. Make a study of the data with the objective of arriving at probability statements for use in bidding the next job. Where is the area of primary interest? Check for fit to a normal distribution using the χ2 test. Is it reasonable to compare figures for jobs of different sizes?

4.23. The Neosho River originates in Kansas and flows into Oklahoma. Recorded mean-annual-flow Q data (in cfs) near the state border are:

images

Make an analysis of the data. Try the normal and lognormal models using probability paper and compare the χ2 statistic likelihoods. Which model seems the most reasonable on a physical basis?

A rule for allocating the use of the water between the two states is desired. Study the following rule using both probability models: A fixed level x0 for Kansas and an equal value on the average for Oklahoma (that is, 2x0 = mean flow).

On the average, how often will Oklahoma have zero allocation under this rule?

4.24. The concrete pavement thickness for a series of county roads placed by one contractor was measured by taking cores. The numbers of cores at each thickness were as follows. (There is one core for each section for which a payment is made.)

images

(a) Make an analysis of the data to determine a reasonable model.

(b) The contractor assumes that the materials cost is 40 percent of the bid price, and the penalty clause in the contract is as follows.

images

What percent of the total payment can the contractor expect to receive on similar jobs using the same construction practice?

(c) Is the penalty clause effective in making the thickness at least 6 in. (the design value)? Reduce the mean thickness and estimate the contractor’s gain or loss.

4.26. Ratios of jack to anchor force for prestressed concrete cables were measured and reported in Prob. 1.3. The objectives of the study were to provide measures of mean and coefficient of variation and to obtain some idea as to the source of the variability.

(a) Make conventional two-sided 95 percent confidence interval statements on the mean and variance. Assume that the ratio is approximately normal.

(b) Fit the data to the normal probability distribution using the method of moments and probability paper. Check numerically for fit using χ2 and Kolmogorov-Smirnov test of goodness of fit.

(c) Using the concepts of statistical inference or significance, what tests would be useful in making statements about the source of variability between the two sets of data?

4.27. A bridge is to be constructed at Waukell Creek (Prob. 1.4). The soils engineer must recommend allowable soil pressures for the footings based on the reported data.

(a) Is a normal distribution a satisfactory model considering both the histogram and professional considerations? Make such a fit.

(b) Fit a Type I extreme value distribution of smallest values to the data. Construct appropriate probability paper and plot.

(c) Compare the results of (a) and (b) as to fit and probability of finding a strength specimen equal to or less than 0.20 tons/ft2.

4.28. A study of the relationship between floor (or column) loading in pounds per square foot (psf) and total area loaded can be made using the data of Table 1.2.1. Consider a particular bay. By working from the ninth floor down and assuming that all bay sizes are 400 ft2, one can obtain total loads and then loads in psf for 400, 800, 1200, . . ., 3600 square feet of area.

(a) Make a plot of observed live load in psf against total area. Do this for each of three bays on one graph. Assuming that the loads on different floors are independent, discuss the applicability of the simplest linear regression model to this problem. In particular is σ independent of x (the area) and are the observations of total loads independent of each other? Is a Markov model appropriate?

(b) Compute the mean and variance of live load in psf, for 400, 800, 1200, . . ., 3600 ft2 of floor area, using for each total area the data from 22 bays. Plot the mean and 95 percent probability limits on the graph, assuming that a normal distribution applies and that the sample moments are satisfactory population estimates. Compare the resulting plot with the code reduction of 0.08 percent/ft2 of tributary area greater than 150 ft2 (up to a maximum reduction of 60 percent). Assume a code-design live load of 100 psf.

(c) There is interest in the possible occurrence of extremely large loads, equal to or greater than 100, 150, and 200 psf, on small areas, say the “basic” floor area of 400 ft2. Consider the data from all (9) (22) = 198 bays together. What are the estimated mean and variance? Compare probability statements for such magnitudes using normal, lognormal, and gamma models.

4.29. The discharge of the Ogden Valley artesian aquifer (Prob. 1.6) is to be allocated between two major users each of whom wants 5000 acre-ft.

(a) What probability models appear satisfactory considering the histogram and professional knowledge of such systems? Fit the normal model and one other model with a range from zero to infinity. For the two models compare probabilities of finding a discharge equal to or less than 10,000 acre-ft. Are the differences important? What is the probability of finding unsatisfactory discharges in both of the first two years? Assume that annual discharges are independent. (Excess water flows away.)

(b) If the two users share all the water equally, does the probability of each user obtaining less than 5000 acre-ft change?

(c) If one user has first chance at the available supply and takes all water up to 5000 acre-ft, leaving only the excess for the second user, what changes in the study must be made to yield probability statements on the water available to both users?

(d) Suppose a small storage dam is built to store any excess water for use in following drier years. Does the simple model of independence of annual water available remain valid?

4.30. A spillway is being designed for a dam on the Feather River at Oroville, California (Prob. 1.7). The problem is to make the most reliable forecasts of future maximum annual floods. Compare the fit of the data to the normal, lognormal, and Type I extreme-value model. Use probability paper.

(a) Compare forecasts of 10-, 50- and 100-year floods.

(b) What model appears to give the best fit?

(c) For each model, what is the most likely (i.e., the modal value of the) maximum annual flood to be found in any one year? What is the probability of finding a flood equal to or greater than double this magnitude? Are these magnitudes particularly sensitive to the choice of model?

4.31. Make a study of the design situation of Prob. 1.9 using probability models of Chap. 3. Is the decision sensitive to choice of model between those with similar degrees of fit? Consider at least two models consistent with the histogram.

4.32. Correlated samples. In an attempt to estimate the mean density of soil in a certain shallow layer under a building site, it has been proposed to do one of two things:

 

(i) Drill five holes at widely separated intervals and take single soil cores from each. Measure their densities and average them to estimate the mean at the site.

(ii) Drill a single hole, take five cores from it, and average their densities to estimate the mean at the site.

Experimental error in these tests is negligible. Soil density varies, however, from place to place within the layer owing to variations in the geological processes.

(a) Why, in words, is the latter method a less desirable way to estimate the mean density at the site?

(b) Why, in statistical terms, is the latter method less reliable? Hint: Interpret Eq. (4.1.12).

(c) If the core densities can be assumed uncorrelated by the former method, but by the second method there is a correlation coefficient of ρ = 0.9 between all pairs of variables, find the relative magnitudes of the mean square errors of the two testing techniques. How many independent samples are the five specimens in the latter method “equivalent” to? Assume that all cores have the same mean and standard deviation.

4.33. Reconsider Prob. 2.41. The project engineer is concerned with developing an operating rule by which the foreman can decide from his observation of a. single roundtrip time whether to report that this particular process is functioning properly or to instigate a more careful inspection of the process. The significance test is of the form

images

(In Prob. 2.41, c = 10 min.) Find c such that the probability of making the inspection when it is unwarranted is only 10 percent.

4.34. Consider the water demands in Prob. 1.17. In previous years the air station was populated only by service personnel. The mean daily demand was 4,550,000 gal/day. In 1965 families replaced a portion of the service personnel. Is the daily mean demand in 1965 significantly greater than in the past? Assume that daily demands are normally distributed. Notice that the sample size is quite large.

4.35. Assuming normality of the beam deflections in Prob. 1.1, should the wood-products engineer report that the observed deflections indicate that the mean deflection is different from the calculated deflection? Use the actual dimensions (1.63 by 3.50 in.) when calculating the deflection. Note that the sample size is small.

4.36. Consider the aquifer recharge data in Prob. 1.6. There is some reason to believe that this may have changed in time. Assuming normality and equal variances, does there appear to be a significant difference between the mean recharge before 1943 and that after (and including) 1943? (Under the null hypothesis of equality, the best estimate of the common variance is simply the sample variance about the sample average of all 17 years. Use this estimate as an assumed known value of σ 2.)

4.37. In a traffic-counting study the number of cars arriving during n intervals, each of 1-min duration, were counted with observed values x1, x2, . . ., xn. Assume that the cars are Poisson arrivals with unknown average arrival rate ν per minute. Then X, the number in any minute, has a Poisson distribution with mean ν.

(a) Show that the likelihood function of ν given the observations is

images

(b) What is the maximum-likelihood estimator of ν? How does this compare with the method-of-moments estimator?

4.38. An engineer must report as to the kind of soil-pile behavior that exists at a site. The choice will be based on the indications from the blow counts from five preliminary piles. The null hypothesis is that the behavior is type A which is known from experience to exhibit a mean blow count of 20/ft, while the alternate hypothesis is that the behavior is type B, which gives rise to a mean blow count of 30/ft. In both cases the standard deviation of the number of blows per foot is 5.

(a) Propose a logical hypothesis test, i.e., a logical statistic and a reasonable form for the operating rule.

(b) Find the specific operating rule which gives a type I error probability of 25 percent. Assume an appropriate, convenient distribution for the sample statistic adopted.

(c) What is the β probability for the test?

(d) What should the engineer report if the blow counts observed are 31, 22, 28, 25, 32?

4.39. The elevator in a high-rise apartment is presumed to behave like a simple Markov process in the following sense. The elevator is observed each time it stops at or returns to the street floor. Its state on arrival is either empty, 0; partially full, 1; or full, 2.

Consecutive observations of the elevator totaling 101 were made. Therefore, 100 transitions were observed. In 35 of them the car was initially empty; 5 of these were followed by full states, 15 by partially full states. In 45 of the transition observations the car was initially partially full; 20 were followed by an empty car and 20 by a partially full car. Of the 20 initially full cars, none was followed by an empty car, and 10 were followed by full cars.

Use the obvious fractions to estimate the nine transition probabilities for this (homogeneous) Markov chain. Are these estimators independent random variables? Would estimators based on 100 nonconsecutive transition observations (e.g., observe one transition a day for 100 days) be “better” estimators (i.e., have smaller variance)? Would these nine estimators be independent?

4.40. A proposed estimator of the unknown median of a lognormally distributed random variable X is the “geometric mean” images of the random sample, X1,X2, . . ., Xn,

images

Show that images is not an unbiased estimator of the median of X. Does it become approximately unbiased as the sample size increases?

4.41. The depth to a bedrock layer under a site will be estimated by the sample average images of n independent readings of a sonic instrument. Any reading is a random variable with a mean equal to the true (but unknown) depth mX and with a known coefficient of variation VX (i.e., the standard deviation increases with depth).

(a) What are the mean and variance of the estimator images in terms of mX and VX?

(b) Make a reasonable assumption for the distribution of images when n = 16, and write an expression for the 90 percent confidence interval of the true depth when VX = 0.2.

(c) What is the 90 percent confidence interval estimate on true depth if the 16 tests are run and their average is 102 ft?

4.42. A building code’s requirements are that a particular type of foundation can be used only if a particular type of soil exists below the site. The soil is identified by a mean “index” value of 10. At higher or lower values some other foundation type must be considered.

(a) Can the engineer accept at the 10 percent significance level the hypothesis that this mean value is 10 if he observes four soil specimens at different points around the site with index values of 6, 9, 10, and 7? Assume that these values are observations of independent identically distributed normal random variables with standard deviation 2 and mean equal to the (unknown) mean index.

(b) Suppose, instead, that the soil specimens whose index values are listed above were taken quite close together on the site, so that they are not independent samples but rather have a common correlation coefficient, 0.5, between all pairs. What moment of the sample statistic changes from the problem in part (a)? Assuming that the sample mean is normally distributed, can the engineer accept the hypothesis that the (unknown) mean is 10 (at the 10 percent significance level)?

(c) Reconsider the situation in case (a) if the code states that the foundation can be used if the soil has an index of 10 or more.

4.43. A large school system is planning a renovation of a number of schools, all built prior to 1940. While making estimates for the budget, it was necessary to forecast the likely cost of renovation based on accumulated experience in past similar jobs. The cost data on old jobs has been updated to current conditions using the Engineering News-Record index. Data are as follows:

images

(a) Assume that the expected cost of renovation is linear with the number of rooms. Estimate the regression line and show the relationship on a plot with the data. Estimate σ and plot lines 1 standard deviation each side of the mean value function. Estimate and plot 1-standard-deviation confidence limits on the mean value.

(b) If the first job to be studied contains 32 rooms, use the results of (a) to make statements about its expected cost and the standard deviation of this figure. Note that a single forecast of the random variable, i.e., cost, is desired.

(c) Data on the number of stories are given. Repeat the analysis using number of stories to predict the cost. The job contains three stories. Which independent variable—number of rooms or number of stories—gives a “more reliable” prediction of cost?

4.44. Algae-fixed organic matter will be produced in a clear-water sample with zero chemical oxygen demand (COD) and biochemical oxygen demand (BOD) through the addition of controlled amounts of nitrogen. A laboratory study using 15-day specimens and sensibly constant phosphorus yields:

COD, mg/1 Nitrogen, mg/1
1780 27.5
1470 41.0
2080 33.5
1110 20.6
1050 20.4
745 15.1
1220 20.6
1020 16.0
830 31.0
183 0.18
1100 16.0

(a) Determine the regression line for COD versus nitrogen. Estimate α, β, and σ.

(b) Test if the slope estimate is significantly different from zero at the 5 percent significance level.

(c) Is the intercept zero? Test the hypothesis at the 5 percent level.

(d) What is a 90 percent upper confidence limit on the variance of COD for any value of nitrogen?

(e) Plot the residuals on normal probability paper. Is the normality assumption reasonable?

(f) Plot the residuals versus the independent variable. Does the data appear to satisfy the condition that σ is constant ? Given a large amount of data, how might you test the hypothesis?

4.45. The tensile strength of concrete can be measured by the splitting test, in which a concrete cylinder is placed on its side in the testing machine and subjected to diametral compression (ASTM C496–66). The compressive and splitting tensile strengths of lightweight and ordinary concrete as a function of age of test for continuous moist cure specimens were reported by J. A. Hanson for a particular mix design as follows:

images

The compressive strengths are the average of two specimens and the splitting tensile strengths, the average of four specimens.

(a) Make a regression-analysis study of the growth of compressive strength with time. Use a model of the form:

images

in which time is the controlled variable. Note that the data are incomplete; that is, only the average value of strength of two specimens is given for each time. Thus the information is incomplete for the purpose intended. This is a common situation in the reporting of data. An approximate adjustment is to use a value of n equal to two (or four) times the given value of 8. Is the estimate of β significantly less than 1 ? Find 95 percent confidence limits on α and β. Are the β’s significantly different?

(b) Statements of tensile strength are desired based on observed compressive strength. How can such a relationship be studied. What significance tests would be of importance?

4.46. Regression analysis has been used in forecasting runoff from precipitation records. The 13-station (average) precipitation statistics from 1923 to 1966 for October, November, December, January, February, March, and April are given in the accompanying table for the Colorado River Basin along with the adjusted April–July runoff at the Grand Canyon. Assume that the expected April–July runoff is linear in the sum of the monthly precipitations and determine the regression equation. How would such an analysis be used? Is the estimate of the correlation coefficient between average monthly rainfall (October through April) and April–July runoff significantly greater than 0?

Colorado River Basin precipitation statistics

images

4.47. The elastic limit and ultimate strength of reinforcing steel are often stated to be a function of bar size owing to differences in the rates of cooling when rolled. The following data were obtained by random sampling of the delivered reinforcing steel for a particular job.

images

The steel was classed as “intermediate grade,” and the allowable design working stress was 20,000 psi. Bar number equals the diameter in eights of an inch.

(a) Make a regression analysis of the elastic limit data on bar size. The elastic limit may vary linearly with bar diameter or with bar area. Make a preliminary plot to determine which is the best measure for use in a regression analysis. Test the assumption that the elastic limit is a function of bar size.

(b) Repeat (a) for ultimate strength.

(c) Are elastic limit and ultimate strength correlated?

4.48. Plot the OC curve for the test using statistic X(1), page 413, and compare it with that for images, Fig. 4.2.2. Which test is to be preferred on this basis? In field implementation, what other factors might influence the choice?

Recommended first references in this field include Hald [1952] and Bowker and Lieberman [1959], two applied books with industrial engineering flavor, and Lindgren and McElrath [1966], Freeman [1963], and Brunk [1960], three books of increasing difficulty dealing in an introductory way with the mathematical theory of statistics.

There are, in fact, several weaknesses in this argument. See Johnson [1953] and Vorlicek [1963] for presentation and discussion of such theories.

The sample coefficient of variation, imagespercent, is typical of steel strength variation within bars from a common lot. When many lots are sampled, a value of 7 to 8 percent is more common. See Granholm [1965].

In general, both the estimator and the estimate of an unknown parameter b will be denoted by a “hat,” thus, images.

§ A case where this is not possible except by trial and error occurs in reliability testing with the Weibull or type III extreme-value distribution (Sec. 3.3.3).

Note that in principle when considering the sampling or repeated observing of a random variable X, we are dealing with a simple random process, X1, X2, . . ., with “time” parameter i = 1, 2, . . . . This process interpretation of sampling will be fully discussed in Chap. 6. At any “time” i, the distribution of the value of the process is the same, fX (x).

This is not to say that nonrandom sampling is unimportant, Note, for example, that if the sampling can be carried out so that there is negative correlation between the pairs of outcomes, the variance of imagescan be reduced from σ2/n, the value associated with random sampling.

Notice, however, that if n is reasonably large, there is little numerical difference between σ2 and [(n – l)/n] σ2.

Notice the dependence of the mean square error of an estimator on n. Characteristically, it is approximately inversely proportional to n.

In the face of this dilemma the authors recommend the use of S2* as an estimator of σ2 if only for the sake of promoting uniformity among reports of experimental data.

A number of such properties will be briefly defined in Sec. 4.1.4.

The study of the distributions of sample statistics is central to the field of mathematical statistics. Clearly these problems are ones of deriving distributions of functions of random variables and hence are problems of pure probability theory. The studies retain the name of statistics, however, because their motivation is statistical applications.

The reader can convince himself of this by recognizing that image is G(n,λ) (see Sec. 3.2.3) and then finding the moments and parameters of image

Treatment of the situation in which σ2 cannot be assumed to be known will follow.

Given the engineer’s hypothesis that X is normally distributed, this statement is exactly correct. Under other conditions it remains approximately true (for n large).

The more natural consideration of the parameter as a random variable will be discussed in Chap. 6. This intuitively more satisfactory treatment will remove the misgivings that many engineers feel when they are told by a large class of statisticians that they “must” not make a probability statement about an unknown parameter such as m.

This choice gives the shortest possible interval in this case.

These two statistics are perfectly correlated random variables, since g2image equals simply g1image plus a constantimage.

See Sec. 2.4.4. This approximation will also be justified in Sec. 4.1.4.

A second similar example arises in studying the reliability of systems where one wants to have a lower-bound estimate on the reliability, i.e., on the probability that the system will perform satisfactorily for a time t0 when the system’s time to failure T, is a random variable.

The needed and somewhat unexpected independence of nS22 and image is discussed further in the next illustration.

image is the rather unfortunate common notation. Caution: it does not denote a random variable, but a value which a random variable with a χ2 distribution with n – 1 degrees of freedom will exceed with probability 1 – α.

We choose here, for historical reasons, S2* over S2. The results using S2 are parallel. imageis T(n – 1) distributed. S* is the square root of S2*. Proof is available in Brunk [1960], Freeman [1963], or Hald [1953], to name but a few locations.

This section may be omitted on first reading.

Owing to the monotonic, one-to-one relationship between the likelihood function and its logarithm, they have a maximum at the same value image.

See for example, Hald [1952] or Brunk [I960].

For example, the mean of the exponential distribution is a function of λ, g(λ) = 1/λ. The maximum-likelihood estimator of the mean of the exponential distribution can be found to be image, while alsoimage.

For instance, the maximum-likelihood estimator of the variance of a normal distribution is S2 [Eq. (4.1.70)]. The properties of this estimator have been compared in Sec. 4.1.2 with S2*. S2 was, recall, a biased estimator for small n.

In modern literature this is called a decision rule, but we prefer to restrict the word decision to analyses which explicitly account for the consequences of actions and outcomes, as will be discussed in Chaps. 5 and 6.

It should not be said that the hypothesis is false with probability 90 percent. Like confidence levels, significance levels cannot be treated as probabilities, despite engineers’ common desire to do so. This restriction is lifted in Sec. 6.2.2.

The reader will have noted by now many parallel features in confidence-interval estimates and hypothesis tests on parameters. In fact, one can easily show, for example, that if the (1 – α) percent confidence-interval estimate of the mean contains the value of the mean associated with the null hypothesis H0, then H0 should be accepted at the α percent significance level. The relationship between confidence-interval and hypothesis testing is more computational than fundamental, but similar attitudes with respect to not treating unknown parameters as random variables exist in both.

The complement of the OC curve, i.e., the plot of 1 – β, is called the power function. We would like a test whose power function is highest everywhere.

This distribution is tabulated, however, in various places (see Owen [1962]).

The new material introduced in this section is not necessary for subsequent sections of the text. Sec. 4.3 can be bypassed at first reading.

The sampling and testing procedure may also introduce a bias, say by weakening the soil while (unavoidably) disturbing it. If known, the bias can be subtracted out. In this case we avoid the issue by defining Y to be the laboratory test quantity: unconfined compressive strength of a soil specimen.

It should be emphasized that presence of a value of β significantly different from 0 does not imply any causal relationship between the variables. It only suggests that one is useful in predicting the other.

Note that the use of dependent and independent here is in the functional sense, not the stochastic sense.

Note that transformations permit treatment of multiplicative models image image and polynomial models image, the latter becoming E[Y]x = α+ β1x1+ β2x2 with x2 = z2).

The implication is that there are r + 1 different linear prediction models that the engineer could construct. Which of these is appropriate depends in part on the future use, i.e., upon which factor the engineer will want to predict, given observations of the others.

§ See for example, Prob. 2.67.

Other factors such as population and distance may also be important of course. We shall see that they can be considered too.

See, for example Freeman [1963], Hald [1952], Brownlee [1965], or Graybill [1961].

Note that continuous variables and discrete “treatments” can be mixed; e.g., the traffic-demand model might include the continuous variables suburb population and distance, as well as socioeconomic class.

§ Under more restrictive assumptions, the maximum-likelihood method, Sec. 4.1.4, will suggest the same estimators. See, for example, Brownlee [1965] or Graybill [1961].

Compare this form with Eq. (4.3.6) for jointly distributed random variables. Recall from Chap. 1 that these sample averages can be computed whether the x’s and y’s represent observations of random variables or not. In particular, in a model 1 case (Sec. 4.3.1), the x’s are preselected, nonrandom variables, but, image and sX can still be computed.

Compare this form with Eq. (2.4.98b) for the conditional variance of Y given X when X and Y are jointly normal.

In the case of model 2 we assume that the Xi’s have been observed, but the Yi’s have not been, as yet.

There are important practical cases when this condition does not hold. For example, if the Yi’s are creep strains of compressed concrete cylinders and the xi’s are times since load was applied, the Yi’s taken from the same cylinder will not be uncorrelated. Such problems should be treated by time-series analysis (e.g., see Bendat and Piersol [1966]), which is beyond the scope of this book.

Do not confuse α the intercept and α the significance level. They are both common notation.

Recall from Sec. 4.2 that we can in fact determine the outcome of the significance test by simply observing that the hypothesized value 0 does not fall in the 1 – α confidence limit {1.50,4.58} found above.

A fundamentally more important simplification is that with this origin, the estimators A and B are independent random variables. In general, they are not.

Higher than those used to estimate the model’s parameters, Sec. 4.1.

Depending on his purposes, the engineer may be prepared to accept such discrepancies in regions of less interest, for example, upper tails. An illustration appears in Sec. 4.5.2.

There is degree of correlation among the interval proportions, Sec. 3.6.1. It is small if k is moderately large.

The reader is undoubtedly familiar with the use of semilog paper which provides a scale such that a relationship like y = abcx will plot linearly and hence facilitate the comparison between experimental data and a proposed formula of such a form.

Notice the unusual nature of this estimating situation. Instead of using a sample statistic, say image, to estimate a specified parameter, say m, we are using a specified number [i/n or i/(n + 1)] to estimate FX(X(i)), where X(i) the ith largest observation, is a sample statistic (called an order statistic).

See Gumbel [1958].

See, for example, Gumbel [1958] or Pieruschka [1963] on “control curves,” the locus of confidence limits on the value of FX(x).

How might such experiments be conducted? (See Lindley [1965], part II, page 167, Vessereau [1958], and Williams [1950].)

This number, unlike α, is computable only after the data are observed, and is sometimes called the p level (Sec. 4.2). It is the highest significance level which would lead to acceptance of H0.

See Mann and Wald [1942].

It also simplifies computations, since the pi are all equal. The statistic in Eq. (4.4.14) now becomes image.

The cumulative histogram differs little from the cumulative frequency polygon (Sec. 1.1). The former is a stair-step function, while the latter has continuous straight-line segments connecting the “toes” of the steps.

See, for example, Fisz [1963] page 447.

These points were plotted at i/(n + 1) to agree with the the procedure outlined in Sec. 4.4.1. The Kolmogorov-Smirnov test is designed to compare values i/n. Strictly, each point should be moved up by a factor (n + 1)/n before carrying out this test. Some of these moved points are shown circled in Fig. 4.4.16. The conclusion is unaltered in this case.

We will make further use of this observation in Sec. 4.5 where we use this statistic in a maximum-likelihood manner to help choose among several contending distributions.

Maximum-likelihood estimators (Sec. 4.1.4) can be found, but they require an iterative simultaneous solution of three coupled, nonlinear equations (Markovic [1965]).

It is not possible to construct a general probability paper for gamma distributions.

Notice that the plot on lognormal probability paper is possible only for a fixed value of the shift parameter; in other words, Y = T – 1.18 is actually plotted on standard lognormal paper.

The same is true of the maximum-likelihood estimates.

The same event can also occur, of course, with the method-of-moments estimates of the shift parameter in the shifted gamma or lognormal distributions. Maximum-likelihood estimates can, however, be constrained to avoid this problem, for example, â ≤ min [Xi] and images, although this may complicate their evaluation.

This particular division is unnecessary, of course, but it avoids arbitrary divisions and is easier to apply. The expected number of observations per interval, three, is undesirably small (Sec. 4.4.2).

In order to compare likelihoods directly, one should choose the number of histogram intervals such that all statistics have the same number of degrees of freedom.

It should be pointed out that the χ2 statistic is often used incorrectly in model comparisons, the model with the smallest observed statistic being selected no matter how many degrees of freedom are involved. For ν greater than 2, zero is not the most likely value of the statistic. In this case the PDF of the χ2 distribution (Sec. 3.4.3) must be evaluated to determine which model has yielded the most likely value of the statistic and inversely, therefore, which is the “most likely” model.

Presumably it is just a fast way of excluding models whose χ2 statistics would be large and unlikely.

For example, a compound model (Sec. 3.5.3) in which the number of buckets k was treated as a discrete random variable.

But, as mentioned, the question of which type of extreme-value distribution is left open.

This comparison emphasizes the earlier statement that tail probabilities of the adopted empirical model should not be given great confidence. Unfortunately it is all too often the civil engineer’s plight that he must make design decisions related to events with small probabilities when insufficient data is available to provide reliable estimates of these probabilities. Since the engineer must make some decision, he often is forced to base it upon unreliable estimates for lack of acceptable substitutes. Lacking reasons to the contrary, extrapolation is probably a reasonable approach, since it is systematic and consistent.

Source: Markovic [1965].

Source: Shook [1963].

Adapted from Mills [1965].

Data provided by the Master Builders Co., Cleveland, Ohio, for an office building constructed in New York.

Source: Langejan [1965].

Source: Greenshields and Weida [1952], page 172.

Data source: Alexander [1965].

Data source: C. R. Allen, P. St. Amand, C. F. Ricter, and J. M. Nordquist [1965].

Source: Greenshields and Weida [1952], page 207.

Data source: Beard [1962], page 42.

The stream flows in both cases were simulated by the Stanford watershed model. Rainfall, the input to this model, was either historic or simulated by Pattison’s model.

Adopted from Liu and Thornburn [1965], who make the additional suggestion that a factor proportional to the length of the region Li /Lj also be included to account for the relative consequences should an error in estimation be made.

The test can be extended to more than two classifications. See, for example, Lindley [1965], Sec. 7.6.

Source: Venuti [1965].

J. A. Hanson, Effects of Curing and Drying Environments on Splitting Tensile Strength of Concrete, AC I J., Table 3, July, 1968.