1 Introduction
The choice between parametric and nonparametric density estimates is a topic frequently encountered by practitioners. The parametric (maximum likelihood, ML) approach is a natural first choice under strong evidence about the underlying density. However, estimation of normal mixture densities with unknown number of mixture components can become very complicated. Specifically, misidentification of the number of components greatly impairs the performance of the ML estimate and acts incrementally to the usual convergence issues of this technique, e.g., [11]. A robust nonparametric alternative, immune to the above problems is the classical kernel density estimate (kde).
The purpose of this work is to investigate under which circumstances one would prefer to employ the ML or the kde. A goodness-of-fit test is introduced based on the Integrated Squared Error (ISE) which measures the distance between the true curve and the proposed parametric model. Section 2 introduces the necessary notation and formulates the goodness-of-fit test. Its asymptotic distribution is discussed in Sect. 3 together with the associated criteria for acceptance or rejection of the null. An example is provided in Sect. 4. All proofs are deferred to the last Section.
2 Setup and Notation
3 Distribution of Under the Null
- 1.
and as .
- 2.
The density and its parametric estimate are bounded, and their first two derivatives exist and are bounded and uniformly continuous on the real line.
- 3.Let be any of the estimated vectors and let denote its estimate. Then, there exists a such that almost surely andwhere is a vector of the first derivatives of with respect to and evaluated at while
4 An Example
As an illustrative example, the Galaxies data of [14] are used. The data represent velocities in km/sec of 82 galaxies from 6 well-separated conic sections of an unfilled survey of the Corona Borealis region. Multimodality in such surveys is evidence for voids and superclusters in the far universe.