1 Introduction
we assume continuity for the usually unknown mean function ;
for any fixed pair of instants the independence is assumed for the random variables and .
This assumption is introduced in order to apply the Rajchman Theorem (see [5]) or the classical results concerning Strong Law of Large Numbers (SLLN) (see [4]). Namely, only pairwise uncorrelation is requested for and but, it can be easily checked in this case, the uncorrelation implies independence; furthermore, independence is here a very mild condition; in fact, we may suppose that the total number of white and black balls in the urn is big enough that the knowledge of or does not produce a meaningful modification of the probability distribution for .
- 1.
to study the asymptotic behaviour of relative frequency in a nonstationary context;
- 2.
to estimate the unknown function , i.e. the mean function of the nonstationary process (1), which is an arbitrary continuous map form into .
The urn evolution has effects concerning sampling; for instance, if the observations number n is big enough, a not slight time interval will be needed in order to receive the n observations which surely are not values taken by the same random variable. Then, for the sake of simplification, we assume that any r.v. Y(t) may be observed at most only one time. The point of view we adopt is then characterized by a strong nonstationarity and the consistent estimation for the mean at a fixed time may appear as a very hard objective.
An approach to estimation for the mean function of a nonstationary process was given by M. B. Priestley (see [10] in page 587 and [11] in page 140) when the form of m is known and the case is suggested of a polynomial function in t. Viceversa: with no information on the form of m we obviously cannot construct a consistent estimate of it. The approach here adopted is quite different from classical methods of time series analysis; the only information available for m is the continuity property over , and no approximation of m is introduced by continuous functions of a known form. The estimation technique involves the process (1) which is a specified case of nonstationarity but the theoretical results given in the last section hold true for a general nonstationary process. The case (1) is only a concrete example of a process having no regularity properties; nevertheless, the continuity for the mean function m is a reasonable and not restrictive assumption which denotes compatibility with a context of an arbitrary but not brutal evolution for the composition of the urn.
Concerning estimation problem for the mean function of a nonstationary process, some well-known approach is available in the literature as, for instance, the smoothing spline estimation by [13] or nonparametric regression estimation as in [7] and [9]. These classical approaches, following the sieves technique, need the first functions belonging to a base inside a vector space and the usual assumptions involved for the smooth function are concerning the derivatives , and so on. Thus the estimation procedure developed in this paper may be seen as an alternative method; only continuity is adopted for and the use of sieves technique is omitted.
Nevertheless, the application of usual SLLN for studying the asymptotic behaviour of (2) is not a trivial step and several problems arise concerning the process (1). The family of r.v.’s is not a stationary process and then we have no possibility of applying the classical ergodic theory (see, for instance, Chap. 3 in [2]) based on a stationary probability distribution over and on a measure-preserving transformation. Analogously the generalizations of ergodic theory such as Dunford and Schwartz pointwise ergodic theorem (see [6] in page 675) or Chacon and Ornstein theorem [3] cannot be applied to our problem. Also law of large numbers for random functions cannot be adapted to the above problem; taking, for instance, the Ranga Raw law for D[0, 1] valued r.v.’s [12], the main argument is given by the observable trajectories inside the Skorohod space D[0, 1] of functions with discontinuities only for the first kind; thus the trajectories of process (1), including any arbitrary function taking only values 0 and 1, are not a random element into D[0, T]. Moreover let us observe that, because of the discontinuity at any point t, the observation of any trajectory over all the interval [0, T], and then any law of large numbers based on trajectories, are a too hard purpose. Consequently, the asymptotic arguments are concerning the sequence (2), where the number of observed r.v.’s tends to infinity.
- (I)
the sequence of observation times ;
- (II)
the permutation .
2 Convergence Elements
3 A General SLLN via Permutations
- (a)
The final goal is not only the construction of a permutation making the ’s a weakly convergent sequence, but also that of driving convergence to a chosen limit measure belonging to class .
- (b)
The definition of class is, of course, a central and rather technical argument: for details and a rigorous treatment see the construction leading to Definition 6 in [8].
- (c)
The main theorem may appear as an analogous of the well-known Riemann-Dini theorem for convergent real series: both the proofs are clearly involving permutations, but the technique adopted in proving the above main result is a constructive one.
- (d)The above result is a generalization of the classical SLLN concerning a sequence of r.v.’s having a common finite expectation . By the elementary equalityand if the convergence holds true:an easy direct comparison is possible:
- 1.in the standard case, when , we trivially haveThis means that for each n the weight 1 is assigned to value and then the probability measure are invariant with respect to any given permutation and the ’s are weakly convergent to measure .
- 2.In the general case, when expectations are arbitrarily different values,depends on the sequence and , and the technique based on weak convergence for ’s is a generalization of the standard case.
- 1.
Moreover, the limit for SLLN is written as an integral , i.e. as an expectation with respect to the probability measure P which is the weak limit of ’s; thus P is defined through , independently of probability distribution of r.v.’s .
Finally, let us observe that the main theorem cannot be directly applied for finding because the proof technique is fully based on the knowledge of values ’s which are the estimation object.
4 Estimating E(Y(t))
5 Remarks
- 1.
Theorem (2) may be applied, at the same time, to several different subintervals of ; for instance, to all the subintervals belonging to a finite partition of .
- 2.The policy of choosing the observation times as a dense subset of is a technique which is common to several areas of statistical inference. In this context it can be easily checked that
- (a)
this choice derives directly from evolution of the nonstationary process ; in fact at most only one observation is possible for any r.v. Y(t). Thus to increase the number of observations implies to choose new ’s and their density in ensures a good knowledge of the process.
- (b)
The density of ’s makes necessary the use of permutations; in fact, the sequence has no meaning if a permutation is not assigned for choosing the ’s. But the choice of , as it was shown above, has a deep effect in terms of measures and of convergence.
- (a)