To give a further illustration of why invariance is important, we now focus on a particular instance of strongly invariant models: we consider a joint model for some random variables and we assume that this joint model is strongly invariant under permutations of them, so it represents the belief that the order of the variables does not matter. Such models are called exchangeable, and they were first studied in the precise case by Bruno de Finetti [216]. Exchangeability was later extended to the theory of coherent lower previsions by Walley [672, § 9.5]. Here, we follow the more detailed and extensive treatment by De Cooman et al. [213, 214].
By virtue of de Finetti's Representation Theorem [216], an exchangeable model can be seen as a convex mixture of multinomial models. This has given some ground [177, 216, 218] to the claim that aleatory probabilities and IID processes can be eliminated from statistics, and that we can restrict ourselves to considering exchangeable sequences instead.6
Consider random variables
, …,
taking values in the same non-empty and finite set
. A subject's beliefs about the values that these random variables assume jointly in
is given by their (joint) distribution, which is a coherent lower prevision
defined on the set
of all gambles on
.
Let us denote by the set of all permutations of the set of indices
. With any such permutation
we can associate a permutation of
, also denoted by
, that maps any sequence of observations
in
to the permuted sequence
. Similarly, with any gamble
on
, we can consider the permuted gamble
, with in other words
for all
, in the manner described in Section 3.3.
A subject judges the random variables , …,
to be exchangeable when she is disposed to exchange any gamble
for the permuted gamble
in return for any strictly positive price, meaning that
, for any permutation
. Taking into account the properties of coherence, this means that:
A subject will make an assumption of exchangeability when she has evidence that the processes generating the values of the random variables are (physically) similar [672, § 9.5.2], and consequently the order in which the variables are observed is not important.
When is in particular a linear prevision
, exchangeability is equivalent to having
for all gambles
and all permutations
. In terms of the (probability) mass function
of
, defined by
, this is equivalent to having
for all
in
and all
; in other words, the mass function
should be invariant under any permutation of the indices. This is essentially de Finetti's [216] definition for the exchangeability of a prevision. The following proposition, mentioned in [672, § 9.5], establishes an even stronger link between Walley's and de Finetti's notions of exchangeability.
This result is a immediate consequence of Theorem 3.3: exchangeability means strong invariance with respect to the set of all permutations of , and as such it is equivalent to the invariance of all the dominating models.
Consider any , then the so-called (permutation) invariant atom
is the smallest subset of that contains
and is invariant under all permutations
in
. We denote the partition of all permutation invariant atoms of
by
. We can characterize the invariant atoms using the counting maps
defined for all
in
in such a way that
is the number of components of the -tuple
that assume the value
.
denotes the map from
to
whose component maps are the
,
. Observe that
actually assumes values in the set of count vectors
Since permuting the components of a sequence leaves the counts invariant—meaning that for all
and
—, we see that for all
and
in
:
The counting map can therefore be interpreted as a bijection between the set of invariant atoms
and the set of count vectors
, and we can identify any invariant atom
by the count vector
of any (and therefore all) of its elements. We therefore also denote this atom by
; and clearly
if and only if
. The number of elements
in any invariant atom
is given by the number of different ways in which the components of any
in
can be permuted:
If the random variable assumes the value
in
, then the corresponding count vector assumes the value
in
: we can see
as a random variable in
. If the available information about the values that
assumes in
is given by the coherent exchangeable lower prevision
, then the corresponding uncertainty model for the values that
assumes in
is given by the coherent induced lower prevision
on
—the distribution of
—, given by
Conversely, any exchangeable coherent lower prevision is in fact completely determined by the corresponding distribution
of the count vectors, also called its count distribution. This establishes a relationship between exchangeability and sampling without replacement.
In order to make this clear, we must introduce first the linear prevision in the following way:
This is the expectation operator for the uniform distribution on the invariant atom , and also the expectation operator associated with a multiple hyper-geometric distribution [383, § 39], corresponding to sampling without replacement from an urn with
balls, whose possible types correspond to the elements of
, and whose composition is determined by the count vector
: there are
balls of type
for any
.
The following theorem implies that any exchangeable coherent lower prevision on can be associated with—or equivalently, that any collection of
exchangeable random variables in
can be seen as the result of—
random draws without replacement from an urn with
balls whose types are characterized by the elements
of
, and whose composition
is unknown—a random variable in
—, but for which the available information about this composition is modelled by a coherent lower prevision on
. It concerns the linear transformation
of the linear space
, defined by letting
The lower prevision that represents the beliefs about the composition
of the urn is unique, and called the count distribution for the lower prevision
.
What will usually happen in practice, is that a subject makes an assessment that random variables
, …,
taking values in a finite set
are exchangeable, and in addition specifies supremum acceptable buying prices
for all gambles in some (typically finite, but not necessarily so) set of gambles
. The question then is: can we turn these assessments into an exchangeable coherent lower prevision
defined on all of
, that is furthermore as small (least-committal, conservative) as possible?
It turns out that it is possible to determine the coherent lower prevision , which we will call the exchangeable natural extension of
. Let
be the lower prevision on the set
given by
Using the reasoning in Section 3.3.2 and [206, Theorem 16], we obtain the following:
Since there are quite efficient algorithms (see [679] and Section 16.2) for calculating the natural extension of a lower prevision based on a finite number of assessments, this theorem not only has intuitive appeal, but it provides us with an elegant and efficient manner to find the exchangeable natural extension, i.e., to combine (finitary) local assessments with the structural assessment of exchangeability.
We next extend the results from Section 3.4.1 from finite to countable exchangeable sequences. Consider a countable sequence , …,
, … of random variables taking values in the same nonempty set
. We can see them as a single random variable
assuming values in the set
, where
is the set of the natural numbers (zero not included). In its simplest form, we can model the available information about the value that
assumes in
by a collection of coherent lower previsions
on
for all
, where each
models beliefs about the first
variables
, …,
.
Clearly, the family of coherent lower previsions ,
must satisfy the following ‘time consistency’ requirement:
where denotes the cylindrical extension of
to
, meaning that
is the
-marginal of
.
The following definition generalizes de Finetti's exchangeability condition:
It turns out that exchangeable lower previsions for countable sequences also have a representation result, which generalizes de Finetti's famous Representation Theorem for countable sequences [216]. Consider the -simplex of all (probability) mass functions
on
:
Given , we can consider a sequence of
independent and identically distributed variables, each with this mass function
. The so-called multinomial expectation operator for these variables is given by:
where the (so-called multinomial count) linear prevision is defined by
The corresponding probability mass for any count vector is given by
and the polynomial function on the
-simplex
is called a (multivariate) Bernstein basis polynomial of degree
[71, 440, 518]. These basis polynomials
,
constitute a basis for the linear space
of all (multivariate) polynomials on
of degree up to
:
where is the polynomial on
that assumes the value
in
. The linear space of all polynomials on
—a subspace of
—is given by:
Hence, the belief model for any countable exchangeable sequence of random variables in can be completely and uniquely characterized by a coherent lower prevision on the linear space of all polynomial gambles on
. In the particular case of a time consistent family of exchangeable linear previsions
on
,
, the frequency representation
will be a linear prevision
on
, fully characterized by its values
on the Bernstein basis polynomials
,
,
. This is the essence of de Finetti's Representation Theorem [216].
A first detailed study of structural judgements in the context of imprecise probabilities was given by Walley [672, §§3 and 9]. In addition to some of the judgements summarised in this chapter, such as epistemic independence and exchangeability, Walley also discusses other types of structural judgements such as conditional independence [672, § 9.2.6] and robust Bernoulli models [672, § 9.6].
Below we discuss some additional references more specifically related to the structural judgements considered in this chapter: independence, invariance and exchangeability.
Besides the notions of independence for imprecise probabilities we have introduced in this chapter, there are other related concepts that may be of interest; see [158, 196] for an interesting overview. Of particular interest is the notion of conditional independence, only briefly touched upon in this chapter. This has been studied for instance in [179, 196, 210] [618]. There are results (see [210], §4.4) that indicate that conditional independence may be modelled by means of a collection of epistemic independence assumptions, allowing us to essentially reduce this notion to the one discussed in Section 3.2.
A more detailed study on the relationships between the epistemic and the formalist approaches to independence in terms of coherent lower previsions can be found in [210]. For the formalist approach based on the generalization of the factorization property to the imprecise case, other interesting possibilities are the type-1 or type-2 products introduced in [672, § 9].
Finally, there are also discussions of independence for some of the special cases of coherent lower previsions to be introduced in Chapter 4: see [55, 157] for a survey of independence concepts within evidence theory, and [121, 200] for the particular case of possibility measures. Further information on this matter will be provided in Section 1.3.3.
Our notion of a weakly invariant belief model corresponds to the notion of a ‘reasonable (or invariant) class of priors’ in [512], rather than to a ‘class of reasonable (or invariant) priors’, the latter being what our notion of strong invariance corresponds to. On the other hand, Walley [672, Def. 3.5.1] defines a -invariant lower prevision
as one for which
for all
and all gambles
, so he requires equality rather than inequality, as we do here.
Most of the discussion in Section 3.3 can be found in a more detailed form in [206]. The interested reader can also find in this reference a number of additional results, such as a detailed account of the existence of strongly invariant coherent lower previsions or a connection with Choquet integration.
Finally, an interesting generalization of the notions of weak and strong invariance to a set of Markov operators can be found in [596].
The first detailed study of exchangeability was made by de Finetti [216] (with the terminology of ‘equivalent’ events). An overview of de Finetti's work can be found in [218], §11.4 and [115]. Other important work on exchangeability was done by, amongst many others, [234, 352, 360], and, in the context of the behavioural theory of imprecise probabilities by [672]. We refer to [387, 388] for modern, measure-theoretic discussions of exchangeability. Most of the results we have summarised in this section can be found in [214] (for lower previsions) and [213] (for sets of desirable gambles). In [213] the reader can also find out how to update belief models while maintaining exchangeability.
Work supported by project MTM2010-17844 and by the SBO project 060043 of the IWT-Vlaanderen.