APPENDIX *iii 
On the Heuristic Use of the Classical Definition of Probability, Especially for Deriving the General Multiplication Theorem

The classical definition of probability as the number of favourable cases divided by the number of equally possible cases has considerable heuristic value. Its main drawback is that it is applicable to homogeneous or symmetrical dice, say, but not to biased dice; or in other words, that it does not make room for unequal weights of the possible cases. But in some special cases there are ways and means of getting over this difficulty; and it is in these cases that the old definition has its heuristic value: every satisfactory definition will have to agree with the old definition where the difficulty of assigning weights can be overcome, and therefore, a fortiori, in those cases in which the old definition turns out to be applicable.

(1) The classical definition will be applicable in all cases in which we conjecture that we are faced with equal weights, or equal possibilities, and therefore with equal probabilities.

(2) It will be applicable in all cases in which we can transform our problem so as to obtain equal weights or possibilities or probabilities.

(3) It will be applicable, with slight modifications, whenever we can assign a weight function to the various possibilities.

(4) It will be applicable, or it will be of heuristic value, in most cases where an over-simplified estimate that works with equal possibilities leads to a solution approaching to the probabilities zero or one.

(5) It will be of great heuristic value in cases in which weights can be introduced in the form of probabilities. Take, for example, the following simple problem: we are to calculate the probability of throwing with a die an even number when the throws of the number six are not counted, but considered as no throw’. The classical definition leads, of course, to 2/5. We may now assume that the die is biased, and that the (unequal) probabilities p(1), p(2),..., p(6) of its sides are given. We can then still calculate the required probability as equal to

20004f49v05_0355_004.jpg

That is to say, we can modify the classical definition so as to yield the following simple rule:

Given the probabilities of all the (mutually exclusive) possible cases, the required probability is the sum of the probabilities of all the (mutually exclusive) favourable cases, divided by the sum of the probabilities of all the (mutually exclusive) possible cases.

It is clear that we can also express this rule, for exclusive or non-exclusive cases, as follows.

The required probability is always equal to the probability of the disjunction of all the (exclusive or non-exclusive) favourable cases, divided by the probability of the disjunction of all the (exclusive or non-exclusive) possible cases.

(6) These rules can be used for a heuristic derivation of the definition of relative probability, and of the general multiplication theorem.

For let us symbolize, in the last example, ‘even’ by ‘a’ and ‘other than a six’ by ‘b’. Then our problem of determining the probability of an even throw if we disregard throws of a six is clearly the same as the problem of determining p(a, b), that is to say, the probability of a, given b, or the probability of finding an a among the b’s.

The calculation can then proceed as follows. Instead of writing ‘p(2) + p(4)’ we can write, more generally, ‘p(ab)’, that is to say, the probability of an even throw other than a six. And instead of writing ‘p(1) + p(2) +... + p(5)’ or, what amounts to the same, ‘1-p(6)’, we can write ‘p(b)’, that is to say, the probability of throwing a number other than six. It is clear that these calculations are quite general, and assuming p(b) ‚ 0, we are led to the formula,

20004f49v05_0356_002.jpg

(1)

or to the formula (more general because it remains meaningful even if p(b) = 0),

20004f49v05_0356_005.jpg

(2)

This is the general multiplication theorem for the absolute probability of a product ab.

By substituting ‘bc’ for ‘b’, we obtain from (2):1

20004f49v05_0356_009.jpg

and therefore, by applying (2) to p(bc):

20004f49v05_0356_011.jpg

or, assuming p(c) ≠ 0,

20004f49v05_0356_013.jpg

This, in view of (1), is the same as

20004f49v05_0356_015.jpg

(3)

This is the general multiplication theorem for the relative probability of a product ab.

(7) The derivation here sketched can be easily formalized. The formalized proof will have to proceed from an axiom system rather than from a definition. This is a consequence of the fact that our heuristic use of the classical definition consisted in introducing weighted possibilities—which is practically the same as probabilities—into what was the classical definiens. The result of this modification cannot any longer be regarded as a proper definition; rather it must establish relations between various probabilities, and it therefore amounts to the construction of an axiom system. If we wish to formalize our derivation—which makes implicit use of the laws of association and of the addition of probabilities—then we must introduce rules for these operations into our axiom system. An example is our axiom system for absolute probabilities, as described in appendix *ii.

If we thus formalize our derivation of (3), we can get (3) at best only with the condition ‘provided p(c) ≠ 0’, as will be clear from our heuristic derivation.

But (3) may be meaningful even without this proviso, if we can construct an axiom system in which p(a, b) is generally meaningful, even if p(b) = 0. It is clear that we cannot, in a theory of this kind, derive (3) in the way here sketched; but we may instead adopt (3) itself as an axiom, and take the present derivation (see also formula (1) of my old appendix ii) as a heuristic justification for introducing this axiom. This has been done in the system described in the next appendix (appendix *iv).

1 I omit brackets round ‘bc’ because my interest is here heuristic rather than formal, and because the problem of the law of association is dealt with at length in the next two appendices.