A RELVAR IN BCNF AND NOT 5NF

I’ll start with a revised version—I’ll call it SPJ—of our usual shipments relvar SP. The revisions consist of (a) dropping attribute QTY and (b) introducing a new attribute JNO (“project number”). The predicate is Supplier SNO supplies part PNO to project JNO, and a sample value is shown in Figure 9-1. Note that the relvar is “all key” and therefore certainly in BCNF.[92]

Relvar SPJ—sample value

Figure 9-1. Relvar SPJ—sample value

Now suppose the following business rule is in effect:

In slightly more concrete terms, this business rule says that if (for example) all three of the following are true propositions—

  1. Smith supplies monkey wrenches to some project.

  2. Somebody supplies monkey wrenches to the Manhattan project.

  3. Smith supplies something to the Manhattan project.

—then the following is a true proposition as well:

  1. Smith supplies monkey wrenches to the Manhattan project.

In other words, if relvar SPJ contains tuples representing propositions a., b., and c., it must also contain a tuple representing proposition d.[94] Note that this requirement is met in Figure 9-1 (take S1 to be Smith, P1 to be monkey wrenches, and J1 to be the Manhattan project).

Now, propositions a., b., and c. would normally not imply proposition d. To elaborate, if we know only that propositions a., b., and c. are true, then we know that Smith supplies monkey wrenches to some project j; we know that some supplier s supplies monkey wrenches to the Manhattan project; and we know that Smith supplies some part p to the Manhattan project—but we can’t validly infer that s is Smith, we can’t validly infer that p is monkey wrenches, and we can’t validly infer that j is the Manhattan project. False inferences such as these are examples of what’s sometimes called the connection trap. In the case at hand, however, the business rule tells us there is no trap; that is, we can validly infer proposition d. from propositions a., b., and c. in this particular case.

Now let’s consider the example more carefully. Let me use SP, PJ, and JS, just for the moment, to denote the projections of SPJ on {SNO,PNO}, {PNO,JNO}, and {JNO,SNO}, respectively. Then we have the following:

Thus (s,p,j) appears in SPJ if and only if it appears in the join of SP, PJ, and JS. It follows that every legal value of relvar SPJ is equal to the join of its projections on {SNO,PNO}, {PNO,JNO}, and {JNO,SNO}, and hence that the JD

      { { SNO , PNO } , { PNO , JNO } , { JNO , SNO } }

certainly holds in relvar SPJ.

Observe now that the foregoing JD is ternary—it has three components. What’s more, it isn’t implied by FDs.[95] Hence it certainly isn’t implied by keys (recall from Chapter 5 that a key constraint is just a special case of an FD). As a consequence, relvar SPJ, although it’s in BCNF (because it’s “all key”), isn’t in 5NF.

In order to understand this state of affairs a little better, it’s helpful to go back to the sample SPJ value shown in Figure 9-1. Figure 9-2 shows (a) values of the projections SP, PJ, and JS corresponding to that sample value, (b) the effect of joining the SP and PJ projections (on {PNO}), and (c) the effect of joining that result and the JS projection (on {JNO,SNO}). As you can see, joining the first two projections produces a copy of the original SPJ relation plus one additional (“spurious”) tuple; joining in the other projection then eliminates that additional tuple, thereby getting us back to the original SPJ relation. Moreover, the net effect is the same whatever pair of projections we choose for the first join, though the intermediate result is different in each case. Exercise: Check this claim.

To repeat, therefore, the JD {SP,PJ,JS}—if now you’ll let me use the names SP, PJ, and SJ to refer not to the projections as such but to the corresponding subsets of the heading—holds in relvar SPJ; in other words, that JD captures the essence (as it were) of the original business rule. As a consequence, relvar SPJ can be nonloss decomposed accordingly. What’s more, it probably should be, because it suffers from redundancy; to be specific, in terms of the sample value of Figure 9-1, the proposition that supplier S1 supplies part P1 to project J1 is represented both explicitly, by means of the tuple (S1,P1,J1), and implicitly as a logical consequence of the JD and the propositions represented by the other three tuples.

SPJ = the join of all three of its binary projections but not of any two

Figure 9-2. SPJ = the join of all three of its binary projections but not of any two

More terminology: We say a JD like the one that applies in the SPJ example is tuple forcing, because if certain tuples appear, it forces certain additional tuples to appear as well. In Figure 9-1, for example, the appearance of the three tuples (S1,P1,J2), (S1,P2,J1), and (S2,P1,J1) forces the appearance of the tuple (S1,P1,J1). Note carefully that not all JDs are tuple forcing; for example, the join dependency {{SNO,SNAME,CITY},{CITY,STATUS}} holds in relvar S, but there’s no question of it forcing additional tuples to appear. Note: To jump ahead of ourselves for a moment, it’ll turn out later that a relvar that’s subject to a tuple forcing JD can’t be in 5NF (though as the SPJ example shows, it can be in BCNF).



[92] As a matter of fact it’s in 4NF, too; however, it’s not in what Chapter 13 calls redundancy free normal form, RFNF, and thus not in 5NF either.

[93] It could be argued that, strictly speaking, “supplier s supplies part p” here should really be “supplier s supplies part p to some project” (and similarly for “part p is supplied to project j” and “project j is supplied by supplier s”). Whether such is in fact the case depends on the full semantics of the situation, which I’ve deliberately left a little underspecified in the interest of intuitive simplicity. I’ll come back to this issue in Chapter 15.

[94] I’m being a little sloppy once again. For example, consider proposition a. (“Smith supplies monkey wrenches to some project”). If “some project” here means “some unspecified project”—i.e., there exists such a project, but we don’t know what it is—then the proposition isn’t an instantiation of the predicate for SPJ, and no SPJ tuple can possibly represent it. But an SPJ tuple certainly can represent the proposition “Smith supplies monkey wrenches to some specific project”; what’s more, the proposition so represented then implies the proposition “Smith supplies monkey wrenches to some project” (i.e., “there exists a project j such that Smith supplies monkey wrenches to j”). I hope that’s clear!

[95] Proof: The only FDs that hold in relvar SPJ are trivial ones, and it’s certainly not the case that every relation satisfying those trivial FDs also satisfies the JD. For example, the relation containing the first three but not the fourth of the tuples as shown in Figure 9-1 doesn’t.