Chapter 13. Additional Normal Forms

Where’s it all going to end?

Now, this is not the end. It is not even the beginning of the end.

But it is, perhaps, the end of the beginning.

To paraphrase something I said in Chapter 9, I’ve assumed so far in this book that the only dependencies we care about[120] are ones that have to do with projection as the decomposition operator and join as the corresponding recomposition operator. I also said that, given that assumption, it followed that 5NF was the final normal form. However, I did also say, in a footnote, that there was something called “sixth” normal form or 6NF. In fact, it turns out that we can define, not just 6NF as such, but several other normal forms also, all without departing from those same assumptions regarding available decomposition and recomposition operators. Figure 13-1 (an extended version of Figure 3-3 from Chapter 3) shows how some of those additional normal forms—viz., RFNF, SKNF, and 6NF, shown in boldface italics in the figure—fit into the overall scheme of things, as it were. In this chapter, I’ll be describing those three normal forms as well as (briefly) a few more, for completeness.

The normal form hierarchy (II)

Figure 13-1. The normal form hierarchy (II)

Before describing the various additional normal forms as such, I need to spend a little time on another preliminary matter. Recall from Chapter 3 the example in which relvar S was replaced by its projections SNC and CT on {SNO,SNAME,CITY} and {CITY,STATUS}, respectively. As part of my discussion of that example, I pointed out that the following constraint—

     CONSTRAINT ... SNC { CITY } = CT { CITY } ;

—holds (or at least might hold) in the result of the decomposition, and I mentioned that this constraint was in fact an equality dependency. Here’s a definition:

Actually, equality dependencies are an important special case of a more general phenomenon known as inclusion dependencies:

Points arising from this latter definition:

Now, we’re going to be seeing lots of examples of EQDs in particular, as opposed to INDs in general, in what follows. In fact this state of affairs should be obvious: Nonloss decomposing a relvar into projections usually leads to INDs at least and often to EQDs, as we already know. However, it’s EQDs that don’t arise as a result of nonloss decomposition that are the interesting ones, in a way. The reason is that the existence of such an EQD often turns out to be a mark of redundancy—because if (as I put it in Chapter 3) some piece of information is recorded twice, an EQD might be what’s needed to keep the two representations in agreement.

By the way, if you haven’t heard much about EQDs before, you might be wondering why not, given their conceptual importance. In my opinion, the most likely reason for the omission is the SQL language ... As you’ll know if you’ve ever tried the exercise, EQDs are extremely awkward to formulate in SQL, because SQL has no direct way of expressing relational comparisons.[121] A striking example in support of this contention can be found in the discussion of Example 12 in the section of that name in Chapter 15.



[120] Apart from equality and inclusion dependencies, that is.

[121] By SQL here, I mean SQL as defined by the SQL standard. The situation is even worse in mainstream implementations, where most EQDs can’t be formulated at all, owing to the fact that the implementations in question don’t allow subqueries in constraints.