DOMAIN-KEY NORMAL FORM

Domain-key normal form (DK/NF) differs from all of the normal forms discussed in this book so far in that it’s not defined in terms of FDs, MVDs, and JDs, as such, at all.^[134] DK/NF is really a kind of “ideal” normal form: It’s desirable because, by definition, a relvar in DK/NF is guaranteed to be free of certain update anomalies; sadly, however, it’s not always achievable, nor has the question “Exactly when can it be achieved?” been answered. Be that as it may, let’s investigate.

DK/NF is defined in terms of domain constraints and key constraints. Key constraints are already familiar, of course (they were defined in Chapter 5). As for domain constraints, I remind you that domain is essentially just another word for type (see the answer to Exercise 2.4 in Appendix D). It follows that a domain constraint ought logically to be the same thing as a type constraint; in other words, it ought simply to be a specification of the set of values that constitute the type in question (see SQL and Relational Theory for further explanation of this concept). However, the term is being used in the present context in a slightly special sense. To be specific, a domain constraint, as that term is used here, is a constraint to the effect that values of a given attribute are taken from some prescribed set of values: for example, a constraint on the suppliers relvar S to the effect that STATUS values (which are integers, i.e., are of type INTEGER) must be in the range one to a hundred, inclusive.

Here then is a definition:

Definition: Relvar R is in domain-key normal form (DK/NF) if and only if every relvar constraint that holds in R is implied by the domain constraints and key constraints that hold in R.^[135]

Enforcing constraints on a DK/NF relvar is thus conceptually simple, since it is sufficient to enforce just the pertinent domain and key constraints, and all constraints—not just FDs, MVDs, and JDs, but all relvar constraints that apply to the relvar in question—on the relvar will then be enforced automatically.

DK/NF was first defined by Fagin in 1981, and it was the DK/NF paper that first gave precise definitions of the terms insertion anomaly and deletion anomaly. I defined these notions in Chapter 10, but there the definitions were framed in terms of JDs specifically. Here for the record are the general definitions (note that they refer to constraints in general, not just ones that happen to be FDs or MVDs or JDs):^[136]

Definition: Relvar R suffers from an insertion anomaly if and only if there exists a legal value r for R and a tuple t with the same heading as R such that the relation obtained by appending t to r satisfies R’s key constraints but is not a legal value for R (i.e., it violates some relvar constraint on R).
Definition: Relvar R suffers from a deletion anomaly if and only if there exists a legal value r for R and a tuple t of r such that the relation obtained by removing t from r is not a legal value for R (i.e., it violates some relvar constraint on R).

Finally, we have the following theorem:

Theorem: So long as every pertinent attribute can take at least two distinct values, DK/NF implies 5NF.

That is (speaking a trifle loosely), every DK/NF relvar is in 5NF, and therefore in RFNF (etc.) as well—though it’s not necessarily in 6NF, of course. In fact, DK/NF and 5NF coincide in the (probably unlikely) special case where the only constraints that hold are FDs and JDs specifically.

^[134] Well ... it’s defined in terms of key constraints, as we’ll see, and key constraints in turn are a special case of FDs, so this remark is perhaps a little economical with the truth.

^[135]A relvar constraint is any constraint that can be tested by examining the pertinent relvar in isolation. For further discussion, see SQL and Relational Theory.

^[136]These definitions, like the ones in Chapter 10, are slightly suspect, inasmuch as they talk about inserting or deleting individual tuples.