Curiously, for a practice that’s so widely advocated, there seems to be considerable confusion over what denormalization actually consists of. (The textbooks aren’t much help, either, even those that specialize in design topics; most of them don’t even mention it, and those that do rarely offer a definition, and they certainly don’t discuss the matter in much depth.) For example, a while back I had occasion to read a paper specifically devoted to the question of denormalization in commercial SQL products.[73] I’ll refer to that paper as “the denormalization paper” in what follows. Now, the author begins by arguing against denormalization. To quote:
I think the normalization principles should be treated as commandments ... unless you’re faced with performance problems that money, hardware scalability, current SQL technology, network optimization, parallelization, or other performance techniques can’t resolve [slightly reworded, boldface added].
I couldn’t agree more with this position. Indeed, I’m on record as saying very much the same thing myself: In a paper I wrote in 1990 on the use of SQL systems in practice,[74] I recommended denormalization as a performance tactic “only if all else fails.” Unfortunately, however, the rest of the denormalization paper tends to suggest that the author doesn’t really know what denormalization is; after the opening position statement quoted above, the paper goes on to give some eight examples of “designing for performance,” all but one of which have absolutely nothing to do with denormalization at all!
In the author’s defense, however, I say again that it does seem to be difficult to find a precise definition of denormalization in the literature. Of course, it could be argued that no such definition is needed, given that (a) denormalization, whatever else it might be, must surely be the inverse of normalization, and (b) normalization in turn certainly is precisely defined. For the record, however, I’ll give some idea as to what a precise definition of denormalization might look like in just a moment. Before I do, however, let me make it clear that I have no particular quarrel with the specific design tactics suggested in the denormalization paper; indeed, I suggested several of those same tactics in a paper I wrote myself back in 1982.[75] My quarrel is only with the fact that it refers to them as denormalization tactics specifically.
So here’s my own definition, for what it’s worth (and I apologize if it seems a little lengthy). I start with the observation that normalizing relvar R means decreasing redundancy by:
Replacing R by a set of projections R1, ..., Rn such that at least one of R1, ..., Rn is at a higher level of normalization than R, and such that also
For all possible values r of R, if the corresponding values r1, ..., rn of R1, ..., Rn (respectively) are joined back together again, then the result of that join is equal to r.
Hence the following definition:
Definition: Denormalizing a set of relvars R1, ..., Rn means increasing redundancy by:
Replacing R1, ..., Rn by their join R such that R is at a lower level of normalization than at least one of R1, ..., Rn, and such that also
For all possible values r1, ..., rn of R1, ..., Rn (respectively), the result of projecting the corresponding value r of R over the attributes of Ri is equal to ri (i = 1, ..., n).
Points arising:
Observe that denormalization is a process that applies to a set of relvars, not to an individual relvar considered in isolation. For example, consider relvars SNC and CT, with headings {SNO,SNAME,CITY} and {CITY,STATUS}, respectively (see Figure 3-2 for some sample values). These two relvars are in BCNF. If we join them together, we get the suppliers relvar S (which is only in 2NF, not in 3NF, and therefore not in BCNF either), and so relvar S can be regarded as a denormalization of relvars SNC and CT. What’s more, of course, relvar S involves more redundancy than relvars SNC and CT do.
If (a) R1, ..., Rn were obtained by taking projections of R in the first place—in other words, if the denormalization is really undoing an earlier normalization, so to speak, as in the suppliers example in the previous bullet item—and if also (b) that earlier normalization was done purely to decrease redundancy and not to fix a logically incorrect design (see the remarks in Chapter 3 on the difference between these two possibilities), then (c) the requirement that for all possible values r of R, projecting r over the attributes of Ri must yield ri (i = 1, ..., n) will be met automatically.
The argument in favor of denormalization is basically that it makes retrievals easier to express and makes them perform better.[76] To what extent this argument might be valid I’ll examine in a later section. First, however, I’d like to point out that once we make the decision to denormalize, we’ve embarked on a very slippery slope. The question is: Where do we stop? The situation is different with normalization, where there are clear logical reasons for continuing the process until we reach the highest possible normal form. Do we then conclude that with denormalization we should proceed until we reach the lowest possible normal form? Surely not; yet there are no logical criteria for deciding exactly where the process should stop. In choosing to denormalize, in other words, we’ve backed off from a position that does at least have some solid science and logical theory behind it, and replaced it by one that’s purely pragmatic in nature (as well as being based, typically, on a somewhat narrow perspective on the overall problem).
[73] Sam Hamdan: “Denormalization and SQL-DBMS,” SQL Forum 4, No. 1 (January/February 1995).
[74] “SQL Dos and Don’ts,” in Relational Database Writings 1985-1989 (Addison-Wesley, 1990).
[75] “A Practical Approach to Database Design,” in Relational Database: Selected Writings (Addison-Wesley, 1986).
[76] It’s also sometimes claimed to make the database easier to understand. Exercise 8.2 addresses this particular issue.