Lesson 16. Creating types with “and” and “or”

After reading lesson 16, you’ll be able to

In this lesson, you’ll take a closer look at some of the types we’ve already covered. You’ll do this so you can learn more about what makes Haskell’s types unique and how to design programs using types. Most of the types you’ve seen so far are algebraic data types. Algebraic data types are any types that can be made by combining other types. The key to understanding algebraic data types is knowing exactly how to combine other types. Thankfully, there are only two ways. You can combine multiple types with an and (for example, a name is a String and another String), or you can combine types with an or (for example, a Bool is a True data constructor or a False data constructor). Types that are made by combining other types with an and are called product types. Types combined using or are called sum types.

Consider this

You’re writing code to help manage the breakfast menu at a local diner. Breakfast specials are made up of selections of one or more sides, a meat choice, and the main meal. Here are the data types for these options:

data BreakfastSide = Toast | Biscuit | Homefries | Fruit deriving Show
data BreakfastMeat = Sausage | Bacon | Ham deriving Show
data BreakfastMain = Egg | Pancake | Waffle deriving Show

You want to create a BreakfastSpecial type representing specific combinations of these items that the customer can choose. Here are the options:

  • Kids’ breakfast—One main and one side
  • Basic breakfast—One main, one meat, and one side
  • The lumberjack!—Two mains, two meats, and three sides

How can you create a single type that allows for these, and only these, possible selections from your other breakfast types?

16.1. Product types—combining types with “and”

Product types are created by combining two or more existing types with and. Here are some common examples:

Although the name product type might make this method of combining types sound sophisticated, this is the most common way in all programming languages to define types. Nearly all programming languages support product types. The simplest example is a struct from C. Here’s an example in C of a struct for a book and an author.

Listing 16.1. C structs are product types—an example with a book and author
struct author_name {
 char *first_name;
 char *last_name;
};

struct book {
  author_name author;
  char *isbn;
  char *title;
  int  year_published;
  double price;
};

In this example, you can see that the author_name type is made by combining two Strings (for those unfamiliar, char * in C represents an array of characters). The book type is made by combining an author_name, two Strings, an Int, and a Double. Both author_name and book are made by combining other types with an and. C’s structs are the predecessor to similar types in nearly every language, including classes and JSON. In Haskell, our book example would look like this.

Listing 16.2. C’s author_name and book structs translated to Haskell
data AuthorName = AuthorName String String

data Book = Author String String Int

Preferably, you’d use record syntax (lesson 12) to write a version of book even more reminiscent of the C struct.

Listing 16.3. Using record syntax for Book to show the similarity to a C struct
data Book = Book {
     author  :: AuthorName
   , isbn    :: String
   , title   :: String
   , year    :: Int
   , price   :: Double}

Book and AuthorName are examples of product types and have an analog in nearly every modern programming language. What’s fascinating is that in most programming languages, combining types with an and is the only way to make new types.

Quick check 16.1

Q1:

Rewrite AuthorName by using record syntax.

QC 16.1 answer

1:

data AuthorName = AuthorName {
     firstName :: String
   , lastName :: String
}

 

16.1.1. The curse of product types: hierarchical design

Making new types only by combining existing types leads to an interesting model of designing software. Because of the restriction that you can expand an idea only by adding to it, you’re constrained with top-down design, starting with the most abstract representation of a type you can imagine. This is the basis for designing software in terms of class hierarchies.

As an example, suppose you’re writing Java and want to start modeling data for a bookstore. You start with the preceding Book example (assume that the Author class already exists).

Listing 16.4. A first pass at defining a Book class in Java
public class Book {
    Author author;
    String isbn;
    String title;
    int  yearPublished;
    double price;
}

This works great until you realize that you also want to sell vinyl records in the bookstore. Your default implementation of VinylRecord looks like this.

Listing 16.5. Expanding your selection by adding a Java class for VinylRecord
public class VinylRecord {
    String artist;
    String title;
    int  yearPublished;
    double price;
}

VinylRecord is similar to Book, but dissimilar enough that it causes trouble. For starters, you can’t reuse your Author type, because not all artists have names; sometimes the artist is a band rather than an individual. You could use the Author type for Elliott Smith but not for The Smiths, for example. In traditional hierarchical design, there’s no good answer to this issue regarding the Author and artist mismatch (in the next section, you’ll see how to solve this in Haskell). Another problem is that vinyl records don’t have an ISBN number.

The big problem is that you want a single type that represents both vinyl records and books so you can make a searchable inventory. Because you can compose types only by and, you need to develop an abstraction that describes everything that records and books have in common. You’ll then implement only the differences in the separate classes. This is the fundamental idea behind inheritance. You’ll next create the class StoreItem, which is a superclass of both VinylRecord and Book. Here’s the refactored Java.

Listing 16.6. Creating a StoreItem superclass of Book and VinylRecord in Java
public class StoreItem {
    String title;
    int  yearPublished;
    double price;
}

public class Book extends StoreItem{
    Author author;
    String isbn;
}

public class VinylRecord extends StoreItem{
    String artist;
}

The solution works okay. You can now write all the rest of your code to work with StoreItems and then use conditional statements to handle Book and VinylRecord. But suppose you realize that you ordered a range of collectible toy figurines to sell as well. Here’s the basic CollectibleToy class.

Listing 16.7. A CollectibleToy class in Java
public class CollectibleToy {
    String name;
    String description;
    double price;
}

To make everything work, you’ve completely refactored all of your code again! Now StoreItem can have only a price attribute, because it’s the only value that all items share in common. The common attributes between VinylRecord and Book have to go back into those classes. Alternately, you could make a new class that inherits from StoreItem and is a superclass of VinylRecord and Book. What about ColletibleToy’s name attribute? Is that different from title? Maybe you should make an interface for all of your items instead! The point is that even in relatively simple cases, designing in strictly product types can quickly get complex.

In theory, creating object hierarchies is elegant and captures an abstraction about how everything in the world is interrelated. In practice, creating even trivial object hierarchies is riddled with design challenges. The root of all these challenges is that the only way to combine types in most languages is with an and. This forces you to start from extreme abstraction and move downward. Unfortunately, real life is full of strange edge cases that make this much more complicated than you’d typically want.

Quick check 16.2

Q1:

Assume you have a Car type. How could you represent a SportsCar as a Car with a Spoiler? (Assume that you have a Spoiler type as well.)

QC 16.2 answer

1:

data SportsCar = SportsCar Car Spoiler

 

16.2. Sum types—combining types with “or”

Sum types are a surprisingly powerful tool, given that they provide only the capability to combine two types with or. Here are examples of combining types with or:

The most straightforward sum type is Bool.

Listing 16.8. A common sum type: Bool
data Bool = False | True

An instance of Bool is either the False data constructor or the True data constructor. This can give the mistaken impression that sum types are just Haskell’s way of creating enumerative types that exist in many other programming languages. But you’ve already seen a case in which sum types can be used for something more powerful, in lesson 12 when you defined two types of names.

Listing 16.9. Using a sum type to model names with and without middle names
type FirstName = String
type LastName = String
type MiddleName = String

data Name = Name FirstName LastName
   | NameWithMiddle FirstName MiddleName LastName

In this example, you can use two type constructors that can either be a FirstName consisting of two Strings or a NameWithMiddle consisting of three Strings. Here, using or between two types allows you to be expressive about what types mean. Adding or to the tools you can use to combine types opens up worlds of possibility in Haskell that aren’t available in any other programming language without sum types. To see how powerful sum types can be, let’s resolve some of the issues in the previous section.

An interesting place to start is the difference between author and artist. In our example, you need two types because you assume that the name of each book author can be represented as a first and last name, whereas an artist making records can be represented as a person’s name or a band’s name. Resolving this problem with product types alone is tricky. But with sum types, you can tackle this problem rather easily. You can start with a Creator type that’s either an Author or an Artist (you’ll define these next).

Listing 16.10. A Creator type that’s either an Author or an Artist
data Creator = AuthorCreator Author | ArtistCreator Artist

You already have a Name type, so you can start by defining Author as a name.

Listing 16.11. Defining the Author type by using your existing Name type
data Author = Author Name

An artist is a bit trickier; as we already mentioned, Artist can be a person’s name or a band’s name. To solve this issue, you’ll use another sum type!

Listing 16.12. An artist can be either a Person or a Band
data Artist = Person Name | Band String

This is a good solution, but what about some of those tricky edge cases that pop up in real life all the time? For example, what about authors such as H.P. Lovecraft? You could force yourself to use Howard Phillips Lovecraft, but why force yourself to be constrained by your data model? It should be flexible. You can easily fix this by adding another data constructor to Name.

Listing 16.13. Expanding your Name type to work with H.P. Lovecraft
data Name = Name FirstName LastName
   | NameWithMiddle FirstName MiddleName LastName
   | TwoInitialsWithLast Char Char LastName

Notice that Artist, Author, and as a result, Creator all depend on the definition of Name. But you had to change only the definition of Name itself and didn’t need to worry at all about how any other types using Name are defined. At the same time, you still benefit from code reuse, as both Artist and Author types benefit from having Name defined in a single place. As an example of all of this, here’s our H.P. Lovecraft Creator type.

Listing 16.14. Making a Creator type for H.P. Lovecraft
hpLovecraft :: Creator
hpLovecraft = AuthorCreator
               (Author
                 (TwoInitialsWithLast 'H' 'P' "Lovecraft"))

Although the data constructors in this example may be verbose, in practice you’d likely be using functions that would abstract out much of this. Now think of how this solution compares to one you could come up with using hierarchal design required by product types. From the hierarchical design standpoint, you’d need to have a Name superclass with only a last-name attribute (because this is the only property that all three types of name share). Then you’d need separate subclasses for each of the three data constructors you use. But even then, a name such as Andrew W.K., with a last name as a char, would completely break that model. This is an easy fix with sum types.

Listing 16.15. Easily expanding Name to work with Andrew W.K.
data Name = Name FirstName LastName
   | NameWithMiddle FirstName MiddleName LastName
   | TwoInitialsWithLast Char Char LastName
   | FirstNameWithTwoInits FirstName Char Char

The only solution for the product-type-only view is to create a Name class with a growing list of fields that would be unused attributes:

public class Name {
    String firstName;
    String lastName;
    String middleName;
    char firstInitial;
    char middleInitial;
    char lastInitial;
}

This would require a lot of extra code to ensure that everything behaves correctly. Additionally, you have no guarantees about your Name being in a valid state. What if all these attributes had values? There’s nothing a type checker in Java could do to ensure that a Name object met the constraints you’ve specified for names. In Haskell, you can know that only the explicit types you’ve defined can exist.

16.3. Putting together your bookstore

Now let’s revisit our bookstore problem and see how thinking with sum types can help. With your powerful Creator type in hand, you can now rewrite Book.

Listing 16.16. The Book type using Creator
data Book = Book {
     author    :: Creator
   , isbn      :: String
   , bookTitle :: String
   , bookYear  :: Int
   , bookPrice :: Double
   }

You can also define your VinylRecord type.

Listing 16.17. The VinylRecord type
data VinylRecord = VinylRecord {
     artist        :: Creator
   , recordTitle   :: String
   , recordYear    :: Int
   , recordPrice   :: Double
   }
Why not just price?

The careful reader may notice that Book and VinylRecord have their own unique name for price. Why not make working with these types more consistent and use the name price rather than bookPrice and recordPrice? The issue here has nothing to do with the limitation of sum types but rather a limitation of Haskell’s way of dealing with record syntax. You’ll recall that without record syntax, you’d define your book type as follows:

data Book = Book Creator String String Int Double

Record syntax automates creating a function like this:

price :: Book -> Double
price (Book _ _ _ _ val) = val

The problem is that using the same name for a property of both a Book and a VinylRecord means defining conflicting functions!

This is incredibly annoying, and a failing of Haskell I have a hard time forgiving. We’ll touch on workarounds later in the book. But if you think this is ridiculous, you’re not alone.

Now you can trivially create a StoreItem type.

Listing 16.18. A StoreItem type is either a Book or a VinylRecord
data StoreItem = BookItem Book | RecordItem VinylRecord

But once again, we’ve forgotten about the CollectibleToy. Because of sum types, it’s easy to add this data type and extend your StoreItem type to include it.

Listing 16.19. Adding a CollectibleToy type
data CollectibleToy = CollectibleToy {
     name :: String
   , descrption :: String
   , toyPrice :: Double
   }

Fixing StoreItem just means adding one more or.

Listing 16.20. Easily refactoring StoreItem to include CollectibleToy
data StoreItem = BookItem Book
 | RecordItem VinylRecord
 | ToyItem CollectibleToy

Finally, we’ll demonstrate how to build functions that work on all of these types by writing a price function that gets the price of any item.

Listing 16.21. An example of using the StoreItem type with a price function
price :: StoreItem -> Double
price (BookItem book) = bookPrice book
price (RecordItem record) = recordPrice record
price (ToyItem toy) = toyPrice toy

Sum types allow you to be dramatically more expressive with your types while still providing convenient ways to create groups of similar types.

Quick check 16.3

Q1:

Assume that Creator is an instance of Show. Write a madeBy function that has the type StoreItem -> String and does its best to determine who made the StoreItem.

QC 16.3 answer

1:

madeBy :: StoreItem -> String
madeBy (BookItem book) = show (author book)
madeBy (RecordItem record) = show (artist record)
madeBy _ = "unknown"

 

Summary

In this lesson, our objective was to teach you about the two ways to create types from existing types. The first way is with product types. Product types work by combining types using and, bundling two or more types together to define a new type. Nearly every programming language supports product types, even if not by that name. The other way to combine types is with or. Sum types are much less common than product types. The problem with product types alone is that you’re forced to think in hierarchical abstractions. Sum types are a powerful tool that allows you to be much more expressive in defining new types. Let’s see if you got this.

Q16.1

To further complicate the items in your store, you eventually keep an inventory of free pamphlets. Pamphlets have a title, a description, and a contact field for the organization that provides the pamphlet. Create the Pamphlet type and add it to StoreItem. Additionally, modify the price so that it works with Pamphlet.

Q16.2

Create a Shape type that includes the following shapes: Circle, Square, and Rectangle. Then write a function to compute the perimeter of a Shape as well as its area.