I suggested that one important feature for a type system is how much it allows us to describe, so we’ll start looking at the language of Haskell types first. It’s perfectly accurate and appropriate to describe a type system as a language—it has words, grammar rules, and meanings. However, languages differ in what they can be used to express (or articulate), by virtue of the forms of sentences available.
Propositional logic can only talk about atomic propositions, like “Rubies are red,” and combinations of these via the logical operators—for example, “Rubies are red and not(Rails is written in C).” Predicate logic allows some more flexibility by allowing the ability to quantify over entities, hence enabling sentences like, “For all m, m is mortal -> father_of(m) likes Rails” (in other words, every mortal’s father likes Rails—probably untrue but we’re not talking truth here, just talking about what we can articulate). In this example, we speak of the variable m being bound by the for-all quantifier. We can have many other kinds of logic, such as modal logic where we talk about possibility vs. necessity, or second-order logic where we can quantify over properties of elements, or...well, whatever we want.
Simple type systems (such as Pascal) can be like propositional logic, only able to say basic things such as, “This method takes two Ints and returns a String.” Typically, we’d express this as (Int, Int) -> String, using the thin arrow symbol as a kind of implication, in the sense of, “If you give me two Ints, then you get a String back.”
Programmers can add new type names—for example, a Person record to contain name, age, and address—and use these type names among the ones supplied in the language.
C allows references to functions to be passed around, and these can be represented using nesting—for example, (Int, Int -> Int) -> String for taking an Int and a function (from Int to Int) and returning a String. This is OK for basic work, but clearly more flexibility is needed.
Haskell’s core type system is a restricted form of predicate logic. It allows variables in the type expressions to stand for arbitrary types, enabling a feature called parametric polymorphism. This kind of polymorphism (for there are several) is about using the same piece of code whatever the actual input types. For example, taking the length of a list has type [a] -> Int, and can be computed without considering the type of list elements: all it needs to do is walk through the list structure counting the nodes. It’s completely irrelevant what is in the list, and no choice of element type can affect how the counting works! Similarly, list reversal has type [a] -> [a]; again, it is parametrically polymorphic and able to work with any list. Notice that type variable ’a’ appears twice; this means that the input type is the same as the output list, so it can never, for example, convert a list of Ints to a list of Strings.
Stepping back a bit, this “it works for everything, and works consistently” view is often precisely what we want to articulate, and the benefits for checking code should be clear. For example, the mapping operation on lists has type
| (a -> b) -> [a] -> [b] |
meaning in English, “Give me a function from some a to some b, and a list of some ’a’s, and you’ll get back a list of some ’b’s.” Notice how it says something useful about what map does, that we no longer need to explain in the documentation? Furthermore, when we use map, the compiler can check automatically that its use is appropriate.
Similarly, function composition f . g = \x -> f (g x) has type (b -> c) -> (a -> b) -> a -> c, translated as “If we can convert ’b’s to ’c’s and ’a’s to ’b’s, then we can convert ’a’s to ’c’s, for whatever choice of a,b,c.” Again, the type clearly says what the requirements are and indicates what we can expect back.
We can use polymorphism when defining our own data types too; for example:
| data SamePair a = MkSamePair a a |
has a constructor that combines two values of the same type. You saw some other examples last time. Let’s now look at how the type language is used.