Some type about Sum types (and more)

Home

A computer stores bits, and 8 bits make a byte.

Some bytes are not like other bytes – bytes can mean different things.

Some bytes mean an integer, some bytes mean a list; we say that they have different types.

A type is a group of things

An integer is a group of integers, like 1, 2, 3, and so on, and so forth.

A boolean is a group that has 2 members, True and False.

But we’re getting ahead of ourselves, so let’s start from 0.

A void type has no members. It is void.

A unit type has only 1 member, and the member is called unit. You see it as () in many places.

What type has 2 members?

Perhaps a boolean!

True and False are both members of boolean. True is not False, and False is not True. But they are both boolean.

We can write it as such

type boolean = True | False

This reads as: “The type boolean has 2 members, it can either be True or False”.

(In OCaml, the type boolean is actually spelled bool.)

Sum types

Suppose you’re deciding if you should continue reading.

type decision =
  | Decided of boolean
  | Undecided

We can make a decision

do (Decided True), or
do not (Decided False), or
be Undecided (there is no try).

We observe that the type decision has 3 members.

In the Decided case, there are 2 members.

The number of members is the same as the number of members in a boolean, because the Decided case is made up entirely of boolean.

In the Undecided case, there is 1 member, similar to Unit.

How did we get 3? A decision can be Decided (2) or Undecided (1). 2 + 1 = 3.

See how or and + works in similar ways.

That’s why they call it a sum type!

Product types

Let’s make a Pair of booleans.

type pair = Pair of boolean * boolean;

How many members are in this type? 4!

True * True
True * False
False * True
False * False

How did we get 4?

For the first boolean, there are 2 members, True and False.

And for each of these 2 members, the second boolean can be True or False.

2 * 2 = 4

And that’s where the name product comes from.

(Notice how the OCaml syntax for defining a product type, or a tuple, is using the * operator, which in many cases mean product)

Exponential types

You’ve made a decision, and depending on the decision you will either continue reading, or not.

Here’s one particular way to encode this decision-to-action process.

let continue = function
    | Decided True -> True
    | Decided False -> False
    | Undecided -> True

If you have decided, you follow your decision. If you are undecided, you continue reading and see what happens.

continue is a function, its type is decision -> boolean.

Let’s ask the same question: how many members are there in the decision -> boolean type?

A function of this type has to, for each member of the input type, decision with 3 members, give a member of the output type, boolean with 2 members.

So the total number of unique functions of this type is 2 * 2 * 2 = 8, or equivalently, 2 ^ 3!

DT, DF, U -> True
DT, DF, U -> False
DT, DF -> True, U -> False
DT, U -> True, DF -> False
DF, U -> True, DT -> False
DT -> True, DF, U -> False
DF -> True, DT, U -> False
U -> True, DF, DT -> False

So you see, function types are also called exponential types.

Dependent sum types / Sigma type

A Pair boolean integer is a type, but a Pair boolean a', where a' depends on what boolean actually is, True or False, is called a dependent sum type.

Here’s an example (not actual valid OCaml code):

type pair = Pair of (b : boolean) * a'
where Pair of True * boolean
      Pair of False * integer

It says that that the pair depends on the first item in the pair, if it is True, we have a Pair of boolean * boolean, but if it is False, we have a Pair of boolean * integer.

To count how many members there are in this type pair, we can treat it like a sum type.

The first item in the pair can have 2 possible values

when True, the second item has to be a boolean, which has 2 possible value.
when False, the second item has to be a integer, which for simplicity sake, can have 256 values.

Therefore, the total number of members is 2 + 256 = 258.

Why is this called sigma? Because you are taking the sum of all possible types across all values of the type being depended on (first item in the pair).

Dependent product types / Pi type

A decision -> boolean is a function type, with an argument of type decision, and a return type boolean.

But a function decision -> a', where a' depends on what decision actually is, True or False, is called a dependent product type.

type continue = decision -> a'
where continue Decided True : integer
      continue Decided False : boolean
      continue Undecided : boolean

This says that when the type of continue depends on the value of the argument.

When decision is Decided True, continue returns an integer, when Decided False or Undecided, it returns a boolean.

Let’s count how many members type has. Recall how we counted the number of members a function type has previously, we multiplied the number of members in the output type n times, where n is the number of member in the input type.

In this case, since the number of members in the output type changes, we have to multiply them one by one instead of taking an exponent, to get 256 * 2 * 2 = 1024.

Why is this called pi? Because you are taking the product of all possible types across all values of the type being depended on (argument of the function).

And actually, the function space is a special case of the pi type, where the return type is constant, i.e. does not depend on the value of the argument.

A little theory behind this

We can think of types as a set of things. We say that some value is of that type when it is a member in that set.

For example, boolean is the set of things, which contains True and False.

A sum type is a disjoint union, or tagged union.

A product type is a cartesian product.

A sigma is the symbol to do a sum over a sequence.

A pi is the symbol to do a product over a sequence.

A lot of this comes from, or is related, to category theory.

Notes

I chose to say that a type has 2 members to invoke the idea of thinking about types as sets. Other words that can be used include value (rejected because it means too many things).

The real reason why the words sum, product, etc are used probably has less to do with counting members, and more to do with category theory. But it works well enough if you are trying to get an intuition about the meaning.

References

I found a lot of relevant and helpful resources while researching on sum types:

Manis Goregaokar. What Are Sum, Product, and Pi Types?
Chris Taylor. The Algebra of Algebraic Data Types, Part 1 (he goes into recursive types in the next post, check it out!)
Wikipedia page on Dependent Type
Idris, which has dependent types, for tips on the made up syntax.
Intuitionistic type theory