# Some type about Sum types (and more)

2017-03-22 19:00

A computer stores bits, and 8 bits make a byte.

Some bytes are not like other bytes – bytes can mean different things.

Some bytes mean an integer, some bytes mean a list; we say that they have different types.

## A type is a group of things

An `integer`

is a group of integers, like 1, 2, 3, and so on, and so forth.

A `boolean`

is a group that has 2 members, True and False.

But we’re getting ahead of ourselves, so let’s start from 0.

A `void`

type has no members. It is void.

A `unit`

type has only 1 member, and the member is called `unit`

. You see it as `()`

in many places.

What type has 2 members?

Perhaps a `boolean`

!

`True`

and `False`

are both members of `boolean`

. `True`

is not `False`

, and `False`

is not `True`

. But they are both `boolean`

.

We can write it as such

`type boolean = True | False`

This reads as: “The type boolean has 2 members, it can either be True or False”.

(In OCaml, the type `boolean`

is actually spelled `bool`

.)

## Sum types

Suppose you’re deciding if you should continue reading.

```
type decision =
| Decided of boolean
| Undecided
```

We can make a decision

- do (
`Decided True`

), or - do not (
`Decided False`

), or - be
`Undecided`

(there is no try).

We observe that the type `decision`

has 3 members.

In the `Decided`

case, there are 2 members.

The number of members is the same as the number of members in a `boolean`

, because the `Decided`

case is made up entirely of `boolean`

.

In the `Undecided`

case, there is 1 member, similar to `Unit`

.

How did we get 3? A `decision`

can be `Decided`

(2) **or** `Undecided`

(1). 2 **+** 1 = 3.

See how **or** and `+`

works in similar ways.

That’s why they call it a sum type!

## Product types

Let’s make a `Pair`

of `boolean`

s.

`type pair = Pair of boolean * boolean;`

How many members are in this type? 4!

```
True * True
True * False
False * True
False * False
```

How did we get 4?

For the first boolean, there are 2 members, True and False.

And for each of these 2 members, the second boolean can be True or False.

2 * 2 = 4

And that’s where the name **product** comes from.

(Notice how the OCaml syntax for defining a product type, or a tuple, is using the `*`

operator, which in many cases mean product)

## Exponential types

You’ve made a decision, and depending on the decision you will either continue reading, or not.

Here’s one particular way to encode this decision-to-action process.

```
let continue = function
| Decided True -> True
| Decided False -> False
| Undecided -> True
```

If you have decided, you follow your decision. If you are undecided, you continue reading and see what happens.

`continue`

is a function, its type is `decision -> boolean`

.

Let’s ask the same question: how many members are there in the `decision -> boolean`

type?

A function of this type has to, for each member of the input type, `decision`

with 3 members, give a member of the output type, `boolean`

with 2 members.

So the total number of unique functions of this type is 2 * 2 * 2 = 8, or equivalently, 2 ^ 3!

- DT, DF, U -> True
- DT, DF, U -> False
- DT, DF -> True, U -> False
- DT, U -> True, DF -> False
- DF, U -> True, DT -> False
- DT -> True, DF, U -> False
- DF -> True, DT, U -> False
- U -> True, DF, DT -> False

So you see, function types are also called **exponential** types.

## Dependent sum types / Sigma type

A `Pair boolean integer`

is a type, but a `Pair boolean a'`

, where `a'`

depends on what `boolean`

actually is, `True`

or `False`

, is called a dependent sum type.

Here’s an example (not actual valid OCaml code):

```
type pair = Pair of (b : boolean) * a'
where Pair of True * boolean
Pair of False * integer
```

It says that that the `pair`

depends on the first item in the pair, if it is `True`

, we have a `Pair of boolean * boolean`

, but if it is `False`

, we have a `Pair of boolean * integer`

.

To count how many members there are in this type `pair`

, we can treat it like a sum type.

The first item in the pair can have 2 possible values

- when
`True`

, the second item has to be a`boolean`

, which has 2 possible value. - when
`False`

, the second item has to be a`integer`

, which for simplicity sake, can have 256 values.

Therefore, the total number of members is 2 **+** 256 = 258.

Why is this called sigma? Because you are taking the sum of all possible types across all values of the type being depended on (first item in the pair).

## Dependent product types / Pi type

A `decision -> boolean`

is a function type, with an argument of type `decision`

, and a return type `boolean`

.

But a function `decision -> a'`

, where `a'`

depends on what `decision`

actually is, `True`

or `False`

, is called a dependent product type.

```
type continue = decision -> a'
where continue Decided True : integer
continue Decided False : boolean
continue Undecided : boolean
```

This says that when the type of `continue`

depends on the value of the argument.

When `decision`

is `Decided True`

, `continue`

returns an integer, when `Decided False`

or `Undecided`

, it returns a boolean.

Let’s count how many members type has. Recall how we counted the number of members a function type has previously, we multiplied the number of members in the output type n times, where n is the number of member in the input type.

In this case, since the number of members in the output type changes, we have to multiply them one by one instead of taking an exponent, to get 256 * 2 * 2 = 1024.

Why is this called pi? Because you are taking the product of all possible types across all values of the type being depended on (argument of the function).

And actually, the function space is a special case of the pi type, where the return type is constant, i.e. does not depend on the value of the argument.

## A little theory behind this

We can think of types as a set of things. We say that some value is of that type when it is a member in that set.

For example, `boolean`

is the set of things, which contains `True`

and `False`

.

A sum type is a *disjoint union*, or *tagged union*.

A product type is a *cartesian product*.

A sigma is the symbol to do a sum over a sequence.

A pi is the symbol to do a product over a sequence.

A lot of this comes from, or is related, to category theory.

## Notes

I chose to say that a type has 2 members to invoke the idea of thinking about types as sets. Other words that can be used include value (rejected because it means too many things).

The real reason why the words `sum`

, `product`

, etc are used probably has less to do with counting members, and more to do with category theory. But it works well enough if you are trying to get an intuition about the meaning.

## References

I found a lot of relevant and helpful resources while researching on sum types:

- Manis Goregaokar. What Are Sum, Product, and Pi Types?
- Chris Taylor. The Algebra of Algebraic Data Types, Part 1 (he goes into recursive types in the next post, check it out!)
- Wikipedia page on Dependent Type
- Idris, which has dependent types, for tips on the made up syntax.
- Intuitionistic type theory