SICP Exercise 2.6

2021-12-17

Question

In case representing pairs as procedures wasn’t mind-boggling enough, consider that, in a language that can manipulate procedures, we can get by without numbers (at least insofar as non-negative integers are concerned) by implementing 0 and the operation of adding 1 as

(define zero (λ (f) (λ (x) x)))

(define (add-1 n)
  (λ (f) (λ (x) (f ((n f) x)))))

This representation is known as Church numerals, after its inventor, Alonzo Church, the logician who invented the $λ$-calculus.

Define one and two directly (not in terms of zero and add-1). (Hint: Use substitution to evaluate (add-1 zero)). Give a direct definition of the addition procedure + (not in terms of repeated application of add-1).

Answer

Definition of One

So, let us do as the question says and use (add-1 zero) to arrive at a definition of one.

(add-1 zero)

We can use the substitution model to evaluate this. Let’s start with the definition of add-1.

(λ (f) (λ (x) (f ((zero f) x))))

Then plug in zero:

(λ (f) (λ (x) (f (((λ (f) (λ (x) x)) f) x))))

This is the fully expanded code for (add-1 zero), i.e. for one. It looks very complex, but let’s focus on just one part of this and try to simplify it. In particular, let’s look at the second half of the code, which looks like this:

(((λ (f) (λ (x) x)) f) x)

The first part of this, (λ (f) (λ (x) x)), evaluates to (λ (x) x). So we can substitute:

(((λ (x) x) f) x)

We can simplify this further. Because ((λ (x) x) f) simply returns f, the fully simplified piece of code looks like:

(f x)

That is much better than what we started with. Let us now plug this into our original equation and we receive the following:

(λ (f) (λ (x) (f x)))

At this point, we see where the difference is between zero and one. Let’s look at them side by side:

(λ (f) (λ (x) x))     ; zero
(λ (f) (λ (x) (f x))) ; one

One simply has one more function call in it.

Definition of Two

We can already guess what two might look like now, but let’s try deriving it using add-1, similar to the way we just did it for one:

(add-1 one) ; evaluate add-1
(λ (f) (λ (x) (f ((one f) x)))) ; evaluate one
(λ (f) (λ (x) (f (((λ (f) (λ (x) (f x))) f) x))))

; (((λ (f) (λ (x) (f x))) f) x) --> ((λ (x) (f x)) x)
(λ (f) (λ (x) (f ((λ (x) (f x)) x))))

; ((λ (x) (f x)) x) --> (f x)
(λ (f) (λ (x) (f (f x))))

So, just to confirm our previous intuition, let’s look at the three numbers we have now side by side:

(λ (f) (λ (x) x))             ; zero
(λ (f) (λ (x) (f x)))         ; one
(λ (f) (λ (x) (f (f x))))     ; two

As suspected, it’s just one more function call.

Definition of +

This should probably be similar to the add-1-procedure, except instead of adding one to n, we add any number m.

We can deduce from the add-1-procedure that, if n describes the number to add one to using (n f), maybe we can do something similar for the f before that, like so:

(define (church-plus n m)
  (λ (f) (λ (x) ((n f) ((m f) x)))))

Let’s test if this works by adding one and two.

Warning: the following code is admittedly close to incomprehensible. I tried my best to explain my thinking with the comments, but even so, there’s just a lot of λs next to each other, which makes it quite hard to read.

(church-plus one two)

; expand church-plus definition
(λ (f) (λ (x) ((one f) ((two f) x))))

; expand one and two
(λ (f)
  (λ (x)
    (((λ (f) (λ (x) (f x))) f)
     (((λ (f) (λ (x) (f (f x)))) f) x))))

; start simplifying
(λ (f)
  (λ (x)
    (((λ (f) (λ (x) (f x))) f) (((λ (f) (λ (x) (f (f x)))) f) x))))

; (((λ (f) (λ (x) (f x))) f) (((λ (f) (λ (x) (f (f x)))) f) x)) -->
; ((λ (x) (f x)) (((λ (f) (λ (x) (f (f x)))) f) x))
(λ (f)
  (λ (x)
    ((λ (x) (f x)) (((λ (f) (λ (x) (f (f x)))) f) x))))

; ((λ (x) (f x)) (((λ (f) (λ (x) (f (f x)))) f) x)) -->
; (f (((λ (f) (λ (x) (f (f x)))) f) x))
(λ (f)
  (λ (x)
    (f (((λ (f) (λ (x) (f (f x)))) f) x))))

; ((λ (f) (λ (x) (f (f x)))) f) -->
; (λ (x) (f (f x)))
(λ (f)
  (λ (x)
    (f ((λ (x) (f (f x))) x))))

; ((λ (x) (f (f x))) x) -->
; (f (f x))
(λ (f) (λ (x) (f (f (f x)))))

This is exactly the result we would expect, so it seems that this method of addition is valid.