Lambda Calculus: Intro (Part 1)

This is part 1 of a series of articles on lambda calculus.

Lambda calculus (or λ-calculus) is one of the coolest things I have ever seen. It’s right at the intersection between computer science and mathematical logic, familiar to those with basic experience with functional programming paradigms.

Lambda calculus is a set of rules where you can play with lambda expressions. From a programmer’s perspective,these expressions represent anonymous functions — there is no name, since you call the functions with the definition itself. You may notice that this is precisely like lambda functions in programming languages, like arg => val in JavaScript, lambda arg : val in Python, and so on. One important difference is that lambda expressions do not have any side effects; they do not modify the “state of the program” or value of the expression other than by providing the value they return.

This article is going to be a quick summary on the computer science aspects of lambda calculus. It’s going to cover the basics of the untyped lambda calculus and a system of arithmetic in it called the Church encoding.

Lambda expressions

There are three types of lambda expressions:

1. $x$ — a variable that represents some kind of value.
2. $(\lambda x.M)$ — an abstraction, or an unnamed function that binds its argument to the variable $x$ in the function body $M$, where $M$ is another lambda term.
3. $(M~N)$ — an application of a function $M$, or “calling” $M$ with $N$, where $M$ and $N$ are lambda terms.

An application substitutes a bound variable in a lambda abstraction with the provided argument. An important thing to note is that all abstractions are essentially functions that take one argument. Through the technique of currying, one can convert a function that takes $n$ arguments into a sequence of $n$ functions that each take one argument. For example, the function that adds its two arguments and its curried version take the form of

$$\text{add1} := (x, y) \mapsto x + y$$

and

$$\text{add2} := x \mapsto y \mapsto x + y.$$

The $\mapsto$ (read “maps to”) separates the function’s argument(s) on the left and its return value on the right. Invoking these two versions of our add function looks like $\text{add1}(59,71)=\text{add2}(59)(71)=130$. Currying confers two main advantages: all functions take one argument, and functions can be partially applied by supplying some of the arguments. Partial application allows for some pretty creative patterns in programming.1

Convention dictates that lambda expressions are to be evaluated from left to right when parentheses are absent. Abstraction bodies are to stretch as far right as possible when the number of terms in the body is ambiguous. For example, $(\lambda x.x~b~c)~(y~z)=(\lambda x.(x~b)~c)~(y~z)$ which, after one β-reduction, is equivalent to $((y~z)~b)~c=y~z~b~c$.

Variables can either be free or bound. Bound variables are ones that have been declared in the current scope — they reference arguments in lambda abstractions. Free variables reference values that are not in the current scope.

Manipulation of lambda expressions

Lambda expressions can be manipulated to rearrange or simplify them. There are three main ways to manipulate, or reduce, lambda expressions:

1. Alpha/α-conversion — renaming a bound variable: $\lambda x.x~\Rarr~\lambda y.y$
2. Beta/β-reduction — replacing an applied abstraction with the argument substituted for the bound variable in the abstraction body. In other words, apply a function: $(\lambda x.M)~y\Rarr M[x:=y]$, where the syntax on the right represents substitution of $x$ with $y$ in the expression $M$.
3. Eta/η-conversion — this one is a bit more tricky. Two lambda expressions are extensionally/η-equivalent if they evaluate to equivalent expressions for all arguments. For example, $\lambda x.f~x$ and $f$ are η-equivalent, since no matter how they are used, they behave exactly the same. For more complicated expressions, determining whether two are η-equivalent is usually a much harder challenge.

Some syntactic sugar

Syntactic sugar is syntax that is shorthand for something longer or more complicated, often used to simplify or make something more clear. Common syntactic sugar in lambda calculus includes:

• Nested abstractions $\lambda a.\lambda b. \lambda c.M$ written as $\lambda abc.M$
• Application $(\lambda x.M)~a$ written as $\text{let }x=a\text{ in }M$ when the application is primarily intended to represent a scope or a variable/constant definition.

Wow that all sounds really complicated. So…

What does this all mean?

Essentially, lambda calculus provides us with set rules on defining pure functions2 and applications of those functions.

Incredibly, lambda calculus is Turing complete, or roughly mathematically equivalent to a computer. Although this may initially sound absurd, we can give some credence to this idea by introducing Church numerals. These numerals encode the natural numbers and may be added, multiplied, exponentiated, etc.

Church encoding

The Church encoding is the standard way to represent data; specifically, natural numbers. Each natural number is represented by a Church numeral.

For a number $n$, the corresponding Church numeral is a lambda expression that accepts the function $f$ and argument $x$, applying $f$ to $x$ $n$ times:

\begin{aligned} [n] :=& {\lambda f.\lambda x.\ {\underbrace{f \cdots f}_{n\text{ times}}\ x}} \\=&\;\,\,{\lambda f\ x.\ {\underbrace{f \cdots f}_{n\text{ times}}\ x}} \quad{\text{(using syntactic sugar)}}. \end{aligned}

For example, $0$ is encoded as $\lambda fx.x$, $1$ as $\lambda fx.fx$, $2$ as $\lambda fx.f(fx)$, and so on. We can take the successor of (add one to) any Church numeral quite easily just by introducing a wrapper that applies $f$ one more time to the number $n$:

$$\text{succ} := {\lambda n\ f\ x.f\ (n\ f\ x)}$$

For example (renaming the variables for clarity3),

\begin{aligned} &~{\text{succ}~2} \\=&~{(\lambda n\ f\ x\ .f\ (n\ f\ x))\ (\lambda g\ y.\ g\ (g\ y))} \\=&~{\lambda f\ x.\ f ((\lambda g\ y.\ g\ (g\ y))\ f\ x)} \\=&~{\lambda f\ x.\ f\ (f\ (f\ x))} \\=&~3 \end{aligned}

One way to define addition is through the use of the successor function, $\text{succ}$. We can use $\text{succ}$ to add two numbers $n$ and $m$ applying $\text{succ}$ $n$ times to $m$:

$$\text{add} := {\lambda m\ n.\ m\ \text{succ}\ n}$$

although the more common form of of $\text{add}$ is

$$\text{add} := {\lambda m\ n\ f\ x.\ m\ f\ (n\ f\ x)}.$$

Multiplication and exponentiation can be defined in a similar way, although these may be harder to understand:

\begin{aligned} \text{mul}&:= {\lambda m\ n\ f\ x.\ m\ (n\ f)\ x} \\ \text{exp}&:={\lambda m\ n.\ n\ m} \end{aligned}

Church encoding is just so fascinating, although subtraction is notoriously complicated. In lambda calculus, by defining pairs, you can create data structures and extend arithmetic to the negative numbers. Of critical importance is the famous Y combinator (or fixed point combinator), which allows recursion by giving functions references to themselves. It satisfies the functional equation

$${Y\ f} = {f\ (Y\ f)}$$

and is defined as

$$Y := {\lambda f.\ (\lambda x.\ f\ (x\ x))\ (\lambda x.\ f\ (x\ x))}.$$

Maybe I’m just easily impressed, but when I discovered this, I had to make a lambda calculus interpreter and evaluate $4!$ with a recursive factorial function. In the next post, I detail how I used the language features of Rust to create this interpreter.

1. Strictly speaking, currying converts functions of type ${(A\times B) \rarr C}$ to those of type ${A\rarr (B\rarr C)}$. The notation ${A\times B}$ is a cartesian product: a combination of two types, like a tuple in programming or ordered pair from middle school. A function that takes ${A\times B}$ as an argument essentially takes two arguments: one of type $A$ and another of type $B$. A function that takes $n$ arguments is equivalent to one that instead takes one argument: a tuple of size $n$. If we view an $n$-tuple as a series of nested ordered pairs, then the function can be curried repeatedly. For example, we can repeatedly curry a three-argument function of type ${((W\times X)\times Y)\rarr Z}={(W\times X\times Y)\rarr Z}$ to get a function of type ${(W\times X)\rarr (Y\rarr Z)}$ and finally ${W\rarr(X\rarr (Y\rarr Z))}$. If you think you can handle even more abstraction, we can define $\text{curry}$ as the function of type ${((A\times B) \rarr C)} \rarr {(A\rarr (B\rarr C))}$ such that ${f(x,y)} = {\text{curry}(f)(x)(y)}$.

[return]
2. A map from one thing to another. Pure functions don’t cause any side effects and give the same result every time they are invoked with the same arguments, unlike procedures in programming.

[return]
3. In a more formal context one would use α-substitution explicitly instead of just renaming the variables. As far as I can tell, however, they’re the same thing.

[return]