Lambda Calculus: Basic Interpreter in Rust (Part 2)
This is part 2 of a series of articles on lambda calculus.
This article assumes basic knowledge of lambda calculus, which is covered in the previous article.
EDIT 2019-07-25: Uploaded and hyperlinked repository, mentioned writing tests, and updated some information to match new code.
Summary
This article covers my design process, with a step-by-step review of the most important parts of my code, before running the interpreter to find $4! = 24$ with lambda calculus.
Finished code can be found on GitHub at tqn/rslambda.
Design
I knew I wanted to write a lambda calculus interpreter in Rust. Immediately, some design choices were clear.
- How do I represent lambda expressions? - There are three variants of lambda
expressions, so Rust’s
enum
type perfectly handles this. They can recursively contain other heap-allocated lambda expressions.
Next came the slightly more difficult problems.
- Where should I put the compiler? - Separate it into a different file, where
a
compile
function will run several helper functions to clean, tokenize, and parse,the program. - What operations should the lambda expression type provide to “run” the
program? β-reduction, or function application, is an obvious choice.
η-reduction, or converting from
λx.f x
tof
. This is critical in simplifying many lambda expressions. I can avoid other kinds of extensionally equivalent reductions for now.
Although I could spend more time carefully designing the interpreter, it was best to start now.
(As I am writing this in the future, I have the omniscient power of infinite foresight/hindsight. Therefore, I will accordingly retcon my programming experience so that I do not experience many of the frustrating hiccups as I once did in the past.)
Code
We’re just going to cover the most important parts of the interpreter, since there are 413+ lines of code.
Data structures
These are the declarations for lambda expressions and identifiers, used to represent variables.
|
|
For some reason, it felt better to create a whole Ident<T>
rather than just an
Ident
with a hardcoded String
, even though T
only ever stands for
String
. (Hindsight once again reveals the tragic folly of the developer.)
Anyway, the Box
es allow recursive types in Rust, since variables can only
point to memory on the stack (which must be sized at compilation time) and
Box
es are heap allocated. All that #[derive(...)]
stuff fills in boilerplate
for hashing (maps!), making copies, testing equality, and printing in a nice
format.
(Some of you who are smarter than me may have noticed that I am referring to
variables with actual String
s like some sort of naive teenager. However,
someone far smarter than most people invented
De Bruijn indices, which
eliminates the need for annoying variables names in a quite elegant fashion. I
could argue that using String
s instead of highly efficient integers simply
enriches our understanding of lambda calculus. It is up to the reader to decide
how ridiculous of an oversight this is.)
Ident<T>
We’ll first focus on this data structure since it is relatively simple. The
purpose of Ident<T>
is to identify a variable with a human-readable name (T
is usually a String
) and a number to ensure the variable is unique. Even
though the tuple entries are nameless, I refer to these parts as tag
and id
in the code.
|
|
Ident
is a tuple struct. It contains the tag in the first field (self.0
) and
the optional unique ID in the second (self.1
). Option<V>
s have two variants:
Some(V)
and None
. I allow the ID to be optional because free/unbound
variables are represented just with the tag portion of the identifier. All bound
variables (those that come first in lambda abstractions as the function
argument) have IDs.
The impl
ementation of Ident
only contains one interesting method:
|
|
realloc
accesses a HashMap which maps the T
part of the Ident
to the next
unique number that we can use for our Ident
to ensure it is unique. Then, it
replaces the numerical part of the identifier and increments the number in the
map for future use.
This is important for β-reduction to prevent identifiers from clashing. (As mentioned before, De Bruijn indices would solve this issue nicely, but we’re unfortunately not using that here.1)
Exp<T>
|
|
The three variants of Exp
represent the three kinds of lambda expressions:
Var
↔ free or bound variablesAbs
↔ lambda abstractions likeλ x.M
App
↔ an application, or function call like(M N)
Again, if this is not familiar to you, check out my last post on the basics of lambda calculus.
Critically, this is where most of the lambda calculus “meat” is stored, where the rubber meets the road. Now we actually encounter a concept that could prevent you from falling asleep reading this article: the evaluation strategy for lambda expressions.
The two main evaluations strategies are:
- applicative order — evaluate bottom-up: arguments before functions
- normal order — evaluate top-down: functions before arguments
A lambda expression is called normalizing if there is some sequence of reductions that produce a normal form, where no β- or η-reductions are possible.2 Applicative order is analogous to eager evaluation, like in most imperative languages, and normal order is analogous to lazy evaluation, like in most functional languages. A fundamental result is that applicative order often fails to find the normal form of normalizing expressions: it is not a normalizing strategy. However, normal order is a normalizing strategy: it always finds a normal form if it exists.
In my interpreter, I use normal form evaluation. Yet another way to improve efficiency is to use the strategy call by need, which is essentially normal order, but instead of duplicating lambda terms, you produce a pointer to the term to be evaluated later when needed (a thunk). This ensures that the same code is only ever evaluated once. This is closely related to how functional programming languages implement lazy evaluation. Since the test program I use computes $4!$, I suspect that call by need would speed up the evaluation tremendously.3
Anyway, the Exp
methods are:
|
|
make_identifiers_unique
is a helper method to rename all variables, regardless of scope, to something unique. This prepares an expression for substitution and reduction.substitute
is another helper method that performs substitution in the top level expression, but does not recurse to nested expressions. It replaces variables with lambda expressions.as_numeral
attempts to read the expression as a Church numeral and output a correspondingusize
, returningNone
if unsuccessful.beta_reduce
andeta_reduce
perform their respective operations before recursing to child expressions. They returntrue
if the expression changed.reduce
repeatedly callsbeta_reduce
andeta_reduce
until the expression has reached normal form. This is equivalent to running the program.reduce
fails to terminate when the program encounters an infinite loop (it is not normalizing).
Many of these methods exhibited a similar structure: a match
statement in the
root. I will cover the beta_reduce
method since it is probably the one most
people care about.
|
|
Since we are using the normal order evaluation strategy, we must evaluate from
the inside out: recurse to child expressions before attempting to reduce the top
expression. If the current expression is an application and the left side is a
lambda abstraction, we can perform a β-reduction. This specific structure
in lambda calculus is known as a redex and looks like ((λ x.M)
N)
. The first match
branch (where the helper method substitute
is
called) attempts to match the current lambda expression to a redex form. All
other branches of the function ensure that beta_reduce
is called recursively
on all child expressions. eta_reduce
is similarly written, but instead of
trying to match a redex, it matches lambda expressions specifically of the form
λx.f x
and replaces them with f
.
This concludes our horrifying foray into the code that represents and manipulates lambda expressions. If you made it this far, congratulations! Unfortunately, we still have to talk about the compiler. Luckily, it’s not that much of a compiler.
Compiler
My compiler sits in a separate file (compiler.rs
) from the rest of the code
above (lambda.rs
). This is important since seeing both at once would probably
grant some sort of eldritch Lovecraftian knowledge that is better left unknown
(or else you get a headache).
It’s kind of generous calling it a compiler, since all it does is clean a string, match parentheses, and tokenize it, before finally doing some pattern matching, crashing really badly if there is a syntax error. (This project is supposed to be a kind of proof-of-concept experiment, I guess.) If you’re planning to write a “compiler” like mine just to get something done, or have a morbid fascination with unhealthy architecture4, you might find the rest of this section useful. If you’re not masochistic, you can safely skip to the next section.
My compiler.rs
file contains six functions, making only compile
public:
|
|
(That // TEMP
comment doesn’t seem so temporary after all.)
Anyway, compile
calls three functions:
lex
, which cleans the code and splits it intoString
tokenspreprocess
, which adds extra parentheses, andparse
, which returns an AST with unique identifiers.
Line 6 converts the preprocessed tokens into a borrowed form. This was just a
kind of separation of concerns thing I did, since parse
should not take
ownership of the tokens.
The lex
function is so much of a placeholder in production that it simply
splits the input into a Vec
of characters and filters out the whitespace. All
preprocess
does is use my mess of a find_matching_brace
function to wrap all
lambda abstraction bodies in parentheses, so it becomes easier to parse. (Some
may refer to this as a “hack.” I have no comment.)
parse
consists of a call to generate_ast
, followed by another to
make_identifiers_unique
on the resulting Exp
tree. generate_ast
is the
longest function in this file. It uses a combination of a match
, an ungodly
if
chain, and several recursive calls to generate the
abstract syntax tree.
I (think I) wish I could cover this in more detail, but my brain is literally constructing a mental block when it comes to writing about the contents of this file. Even though signs are total BS, I’m pretty sure this is a sign to stop typing about it. It’s best if both you and I forget that such a pile of manure ever existed.5
Factorial in lambda calculus
Unfortunately, we have one more hurdle to cover: how the factorial actually works.
The classic recursive functional factorial function is
Since we can use the Y combinator $Y := {\lambda f.\ (\lambda x.\ f\ (x\ x))\ (\lambda x.\ f\ (x\ x))}$ which satisfies ${Y\ f} = {f\ (Y\ f)}$ to implement recursion, it is critical in our lambda calculus factorial function. If $f = \lambda y.M$,
where $M[y:=Y\ f]$ means to substitute $y$ with $Y\ f$ in $M$. Essentially, the bound variable $y$ becomes a reference to the function body $M$ — pretty neat concept!
In the last post, I wrote about the Church encoding for natural numbers. However, I didn’t mention how booleans are encoded. It’s rather elegant:
This method of encoding true and false actually eliminate the need for an if-then-else statement, since a boolean can be “called” with the then-value and the else-value:
To implement our branching factorial as illustrated above, we must use an $\text{isZero}$ function that returns a Church-encoded boolean. All Church-encoded numbers $[n] := \lambda fx.f^n\ x$ call the function $f$ at least one time, except for $0 := \lambda fx.x$. Therefore, a definition for $\text{isZero}$ is as follows:
All nonzero numbers should call the constant function $\lambda u.\ \text{false}$ at least once. Otherwise, the expression returns $\text{true}$.
The last ingredient we need before we craft our factorial function is a predecessor function. The factorial definition above requires subtracting one from our number for recursive calls. As I mentioned in my last article, definitions for the predecessor function are notoriously difficult to understand. One definition is
As far as I can tell, this predecessor function works by wrapping the $f$ call to defer it outwards, using the constant function to remove the innermost $f$ application, and finally using the identity function to remove the wrapper. As I finish typing that sentence, I realize that I have probably written better-formed sentences on some of my more well-thought-out ideas. A detailed derivation is available on Wikipedia, but that is definitely not necessary to understand this program.
We are finally ready to construct the factorial. In pseudo-calculus (?):
Unfortunately, a function cannot refer to its own definition; these $:=$ definitions just declare the symbol on the left to be shorthand for whatever is on the right. This is where we use the Y combinator, to allow the function to call itself.
All that remains is to replace those binary arithmetic operators with their proper functions, and we are done!
Running the interpreter
Finally! After all that reading (or skimming (or skipping)), we’re finally getting to the good part — the part where cool stuff happens.
My main.rs
is as follows:
|
|
I abuse the poor format!
macro to keep track of which lambda terms mean what.
You may also notice that throughout my program, I use \
instead of an actual
λ. Even though this requires me to prefix a few strings with r
to
indicate that they should not process escape sequences, it’s much better than
actually using λ everywhere.6 The rest of main
is pretty
straightforward: compile the program, run it, time it, and display the output.
Running this right now gives me
|
|
which, at a glance, takes at least 17 MB in task manager. Changing N
from 4 to
5 gives
|
|
Bear in mind this is the unoptimized debug build, not to mention that the compiler itself could/will be greatly optimized.
Testing
A few weeks later, I finally pushed myself to write
unit tests for lambda.rs
.
It was an extremely unpleasant job, mostly because of the ridiculous amount of
paren-matching required to write out the abstract syntax tree explicitly.
The tests were mostly so I could feel confident publishing the code online, but also because I had half-finished them.7 Above all else, I want to ensure that the code I am uploading works properly and is not a buggy piece of junk – so here is my code (to a reasonably bug-free degree)!
Conclusion
Doing this project opened up new ways for me to think about programming, logic, and math. It’s interesting how closely related lambda calculus is to functional programming in general. Even though imperative programming usually beats functional in performance, functional progamming paradigms like stream combinators (map, filter, etc.), first-class/higher-order function, and immutability have made their way into major languages like C++, Java, JavaScript, Go, and now Rust.
All programmers should learn about lambda calculus, since it fundamentally grounds functional programming. FP is extremely good at crafting intuitive solutions to problems, which then can be translated to a more imperative style for practical use. With the rise in popularity of Go and event-driven services, FP provides tools such as immutability of state and lazy evaluation to make development less painful.
If you are interested by anything at all in this article, I encourage you to read up on logic, lambda calculus, and perhaps the coolest result in this area: the Curry-Howard correspondence, which shows that mathematical proofs and computer programs can be put into a one-to-one correspondence. (You can “run” a proof and “prove” a program!) In fact, under C-H, a lambda application can be seen as an application of modus ponens or implication elimination: “P implies Q and P is asserted to be true, therefore Q must be true."
I hope you learned something useful, and thanks for reading!
If we did, it would probably shave off like 100 lines of code. Maybe I’ll make a second version using De Bruijn indices? Stay tuned! (Please?) ↩︎
The actual definitions get technical, if you care to read more. ↩︎
Writing this article really makes me want to make a more efficient version. ↩︎
If you believe I’m being too harsh on myself and my code, you may be right. However, I believe this is one of the manifestations of my intention to write a better compiler (look out for that!), or (most likely) it could be a result of some deeper issue. The footnote of some blog is probably not the best place to explore that, or to spend my time anxiously writing and yours laboriously reading. ↩︎
Some other people use
^
instead of\
. For some reason I can’t remember, I actually switched away from^
. Maybe it was giving me some technical issue, possibly when I was taking command line input on Windows? Another likely explanation is that it is harder to type. ↩︎Call it “perfectionism” if you want (which I do sometimes), but it’s certainly not healthy. ↩︎