WARNING: This article is an interesting diversion
and contains little practical value. It is an example of
mind-stretching language torture and is intended for
entertainment purposes only. If you use any of the C# or
VB code presented in
this article in production code, you will lose your job and
shame your family!
You have been warned.
Lately, I've been revisiting some fun, old bits of
Scheme
code to see how they might be implemented in C#. I find this to be a rewarding
exercise because it always stretches my mind a bit. Today, I'm playing with the
following theoretical
implementations of the three standard functions used to build data structures in
Scheme:
(define (cons x y)
(lambda (m) (m x y)))
(define (car z)
(z (lambda (p q) p)))
(define (cdr z)
(z (lambda (p q) q)))
If you've never seen Scheme code before, that probably looks like a bunch of
gobblety-gook. Don't worry, we'll dig into these shortly. First, here are a few
facts and ground rules for the uninitiated:
- Scheme is one of the two main dialects of Lisp
(the other is Common Lisp).
- Scheme is based heavily on the untyped
lambda calculus.
- Variables are dynamically typed, so you won't see any type annotations.
- There are no operator precedence rules. Instead, expressions are
explicitly delimited by
parentheses.
- Operators appear in prefix form instead of the infix form that we're all
used to (i.e. + x y instead of x + y).
- Functions are values. In other words, functions are treated as
first-class objects of the language in the same way that, say, numbers are.
- define is used to bind a value to a name.
- lambda is used to declare a function.
With these in mind, let's see how cons, car and cdr are actually used in
Scheme.
Cons, Car and Cdr: The List Builders
Cons has a long history in
functional programming and has even become part of the
jargon of that world. In
essence, cons constructs
(duh!) pairs of values. Consider the following use of
cons:
(cons 37 5)
;Value 11: (37 . 5)
The result of
(cons 37 5) is an ordered pair
holding 37 as the first value and 5 as the second. The comment that
appears on the line after the expression is output from
Edwin, the lovable
and interactive
Emacs-like editor distributed
with MIT/GNU Scheme.
If we want to do something with our pair, we need to bind it to a
name.
(define the-answer (cons 37 5))
;Value: the-answer
Now that our pair is named, we can view its contents.
the-answer
;Value 12: (37 . 5)
Retrieving the values from the
pair is the job of Car and
Cdr.
(car the-answer)
;Value: 37
(cdr the-answer)
;Value: 5
As you can see, car
retrieves the first element from the pair, and cdr
retrieves the second. Using car and cdr, we can add the values held by the pair.
(+ (car the-answer) (cdr the-answer))
;Value: 42
Neat!
It's a small step from constructing pairs to building singly-linked lists.
The idea is that the first element of a pair will hold a value from the list,
and the second element will hold the next pair.
It's convenient to think of the resulting structure like so:

While that may be easy to visualize, our list actually has the recursive structure illustrated below:

Since the structure is recursive, we can think of each cell as a list containing
a value and another list. Let me restate that. Each cell, by itself, is a
list. For example, the second cell is a list containing
the value 2 and the list represented by the third cell. The third cell is a list
containing the value three and the empty list—that is, a list
containing no elements.
Let's construct the list illustrated above using nested cons
calls.
(define my-list (cons 1 (cons 2 (cons 3 nil))))
;Value: my-list
my-list
;Value 13: (1 2 3)
Extracting values from the list is simple. We just have to combine calls to
car and cdr.
(car (cdr my-list))
;Value: 2
Using cons to build lists
is a bit awkward.
Thankfully, Scheme provides another function, list,
to "cons up" lists.
(define my-other-list (list 4 5 6))
;Value: my-other-list
my-other-list
;Value 14: (4 5 6)
(car (cdr my-other-list))
;Value: 5
Cons, Car and Cdr: In Theory
At this point, you should have a basic understanding of how cons, car and cdr
(along with the helper function, list) are used
to build lists. At runtime, these functions are efficiently implemented as
primitives. We can see this when we type cons into Edwin:
cons
;Value: #[primitive-procedure cons]
OK, hold onto your seats.
Let's drill into the definition of cons from the beginning of this article.
(define (cons x y)
(lambda (m) (m x y)))
Knowing a little bit more Scheme than we did before, it's obvious that the above code binds the name cons to something. In
fact, the name cons
is bound
to a function that takes two parameters, x and y.
The body of this function is (lambda (m) (m x
y)). What the heck is that? Well, it's an anonymous function—that
is, a function with no name (exactly like a lambda expression from C# 3.0 and VB
9). That's right, this
definition of cons
doesn't return some special data structure that holds two values—it returns
a function.
Is your mind warping yet? Yes? No? Maybe? Well, read on.
The anonymous function returned by
cons takes
one parameter. That parameter,
m,
is yet another function. The body of the function returned from
cons calls
m,
passing x and y as the arguments.
So, cons
really just returns a function that knows how to call functions passed to it
with x
and y.
Insane? A little.
Let's go ahead and define this version of cons
in Edwin.
(define (cons x y)
(lambda (m) (m x y)))
;Value: cons
(define the-answer (cons 37 5))
;Value: the-answer
So far, everything looks pretty much the same. However, the values of cons
and
the-answer
are quite different than before.
cons
;Value 11: #[compound-procedure 11 cons]
the-answer
;Value 12: #[compound-procedure 12]
cons
is no longer a primitive-procedure because we've rebound its name to our
function. Notice that
the-answer
is also a procedure. In fact, it's a procedure with no name because cons
now returns an anonymous function.
OK, let's poke this with a stick. We know that the anonymous function
represented by
the-answer
knows how to call functions passed to it with the values that were originally passed
into cons.
So, we should be able to call
the-answer
with any function, for instance, the addition operator (which is just another
function).
(the-answer +)
;Value: 42
Sweet! Passing the addition operator into
the-answer
caused the originally consed values to be added together. So, what happens if we
try to use car
to get the first value from
the-answer?
(car the-answer)
;The object #[compound-procedure 12], passed as the first argument to car, is not the correct type.
;To continue, call RESTART with an option number:
; (RESTART 2) => Specify an argument to use in its place.
; (RESTART 1) => Return to read-eval-print level 1.
;Start debugger? (y or n):
Holy crap! We broke Scheme! Seriously, we did. The current implementations of car
and cdr
don't know how to handle values returned from our newly defined cons.
We need to redefine those as well if we want them to work.
OK, take a deep breath and look at the definitions of car and cdr.
(define (car z)
(z (lambda (p q) p)))
(define (cdr z)
(z (lambda (p q) q)))
If it's not obvious how
car and
cdr
work, take another look. In essence,
car and
cdr know how to call the function returned by
cons with functions that retrieve the first and second elements
respectively. Defining these in Edwin makes everything work again.
(define (car z)
(z (lambda (p q) p)))
;Value: car
(car the-answer)
;Value: 37
(define (cdr z)
(z (lambda (p q) q)))
;Value: cdr
(cdr the-answer)
;Value: 5
I should reemphasize that the above definitions are not how
cons, car and cdr
are actually implemented in Scheme. They are theoretical. In fact, these
definitions are one way that cons, car and cdr can be implemented
in the lambda calculus.
If your head isn't spinning at this point, I haven't done my job. To be fair,
I've been demonstrating all of this in a language that's probably new to most of
you. To correct that,
here are working versions of cons, car and cdr,
based on the definitions above, in C# and VB. Both programs compile and run
properly with Visual Studio 2008. First, the C#:
using System;
namespace ConsCarCdr
{
class Program
{
delegate T ConsCell<T>(Func<T, T, T> f);
static ConsCell<T> Cons<T>(T x, T y)
{
return m => m(x, y);
}
static T Car<T>(ConsCell<T> z)
{
return z((p, q) => p);
}
static T Cdr<T>(ConsCell<T> z)
{
return z((p, q) => q);
}
static void Main()
{
var theAnswer = Cons(37, 5);
Console.WriteLine(Car(theAnswer) + Cdr(theAnswer));
}
}
}
And here's the VB version:
Module Program
Delegate Function ConsCell(Of T)(ByVal f As Func(Of T, T, T)) As T
Function Cons(Of T)(ByVal x As T, ByVal y As T) As ConsCell(Of T)
Return Function(m) m(x, y)
End Function
Function Car(Of T)(ByVal z As ConsCell(Of T)) As T
Return z(Function(p, q) p)
End Function
Function Cdr(Of T)(ByVal z As ConsCell(Of T)) As T
Return z(Function(p, q) q)
End Function
Sub Main()
Dim theAnswer = Cons(37, 5)
Console.WriteLine(Car(theAnswer) + Cdr(theAnswer))
End Sub
End Module
So, what's the big deal? The big deal is that we are building data out of thin air. Nowhere did we define a class or struct
with fields to hold
the data. The values passed to Cons
continue to exist entirely through the power of
closure.
(For more on closures in C#, see the article, "What's
in a Closure?".)
Now for the bad news. The C# and VB definitions of Cons, Car and Cdr
above can't be used to build lists—only ordered pairs. Remember that building a list with Cons
produces a recursive data structure. So, the second argument passed to Cons must
be another ConsCell<T>.
Let's try that.
delegate T ConsCell<T>(Func<T, ConsCell<T>,
T> f);
static ConsCell<T> Cons<T>(T x, ConsCell<T> y)
{
return m => m(x, y);
}
static T Car<T>(ConsCell<T> z)
{
return z((p, q) => p);
}
static ConsCell<T> Cdr<T>(ConsCell<T> z)
{
return z((p, q) => q);
}
Ugh! That doesn't compile. The problem is that Cdr's
lambda expression is expected to return a value of type
T
but returns a value of type ConsCell<T>
instead.
We can hack around this with a dirty trick.
static ConsCell<T> Cdr<T>(ConsCell<T> z)
{
ConsCell<T> result = null;
z((p, q) => { result = q; return p; });
return result;
}
I dislike this hack primarily because it requires a lambda expression with multiple
statements, and VB
doesn't support that. In addition, I don't like returning p from the lambda
expression and throwing its value away. However, this gives us the hint that we need to
find a cleaner approach.
delegate void Chooser<T1, T2>(T1 x, T2 y);
delegate void ConsCell<T>(Chooser<T, ConsCell<T>> f);
static ConsCell<T> Cons<T>(T x, ConsCell<T> y)
{
return m => m(x, y);
}
static T Car<T>(ConsCell<T> z)
{
T result = default(T);
z((p, q) => result = p);
return result;
}
static ConsCell<T> Cdr<T>(ConsCell<T> z)
{
ConsCell<T> result = null;
z((p, q) => result = q);
return result;
}
It requires more hacks to make this work in VB. In their current
incarnation, VB lambda expressions simply aren't all that robust. First, they don't
support multiple statements (that's a little tricky syntactically to achieve in
VB). The second (and more cumbersome) problem is that VB lambda expressions don't work very
well with delegates that return Void. VB lambdas must always return a value.
(These differences are mainly due to the fact
that C# and VB have very
different destinies.)
With hacks in place, the VB version looks like so:
Delegate Function Chooser(Of T1, T2)(ByVal x As T1, ByVal y As T2) As Integer
Delegate Sub ConsCell(Of T)(ByVal f As Chooser(Of T, ConsCell(Of T)))
Function Cons(Of T)(ByVal x As T, ByVal y As ConsCell(Of T)) As ConsCell(Of T)
Return Function(m) m(x, y)
End Function
Function Assign(Of T)(ByRef left As T, ByVal right As T) As Integer
left = right
Return 0
End Function
Function Car(Of T)(ByVal z As ConsCell(Of T)) As T
Dim result As T = Nothing
z(Function(p, q) Assign(result, p))
Return result
End Function
Function Cdr(Of T)(ByVal z As ConsCell(Of T)) As ConsCell(Of T)
Dim result As ConsCell(Of T) = Nothing
z(Function(p, q) Assign(result, q))
Return result
End Function
Now that Cons, Car and Cdr are defined, it's a simple matter to define a List
method to help "cons up" lists.
static ConsCell<T> List<T>(params T[] items)
{
ConsCell<T> result = null;
for (int i = items.Length - 1; i >= 0; i--)
result = Cons(items[i], result);
return result;
}
We can even write recursive methods that process our lists.
static void PrintList<T>(ConsCell<T> list)
{
if (list == null)
return;
else
{
Console.WriteLine(Car(list));
PrintList(Cdr(list));
}
}
Oh, and the client code? It's looking good.
var numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
PrintList(numbers);
Success! We've built meaningful data using only functions in both C# and VB.
It required some dirty tricks to get around static-typing and VB's skimpy lambda
expression support, but we did it. We built data out of air.
I need to reiterate that you should not use any of this in
production code. I have presented it for your enjoyment. If you learned
something from this article (i.e. you were able to wrap your head around how this works), you might consider taking steps into the world of
functional programming. Try a language like
ML,
F#,
Haskell,
Common Lisp or
Scheme.
Research the various
lambda calculi or
first-order logic
to see where the functional programming ideas were born. Mainstream development
has just started to test the academic waters of functional programming in
languages like C# and VB. It can't hurt to get a little wet.
I must make mention of my beautiful editor and trophy wife.
She took the time and shed the tears necessary to understand the concepts
presented here and to improve my clunky, caveman-like writing style. Remember,
if she gets it, you'd
better. 