Tuesday, August 16, 2011

Clojure Patterns: cons, car and cdr

My first look at Clojure came very close to being my last. Coming from a background of both Lisp and Java, Clojure sounded like a good way to get the benefits of functional programming in a JVM environment. As with a lot of people I have become increasingly frustrated with object-oriented programming. Too many things are still either too hard or simply too "boiler-plate", things that a functional paradigm addresses more cleanly.

As with any new language, it is good to poke it with a stick. Accordingly, I fired up a REPL and tried a few command just to see what happens. My first session went something like:

user> (+ 1 2 3)
6
user> (cons 1 2)
java.lang.IllegalArgumentException: Don't know how to create ISeq from: java.lang.Integer (NO_SOURCE_FILE:0)
at clojure.lang.Compiler.eval(Compiler.java:5440)
at clojure.lang.Compiler.eval(Compiler.java:5391)
at clojure.core$eval.invoke(core.clj:2382)

Seriously?, a Lisp that cannot cons? And with that I shut down the REPL and went on with my life.

Sometime later a friend of mine re-introduced me to Clojure and I took a deeper look at it. There are a lot of things to recommend the language, the "broken" cons not-with-standing. The concurrency model, the lazy model and sequences in general are both well thought out and implemented.

However, as a new language there are a lot of questions out there that boil down to "how do I implement X in Clojure"? Languages like Java have a lot of materials out there on that exact subject, often referred to as design patterns. By following the established patterns one can implement "better" code.

Some would say that design patterns are to overcome language defects. While that may be true in some cases, that is certainly not the driving force behinds these ideas. Design patterns have been around for a long time with lots of names. The Greeks had a golden ratio. Today we have the "16 inch on center" pattern. In construction, it is common to build walls with 2x4s 16 inches apart. Doing so is not to overcome a deficiency in lumber, but is a well-established technique for making sturdy walls. In addition, it makes the wiring, plumbing, insulation and sheet rock work together properly as well. The benefit is felt even years later when a person needs to work on the wall.

The same idea applies to software. If software is built using accepted patterns, that software has a better chance to being able to integrate into systems today and to be maintained in the future.

With that in mind, I will be presenting some patterns for use in Clojure. These patterns will be garnered from personal experience, from the Clojure core and from answers to questions which people pose in various places around the web.

The first pattern will be an anti-pattern: cons, car and cdr. Since these do not exist as expected in the language, what is the proper way to implement these ideas in Clojure? First a quick reference table.

Patterns Quick Reference
IdeaCONSCARCDR
A cons cell or a pair (def pair [1 2]) (first pair) or (pair 0) (second pair) or (pair 1)
A sequence or ISeq like list, vector, set or map (def sq (cons 4 [3 2 1])) (first sq) (rest sq)
An arbitrary data structure Use either a built in clojure data structure or a Java object Access depends on the structure Access depends on the structure
Cons Cell or Pair

As noted before the cons cell does not exist a generic data structure in Clojure. If you want to create a simple pair of values, one recommended way to do it is with a vector. A vector allows you to quickly create a structure to hold one or more values that can be quickly accessed by index, much like an array.

A vector can be created using the vector function or by surroundings the values in square brackets []. Using vectors is fast and also allows for expansion from just a pair of values to any n-tuple of values. Retrieving the values can be done in several ways including:

  • Functions first and second
  • Using the vector itself as a function and getting the values by index
  • Treating the vector as a sequence and using map, reduce and filter.
user=> (def pair [1 2])
#'user/pair
user=> (first pair)
1
user=> (pair 0)
1
user=> (second pair)
2
user=> (pair 1)
2
user=> (reduce + pair)
3
Sequence or ISeq

In Lisp the basic abstraction is the List is is simply a linked list of cons cells. The head of each cell is the value and the tail is a link to the next cons cell in the list. In Clojure, the basic abstration is the Sequence implemented with the ISeq interface. This allows Clojure to have multiple higher-level data structures with different semantics that implement this interface and can be accessed in similar ways. These data structures include lists, sets, vectors and maps. All have different uses yet can all be accessed in classic fashion using functions like map, reduce and filter.

The cons function in both Lisp and Clojure adds an element to the beginning of a list or sequence. In Lisp, the result is a new list while in Clojure the result is a new sequence. The concrete implementation of the sequence does not matter as much as the fact the it implements ISeq and is thus able to be processed as expected.

user=> (def sq (cons 4 [3 2 1]))
#'user/sq
user=> sq
(4 3 2 1)
user=> (first sq)
4
user=> (rest sq)
(3 2 1)
user=> (class [3 2 1])
clojure.lang.PersistentVector
user=> (class sq)
clojure.lang.Cons
Data Structures

In Lisp it is possible to build arbitrary data structures like trees using cons cells. Clojure is built on the JVM and therefore has access to any data structure via the java interop. There are also a set of data structures built in that are immutable and are therefore well suited to the language.

At this point there will not be any examples of using data structures as the general usage of data structures in the language is beyond the scope of a discussion of cons, car and cdr. As specific patterns are discussed examples appropriate to the pattern will be given.

Conclusion

While Clojure is missing a basic Lisp form, it turns out that by using higher abstractions the same tasks can be accomplished in a more efficient manner. It is unfortunate that the language is missing the cons cell just to avoid the principle of least astonishment for people coming from Lisp.

Next time will be a discussion of recursion in Clojure because there is an unexpected twist in how it is implemented.