Lisp's nested syntax

Syntactic sugar causes cancer of the semicolon. — Alan Perlis

The continued growth of Clojure has proven that parenthitis is not as widespread an ailment as some would have made us believe. Should we then conclude that we are past the decade-long flamewars around Lisp's syntax? What would become of the legacy built by lovers and haters alike? – a full body of literature praising the power and elegance of sexp notation or, conversely, decrying the horror of a syntax as bare as a parse tree representation. Fear not. Palavering around syntax may be an idle pursuit, but it is one that we as members of the homo loquax species are not quite ready to let go.

Jean-Philippe Paradis tweeted that all objections to s-expressions syntax are speculative problems. They disappear once one gets its hands dirty. And indeed, lispy syntax is something one embraces in the name of the grand vision it stands for: homoiconicity, code is data, macros… Give it some time, and paredit will do the rest.

And yet, memories of old may assail the recent convert. In particular, the allure of method chaining, a technique that originated in Smalltalk and that is now found in many imperative languages, might come back and taunt him. Stuart Sierra made the observation after he moved from Perl to Lisp, back in 2007. Xah Lee, a polyglot never shy to stir controversy, has turned the subject into a favorite pet peeve of his.

Here is a simple problem:

OK, I want to create a nested list in Lisp (always of only integers) from a text file, such that each line in the text file would be represented as a sublist in the 'imported' list.

Example of input:

3 10 2
4 1
11 18

Example of output:

((3 10 2) (4 1) (11 18))

This Ruby one-liner is used to demonstrate an elegant solution involving method chaining.

IO.readlines("blob.txt").map{|line|{|s| s.to_i }}

While Lisp languages, according to Xah Lee, offer an unwieldy solution, as shown in this Emacs lisp example.

(defun read-lines (file)
  "Return a list of lines in FILE."
    (insert-file-contents file)
     (buffer-string) "\n" t)))

 (lambda (x)
    (lambda (y) (string-to-number y))
    (split-string x " ")))
 (read-lines "blob.txt"))

The argument being that nested syntax would somehow stand in the way of function chaining constructs available elsewhere.

x | f | g | h      unix pipe
x // f // g // h   Mathematica 
h @ g @ f @ x      Mathematica 
x.f.g.h            various OOP langs, especially Ruby, JavaScript
h g f x            some functional langs, Haskell, Ocaml

What stands in the way is probably just a psychological barrier. Think of x | f | g | h as (| x f g h), just like a + b + c + d is equivalent to (+ a b c d). And indeed, that's precisely how pipes look in scsh or Chicken scheme. (Thank you, Rainer Joswig, the veteran Lisper who shared this insight with me.)

Lisp is the result of taking syntax away, Perl is the result of taking syntax all the way. — Doug Hoyte

With this quote, Doug Hoyte was stressing the fact that Lisp languages are built on minimalistic sexps. But it would be a mistake to equate the two. Lisp expressions take many shapes and forms before they are reduced to internal data structures. Rainer Joswig pointed out that every macro, every special form implements syntax. Additionally, every user-defined macro introduces new syntax. And when available, read macros further coaxe non-lispy syntax into something that the reader can process.

Nested expressions are a byproduct of sexp syntax, but when they become too deeply entangled, they can be disentangled by… (Lisp) syntax. Here are different ways to solve the previous problem in Clojure, demonstrating syntactic variety and how nested expressions can be kept at a minimum.

First, let's require some functionality.

(require '[ :refer [reader]]
         '[clojure.string :refer [split]])

(def blob "/path/to/blox.txt")

List comprehension in Clojure is a syntax-laden macro for the benefit of building specific sets out of general sets.

(for [line (line-seq (reader blob))
      :let [line (split line #"\s+")]]
  (map read-string line))
((3 10 2) (4 1) (11 18))

Nested syntax is mitigated with the use of higher-order functions such as comp, the classical function composition mechanism.

(map (comp (partial map read-string) #(split % #"\s+")) (line-seq (reader blob)))
((3 10 2) (4 1) (11 18))

The threading operator, aka thrush combinator, is a Clojure macro that interweaves forms in a preset way, eliminating nested syntax. This makes way for expression chaining, much like the Unix pipeline or à la jQuery. Somewhat of a syntactical innovation, the idea has spread by now across the Lisp horizon.

(->> (line-seq (reader blob))
     (map #(split % #"\s+"))
     (map  (partial map read-string)))
((3 10 2) (4 1) (11 18))

Let's not forget the ubiquitous let, which executes a series of forms with bindings.

(let [lines (line-seq (reader blob))
      lines (map #(split % #"\s+") lines)]
  (map (partial map read-string) lines)))
((3 10 2) (4 1) (11 18))

Finally, Lisp old-timers will telll you there's nothing wrong with properly formatted, nested, tree-like expressions.

(map #(map read-string %) 
     (map #(split % #"\s+") 
          (line-seq (reader blob))))
((3 10 2) (4 1) (11 18))

As Mike pointed out in the comments, the above can be made shorter like so:

(map #(map read-string (split % #"\s+"))       
     (line-seq (reader blob)))
((3 10 2) (4 1) (11 18))

Further reading:

The Semicolon Wars / PL Syntax / Extreme syntax / Syntax across languages / On holy wars and a plea for peace

P.S. Follow me on Twitter.

Daniel Szmulewicz 15 January 2014
blog comments powered by Disqus