SinC — The Tiniest LISP Compiler (to Python)

Tim Lopez published an elegant and tiny LISP interpreter for Python on his blog. Inspired by Norvig’s Paradigms of Artificial Intelligence Programming. I developed an interpreter as well as a compiler for LISP in Python. Hence the interpreter is quite similar to Tim Lopez’ implementation, I will present sinC, the s-expression input compiler. SinC compiles my own proprietary LISP dialect to Python source code.

We lost the documentation on quantum mechanics.  You'll have to decode the regexes yourself.

The reader

def read(sexpr):
    grammar = r"(\()|(\))|([^()\s]+)|(\s+)"

    def sequenceBuilder(match):
        leftbracket, rightbracket, atom, whitespace = match.groups()
        if(leftbracket): return '['
        elif(rightbracket): return ']'
        elif(atom): return '"' + atom + '"'
        elif(whitespace): return ','
    return eval(re.sub(grammar, sequenceBuilder, sexpr), None, None)

The reader converts a s-expression in a corresponding Python list representing the syntax tree. For example:

(print (+ 1 2)) -> ["print", ["+", "1", "2"]]

Only alphanumeric characters are allowed in symbols. Especially no strings or brackets. They would break the reader.

The evaluator

def comp(tree):
    if(type(tree) == list):
        if(tree[0] in specialForms):
            return specialForms.get(tree[0])(tree[1:])
        else:
            args = map(lambda x: comp(x), tree[1:])
            return tree[0] + "(" + ",".join(args) + ")"
    else:
        return tree

The evaluator parses the syntax tree recursively. The first symbol in a list is always interpreted as a function and the remaining symbols as its paramters. If the symbol representing the function is found in specialForms, a Python function is called, returning the generated Python source code as a string.

My dialect has the following special forms:

set, first, rest, cons, quote, eq, cond

Additionally, I implemented the algebraic operators ‘+’, ‘-’, ‘*’ and ‘/’ as special forms for convenience reasons.

Except for the set function, every special form is side effect-free and can be written as a Python lambda-expression. In particular, the evaluator is just nesting calls to pure Python functions.

The transcription step and the set function

def set(rest):
    if(len(rest) == 2): # a variable
        globalScope.append((comp(rest[0]), comp(rest[1])))
    else: # a function
        globalScope.append((comp(rest[0]), "lambda " + comp(rest[1]).strip("[]") +": " + comp(rest[2])))
    return "None"

The set function takes care of global variable and function bindings. In Python, such a binding is not possible in a lambda expression. So we evaluate this function to None and store the binding for later use in a globalScope. After parsing the syntax tree, we first generate Python code for the global Scope to establish the bindings and after that append the function calls generated by the comp function.

The whole compiler looks like this:

#!/usr/bin/env python

import re
import sys

errfile = "<sinc>"

def read(sexpr):
    grammar = r"(\()|(\))|([^()\s]+)|(\s+)"

    def sequenceBuilder(match):
        leftbracket, rightbracket, atom, whitespace = match.groups()
        if(leftbracket): return '['
        elif(rightbracket): return ']'
        elif(atom): return '"' + atom + '"'
        elif(whitespace): return ','
    return eval(re.sub(grammar, sequenceBuilder, sexpr), None, None)

def set(rest):
    if(len(rest) == 2): # a variable
        globalScope.append((comp(rest[0]), comp(rest[1])))
    else: # a function
        globalScope.append((comp(rest[0]), "lambda " + comp(rest[1]).strip("[]") +": " + comp(rest[2])))
    return "None"

specialForms = {
    "set": set,
    "first": lambda rest: "(" + comp(rest[0]) + ")[0]",
    "rest": lambda rest: "(" + comp(rest[0]) + ")[1:]",
    "cons": lambda rest: "[" + comp(rest[0])+"] + " + comp(rest[1]),
    "quote": lambda rest: "[" + ", ".join(map(lambda x: comp(x), rest)) + "]",
    "eq": lambda rest: "("+comp(rest[0])+" == "+comp(rest[1])+")",
    "cond": lambda rest: "("+comp(rest[1])+" if "+comp(rest[0])+" else "+comp(rest[2])+")",
    "+": lambda rest: comp(rest[0])+" + "+comp(rest[1]),
    "-": lambda rest: comp(rest[0])+" - "+comp(rest[1]),
    "*": lambda rest: "("+comp(rest[0])+") * ("+comp(rest[1])+")",
    "/": lambda rest: "("+comp(rest[0])+") / ("+comp(rest[1])+")"
}

globalScope = []
def comp(tree):
    if(type(tree) == list):
        if(tree[0] in specialForms):
            return specialForms.get(tree[0])(tree[1:])
        else:
            args = map(lambda x: comp(x), tree[1:])
            return tree[0] + "(" + ",".join(args) + ")"
    else:
        return tree

def transcribe(sexpr):
    tree = read(sexpr)
    if(type(tree) == tuple):
        transcripts = []
        for node in tree:
            transcripts.append(comp(node))
        transcript = "\n".join(transcripts)
    else:
        transcript = comp(tree)

    initCode = []
    for expr in globalScope:
        initCode.append(str(expr[0]) + " = " + str(expr[1]) + "\n")
    return "".join(initCode) + transcript

def sinc(sexpr):
    transcript = transcribe(sexpr)
    print transcript

def repl():
    while(True):
        print "sinC> ",
        sys.stdout.flush()
        line = sys.stdin.readline().lstrip(">").strip()
        try:
            sinc(line)
        except:
            print "???"

if __name__ == "__main__":
    if(len(sys.argv) == 2):
        f = open(sys.argv[1])
        sinc(f.read())
        f.close()

    else:
        print "This is sinC-alpha. Interactice LISP to Python compiler."
        print
        repl()

Call it with a sourcefile as argument or with no arguments to compile s-expressions interactively (used this a lot for testing).

Demo program: Series summation

As a little demo we calculate the series summation of a range and print the result:

(set sum x (cond (eq x 0) 0 (+ (sum (- x 1)) x)))

(set range (quote begin end) (cond (eq begin end) (quote) (cons begin (range (+ begin 1) end))))
(set len list (cond (eq list (quote)) 0 (+ 1 (len (rest list)))))
(set map (quote fun list) (cond (eq list (quote)) (quote) (cons (fun (first list)) (map fun (rest list)))))

(set testrange (range 1 8))

(print (len testrange))
(print testrange)
(print (map sum testrange))

Call it

$ ./sinc sum.sinc  | python
7
[1, 2, 3, 4, 5, 6, 7]
[1, 3, 6, 10, 15, 21, 28]

and it works! The generated code looks like this:

sum = lambda x: (0 if (x == 0) else sum(x - 1) + x)
range = lambda begin, end: ([] if (begin == end) else [begin] + range(begin + 1,end))
len = lambda list: (0 if (list == []) else 1 + len((list)[1:]))
map = lambda fun, list: ([] if (list == []) else [fun((list)[0])] + map(fun,(list)[1:]))
testrange = range(1,8)
None
None
None
None
None
print(len(testrange))
print(testrange)
print(map(sum,testrange))
About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s