[FoRK] The World's Smallest Programming Language

Dave Long < dave.long at bluewin.ch > on > Thu Sep 14 08:40:47 PDT 2006

> Really?  I was just reading Peyton Jones' 1987 book on implementing
> functional languages, and there's a chapter on SK-combinators.  He's
> not very enthusiastic about them, but he doesn't say *exponentially*
> larger and slower.  The basic rewrite rules for compiling the lambda
> calculus into SK-combinators (here \ is lambda):
>   T[ \ x . x ] = I = SKK if you want to be really minimalist
>   T[ \ x . y ] = K y
>   T[ \ x . \ a . B ] = T[ \ x . T[ \ a . B ] ]
>   T[ \ x . A B ] = S (T[ \ x . A ]) T[ \x . B ]
> which is that you go in and tack on another level of S or K for each
> level of lambda-nesting, which takes O(n^2) at compile time in levels
> of nesting, and also means that every original expression gets this
> O(n) slowdown.  But that's not *exponentially* larger and slower, is
> it?

Well, by naively coding that same translation (and taking advantage of  
switching to Unlambda notation in which ` means application, and is  
equivalent to fully-parenthesizing the combinator expressions, so that  
translating becomes folding a snoc-list) we get:

def T(vs,e):
     """ abstract variables from an expression """
     def unlist(l): return "".join(l)
     def elim(v,e):
         """ eliminate a free variable from an expression """
             t, u = e[-1], elim(v,e[:-1])
             if t == "`":        return u + list("``S")
             elif t != v:        return u + list("`K%s" % t)
             else:               return u + list("I")
         except:                 return []
     try:        return unlist(elim(vs[-1], T(vs[:-1],e)))
     except:     return e

and \x . (+ x x), in this notation, becomes "``+xx", with two  
applications due to the currying.

print T("x","``+xx") produces "``S``S`K+II", which is what we want.

unfortunately, even relatively simple nested terms

     def s(vs):
         r = T(vs,vs[0])
         print "%d: %d %s" % (len(vs),len(r),r)


1: 1 I
2: 3 `KI
3: 9 ``S`KK`KI
4: 27 ``S``S`KS``S`KK`KK``S`KK`KI
5: 81  
6: 243  
7: 729  

which look exponential to me.  (but, again, this is a very naive imp)

>  And I have a hunch that you can eliminate that source of the
> slowdown entirely by first lambda-lifting, compiling to
> supercombinators, and then tupling their arguments.
> He explains that you can ameliorate this problem with B and C
> combinators ...

I certainly believe that, but will leave it up to someone else to fix  
up (or, more simply, find a better alternative to) the code above to  
perform Turner reductions, lambda-lifting, S' B* C' supercombination,  
etc.  Tupling sounds like the big win here; it looks like it's multiple  
arguments that cause the big blow ups.

Didn't SPJ explicitly say somewhere that he was using graph reduction  
in order to avoid duplication of combinators/reductions when one has  
large, redundant, expressions?


:: :: ::

> (I think there's an interesting connection here with strictness
> analysis and linearity, if you use the B and C combinators instead of
> S whenever you can, but I'm not sure I understand it yet.)

It makes sense that there should be.  Linearity analysis allows one to  
thread straight through, avoiding environment save/restores.  One  
method of saving and restoring an argument is to duplicate a copy for  
each reference, and B and C likewise seem to be ways of avoiding the  
duplication involved with a straight S.

>> Maybe I just need to read the speed prior stuff more closely, but it
>> seems that if algorithms jump drastically between big-O classes
>> depending upon whether they're expressed with naming/sharing or not,
>> it would affect which programs were chosen by a speed prior.  Or
>> does time expand pretty uniformly across codes?
> I'm not sure what you mean, exactly.  Can you give some examples?
>> Depending upon how much you care, this might be a question for  
>> kc-list.
> What's that?

The Kolmogorov-complexity (kolmogorov-chaitin?) list -- basically a  
measure by which one defines the complexity of a string to be the  
length of the shortest program which produces it as output.

Of course, it's usually much easier to produce proofs with these  
minimums than to actually find them for any given string/language pair.

I believe the "speed prior" idea is that rather than find the shortest  
program, we can look for the program which produces the correct output  
in the shortest amount of time.  Now it becomes a little easier to find  
some minima: run all possible programs, interleaving (according to  
length?) their executions, until one manages to generate the output.


More information about the FoRK mailing list