The Next 700 Programming Languages by P. J. Landin
2017-03-11 15:12
This is a series of posts where I jot down my learnings from computer science papers I’ve read. Previously, An Axiomatic Basis for Computer Programming
This blog post summarizes key points/learnings from the paper. I chose this paper after hearing a talk by John Hughes and Mary Sheeran. I thought it would be fun to examine what Landin thought about the programming language (PL) landscape 50 years ago, and contrast it with what we have now.
ISWIM
The paper describes a framework, called ISWIM (for If you See What I Mean), for programming languages. The framework defines rules around 2 features that all programming languages provide:
User-coined names
Conventions about characterizing functional relationships
ISWIM attemps to be a general purpose system, that can be made oriented for a specific problem by picking an appropriate set of primitives.
“Most programming languages are partly a way of expressing things in terms of other things and partly a basic set of given things”
Levels of ISWIM
Landin defines the 4 levels of abstraction for ISWIM:
Physical ISWIM
Logical ISWIM
Abstract ISWIM
Applicative Expressions (AEs)
Physical ISWIM is the actual text that is seen and written by programmers.
Logical ISWIM can be thought of lex tokens output by a Lexer (<=
is the less than equals operator).
The abstract ISWIM is the Abstract Syntax Tree (AST) for the program.
And the AEs is the lowest level bytecode/assembly that is executed.
This separation of layers gives ISWIM its flexibility. For example, “new” languages can be written that runs on the same Abstract Machine that executes AEs, as long as these “new” languages can be transformed in AEs. (This is similar to how various X-to-JavaScript languages, such as ClojureScript, BuckleScript, GopherJS, etc., works.)
For the abstract ISWIM, Landin defines a where-expression, which looks like how mathematical formulas are broken down:
f(b+2c) + f(2b-c) where f(x) = x * (x+1)
Characteristic equivalences
Landin describes in details the characteristic equivalences of ISWIM expressions, i.e. when are two expressions in ISWIM equal. Through these equivalence relations, different expressions at the abstract level can be transformed into the same expressions at the AEs level. The benefit is that now you can have a smaller core.
User-coined names (which is a major feature that ISWIM claims) introduced via let expressions, function definitions, have equivalent where-expressions.
So the benefit is obvious here: instead of separately supporting various ways of user-coined names, they are all equivalent to where-expressions, and at the AEs level, it only needs to know about where-expressions. I think this is now more commonly known as desugaring.
The basis of all these beautiful transformation are 3 things:
Expressions have nesting subexpression structure
Each subexpression denotes something (a number, string, etc.)
What the expression denotes depends only on the values of its subexpressions
These 3 properties make expressions easier to construct and understand, and Landin suggests that there is an
“evolutionary trend towards ‘bigger right hand sides’ in place of strings of small, explicitly sequenced assignments and jumps”.
From my experience with programming, we don’t actually have bigger right hand sides. Rather we bury more expressions within curly braces in classes and methods, at least in object-oriented languages like Java. In functional languages, I think this is quite true. I find myself using let
expressions more in OCaml. One explanation is that OCaml infers types well, I don’t have to specify types whenever I use let
, which reduces the friction.
Discussion on indentation-based syntax and declarative languages
At the end of the paper, there is a discussion section from various computer scientists about the indentation based syntax of ISWIM, and what declarative languages are.
The general consensus around not using indentation for structure is when you have a block that has multiple lines, spanning multiple pages. In that case it makes it harder, when on a new page, which containing block an indented block belongs to.
The definition of declarative language starts of as “languages which do not contain assignment statements or jumps”, as offered by Strachey. As the discussion progresses, the definition moves towards having a “situation where you have specified everything you want to know, though the exact sequence in which you evoke the various operations to cause the solution is left unspecified”, offered by Ingerman. This is closer to the current, common understanding of declarative languages, think MySQL.
Thoughts
This paper has been quite a challenging but interesting read. It is challenging because I have to try and put myself into the PL world 50 years ago, and it’s hard to imagine what it was like then, and the motivations behind this paper. It is interesting because I can spot a little of ISWIM in the languages I use now, both object-oriented and functional. It is also interesting that Landin advocates strongly for expressions with no side effects, but somehow it didn’t become the default way of doing things.