A programming language from scratch: 3 steps to an interpreter

		A programming language from scratch: 3 steps to an interpreter (francisstokes.wordpress.com)
		217 points by FrancisStokes on Aug 17, 2017 \| hide \| past \| favorite \| 33 comments

justinpombrio on Aug 17, 2017 | [–]

Shameless plug for the programming language course I'm TAing. It covers how to write a similar interpreter, with more detail about evaluation and less about parsing. The book and all of the assignments are online.

http://cs.brown.edu/courses/cs173/2016/index.html

http://cs.brown.edu/courses/cs173/2016/assignments/programmi...

http://cs.brown.edu/courses/cs173/2016/assignments.html

The "interpreter" assignment has a similar end-goal as this tutorial.

Many of the other assignments (the ones prefixed "ML-") are "mystery languages", where you try to tell the difference between two programming languages that have the same syntax but different semantics. It's a fun way to explore the consequences of language design, I think.

Next year we might do a MOOC.

Twisol on Aug 17, 2017 | | [–]

I can vouch for the mystery language assignments at least -- we worked through several at this year's Racket summer school (https://summer-school.racket-lang.org/2017/), and they were interesting and very engaging. (Hi Justin!)

totalperspectiv on Aug 17, 2017 | | | [–]

This course looks awesome! Do you (or those scrolling by) know of any other courses like this? By this I mean short but focused burst of learning done in person.

Twisol on Aug 17, 2017 | | | [–]

Yes, there seem to be a number of summer schools every year on a variety of topics, although the ones I'm familiar with are based more in academia than in industry. I think they tend to get announced on the Types mailing list (https://lists.seas.upenn.edu/mailman/listinfo/types-announce).

Here are a few that I'm aware of:

* Racket summer school: https://summer-school.racket-lang.org/2017/

* Oregon Programming Languages Summer School: https://www.cs.uoregon.edu/research/summerschool/summer17/

* DeepSpec summer school: https://deepspec.org/event/dsss17/

mlazos on Aug 17, 2017 | | | [–]

I took the course a few years ago and can't overstate how much I learned implementing the interpreter assignments. They entirely revamped my thinking about programming and even prepared me for the real world a bit (my team at work just built a custom scheme interpreter for a deep learning DSL)

userbinator on Aug 17, 2017 | | [–]

IMHO Lisp-like languages are good for showing off the "back-end" of a compiler/interpreter, because tokenisation/parsing is completely trivial. On the other hand, you miss out a lot of the fun of parsing a "real" language, with precedences and a more nontrivial grammar.

One of the most enlightening points about interpreters/compilers is that, to a first approximation, the only difference between an interpreter and compiler is what it does when it walks the tree --- an interpreter executes the actions immediately, while a compiler generates code which executes those actions.

An amazing demonstration of this is at https://news.ycombinator.com/item?id=8558822 and https://news.ycombinator.com/item?id=8746054 . It doesn't actually build an AST, but implicitly creates one and walks it since it does recursive-descent parsing.

(Aside: I happened to glance at the URL at the same time as I was reading the Tokenisation section, and thought "Francis Tokes? How appropriate..." before looking more closely at the page title.)

FrancisStokes on Aug 17, 2017 | | [–]

Hey - author here. You're completely right, but I hope the basic understanding will whet peoples appetites to try some more exotic stuff! Really I was just trying to lift the veil of what can be a quite intimidating subject if you haven't grazed past it before. Thanks for checking it out!

jeffhuys on Aug 17, 2017 | | | [–]

Hey Francis, I just wanted to take a moment to tell you that I really enjoyed reading your article. Good stuff!

FrancisStokes on Aug 17, 2017 | | | [–]

Thanks! I really appreciate you saying so, and I'm glad you enjoyed it

e12e on Aug 17, 2017 | | | [–]

> One of the most enlightening points about interpreters/compilers is that, to a first approximation, the only difference between an interpreter and compiler is what it does when it walks the tree --- an interpreter executes the actions immediately, while a compiler generates code which executes those actions.

For something else in a similar vein, I've always liked:

http://howistart.org/posts/nim/1/

I suppose implementing brainfuck is more like writing a vm, than a language - but I still think the article is a surprisingly straightforward implementation of a compiler.

agumonkey on Aug 17, 2017 | | | [–]

As a programmer I hated syntax from day one. Lisp focus on everything else was a breeze of fresh air.

atemerev on Aug 17, 2017 | | [–]

If you have an interpreter, you can obtain a native executable by attaching the interpreter code to the source file, and then removing all the unnecessary parts (optimizing). This is called the first Futamura projection.

Discovering the nature of second and third Futamura projections is left as an exercise to the reader ;)

Sohcahtoa82 on Aug 17, 2017 | | [–]

> This is called the first Futamura projection.

I read that as "Futurama" at first and was confused trying to figure out how that TV show was related to interpreters and compilers.

tzs on Aug 17, 2017 | | [–]

That looks like fun. Here's a series of HN comments that outline a similar thing, but going FORTH-like instead of Lisp-like [1].

I suppose Lisp and FORTH are the most common for these kind of projects, at least that I have seen, because of the simple syntax.

APL has a pretty simple syntax, but I have not seen anyone choose it for one of these build your own language from scratch projects. I wonder why not?

[1] https://news.ycombinator.com/item?id=13082825

agumonkey on Aug 17, 2017 | | [–]

They saw Whitney (J author) threaded interpreter in C and they ran away.

http://code.jsoftware.com/wiki/Essays/Incunabulum

brudgers on Aug 17, 2017 | | [–]

A couple of other resources that come to my mind are,

The somewhat more in depth Beautiful Racket, http://beautifulracket.com/

and the significantly more in depth Structure and Interpretation of Computer Programs, https://mitpress.mit.edu/sicp/full-text/book/book-Z-H-4.html...

zerr on Aug 17, 2017 | | [–]

I wish such content existed for ACID compliant databases.

boltn on Aug 17, 2017 | | [–]

'How Postgres Makes Transactions Atomic' https://news.ycombinator.com/item?id=15027870

gedrap on Aug 17, 2017 | | | [–]

Yes! I was looking for something similar to create a basic database in order to really understand the underlying concepts, but couldn't really find anything.

overcast on Aug 17, 2017 | | | [–]

Sounds like a good project!

d12bb on Aug 17, 2017 | | [–]

Am I completely misunderstanding something here or should (* 3 3 3) be actually 27 instead of 9?

Anyway, nice article, well illustrated, love to read stuff like this!

FrancisStokes on Aug 17, 2017 | | [–]

Oh my god what a glaring mistake! Thank's for pointing it out, I've corrected the article.

c0achmcguirk on Aug 17, 2017 | | [–]

Using a Lisp-like language is great for creating a new language because of the S-expressions that Lisp depends on.

S-expressions are very compact and are easy to parse. There was actually some support for using S-expressions over XML and JSON as a serialization language.[1]

Ron Rivest even proposed S-expressions as a standard.[2]

I still prefer JSON over them though. As the old joke goes, LISP stands for "Lots of Irritating Silly Parentheses."

1 - http://eli.thegreenplace.net/2012/03/04/some-thoughts-on-jso...

2 - http://people.csail.mit.edu/rivest/Sexp.txt

jwdunne on Aug 17, 2017 | | [–]

After grokking the tokenisation, parsing and eval of lisp like languages, what's the next step up?

I usually have a bit of fun mocking out a language I'd like but end up getting into details like type systems. I never really get to expressing as a CFG.

I'm wondering if the best course is to say right, keys now build a small, procedural language or some kind of barely typed C lookalike as the best step up?

mhh__ on Aug 17, 2017 | | [–]

At a glance, you could make a very unsafe not-really C without too much thought. I probably wouldn't make it too C like, just so I could avoid worrying about parsing etc.

As to type systems, that's a completely different can of worms (One which I do not have much expertise in). Type systems can range from the simple e.g C to the formal study of type systems (e.g. Hindley-Milner). Personally, I don't really know the formal side very well (I can't recall details from memory the same way I can with (say) Lagrangian mechanics), so I can't recommend one particular book.

simplify on Aug 17, 2017 | | | [–]

I found these resources invaluable when implementing my first type system:

- Understanding Hindley-Milner http://akgupta.ca/blog/2013/05/14/so-you-still-dont-understa... - Implementing a basic type system http://cs.brown.edu/courses/cs173/2012/book/types.html

jwdunne on Aug 18, 2017 | | | [–]

That first link was awesome. The notation can be intimidating but this article has defogged it. Thank you!

I'm gonna check out the next article and return to the first. Deserves a second reading.

mietek on Aug 17, 2017 | | | [–]

Why not give Pierce’s “Types and programming languages” a spin?

cardiffspaceman on Aug 17, 2017 | | | [–]

I agree, for sufficiently large values of step. This book could change your life.

But I think the C4 self-hosted C compiler (discussed at [1]) might be another next step in a different direction.

[1] https://news.ycombinator.com/item?id=8558822

jwdunne on Aug 18, 2017 | | | [–]

This intuitively makes a lot of sense. The short variable names with the legend at the top makes it a little harder to understand but you can see what's going on from a macroscopic view. Perhaps translating this to another language with clearer variable names would be a good exercise for me.

jwdunne on Aug 18, 2017 | | | | [–]

I've been working through this book. It's difficult to get through, I'll admit, at my stage. Is there anything you'd recommend as a supplement?

Sohcahtoa82 on Aug 17, 2017 | | | [–]

Try making an interpreter for a simple language like QBASIC.

novakinblood on Aug 17, 2017 | [–]

quote from article: "As a side note: if Lel ever becomes popular enough that people argue about tabs vs spaces, or if function bodies belong on a newline always, that will absolutely make my life."

That unexpectedly made me LOL.