Hacker News new | past | comments | ask | show | jobs | submit login
A programming language from scratch: 3 steps to an interpreter (francisstokes.wordpress.com)
217 points by FrancisStokes on Aug 17, 2017 | hide | past | favorite | 33 comments



Shameless plug for the programming language course I'm TAing. It covers how to write a similar interpreter, with more detail about evaluation and less about parsing. The book and all of the assignments are online.

http://cs.brown.edu/courses/cs173/2016/index.html

http://cs.brown.edu/courses/cs173/2016/assignments/programmi...

http://cs.brown.edu/courses/cs173/2016/assignments.html

The "interpreter" assignment has a similar end-goal as this tutorial.

Many of the other assignments (the ones prefixed "ML-") are "mystery languages", where you try to tell the difference between two programming languages that have the same syntax but different semantics. It's a fun way to explore the consequences of language design, I think.

Next year we might do a MOOC.


I can vouch for the mystery language assignments at least -- we worked through several at this year's Racket summer school (https://summer-school.racket-lang.org/2017/), and they were interesting and very engaging. (Hi Justin!)


This course looks awesome! Do you (or those scrolling by) know of any other courses like this? By this I mean short but focused burst of learning done in person.


Yes, there seem to be a number of summer schools every year on a variety of topics, although the ones I'm familiar with are based more in academia than in industry. I think they tend to get announced on the Types mailing list (https://lists.seas.upenn.edu/mailman/listinfo/types-announce).

Here are a few that I'm aware of:

* Racket summer school: https://summer-school.racket-lang.org/2017/

* Oregon Programming Languages Summer School: https://www.cs.uoregon.edu/research/summerschool/summer17/

* DeepSpec summer school: https://deepspec.org/event/dsss17/


I took the course a few years ago and can't overstate how much I learned implementing the interpreter assignments. They entirely revamped my thinking about programming and even prepared me for the real world a bit (my team at work just built a custom scheme interpreter for a deep learning DSL)


IMHO Lisp-like languages are good for showing off the "back-end" of a compiler/interpreter, because tokenisation/parsing is completely trivial. On the other hand, you miss out a lot of the fun of parsing a "real" language, with precedences and a more nontrivial grammar.

One of the most enlightening points about interpreters/compilers is that, to a first approximation, the only difference between an interpreter and compiler is what it does when it walks the tree --- an interpreter executes the actions immediately, while a compiler generates code which executes those actions.

An amazing demonstration of this is at https://news.ycombinator.com/item?id=8558822 and https://news.ycombinator.com/item?id=8746054 . It doesn't actually build an AST, but implicitly creates one and walks it since it does recursive-descent parsing.

(Aside: I happened to glance at the URL at the same time as I was reading the Tokenisation section, and thought "Francis Tokes? How appropriate..." before looking more closely at the page title.)


Hey - author here. You're completely right, but I hope the basic understanding will whet peoples appetites to try some more exotic stuff! Really I was just trying to lift the veil of what can be a quite intimidating subject if you haven't grazed past it before. Thanks for checking it out!


Hey Francis, I just wanted to take a moment to tell you that I really enjoyed reading your article. Good stuff!


Thanks! I really appreciate you saying so, and I'm glad you enjoyed it


> One of the most enlightening points about interpreters/compilers is that, to a first approximation, the only difference between an interpreter and compiler is what it does when it walks the tree --- an interpreter executes the actions immediately, while a compiler generates code which executes those actions.

For something else in a similar vein, I've always liked:

http://howistart.org/posts/nim/1/

I suppose implementing brainfuck is more like writing a vm, than a language - but I still think the article is a surprisingly straightforward implementation of a compiler.


As a programmer I hated syntax from day one. Lisp focus on everything else was a breeze of fresh air.


If you have an interpreter, you can obtain a native executable by attaching the interpreter code to the source file, and then removing all the unnecessary parts (optimizing). This is called the first Futamura projection.

Discovering the nature of second and third Futamura projections is left as an exercise to the reader ;)


> This is called the first Futamura projection.

I read that as "Futurama" at first and was confused trying to figure out how that TV show was related to interpreters and compilers.


That looks like fun. Here's a series of HN comments that outline a similar thing, but going FORTH-like instead of Lisp-like [1].

I suppose Lisp and FORTH are the most common for these kind of projects, at least that I have seen, because of the simple syntax.

APL has a pretty simple syntax, but I have not seen anyone choose it for one of these build your own language from scratch projects. I wonder why not?

[1] https://news.ycombinator.com/item?id=13082825


They saw Whitney (J author) threaded interpreter in C and they ran away.

http://code.jsoftware.com/wiki/Essays/Incunabulum


A couple of other resources that come to my mind are,

The somewhat more in depth Beautiful Racket, http://beautifulracket.com/

and the significantly more in depth Structure and Interpretation of Computer Programs, https://mitpress.mit.edu/sicp/full-text/book/book-Z-H-4.html...


I wish such content existed for ACID compliant databases.


'How Postgres Makes Transactions Atomic' https://news.ycombinator.com/item?id=15027870


Yes! I was looking for something similar to create a basic database in order to really understand the underlying concepts, but couldn't really find anything.


Sounds like a good project!


Am I completely misunderstanding something here or should (* 3 3 3) be actually 27 instead of 9?

Anyway, nice article, well illustrated, love to read stuff like this!


Oh my god what a glaring mistake! Thank's for pointing it out, I've corrected the article.


Using a Lisp-like language is great for creating a new language because of the S-expressions that Lisp depends on.

S-expressions are very compact and are easy to parse. There was actually some support for using S-expressions over XML and JSON as a serialization language.[1]

Ron Rivest even proposed S-expressions as a standard.[2]

I still prefer JSON over them though. As the old joke goes, LISP stands for "Lots of Irritating Silly Parentheses."

1 - http://eli.thegreenplace.net/2012/03/04/some-thoughts-on-jso...

2 - http://people.csail.mit.edu/rivest/Sexp.txt


After grokking the tokenisation, parsing and eval of lisp like languages, what's the next step up?

I usually have a bit of fun mocking out a language I'd like but end up getting into details like type systems. I never really get to expressing as a CFG.

I'm wondering if the best course is to say right, keys now build a small, procedural language or some kind of barely typed C lookalike as the best step up?


At a glance, you could make a very unsafe not-really C without too much thought. I probably wouldn't make it too C like, just so I could avoid worrying about parsing etc.

As to type systems, that's a completely different can of worms (One which I do not have much expertise in). Type systems can range from the simple e.g C to the formal study of type systems (e.g. Hindley-Milner). Personally, I don't really know the formal side very well (I can't recall details from memory the same way I can with (say) Lagrangian mechanics), so I can't recommend one particular book.


I found these resources invaluable when implementing my first type system:

- Understanding Hindley-Milner http://akgupta.ca/blog/2013/05/14/so-you-still-dont-understa... - Implementing a basic type system http://cs.brown.edu/courses/cs173/2012/book/types.html


That first link was awesome. The notation can be intimidating but this article has defogged it. Thank you!

I'm gonna check out the next article and return to the first. Deserves a second reading.


Why not give Pierce’s “Types and programming languages” a spin?


I agree, for sufficiently large values of step. This book could change your life.

But I think the C4 self-hosted C compiler (discussed at [1]) might be another next step in a different direction.

[1] https://news.ycombinator.com/item?id=8558822


This intuitively makes a lot of sense. The short variable names with the legend at the top makes it a little harder to understand but you can see what's going on from a macroscopic view. Perhaps translating this to another language with clearer variable names would be a good exercise for me.


I've been working through this book. It's difficult to get through, I'll admit, at my stage. Is there anything you'd recommend as a supplement?


Try making an interpreter for a simple language like QBASIC.


quote from article: "As a side note: if Lel ever becomes popular enough that people argue about tabs vs spaces, or if function bodies belong on a newline always, that will absolutely make my life."

That unexpectedly made me LOL.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: