NHacker Next
login
▲I Wrote a Compilerblog.singleton.io
68 points by ingve 3 days ago | 34 comments
Loading comments...
WalterBright 2 hours ago [-]
> It’s possible to write the lexer and parser entirely by hand

I write mine all by hand. It's the easiest part of a compiler to write, by far. It's also the least troublesome.

One advantage of doing them by hand is better, more targeted error messages are easier to fold in.

sph 1 hours ago [-]
> I write mine all by hand. It's the easiest part of a compiler to write, by far. It's also the least troublesome.

It's also the most annoying if you're writing a new language. You want to iterate on its ideas, but can't do so until you have a parser done.

I've been designing a few language concepts over the past year, and it feels 80% of this time has been writing and debugging parsers; by the time I get to the meat of language design - the shape of its AST, the semantics of it - any small syntactic change means going back to update the lexer and parser stage. Doesn't help that I can't settle on a syntax.

BTW I first started with PEG, which are nice in theory, but I feel the separation of lexing and parsing stage to very helpful to reduce boilerplate (handling whitespace in PEG is obnoxious). Later, I hand-wrote my parsers (in C), but it's gotten so repetitive I've dedicated a weekend to just learning lex/yacc (actually flex/bison). Even if parsers are easy to write, it's good to have higher level tools to reduce the tedium of it.

fjfaase 21 minutes ago [-]
There are other tools for prototyping the grammar for a new language. There are several online tools that can interactively parse input based on a given grammar. The one I developed is: https://fransfaase.github.io/MCH2022ParserWorkshop/IParseStu... The page does contain a sample grammar and sample input. For the grammar for the grammar see: https://fransfaase.github.io/MCH2022ParserWorkshop/index.htm...
fredrikholm 1 hours ago [-]
I suspect that this is the more common opinion, especially when the desired outcome is real world use.

Recursive descent is surprisingly ergonomic and clean if one gets the heuristics right. Personally I find it way easier than writing BNF and its derivatives as you easily get into tricky edge cases, slow performance and opaque errors.

WoodenChair 5 hours ago [-]
This is very similar to the project I have in Chapter 2 of my new book Computer Science from Scratch [0]. It's also Tiny BASIC without INPUT. I called it NanoBASIC. But it's an interpreter not a compiler. This tutorial is a nice starting point. The chapter is much more comprehensive, so if you want to get into the weeds, I can recommend my own chapter (of course). But it's in Python, not Go. The code is on GitHub[1]. But this tutorial is great too.

0: https://nostarch.com/computer-science-from-scratch

1: https://github.com/davecom/ComputerScienceFromScratch

npalli 8 hours ago [-]
>The original authors of yacc were Mike Lesk and Eric Schmidt - yes that Eric Schmidt.

Incorrect, they were authors of lex. yacc was authored by Stephen Johnson.

Surprising to me is all the authors are still around, even though the tools are over 50 years old!. Shows how young computer science field is.

musicale 4 hours ago [-]
Who coauthored "The C Programming Language" anyway? Oh right, prof. Kernighan.

https://www.cs.princeton.edu/~bwk/

tapirl 2 hours ago [-]
> ... yes that Eric Schmidt.

Not the one who was ever Google's CEO. right?

orthoxerox 3 minutes ago [-]
yes, that Eric Schmidt.
azhenley 8 hours ago [-]
TinyBASIC is fun and beautifully simple. I wrote a 3-part tutorial for making a TinyBASIC-to-C compiler using Python a few years ago.

Let’s make a Teeny Tiny compiler https://austinhenley.com/blog/teenytinycompiler1.html

tuveson 7 hours ago [-]
I know BASIC is kind of a “bad” language, but there’s something so delightful about it. If we’re plugging TinyBASIC projects that others might find interesting, I made an MMO TinyBASIC REPL the other day: http://10klob.com/
musicale 4 hours ago [-]
BASIC is an amazing language that computing novices (including humanities majors) could learn in an afternoon, that could be efficiently compiled or compactly interpreted, that was small enough to support dozens of interactive users on a mainframe or minicomputer, or to fit into a tiny 8-bit microcomputer – and yet was largely equivalent to FORTRAN in terms of its expressive power.

I think the closest modern equivalents might be Python (for easy onramp and scalability from microcontrollers to supercomputers) and JavaScript (for pure ubiquity in every device with a web browser.)

I wonder if there is a modern-ish (?) environment that can match Visual BASIC in terms of easy GUI app programming. Perhaps Python or Tcl with Tk (Qt seems harder) or maybe Delphi, or perhaps a modern Smalltalk.

pjmlp 3 hours ago [-]
Delphi, and naturally Visual Basic for .NET with Windows Forms, not forgeting about C#, however it is getting a bit too much featurities lately, and most likely not what the BASIC target audience would like.
andsoitis 3 hours ago [-]
Delphi for sure. And while you have to run it on Windows, it can create binaries for Windows, macOS, Linux, and mobile.

https://www.embarcadero.com/products/delphi

pjmlp 3 hours ago [-]
People too often complain about original BASIC, and forget most dialects moved away from line numbers and spaghetti GOTOs during the 16 bit days, with widepsread of compilers and structured constructs.

I am really glad that I only got to learn C, after getting through Turbo Basic, Quick Basic, Turbo Pascal[0], doing exactly the same kind of stuff urban myths say it was only possible after C came to be.

[0] - On 16 bit systems, I started coding on an 8bit Timex 2068.

fjfaase 3 hours ago [-]
Since 1990, I have developed software with C and later C++. Now that I am working on a C compiler, I am learning new things about the language. So, writing a compiler (or an interpretter) can really help to get a deep understanding of a programming language.
bastawhiz 8 hours ago [-]
>The original authors of yacc were Mike Lesk and Eric Schmidt - yes that Eric Schmidt.

I don't know if it's worth mentioning, but the author of the post is David Singleton, the former CTO of Stripe. I almost hadn't noticed until I saw the domain.

refulgentis 8 hours ago [-]
I worked ~4 layers underneath him when he led Android Wear at Google, and every year or two that happens to me, and it puts a smile on my face. Gotta have love of the game to do this at that level.

IIRC, and man, maybe I'm making it up, but, lore was he always made time on a regular schedule to hack.

Usually 1 layer from the bottom isn't coding so much anymore.

(oddly, I didn't realize he was *CTO* of Stripe until a few months back, when his new thing with Hugo Barra was announced)

kristopolous 38 minutes ago [-]
nice to see that the red dragon book is still a thing. I've heard people have stopped using the wizard book.
musicale 4 hours ago [-]
> Yes, this is what I do for fun.

Don't we all? ;-)

matthewmueller 7 hours ago [-]
Love reading these. Keep these blog posts coming!
2 hours ago [-]
teo_zero 3 hours ago [-]
Wait! What are == and != doing in a BASIC language? Heresy! :)
dps 3 hours ago [-]
Yeah, I really should have included <> :-)

Fun to see this post from the deep archive get some interest - thanks for reading!

TMWNN 8 hours ago [-]
I thought a compiler, with no adjective or caveat, should turn a HLL into machine language. Isn't what this describes—turning BASIC into Go—more accurately described as a "pseudocompiler" or "Go compiler" or somesuch? I know Emacs is always said to have a "bytecode compiler" that processes Elisp code, not a "compiler" per se. Am I mistaken?
vrighter 1 hours ago [-]
A compiler is a language translator. To me it makes no difference whether it generates machine code (one type of language, machine executable), asm (equivalent language but which needs additional preprocessing into machine code), or some other language (which also needs additional preprocessing into machine code, but the tool for this is another compiler). Or it might output Java bytecode or MSIL, which doesn't even target a real, physical machine at all.

They translate one language into another. The line between compiler/transpiler just doesn't make sense to me.

ethan_smith 1 hours ago [-]
A compiler is any program that translates from one language to another (source-to-source, source-to-bytecode, or source-to-machine code), so translating BASIC to Go is indeed a proper compiler, just as GCC translating C to LLVM IR before machine code is still a compiler.
tuveson 7 hours ago [-]
This kind of question winds up being the CS equivalent of “is a hotdog a sandwich”. I agree that transpiler is a more accurate term for it and that a hotdog is not a sandwich. But there are lots of languages that start life as compile-to-C things. Many compiled languages today output LLVM IR which is not machine language. Similarly people would probably call javac a compiler, even though it outputs bytecode.
fao_ 8 hours ago [-]
Strictly speaking it's a transpiler, but honestly the delta between the target language (Go) and the source language (BASIC) is very fluffy and wooly, from what I remember from my PL theory days the distinction was always fuzzy enough that people used whatever term felt right to them.

An example off the top of my head — Chicken Scheme (call-cc.org) calls itself a compiler but it's target language is C

shakna 1 hours ago [-]
"Transpile" is a shortening of the older term "trans-compile". [0]

It's a subset. All transpilers are compilers. Not all compilers are transpilers.

[0] Amiga BASIC called itself a transcompiler, from memory.

pxc 8 hours ago [-]
The standard term for this kind of compiler is "transpiler", afaik.

Here's the Wikipedia page for such things, which also taught me several other names for them:

https://en.m.wikipedia.org/wiki/Source-to-source_compiler

kragen 4 hours ago [-]
The standard term for "transpiler" is "compiler", though.
meisel 8 hours ago [-]
What would you call TypeScript’s tsc, which translates TS to JS? Microsoft would say it’s a compiler: https://code.visualstudio.com/docs/typescript/typescript-com...
andsoitis 3 hours ago [-]
They do, but that article also mixes “transpile” and “compile” often enough that it is near impossible to deduce what different meanings they might ascribe.
ratmice 8 hours ago [-]
So, if he had invoked go for you would it be a compiler? Another definition is that it translates a source language into a target language.
icemanind 7 hours ago [-]
[dead]
DustinEgg 3 days ago [-]
[dead]