Starting out, part 2: Now We Interpret

Through the fire and flames trial and error

Well, rather unsurpringly in hindsight, when I actually started writing the compiler and having to think about everything in much more detail, the cracks started to show. Nevertheless, I pushed on and wrote a simple compiler. Recursive descent and quick glue to let LLVM handle most of the rest.

My AST was a mess, the codegen did run through LLVM so it was OK, I guess, but there were a lot of things I was quite disappointed with. The first compiler gave me experience, but I ended up scrapping it and rewriting nearly everything but the lexer.

And then I did it again.

At this point I knew I could make this work though, and I dug through a lot of references on the way, so it was a good exercise. Especially with how to handle user-defined operators, since that was entire mess of its own. Which might be worth a blog post in itself.

The move away from the basic design

The biggest, and the most overarching change is the fact that I felt the need to embed a virtual machine to the compiler. Which is what this blog post will be mostly talking about.

WAIT, Flower has an interpreter?

Yes. Currently, Flower is pretty much an interpreted language. I said I wanted a systems programming language as the first thing in the previous post. And that is still true. But for now, I have a bytecode interpreter (really a virtual machine) that I use to nearly all of the code.

How did we get to this?

I knew early on that I needed some way to run code at compile-time to get the metaprograming features I wanted. Both Zig and Circle which have some things I like have embedded intepreter.

I was originally planning to just run AST interpreter and add a virtual machine later if necessary. Kind of trying to push it as late as possible. But well, that decision is the reason I scrapped the second iteration of the compiler, and started with the third. While adding the virtual machine and custom bytecode this early on does drive the implementation cost up by a lot, I am quite confident it'll be worth it.

First of all, having the virtual machine tagged on this early on, I can design lot of the language features around it. I was going to have it at some point anyway, and it enables core feature of the language in the way that I can run pretty much arbitrary code at compile-time and modify the program itself with it. That allows me to use it to implement a lot of the other stuff I want to have. It also helped me realise how to structure the compiler and how to handle different compiler stages a lot better than if I had done it as an afterthought. Not least because since the program can modify itself at compile-time, the order of compilation stages might not be as simple a problem as it usually would be.

Second, it is pretty good at handling debugging -- at least unless the bug is in the VM itself. I can pretty much choose how I want the system to work, and for a hobby language like Flower, enabling good debugging early on makes a lot of things so much easier.

So, no native code?

I still design my IR and the bytecode so it can be passed on to LLVM, QBE or whatever backend. It might not be as simple as just using the AST for that, but still. I've redesigned the bytecode at least once already because I realised I was going way too low-level with it for there to be a reasonable chance to pass it on to LLVM without some serious problems.

So it will be there, just not immediately. Which is a bit of a shame. But since I am quite far from the "make it fast"-part of the "make it work, make it right, make it fast"-mantra, I'm fine with the tradeoff for now.

Although, there is an argument that I'll need the native code compilation for "make it right" and not "make it fast". But it's a problem I'll tackle at a later point of time.

The type system

Another thing that I've rehauled completely a couple of times is my typesystem. I built it way too complex, to the point of me lampshading it in a chat conversation in January. I had an extremely complex solution to something I admit is a complex problem.

After the second time I tried to wrap my head around it and failed, I decided that if I can't make any sense of it, and I'm the one who wrote it, I probably shouldn't subject other people to it.

Interestingly, plugging the virtual machine, again, seemed to lead me into the right direction. In the current system, user-defined types are just functions that return a type, which is considered a primitive type itself by the virtual machine. This has proven to be quite elegant and efficient. It does lead to some problems with the syntax though, which I'm currently trying to figure out.

How are the other ideas faring?

Well, they are still ideas. But in a sense, plugging in the virtual machine early on clarified how to actualise some of them.

RAII, zero-cost abstractions and functional ideas benefit directly from the choice of plugging in the VM early on. While it might take a while longer to be reality, the pieces of the puzzle seem to be falling into place quite neatly now that the base ideas have been boiling down a bit.

Generics seem pretty straightforward with the ideas already in place, C interop with some caveats already works (thank you libffi) and well, pretty much everything is currently a function in Flower.

Complexity

I was afraid the addition of the VM would make the language so much more complex, but actually, I think it has been the other way around.

There is no need for monstroties such as C preprocessor, weird macro tricks, or other stuff like that. It is quite possible (I'd even say likely) that a Flower compiler will not be much harder to implement than a C compiler, at least a standards-conforming one. (Though that is much harder than people realise).

Also, the "just return a type" is pretty straightforward, and doesn't require too many leaps in thought. It is different, sure, but we'll see how it plays out. I am hopeful that it ends up being simple to wrap your head around as well.

What's next?

Before I'll start going into detail about some of the things I glossed over in this overview so far - and start showing more code - I'll still want to write a conclusion to this Flower overview trilogy where I'll talk about the wild ideas for the far future, what the current roadmap is like, and how likely I feel it is that I ever get there.

Posted 2022-10-06 in overview design