Developing GHC for a Living: Interview with Vladislav Zavialov

Approximately a year ago, I wrote an article with a bold title: “Why Dependent Haskell is the Future of Software Development”. Here is its quick summary: Haskell is nice. Dependent types are also nice. Haskell with dependent types would be doubly nice. We at Serokell are doing our part to make this happen.

And a year later, I’m happy to report that I’ve made some good progress in this direction, with help and guidance from Simon Peyton Jones and Richard Eisenberg. But for me this is more than just work, it’s a personal journey. I always found compiler engineering interesting, and it’s extremely fulfilling to work on something I like.

That’s why I’m eager to share it with everyone who would listen, or, even better, get involved. For example, I put together https://ghc.dev/, a cheatsheet to help anyone get started with contributing to GHC. So if you ever feel like adding a GHC feature, especially one that is related to dependent types, and don’t know where to get started, one option is to start by dropping me a message on Twitter.

But this line of work is not trivial. Like other areas of software engineering, working on a compiler is challenging and requires thinking deeply about hard problems. That’s why it helps immensely to have the opportunity to immerse yourself into the problem space, to work on this full time at Serokell. What follows is a small dive into the thoughts and opinions of a randomly sampled GHC contributor, i.e. me :-) Read on if you’re interested in what sort of person ends up working on a Haskell compiler and to see some of the latest results of improving GHC.

– What does your work on GHC involve? What did you do in 2019 and what are your plans for 2020?

My GHC activities fall into two categories: social and technical. The social activities are about discussing language design, and going through the GHC proposal process to get my preferred design decisions accepted. In that regard, I (co-)wrote several proposals in 2019:

In general, I’m trying to nudge Haskell in a direction that brings it closer Dependent Haskell. I have outlined my vision in a meta-proposal, #236 Grand Class Unification. I also routinely participate in the discussion of proposals by other people, as long as they are relevant to my work. The technical activities involve implementing the designs that were accepted by the community and also internal improvements. That’s the part I prefer because I’m first and foremost a software engineer, not a community manager. Here are the things I managed to get upstream in 2019:

  1. GHC’s parser can now report multiple parse errors.
  2. The “happy” parser generator (used by GHC) now handles higher-rank non-terminals correctly. Also, shift/reduce conflicts can be resolved with a new %shiftpragma.
  3. forall is now always a keyword in types.
  4. (.) is now a valid type operator.
  5. A new parser architecture with a late validation step, applied to disambiguate between arrow commands, expressions, and patterns.
  6. Identical treatment for implicitly quantified type and kind variables.
  7. Guard CUSKs (complete user-supplied kind signatures) behind a language pragma.
  8. A new Template Haskell function, reifyType.
  9. Standalone kind signatures, a new feature that allows one to write kind signatures for data types, classes, type synonyms, and type families.
  10. Whitespace-based lexical rules for (!), (~), ($), ($$), and (@).
  11. Meaning-preserving parsing rules for {-# SCC ... #-} annotations.
  12. Conservative parenthesization of the * kind syntax in the pretty-printer.

Also, I made a few minor improvements to the test suite, removed dead code here and there, and did other small refactorings of the “while I’m in town” nature. You can scroll through my commits to see the exact changes: https://github.com/ghc/ghc/commits?author=int-index

As to the plans for 2020, I was happy to learn that Artem Kuztensov would join my team on a permanent basis. My most optimistic estimates are that in the third quarter of the year we will be done with the term/type unification in the parser and the renamer, and then we can focus on the type checker.

– Is there a dream project that you want to work on someday?

I’m an idealist, and I believe that a perfect programming language with perfect tooling (or, at least, Pareto-optimal language and tooling) can exist. This is in contrast to people who just see languages as tools to get the job done. For them, a language is probably as good as the libraries available for their tasks. For me, a programming language is an instrument of thought, and I’d like this instrument to be as versatile as possible. But, of course, I dare not claim to know what a perfect language would look like.

At this stage of my life, I’m just trying to figure out what makes a language good. It’s quite clear that static types are important to guarantee correctness. And it’s also clear that the type system must be very powerful (including dependent types) to express any interesting properties. However, it is less clear how to design a type system that not only has all the cool features, but also an efficient type checker. There are also many other concerns besides correctness, such as runtime performance, distributed computation, modularity, and so on.

That’s why I enjoy working on GHC so much. Not only I’m contributing to an important open source project, but I’m also constantly learning about the trade-offs of language design and production grade compiler engineering techniques. That’s also why my other project is a structure-based code editor: the goal is to find out if a new generation of programming languages can provide a better user experience by abandoning some core assumptions (e.g. that code is text), and what exactly this user experience should be.

My hope is that in a few decades this will culminate in a clear design document for the programming environment of the future. So, to summarize, my dream project is to revolutionize programming, to enable orders-of-magnitude higher developer productivity by means of superior language design and superior tooling.

– Does GHC need more consensus or technical talent to advance further?

GHC finds itself in a remarkably difficult position. Haskell is used by people with very different background and goals:

  • Teaching. Haskell is the natural starting point to teach functional programming to students. Algebraic data types, pattern matching, pure functions, higher order functions, lazy evaluation, and parametric polymorphism, are all very important concepts, and Haskell has them all. And you know the one thing that teachers dislike? It’s when their teaching materials go out of date and they can’t point students to a rich body of literature because all of it is subtly incorrect due to recent breaking changes.

  • Industry. Haskell is memory safe, compiles to native code, offers a good threaded runtime, and has a powerful type system. It encourages immutable data and pure functions. It makes it easy to express complicated control flow using lazy evaluation and higher order functions. Michael Snoyman goes into more detail in his article “What Makes Haskell Unique”. All in all, the features of Haskell make it a great choice for writing applications. And you know the one thing that industrial users dislike? It’s when their 100k LOC codebase takes ages to compile.

  • Research. Haskell attracts the sort of people who are open-minded about new features, even if there are sometimes loud opposing voices. Today’s dialect of Haskell implemented by GHC is light years ahead of standardized Haskell, featuring GADTs, data type promotion, type families, template metaprogramming, various deriving mechanisms, unboxed data types, multi-parameter type classes with functional dependencies, higher rank universal quantification, and more. Many of those features have imperfect design and imperfect implementation, but such is the nature of research. And mark my words, sooner or later Haskell will also get dependent types, linear types, row polymorphism, refinement types, and whatnot. And you know the one thing that researches dislike? It’s backwards compatibility, which makes it extremely hard to design and implement new features.

How can GHC developers balance these concerns? How do we add new features without breaking the teaching materials too much and without making the compiler too slow? For that, we need to be inventive. We must find solutions that are good for the community as a whole. And that’s why the GHC proposal process exists. Sure enough, it helps when more people are coming up with new ideas and find flaws in the existing ones. I have learned a lot just by reading the discussions there.

On the other hand, it puts proposal authors under a lot of stress. It is hard to present a feature in a compelling way. Sometimes good ideas get abandoned because their authors are not good at selling those ideas.

– Would Haskell benefit from a new standard?

Good question. As I said, GHC’s dialect of Haskell is much more advanced than the language standard. Why don’t we incorporate all of these features into the standard? As I also said, these features have imperfect design and imperfect implementation. It would be a shame to standardize something flawed. And I don’t think it would serve any purpose either. If there were competing Haskell implementations, then the standard could be useful to describe the portable subset of the various dialects. But there are no competing Haskell implementations, and there’s no market for them. The effort is better spent on improving GHC. Therefore, I don’t believe there’s a need for a new standard.

– Do you think that Haskell is closest to the perfect language that is out there right now?

There are definitely languages that are better than Haskell in some particular aspects. For example, PureScript has row polymorphism, which Haskell does not have. Agda has dependent types, and Haskell has not caught up yet. Mercury has linear types. However, I’m not aware of a language that would be better than Haskell in many ways rather than a few specific ones. So I’d say that overall, Haskell is the best general purpose language out there at the moment, but it must evolve continually to stay in this spot, or else it will be outcompeted.

––––––

Vladislav Zavialov is our Haskell developer and GHC contributor. If you are interested in other Vlad’s materials, you can read his article about Dependent Haskell and follow him on Twitter. Thank you, Vlad, for such a great interview and for working with us!

Haskell courses by Serokell
More from Serokell
Optimizing K FrameworkOptimizing K Framework
What Remote Workers Do in Their Free Time. Part IWhat Remote Workers Do in Their Free Time. Part I
bringing fun to fp interviewbringing fun to fp interview