In this month’s episode of Functional Futures, our guest is David Christiansen, the Executive Director of the Haskell Foundation, a contributor to a number of dependently typed languages, and a dependent type advocate that has managed to introduce many people to the topic today through his work, talks, and texts.
We cover topics such as dependent types, theorem proving, metaprogramming, and many more. We also discuss the book he co-authored with Daniel P. Friedman, The Little Typer, and his current work in progress: Functional Programming in Lean.
You can watch the episode on YouTube or via your favorite podcast app.
Below is a lightly edited transcript of the episode.
Dependent types
Jonn: How would you define dependent types?
David: Dependent types are types that can depend on values. This is the usual way that it’s talked about.
So a way to think about it is that in languages like Haskell and Java, you can have a type which takes another type as an argument. So you might have a type called List, but List isn’t really a type until you tell it what the type of the things in the list are. So you could say List of Int or List of String or List of List of List of List of Int.
And this process of applying the type to another type gives you a type. So people will often talk about the argument type being like a parameter to the type. And in a dependently typed language, you can have parameters that are not types themselves.
So you could say, I have a List of 5 Ints, or I have a List of 20 Ints, or I have a List of 0 Ints. And you could say that the type of the appending operator that appends two lists isn’t List a to List a to List a, but it’s instead List of n a’s to List of k a’s to List of (n + k) a’s. And that’s valid for any natural numbers or any lengths n and k.
Dependent types are interesting from a number of perspectives. I think one of them is that they allow us to write very precise types down, another is that they allow us to write very flexible types down. So you could say that, you know, depending on the value of this Boolean that the user generates as input to the program, the answer is gonna be either an Int or a String, you know, the caller picks which thing I return.
That particular case isn’t the most useful thing, but you can build very useful things on top of that. So you could, for example, have a representation of a database schema as a data type, and from the representation of the schema, you could derive a type which represents the result of a query against that schema, and then arrange your database library to take care of all this stuff.
And aside from being sort of both more precise and more flexible than non-dependent types, dependent types are also interesting because, depending on the specifics of the system you’re building them in, they allow your type system to be used as a fully fledged logic for proving things about your program.
So on the one hand, you know, we like to have precise types that encode various invariants about our data. But on the other hand, sometimes we also just wanna sit down and crank out a mathematical proof that says that a program lives up to its specification, whatever that might be. And some people are also wanting to use the logic of the type theory just to do ordinary mathematics and not even sort of program verification tasks.
I’m an absolute garbage mathematician, so I don’t use them in that way. So I have very little to say about that, but it does exist.
Jonn: You provided in your elaboration of what dependent types are the iconic, I think, example of a list indexed by length. Is it your favorite way to show off dependent types to beginners?
Or is there something you would show to – because, you know, your other example with database schemas, it’s very difficult for a person to envision if they don’t have a background encoding something like this. Do you have something in the middle that you reach for to explain dependent types?
David: First, a little bit of defense for the length-indexed list. I went through a period of time where I would play length-indexed list bingo in a talk, you know, and mark off on my card whenever someone got to that example, cause everyone uses that example. And it’s kind of not the most interesting example in the world.
So I absolutely get the frustration with it. But on the other hand, it’s a classic example because it has like a little bit of every kind of complexity you get with the data type. So one of its arguments, the type of the elements of the list is the same, no matter how long the list is, another one changes, which is the number.
So it’s got, you know, a parameter and an index, as they say in the business, it has the ability to drive some interesting programmer-user interface, automation stuff without – you know, it’s simple. So it’s actually a very good example in a lot of ways, but from a practical perspective – yeah, I get it.
The second example I’d like to use is this example of having one data type that represents the types of some little language that you’re implementing. Our data is our data type, let’s call it T, it’ll have one constructor called Nat, which represents the numbers type, it’ll have another construction called Bool, which represents booleans in our language that we’re implementing. And then it might have a data constructor called Arrow, which represents the function type.
Then we can define a type of expressions which are parameterized over a list of these types, which gives you the typing context and the type that the expression actually has, such that you cannot write down an expression that has the wrong type, and you cannot write down a variable that isn’t in scope.
And this is certainly an example that’s only compelling for people who think that lambda calculus is the easy thing they understand that they want to use to understand the new system. If you’re still learning lambda calculus, it’s not a good example, so it’s certainly gonna depend on the audience.
But for people who know that, you can then proceed to define a type of values which are also indexed by the type. And then you can write a little interpreter and it turns out that you can write a function from the representation of the type of your sort of object language types, you know, like the Bool, the Nat, and the Arrow, you can write a function from that to the actual types of your dependently typed programming language.
So you can have a function that sort of case splits on that data type. And for the Bool constructor, it returns the actual type of Booleans. For the Nat constructor, it returns the actual type of natural numbers. And for the Arrow instructor, it returns an actual function type whose argument and return types are found by recursion.
And then you can write a function which says, you know, given a suitable environment, so I can look up my variable, I’m gonna produce something with the right type. And it’s just a really cool example showing how you can – because, on the one hand, you’ve implemented a programming language, but on the other hand, you’ve also carved out some selection of the surrounding language that you’re working in. And that’s a really powerful tool to use for writing dependently typed programs.
Jonn: That’s really cool. And also, we have so many blog posts already about writing embedded DSL in Haskell, for example, I think it’s very relatable, and I mean, these blog posts are out there for a reason, right? We write a lot of embedded DSLs, and it’s very nice to have these sorts of advanced facilities while we encode stuff like this.
You already mentioned that you aren’t an evangelist or, you don’t think that a hundred percent of programs should be written in a dependently typed way. But I’ll ask this question anyway, because it’s a very interesting question for me, personally, as we in Serokell and a lot of people in the FP industry are looking into industrializing dependently typed systems. So we, as a society know really well how to sell well-typed systems to businesses.
David: So here you’re thinking of speedy refactoring and certain kinds of tests you don’t have to write anymore? This kind of stuff?
Jonn: Yeah. Stuff like this. And then, you know, our sales can have a very nice, like, you know, bingo card that they can reverse engineer into a sales pitch.
Do you think there’s a similar methodology for dependently typed systems, and can there even be one, given, as you mentioned, how different dependently typed systems have different implementation trade-offs, and stuff like this? What are your thoughts on this?
David: I mean, different programming languages always have different trade offs, right?
Haskell and OCaml have pretty different implementation trade-offs. And there’s certainly things where I’d rather use OCaml and things where I’d rather use Haskell, and things where I’d rather use neither of them because a garbage collector would get in the way. And then I might look at Rust or C or something like that.
There’s no one programming language that is gonna solve every problem in the best way. And this is probably gonna reflect my complete lack of experience in sales. But if I’m gonna try to sell something to someone first, I want to understand what their needs actually are and find a way to actually meet them.
So, one context where I’ve used some dependently-typed-like features of Haskell at work – we had a large verification system, and we really wanted to avoid a lot of certain kinds of bugs. And so there, maintaining invariants through our data types was a pretty good argument. In other contexts, I think that the ability to do more interesting things with embedded DSLs could be a good way to sell something if that was relevant for the person.
I think that being able to model aspects of a program that you can’t model that you can’t, at least, conveniently model without dependent types could be an interesting thing. But for me, it would really depend entirely on who I’m talking to, and what their problems are, and seeing whether the dependent types actually do solve their problems.
And, you know, I don’t want to talk down the thing that I love because it is a thing that I love, but I also think it’s important to not try to – just because I’ve got this screwdriver that I’m a big fan of, you know, I don’t wanna use it to like start pounding in nails. It’s not right for everything, and it’s gonna get right for more things, as we learn more as a community, as, you know, researchers make more cool stuff, as implementations get better, as we get more resources invested in high quality implementations and high quality instructional materials and training, you know.
Today, the big thing I would worry about is where do you find people who can maintain this? And that’s a real thing to worry about, because, you know, it’s certainly much more common for people to have an idea of what dependent types are than they used to, or to have maybe done some dabbling with Agda, an undergraduate verification course that used Software Foundations.
Like these things exist a lot more than they used to, but it’s still a pretty niche skill. So, you know, just because something isn’t super useful for all industrial problems yet, it doesn’t mean it’ll never be, it doesn’t mean it doesn’t have value. It doesn’t mean it isn’t wonderful and beautiful, and fantastic. But just because something is wonderful, beautiful, and fantastic, it doesn’t mean that we should necessarily inflict it on everyone around us at all times.
Jonn: My huge hypothesis is that – you mentioned database schemas, the way you were describing this stuff, made me think about Haskell’s library called servant and the family of it, and it even does a little bit of dependently typed-ish things, right? With type-level strings and stuff, generating implementations for APIs automatically, and then whatnot.
So from talking to you, I would say that maybe, you know, porting servant to some dependently typed system, and then kind of selling it for high-assurance web servers. Because we sometimes really need to respond to some requests. Like, for example, if we’re providing some emergency API for physical emergencies and stuff like this, or maybe our nuclear power plant, our sensors are going off and we really want to make sure that a particular type of sensor would trigger a particular type of response.Stuff like this.
So I think that whoever will implement servant in Lean, will at least be at the head of like, hundreds of thousands of dollars worth of a project, I think.
David: Yeah. So, I haven’t ever used servant. My industrial Haskell experience has mostly been on either things that talk to text editors over a local socket or command line batch mode stuff.
And so I’ve read about servant, but I don’t have concrete experience, but based on everything I say, it’s basically using dependent types. I mean like, based on the definition I gave earlier, which is what people would usually say, like full-spectrum dependent types, you know, types are first class things, you can return them from functions, you know, they can be passed to functions, they can be computed with.
We have a lot of that in Haskell, and we get more and more. Like, as soon as we got GADTts, you know, generalized algebraic data types. So for those who may not know what that is, it’s where you have a data type in Haskell that takes a type argument. And then the choice of constructor can cause that type argument to have a certain value.
And as soon as you do that, you have a runtime value, which is that data constructor, which is determining a type. So that is one form of dependent types. Depending on exactly why you want to say whether a thing has dependent types or not, like, maybe you wanna do some mathematics and then you need a specific thing and, and that’s great, you know, be specific with your definitions, but when it comes to the kind of things you can achieve with it, Haskell for a decade now has been able to do a lot of what you can do with many dependently typed systems.
And I’ve used those things and have gotten a lot of value out of them for the kinds of things you’re talking about. So it’s, it’s less of an either/or in my opinion, for most contexts.
Jonn: Yeah. That’s, that’s a very constructive way to put it. But for me, when, for example, I read like servant code, or even read the code that uses values in the type level, like lifted values – I feel like the ergonomics of it – it feels a little bit like writing, you know, C++ boost libraries.
David: In my experience, when writing Haskell in that style, which I did some back when I worked at Galois, I would essentially write some code in Idris in my head, and kind of feel like I was hand-compiling.
It’s kind of like how people used to write C code in their head and hand-compile it to assembly code when they were doing assembly. It was very much that kind of a feeling. I think that it’s certainly easier to use than it used to be. And we have better features for it now than we did in the past.
But you can definitely tell that Haskell as a language wasn’t designed for that style of programming to start with, and that it was added on later in a way to be backwards compatible. I think they’ve done a great job under those constraints, but, you know, when you design something to do something from the start, then you can certainly make it more convenient than when it’s added, you know, when the language is old enough to vote.
Jonn: At Serokell, we are very much on the boat of ergonomic dependent types in Haskell. Whereas, there is, I feel, still a sentiment in the community that says types should be types, and values should be values. So you can have your little operators to lift one to the other, if it pleases you. But we shouldn’t put it all in one blob.
And there are a lot of arguments. One of the most popular ones is that, well, okay, now you lift your values to your type level, and now you write like type-level bugs instead of runtime bags. Like if you’re, if you kind of don’t get the model right, then you won’t get it right no matter which tools you use.
And there’s also some pedagogical remarks that, oh, it’s gonna be harder to teach or something like that. How do you feel about this stuff? Where do you stand? If it’s okay to ask a person in your position that sort of question.
David: I mean, I’m gonna give you my honest answer, but you probably won’t be satisfied, which is that I don’t think that with any programming language related question that you ever get a useful answer from saying what should be done.
I think that you get useful answers by saying what should be done by these people, in this situation, with these resources and this background, and these goals, and these constraints. And, I think that there is absolutely a real risk that the Haskell culture, in particular, could evolve to be something that only experts can get involved in, and only people with a sort of deep experience and deep background.
And then, we’ll, one thing about only letting in experts is that you stop making new experts. That would be a real problem for the community, if we sort of essentially isolate ourselves from the rest of programming and never let anyone new in, because we, you know, pull up the rope ladder behind us when we climb up to our type castle.
I do think it’s very important that whatever we’re doing, we maintain a way in for other people, and that we don’t make it impossible to do something that Haskell has been very good at for decades now, which is, you know, Hindley Milner + higher kinded types + type classes. Like that, that’s really a nice way to program and it’s really effective. And I don’t, I think it’s important that we don’t break that because certain programs and certain contexts aren’t gonna gain anything from fancier types.
If we make everybody pay for it, but only a few people get to use it, that might be a problem, depending on what they have to pay. You know, if what they pay is that maybe their compiler is a five megabyte bigger download than it otherwise would be probably, that’s fine. You know, if what they pay is their compiler takes an hour instead of 10 minutes to build their code base, and it’s got a whole bunch of obscure bugs that it wouldn’t have had because it got more complicated, that might be an issue.
So I think it’s important that we listen to a variety of people and take their perspectives and experiences seriously. And part of that variety are the people who want to do the fancy types. Like, I want to do fancy types a lot of the time, but I also think it’s important to not have to do them all the time.
Jonn: Do you feel like we’ll navigate this initiative? And Haskell will become a proper vessel for, for dependent types.
David: Isn’t it already?
Jonn: Oh, okay. That’s a very nice answer.
David: I can elaborate on that a little bit. I think that the design space for dependently typed languages is absolutely gigantic. You know, if you look at sort of languages in the broad ML family, like, you know, Haskell was until recently, I guess it still is, depending on which flags you turn on.
Right. But Haskell 2010, clearly in the ML family, Standard ML, clearly in the ML family, OCaml, clearly in the ML family, they all make different design decisions in various ways, right? Like, you know, Haskell has type classes and polymorphic recursion, Standard ML has equality types and it has, you know, generative functors, OCaml has all the features, you know, but applicative functors by default, here, applicative functors doesn’t mean what it means in Haskell, but it has to do with like the way the module system works.
Haskell almost doesn’t have a module system to speak of, it’s got files with code and you can hide some names, but nothing along the lines of what you see in Standard ML and OCaml. Backpack is a thing, but it’s not commonly used the way that modules are in ML.
And just in this area, you get really different practical trade offs with respect to all these different features, and they all work differently. And they’re all interesting points in the design space, but once you get dependent types in a language, then your design space just absolutely explodes.
So some examples of decisions you can make that are in the sort of well explored areas of dependently typed languages. Remember I was talking earlier about appending two length-indexed lists, and we’ve got the return of that saying n plus k is the length. Well, what if I’d written k plus n there, should it be a type error or should it not be a type error?
And that right there exposes a huge gap in how we interpret the meaning of plus. Are we going to define it as a recursive function that’s working on some partially symbolic values, or are we going to be appealing to an SMT solver, or are we going to be doing proof search across the thing, the properties that somebody has proved about equality? All three of those exist in languages, and they all give very, very different user experience trade-offs. And that’s just for that one little aspect.
Another thing is, let’s say we write some proofs, or let’s say we have a data type, which includes a few proofs that it’s well-formed. Maybe you’ve got a list which contains a proof on every cons cell, which says that the thing you’re adding to the list doesn’t already exist in it.
What does it mean to compare two of those lists for equality? Well, that means we have to compare the proof objects for equality, and different design decisions about the way that proof objects are represented can make that easy or difficult, interesting or boring, and it just goes on and on and on and, some interesting work going on there.
We have sort of the big dependently typed languages, like, Coq, and Agda, and Idris, and Lean, and they’re all fairly similar. Like, they make quite a few different decisions in those areas, but then if you go and you take a look at something like [inaudible] that Aaron Stump’s group is working on at the University of Iowa, or you look at Zombie [inaudible] that Stephanie Weirich and her students Vilhelm and Chris were working on before they graduated.
Like, they have very, very different answers to these kinds of questions. And all of them are super interesting. And as Haskell is growing dependent types, it’s growing them in a very different way from something like Agda or Idris. And that means that these trade offs are gonna be very, very different.
So for example, people look and they say, oh, well, the type of Type is Type in Haskell. Like, that’s a logical problem, but actually it’s not because what it means is that you have to be strict in your proof objects. We have to do that anyway, because that’s how GADTs work, and doing that allows a lot of other things in the language to be significantly simpler. And so Haskell + dependent types is never gonna be Agda, and that’s okay. It’s gonna be its own thing, and it’s gonna have different trade offs than those. And we’re all getting richer for it.
So that’s with my researcher hat on, then with my “I like to program and get things done” hat on, Haskell is the only sort of reasonably dependently typed language that has the kind of library ecosystem that means you can actually get stuff done.
You know, it’s got an advanced, mature compiler to a much greater extent than others, which are sort of less in that state. So there’s a lot to recommend it already today.
Do I wish that the user interface for doing dependently typed things was easier?
Absolutely. But to claim that Haskell is not a good spot for dependent types today, I think is taking a bit too myopic of a view of what dependent types are and can be, and the kinds of things that they’re good for and not good for, and where we get the values. So that was the short answer and the long answer.
The Little Typer
Jonn: You were very humble to say that you’re a proponent but not evangelist [of dependent types]. But I mean, you co-authored The Little Typer book?
David: Yeah, me and Dan Friedman.
Jonn: As far as I understand, it’s an homage to The Little Schemer?
David: Right. I’d say that it’s a sequel rather than homage.
Jonn: Oh, wow.
David: Um, I mean, so The Little Schemer is based on The Little Lisper, which Dan wrote way back when, and then in later editions, his student Mattias got involved.
I forget which edition that was in, but certainly by the time The Little Lisper became The Little Schemer, Mattias was involved as well. But you know, Dan has been involved in a number of sequels, like The Seasoned Schemer, which is talking about control operators and continuations, and all sorts of cool stuff like that.
The Reasoned Schemer, where there’s this logic language that he built together with Ole Kiselyov and William Byrd and in the second edition that just came out, his other student was also involved there. And you know, there’s The Little Prover, which is an ACL2 style system.
So, essentially, Dan’s got this thing going where he gets interested in something, and then he wants to understand it deeply. So he finds somebody who knows about it, and then works together with them to create a little book. And, and so these little books, they’re all written in dialogue form, they all make a point of not using complicated mathematical examples.
Um, ideally the examples are all food, cause everyone can relate to food. You know, they try to cook things down to their simplest essence. They try to be short. Although we failed at that with the little typer, it ended up longer than we wished.
Jonn: Hopefully not five megabytes longer?
David: laughs It ended up being like, I think 420-ish pages. I don’t know the exact number, but over 400 pages.I was kind of hoping we’d write around 250 pages, but I had an idea of what does basic competence with the core type theory of one of these classical dependently typed languages entail?
I think it means that you can prove that equality of natural numbers is decidable. And that’s how long it took us to get there without skipping steps and leaving people behind. So that’s what we did.
But yeah, The Little Typer, the idea is really to help people who don’t have a deep background in formal mathematical logic to get a sort of intuitive understanding of the core theories that are used in languages like Coq and Agda and Lean and Idris, such that you could go on and maybe have an easier time learning the full-on version with all the Greek letters and also get an appreciation of what’s going on behind the scenes in those languages.
So we don’t really teach you how to do anything practical with dependent types. In that book, we try to help you get a feeling in your gut for how they work such that you can then maybe do a better job learning the practical things somewhere else.
Jonn: Could you talk about the programming language that you use there?
David: So we invented our own programming language for the book. And the reason we did this is because working in the core language of one of these type theories is extremely tedious. You have to write down a ton of details, then the implementation is gonna go look and say, you know, did you write the same thing in these eight places? Yes, you did, gold star.
And on the one hand, we wanted to have short readable examples. And on the other hand, we didn’t want the book to be 800 pages. So, that meant that we couldn’t really use the existing systems and turn off all the fancy features. On the other hand, we didn’t want to use all the fancy features because there’s other books that already do that.
If you want to learn how to do formalized mathematical proofs using dependent types, you can look at Software Foundations, Type Theory and Formal Proof, there’s a bunch of these books out there these days. And so any of my colleagues and friends whose books I didn’t just name, I apologize in advance, cause I didn’t think of a list ahead of time.
But we really wanted to just do something different with this book that we didn’t think that others were doing. And so we made our own language. And a lot of people think it’s a dependently typed lisp, but I think for a language to count as lisp, it should have a lot of these features.
Like, it should have parenthetical syntax, it should have a macro system, it should be dynamically typed, it should have first class functions, at least. And you know, Pie, the language we use in our book, is it a lisp? I don’t think so. Like, it has parenthetical syntax and it has first class functions, but none of the other things
We used parenthetical syntax partly because we like it. You know, I like writing Lisp. I’ve written a fair bit of Racket code and Emacs Lisp and Common Lisp back in the day. It’s one of the languages that I love in addition to Haskell and these dependently typed languages. Dan’s whole career has been built around Lisp and Scheme. So we both like that notation.
But also we wanted to have a dependently typed language that was simple enough that people who read the book could understand how the implementation worked without needing to learn a lot of new concepts. And I think we were partially successful at that. We really wanted to have the whole source code, the implementation, be at the end of the book, but it was too long.
So we sort of transcribed it down into mathematical rule notation. And then I wrote some supplementary tutorials and other materials to understand it. Plus two sort of pedagogical implementations, one in Racket and one in Haskell. And one nice feature of parenthetical syntax is that a parser is very easy to write compared to something with operator precedence and infix things.
So, you know, if you wanna sit down and implement Pie in JavaScript, that would be awesome. And hopefully we’ve made your life easier by using the parenthetical syntax.
Jonn: That’s really cool. And also, the full implementation is a nice appendix for the collector’s edition.
Dependent types and proofs
Jonn: So, the first thing that I don’t have a very good understanding of – you mentioned proof objects. That you would attach to statements demonstrating how things of different types, depending on different values combine under some function, let’s say. It would prove that the thing type checks, right?
David: I wouldn’t, no, it doesn’t prove that the thing type checks. A thing either type checks or it doesn’t.
Jonn: Sorry. Yeah, it would require me to submit something in addition to the values to demonstrate that, yeah.
David: You could formulate systems that way, for sure. And that the type checker will make sure that your supposed proof is in fact a proof.
Jonn: Yep. And is this why kind of formal verification or like proof assistants come hand in hand with dependent types?
David: A lot of the initial interest in dependent types was from the perspective of people wanting to work on the philosophical project of the foundations of mathematics.
If you read like Per Martin-Löf early stuff, especially, you can really see a strong thread back to the early century foundations project, sort of with some extra tempering with intuitionism and a bit of phenomenology, like, you know, there’s some marks from Husserl if you know where to look.
And so from the very start, I’d say, it’s actually not been thought of so much as a programming tool, but rather as a tool for doing mathematics. And then eventually we get a machine where we can take this foundational theory and put it in and use it. And, you know, Coq from the start was a tool for doing math, not so much a programming language. Like you certainly write programs in it, but that’s not so much what it was, what it was really there for.
And if you read papers on programming languages from the nineties, people would say, like, oh, we can’t do that, cause that would require dependent types, which is clearly an absurd idea.
As opposed to today when you’re reading, people are like, we must do that because it requires dependent types.
One place where this started to change, people were certainly writing programs in Coq, but they were writing programs in a subset of Coq that’s very much like, you know, system F or ML, and then using the dependent types kind of on the side to prove properties about the program. And that’s a really effective way to do things. And it works really well and it’s scaled up to like a full C compiler. But it wasn’t so convenient to have programs that were themselves dependently typed.
So where a lot of these things come from is a system called Epigram that was developed by Connor McBride and James McKinna. And you know, Edwin Brady worked on it, and a bunch of other people too. And Epigram was never more, I think, than an artistic statement or a dream.
I resurrected its code so you could build it with modern GHCs. You do need to have XEmacs installed to do it, to use its user interface. So that could be tough on newer computers. I haven’t had time to port the UI to GNU Emacs, but, but it’s essentially using XEmacs as like a dumb terminal with mouse support.
And it’s implemented the entire editing thing behind the scenes. And it’s got this 2d syntax and all sorts of cool stuff. But what it also has is a convenient notation for programming with dependent types using pattern matching and recursion that’s nice to read and nice to use.
And so, once Epigram had sort of made its artistic statement, then we started getting things like the equations package in Coq, and we got Agda 2, and Idris, and Lean, which take a lot of the techniques and a lot of the thoughts from Epigram and sort of build further on them.
And now we have convenient programming with dependent types, like for, you know, depending on who you are and what you find convenient, but certainly for things that I find convenient in, in all of those languages. So it really came from math to programming rather than the other way around.
Jonn: That’s really interesting. If we circle back to ergonomic dependent types in Haskell, you mentioned figuring out trade-offs for proof objects there as well, right? The way I imagine it, and this is the continuation of this question.
For me, it’s very difficult to understand straight away how would we, for example, take some bytes from the wire and parse them into a particular dependently typed object. Now that I used the term parsing, perhaps exactly like that. Perhaps we will have a parser, which we will have as you mentioned, like branching.
And then we’ll be able to use these values of these types. Let’s say, if we use the handy list n to then go to the pure world of our programs and use it as this object with the dependently typed object as arguments, but where do proof objects come into play here? I kind of can’t see them.
David: When you’re doing programming, proof objects typically show up in a context where the types don’t quite line up for reasons that the compiler can’t figure out on its own.
Anytime you design a type checker, there’s sort of two questions to figure out how to answer.
One of them is: when are two types considered to be equivalent to one another? And the other: when and how do I check that? And your answers to those will guide the design of the rest of your type checker. And it happens to be the case that any sort of sufficiently interesting type checker is gonna disappoint you when talking about when two things are considered the same.
And you’re gonna be disappointed because you want it to do sort of arbitrary mathematical reasoning for you, but any type checker that actually did that would be either slow or error prone, or tend to fall into loops. And there’s practical reasons why you can’t do it.
And also, depending on your language, mathematical reasons why you can’t, like, you could solve the halting problem. And so what you end up with is an equational theory of the values inside of the types that is somehow insufficient, but that the machine can check on its own. And then you have all the other things that you would like the type checker to consider to be equal to one another, but you’ve gotta give it some evidence that it can check.
And so on the one side, you’ve got the things that are, that just are the same as each other. And then on the other, you’ve got the things that are equivalent in some way that requires you to do the work. And then the computer just checks that you did the work right. So equality proof objects are the evidence that is used to equate two things that the compiler can’t see are the same on its own.
Other kinds of proof objects you might have: you can start embedding your various logical operators, like conjunction you get by having like, and, by having a pair of proofs of the two things that are being conjoined. Disjunction – or – you get by having essentially the either type with the proofs in each branch, implication you get of a function, a dependent function gives you universal quantification.
The empty type, the thing we call void in Haskell is false, the false proposition, saying that a thing implies false is negation. And so in some sense, there isn’t a formal notion of what constitutes a proof object. I’d say it’s a pragmatic thing, like how are you using it in your program. In the same sense that it’s not always clear, like, what is business logic and what is configuration. It’s kind of a fuzzy boundary.
Some languages give you specific technical means of marking the proofs, like in Coq and in Lean, you have this notion of a type being a proposition. And then types that are propositions have some special effects related to them. But generally speaking, you might also want to write proofs that aren’t propositions in both those systems anyway.
So I think it’s really a pragmatic question. Like, are you using this for its logical power rather than just to hold data?
Jonn: That’s a very, very nice and insightful question. I think it’s familiar to most of our, uh, listeners who are Haskell professionals, right?
When, when GHC can’t figure something out, you say, even say something as simple as colon colon type somewhere, and then it’s like, oh, okay. So sometimes it can’t infer something. And this may be applied to inference.
David: That’s a whole separate problem.
Jonn: Yeah, it is. But I think that it’s also very interesting to observe that it has similar reasons behind it. So many of us already have experienced something similar to writing like a proof object in dependent Haskell in our work when we would deal with inference.
Idris
Jonn: I kind of want to already start talking about Lean, but I don’t think that it’s fair to skip Idris because…
David: Idris is cool.
Jonn: Yeah, it is. It is very cool. And you have contributed to it a little bit, right?
David: I contributed, I’d say, a fair bit to the first version of Idris and almost nothing to the second. I don’t know if it’s still the case, but last I looked, I had the second most commits on Idris 1 after the main developer Edwin Brady.
I started off when I was at a summer school during my PhD program where Edwin was teaching a little tutorial on using Idris and there was no tab completion at the REPL and that bothered me, so I was like, all right, I’m gonna make a pull request.
I was like: “that was kind of fun”. And then I just kind of got carried away, and it ended up taking over my entire PhD project. Because I was working on a DSL thing and I was like, okay, I wanted dependent types in the DSL. How do I implement that? Maybe I can reuse bits of Idris. How do I reuse bits of Idris in a principled way? Hmm. Let’s do some meta programming – and it just kind of escalated from there.
Working on Idris, I had a lot of fun working on the interactive experience. Like the sort of Emacs-based IDE that I was contributing to. I did features like interactive error messages. So you could right-click on a term in an error message and normalize it in place without leaving the context of the error. That was really fun. And then I also did the meta programming system and, essentially, the ability to do unsafePerformIO in the type system was my master’s thesis.
Cause I was trying to do something kind of like F#’s type providers, but without having it be a code generation step. I don’t know that we have time to actually get into what those are, but type providers are a super cool feature in F#. So people listening who don’t know what it is, you can go check it out.
But yeah, I was working on other things and then Idris 2 got developed, and it’s got a lot of really nice features, but I just haven’t had the time to work on it, unfortunately. And so I also recommend checking out Idris too.
Metaprogramming
Jonn: Before we go to Lean, I would like to ask a question about meta programming.
You mentioned meta programming related to Idris. And in general, when we look at Haskell, we see that there are mechanisms like quasi quotation, whatever it means. And then we have some sort of compile-time code generation with Template Haskell. But then, we also just have deriving
, which is also meta programming.
Maybe you can tell our listeners a little bit about how that would look in a language like Idris with dependent types and stuff. Is there a significant difference between what we have in Haskell and there?
David: Yes and no. The thing I did for Idris fulfills the same kind of a role that Template Haskell fulfills in Haskell.
In these dependently typed languages, there’s typically a small core language, kind of like the role System FC plays at GHC. When you are type checking the program that the user writes, you generate code in that core language, because it’s much easier to do everything for a small language than for a big language. And particularly if you’re using proofs, then you are in a better position to trust that your proof checking system is correct if it’s very small.
And so the system that does that translation and type checking at the same time is called an elaborator. And the Idris elaborator had a bunch of nice features.
And what I essentially did was made that available to Idris programs so that you could use the features implemented to implement Idris to implement your own things. That was actually how I met the Lean developers, because they were working on a very similar design that ended up being the tactic language in Lean 3. And we met up at POPL in Paris and said hello, and that’s how I got to know them.
We were sort of independently arriving at fairly similar ideas. Theirs ended up being fast and having a profiler and a debugger. So they definitely did really cool stuff there.
But meta programming in general is a very broad term. You’ve got the Lisp and Racket style approach where you’ve got macros and hygiene, and little bits of code that expand to other ones. And you’ve got the sort of Template Haskell style approach, where you expose a chunk of the compiler to the program and let it do what it wants at compile time. All sorts of stuff.
Generally, making programs that make other programs is metaprogramming. You know, C++ has its templates. There’s a lot of ways to do all this. And if you’re interested in what we did with Idris, you may want to go look at my talk from ICFP 2016, where I’ve got a quick run-through of what that worked for.
Jonn: When we write macros in C or we write macros in pure Erlang, or we write macros in languages like C++ or Rust, it’s horrible.
It’s not good. You want to stop. You want to go have a breath of fresh air. But when we write macros in Lisp or in Elixir, we just want to keep going because it’s fun.
Do you think it’s even feasible to have Lisp-style macros for languages like ours?
David: Yes, absolutely. Lean 4 has them. If you want to do it for Hindley-Milner style languages, there’s an unsolved research question that back before I had a child, I had a little bit of time to try to work on this, and my collaborator Samuel Gélineau was still working on it.
The problem with Hindley-Milner style language is that type information flows in unexpected directions. And it’s unclear how to keep principled typing and have macros that can observe types. You know, Samuel had this idea of, essentially, when a macro tries to observe a type that happens to be an unsolved meta variable, then it pauses and gets resumed when the type becomes known.
And we’ve an implementation of that, which is this little language called klister but we haven’t done the formal math and the proofs yet to show that it, in fact, still has principal typing and all these other nice features. So I think there’s some interesting research work still to be done, absolutely. But languages with types can absolutely have Lisp and Scheme-style macros.
Jonn: That’s amazing. And when you’re distinguishing this approach to type inference between Haskell and Lean, you mean that Lean has a different approach to it? And what’s the difference, can you talk a little bit about it?
David: So, languages in the broad ML family are typically based around what people call like the Hindley-Damas-Milner type system, which has a number of nice features. One of them is that you have principled typing, which means that every program has a single, most general type that you can assign. Either that, or it has a type error.
And additionally, we have an efficient algorithm to discover said principled type. And dependently typed languages in the tradition of Martin-Löf type theory and the calculus constructions just absolutely do not have principled typing. A given program could have lots of types, and there’s not necessarily one of them that’s better than the others.
And this means that type inference becomes a best-effort thing, rather than something you can always rely on solidly all the time. And this means that you lose some properties. Like, with a Hindley-Milner style system, you can change the order that the type checker traverses the program. Maybe it starts going left to right, you make it go right to left. Or maybe depth-first instead of breath-first.
And the error messages that it produces in the case of a type error might be different. But if the program doesn’t have type errors, then you’re still gonna get the same types out. And that’s not the case for the other systems.
So that means that you are essentially fixing the order that the type checker goes in, which means that metaprograms have a predictable model, kind of, which is the implementation of the type checker. Whereas, in a Hindley-Milner style system, you really want to preserve that nice property of being able to switch around the guts of the type checker without breaking people’s programs.
And so if you expose that information to the metaprogram, then you remove that ability. Given that we can only traverse the program in the expected direction when we’re doing these dependently typed languages, then that means that the particular metaprogramming issue doesn’t pop up.
Jonn: Do I understand correctly that the simplest illustration would be that, let’s say, I have a dependent type that says that something has at most n elements, and then I have something that has three elements. Now that something that has three elements has an infinite amount of types.
David: It has at most five elements. It has at most 300 elements.
Jonn: So that’s an example of what you’re talking about, right?
David: Yeah. [inaudible] about the specific way you encode it in all these different varied systems. But yeah, exactly.
Lean
Jonn: You were mentioning Lean 3 and Lean 4. And I only learned about Lean this year, I think. And it’s already Lean 4. How come? Was I living down the rock? Why are we not all hyped about it?
David: I don’t know why you’re not hyped about it. So, Lean 1 and Lean 2, I never used them. My understanding is that they didn’t have many users beyond Leo. Leo de Moura is the main developer of Lean, he’s previously one of the main developers of Z3. Lean kind of started its life being more of an automated proof tool and less of one of these proof assistant style programming languages.
And over time, it’s just evolved more in the direction of interactive proof while still trying to do good automation, but automation and type theory are just hard to combine.
I think Lean 3 was the first version that had a significant number of users outside of MSR [Microsoft Research]. And It’s still the mature choice. Lean 4 is particularly interesting to me because, first off, it has a macro system inspired by Racket’s, which is one of my deep loves of the programming world.
It’s also interesting because it’s self hosting. So it’s implemented in itself, and that has forced them to think hard about a lot of implementation choices. Idris 2 is also implemented in itself, by the way. It’s got a nice interactive environment and, and all these things written in Lean, and one of my dreams is, you know, turtles all the way down.
Like I want to be able to tear apart my compiler and tweak it and mess with it in the language that I’m working on. Lean 4 seems like a nice step in that direction.
Jonn: And you’re writing a book about it, right?
David: Yeah. Microsoft Research is sponsoring me to write an introductory book on using Lean 4 as a programming language.
Lean has been sort of blessed for a long time with having mathematicians who are interested in using it. So, you know, Kevin Buzzard got into Lean 3. I’m not entirely sure why he picked that one instead of Coq or whatever, but he’s done a good job getting other mathematicians on board.
And so an interesting thing about Lean in contrast to these other systems is that it’s got a lot of mainstream mathematicians involved. You know, people who don’t think classical logic is dirty and in fact are kind of suspicious of why you would even consider not using classical logic. And that’s led to some interesting different tradeoffs in the design of Lean that weren’t made in other systems, which is good, cause as a community, we explore more of the design space for these languages.
I’d say Lean has been less of a programming language in the past, but once it became self-hosting, it had to suddenly be more of a programming language. And they wanted a resource so that programmers could get involved with Lean, who didn’t have a background in, you know, high-level mathematics and who didn’t have a background in type theory and all those sorts of things, because mathematicians need programmers to help them with meta programming sometimes because a big part of what you wanna do is automate your proofs so that you don’t have to sit down and type out every case by hand.
Alot of mathematicians have programming experience in Python or something like that. And you know, they’re not Haskellers, so learning materials that assume that you know what a monad is from a programming perspective aren’t gonna be useful. At the same time, if you wanna recruit people to work on Lean, which is written in Lean, and they’re very, very good C++ programmers but don’t have a deep background in functional programming, there wasn’t a resource.
And so I’m trying to write a book for people who know how to program, but who haven’t done any functional programming in the past, to sort of learn functional programming in the context of Lean and writing programs with Lean.
Jonn: That’s really cool. And, also you are describing the goals that you have for the book. And I didn’t read the introduction, but I read the first and second chapters and I want to say that so far you’re hitting the nail on the head.
David: But you’re a functional programmer already. So you’re not part of the audience.
Jonn: Sure. But the thing is that I could infer your goals from reading. Like,how you in the first chapter, for example, tell people, okay, here’s how you make a structure. You might see the errors. This is what they mean. It doesn’t matter for the time being.
Yeah. It’s a very, very, very good read. I think it’s going to address the objectives that you set for it.
David: One thing that’s kind of fun about the process of writing is that I actually wrote Lean macros to test all of those error messages. So the Lean macro system is powerful enough that I wrote a little macro where I put a Lean declaration and then I write a string next to it, and that string says what the error message that it emits should be.
And then, if the thing doesn’t make an error message, then that becomes an error and it fails, if it makes a different error message, then it fails. But if it makes the same error message, then it’s as if I never wrote the declaration at all.
And the type checker will just continue. And, you know, this is like maybe a 10-line macro. That way, I can make sure that the text of the book keeps up with all the improvements they’re making to the error messages in the Lean team. Every couple of weeks I’ll go and I’ll say like, okay, use the nightly from today. And then all the error messages will break and I’ll go and fix 'em.
Jonn: I love stuff like this. That’s a very smart thing to do. If you have this macro or like a blog post about it, please share. We’ll include it in the notes.
David: I don’t have the source code of the book publicly available. Partly because there’s some really hastily written Python code with some really, really impenetrable regular expressions that sort of assembles the various pieces of the book with the various example source files.
And I don’t have a lot of time to work on the book, because, you know, I’ve got a full-time job and I have a two year old and, and that only leaves a few hours a week to actually do the writing work. And if I have to spend a lot of time doing tech support on the metaprograms, then I just won’t have any time to get any writing done.
So being secretive at this point is a bit of a self-defense mechanism, but at some point it will come out. Just not yet.
Jonn: But I absolutely love rolling releases. I don’t know if there RSS, but like, it could be nice.
David: Yeah, I should see what it would take to put an RSS feed up for the book. Otherwise you can follow me on Twitter. I don’t actually post that often, but I do announce the book releases. And I announce them on the Lean Zulip.
Jonn: Okay. We’ll put the links in the description of the podcast.
One of the things you mentioned about the costs of dependent types is that they slow the compiler down.
David: They don’t necessarily slow down the compiler, but they can. Like, it’s certainly not the case that every program with the dependent types compiles slowly. However, the harder you make a compiler work, the slower it’s gonna get. And the ability to put programs in your types means that the type checker has to run your programs, and you can always write a slow program.
Maybe you want to call Ackermann of 30 as part of your type. There’s nothing that’s gonna make that fast.
Jonn: Yeah. And you can also always depend on the library that does this.
One more thing that people seem to care about a lot is performance. In Lean 4 in particular, how does it handle run time? And do you think that it’s feasible industrially from the performance standpoint?
David: People always ask: is this production ready? I think the answer with Lean 4 is definitely not, it’s not formally released yet. It’s in pre-release mode, the stable version is still Lean 3. I would not build a business around software written in Lean 4 today, simply because the cost of keeping up with the compiler changes would be way too high. Let alone all the other issues that you might have.
It’s just not done yet,and that’s okay. Like, it’ll be done, and the basic design is gonna be, I think, reasonably fast for a lot of purposes.
There’s a few challenges that come up. One thing is that type erasure is an interesting problem for dependently typed languages because type erasure also means value erasure. And you have to think carefully about which parts of the program survive to run time and which ones don’t. Like in, in Hindley Milner, you just delete all the types and the rest of the program survives.
But that’s not gonna work in a dependently typed language, because, you know, maybe the type is being passed as an argument and that matters, or, you know, you’ve got numbers that exist only to please the compiler. Do you want those numbers to get erased? All that stuff. And so I think Idris 2’s quantitative type theory is a very interesting answer to this where you can say in the type’s signature which parts of the program survive and which parts don’t survive to runtime.
I think Coq and Lean have this notion of propositions, which can get erased, which is also useful. But at the end of the day, part of the answer is gonna be that it depends on the program you’re writing and like some advance of the compiler technology, but once you’ve figured out your erasure situation, I think that the research community has reasonable answers to this these days, then you’ve got a programming language that is a lot like getting OCaml to run fast
And plenty of businesses can use something like OCaml. And OCaml is, I think, a really good example here, because it’s got a nice, simple compilation mechanism that’s easy to understand. You can mostly look at a program and know what’s gonna happen at runtime.
There are certain programs that I wouldn’t write in OCaml where a garbage collector isn’t gonna work. For that, I definitely want to use Rust. But I think that Lean is essentially a strict functional programming language when you come right down to it.
Um, it’s got a kind of an interesting allocator based on reference counting. And the reason why reference counting is interesting here is that one of the big drawbacks of reference counting is cyclic data. You know, you get these cycles and then you need a traditional collector to collect them, but you can’t actually write cyclic data in Lean. It’s impossible. So that’s no issue
And that’s due to the termination checking that you get from a dependently typed language. And so, because you can’t write cyclic data, that big disadvantage goes away. And when you have reference counting, then you can actually check, is this reference count exactly one and about to go to zero.
And if it’s about to go to zero, then you can mutate the object in place instead of deleting and allocating a new one. And that means that you can achieve runtime mutability with logical immutability and kind of get maybe not the best of both worlds, but some good stuff from each.
I don’t have any experience in writing Lean code that has to go super fast. However, the Lean compiler is mostly written in Lean. I use it on a regular basis and it’s plenty fast. So, I suspect that once the dust settles and things are mature, probably Lean code is gonna go a fair bit faster than similar Python code and a fair bit slower than similar V8 code or similar JavaScript code because probably less work will be put into a compiler for it.
And that’s gonna be fast enough for a great number of businesses and a great number of industrial production use cases.
Jonn: Absolutely. Yeah. That’s a very good and very inspiring assessment And also thank you very much for talking a little bit about the cool reference counting trick.
I guess my only follow up is and I’m a little bit confused here, so I see how we can’t define cyclic structures or something like this, but we do have mutual recursion, right?
David: Sort of. You have to prove that your recursive functions terminate. And that’s part of keeping the logic consistent. And in a strict functional language, the only reasonable way to make an actual cycle is to have a mutable reference cell.
So you can do it with, you know, a ref in OCaml or something, right. In a lazy language, you can absolutely do all these knot tying tricks, but, but in a strict language, you don’t get to do that.
You just have an infinite loop at runtime. If you try to make something like that. And, you know, the infinite loop gets ruled out by the termination checker. If you’re staying in the safe subset, at least, and, and you don’t have reference cells, so, yeah. And there us like unsafePerformIO kind of stuff in Lean, and that indeed lets you break the invariants that the languages run time depends on, that’s sort of in the name, unsafePerformIO. And you know, you can break those in variants, and probably you could do something interesting.
Jonn: I mean, we can write a socket accept, right? For example, we sometimes want to go infinite, which brings me to another question –
David: Infinite data, like cyclic data, is different from infinite programs.
Jonn: That’s true. Yeah. I asked about recursion and you said, well, you want, you need to prove termination. And my question now is like, what if I don’t want to prove termination?.
David: How do you write a web server? The classic question.
Okay. So, I’m gonna zoom out from Lean a little bit here cause different systems have different answers. So one way you can do it is you build an escape hatch into the language and you say that, like, I’m gonna decorate this function and say that it’s partial.
And for the Haskellers listening – in Haskell, people talk about partial functions. They usually mean incomplete pattern matches. Whereas in dependently typed settings, people usually mean functions that are not sort of mathematically total. So infinite loops can also make you partial.
And so you say, I’m gonna declare this partial, allow it to be partial, and then I’ll write it the way I write any other program with infinite loop in it and I’m done. And that’s what you’d have to do in lean because lean doesn’t have the other thing you can do, which is to use what’s called co-inductive data types. Co-inductive data types are infinite data where functions over them have to be able to observe a finite prefix of it in finite time.
Jonn: As far as I understand, Haskellers in the audience will recognize their favorite list in your co-inductive explanation. So lists in Haskell are actually co-inductive, I think.
David: Yes, basically. I mean, Haskell doesn’t give you a co-inductive reasoning principle of lists. Ordinary data types in Haskell work very much like co-data types in Idris or Agna or Coq. Lean just doesn’t have them. I think Lean 3 has one of them, which is infinite streams. And you can do some tricks to encode various ones.
I’m actually not entirely sure what the state of the art is, but you can’t just sit down and declare here’s my co-data type in Lean 4, at least. But in languages where you can do that, there you can define a type, which is essentially like a stream of input/output actions that could go on either infinitely or potentially infinitely.
And then cause your web server, you don’t actually want it to fall into an infinite loop, right? Like you want it to serve an unbounded number of requests. But you don’t want it to just sit there, you know, incrementing an integer forever until it wraps around and then continue to increment it.
This principle that you can observe a finite prefix in finite time is actually much nicer than you writing arbitrarily looping stuff, assuming that the language is otherwise ergonomic to use. And that is a real question that you have to worry about, but.
Jonn: That’s really cool. I thought about conductives for streams, but I didn’t think about the fact that it’s a good representation for a web server.
David: So there’s this paper by Anton, I think it was Anton Setzer and Pedro Dubia, but I could be misremembering cause it’s a few years since I’ve read it.
So apologies if I got it wrong. It talks about how to write generally effectful programs. And you can actually encode IO as a coinductive type because you can see IO as being a system where you have a data type that represents commands and then you have a function from a command to the type of the response you get from the system, you know, so for like getLine that might be string and for printLine, it might be unit.
And then you can define a program that does IO as essentially being an infinitely branching tree. Or you can see, you can see the world as like an infinitely branching tree, and then you observe down one branch of that tree and then the types tell you what you’ll get back as a value. And then you can, depending on that. It’s really cool stuff.
Jonn: Let’s talk about features of Lean 4. I played with it a little bit based on your first two chapters.
As a young adult, I tried to write a game in Haskell, and I wanted it to be an Infocom-like game, basically a text adventure. And I wanted to have rooms, and I wanted rooms to have different things in them. And I wanted to encode these things at the type system level.
And I didn’t know about existential quantification back then, so I failed to do it. Yeah, because you can’t put an a
and a b
in a list of a
.
David: Well, you can, if you have a sum type, right? Like you could have a list of a
and b
.
Jonn: Yeah. But I wanted to do it under the expression problem, about which I also didn’t know, but I wanted to make new rooms.
David: Right.
Jonn: And when I played with Lean a little bit, I got fascinated about how easy it is to represent essentially existential types with just a simple structure that takes a type in its field. But what I noticed is that when I do that, I get some error messages because it says, you know, I want type, but you’re giving me type one. So what’s the story here?
Is it related to the thing that you said before about how in Haskell type of type is type? Tell our audience a little bit about this stuff.
David: So back at the dawn of Martin-Löf’s type theory, there was a version which had this rule, that type had type type, and you could just sort of write generic things, have them as type arguments, and pass them around.
But it turns out that this is logically inconsistent because by working hard enough, you can encode and [inaudible] a loop. And this is very similar to Russell’s paradox, you know? So the set of all sets that don’t contain themselves. So does it contain itself?
Yes. But then it can’t. No. But then it must. It’s a related topic to that. And just as the original type theory was invented to take care of that problem of impredicativity in set theory, we ended up with a way to rule out this sort of arbitrary impredicative quantification.
So impredicative versus predicative quantification says, what I’m gonna say for all x in “blah”, is “blah” allowed to be the same kind of thing that I’m in the process of defining, or does it have to be a smaller thing? And predicative means it has to be in some sense, smaller. And impredicative says that I get to talk about the thing that I’m in from the perspective of myself. Impredicativity is great if you can make it work, but it comes with technical challenges that can make your system inconsistent and thus useless for math.
And so it turns out that the initial version of Martin-Löf’s type theory was impredicative and had a consistency problem. So, we ended up with a system of universes they’re called. This is another one of those areas where the design space varies quite a lot and different choices are made in Coq and Agda, and Lean, all of them.
But the basic idea is that you can’t talk about something that’s your own size when you’re defining yourself. And so in the case where you’re defining, where you probably define a record type with one projection that is a type and another projection that is an inhabitant of that type to do your existential type, then that record type itself would have to have a bigger universe.
It has to be a part of a bigger notion of type so that it doesn’t end up with being contained in itself. And thus you avoid all these paradoxes. Nnd normally in Lean, you actually wouldn’t say this is in Type 1 or in Type 2. Normally what you do is you define a system of constraints.
So you’d say that this type field is in Type u, and my overall structure is in Type u + 1. And then that gives you a kind of polymorphism, which allows you to then use this in type zero, or actually not in type zero, because it says view plus one, but you could use it in Type 1, Type 2, Type 5. And so what you end up doing is essentially defining an infinite number of copies of this type simultaneously across all the different universe levels that it could happen.
And that sounds like a big, scary thing, but it’s very much like when you define a polymorphic function in Haskell, like identity is actually defined for every single type in Haskell. It’s just, you know, because it’s polymorphic. So it’s that basic idea just for these universes rather than for the individual types.
Jonn: Yeah. And now you can, if you’re writing a sort of universe-polymorphic code, you can put this structure into another universe-polymorphic structure as a type, for example. For example, let’s say I have the structure e which has this associated type hidden inside, which as you said, is in Type 1 or is in Type u + 1. If my associated type is in universe u, now I can make lists of those, because list is also defined in universe-polymorphic way.
David: That’s right. On the other hand, this is not how I would ever write this code. The way I would write this code is I would – cause normally you actually don’t wanna be able to put any type in there. Normally, you want to put some interesting subset of types in there.
So going back to our example of the interpreter for the programming language, where we have a data type representing the programming languages type, like I think it was Nat, Bool, and functions, and then a function mapping those to real types. What, let’s say you want to sort of existentially quantify that.
Then what you do is you have the first field be that data type. And then you say that the second field, you find its type by calculating based on that code that you put in the first field. And once you do that, then a lot of these size problems go away, because now you’re not storing a type, you’re storing a piece of data that represents types, and that way you get rid of the universe lift.
And that doesn’t work in every case, but it works in quite a lot of them.
Jonn: Yeah. Well, for me, I was very uneducated industrially, so I’ve always wanted to write general stuff. So this is why my life was hell before I discovered Wadler’s expression problem.
Because I thought that there must be a solution in any language I’m using. But it was relieving to understand that sometimes we just have to go for a closed type. And then we’ll extend our functionality by refactoring, and we have beautiful facilities for refactoring in Haskell and ML languages.
David: Especially in Haskell, I think, yeah. We don’t have beautiful facilities for automatic refactoring. Like [inaudible] and HLS does some cool things, but we’re still a way from what you can do with Kotlin and IntelliJ, but hopefully that will change. But manually factoring, like where [inaudible] helps you not screw it up.
Jonn: But yeah, honestly, I want to say that for this particular thing of making rooms that can kind of store different things while still being rooms, without type classes. Because if, let’s say, I don’t want to use type classes. It was so easy, I was very happy with how easy it is with Lean.
Um, okay. So, um, you mentioned when you were talking about Haskell, how Kel is really, really good for one, two, three, four, right? So, you mentioned type classes there, right. And type classes are used and abused, and hated, and loved by many. But at the end of the day, if we look at any sufficiently large Haskell codebase – full of type classes.
Um, so what’s the situation in Lean as far as type classes go, and how is subtyping handled?.
**David: **Yeah. So type classes exist in Lean. They’re different than in Haskell in a few ways. One is is that the arguments to a type class need not be types because, you know, like for example, you could put a number in there as well, like cause why not? You know, we have data at the type level. Or rather, I should say that there is no fundamental distinction between the levels in the way there is in something like Haskell. So the equivalent of fromInteger
in Haskell in Lean is actually a type class called OfNat
, and it takes the actual Nat as an argument.
Let’s say you have a data type that only represents the even numbers. You can just make odd literals be a type error, which is pretty cool. But lean doesn’t have a notion of unique implementations of type classes, you know? So, a given type class resolution problem might have many instances that satisfy it, and then you can give the lean compiler some priorities to pick between them.
And it has some rules that it’ll use to pick which one it thinks is the best or the most relevant. And so any code that assumes that they’re unique is buggy in Lean. Whereas that’s commonly assumed in Haskell code.
Jonn: Scary.
David: Well, I mean, you can put a proof in it that says it does the right thing.
Jonn: That’s true. Actually, it’s less scary, right? Like, because if you don’t put the proof, then you write bugs. So it’s cool, in a way,
David: I mean, it’s, it’s just a different feature and it has a different trade off and you use it differently. You know, you can’t just port Haskell code line for line to Lean. First off, it’s strict, but second off, a lot of the parts of the language just work differently, and that’s okay. You know, there’s more than one programming language in the world.
Where HASCO has the defaulting rules. Lean has a notion of default instances. So you can essentially make your own default rules by making your own type class instances. This worries me, but it’s used for reasonable things in the standard library so far.
They’ve got some nice work done on making instance resolution be efficient. And let’s see, what else is there?
Jonn: I mean, what about stuff like multi parameter type classes? Yeah. So like in general, multi parameter type classes, how are they handled? And are they facilities to say, okay, I have like a functional dependency.
David: Um, so far as fun. deps go there, because we don’t have this uniqueness property, talking about actual functional dependencies, doesn’t make a lot of sense.
But there is a notion of something called an output parameter. So, internally, Lean, when it sees a type class resolution problem, it won’t start searching until the types involved have been solved. Like if they’re just bare meta variables, it won’t go do it. Unless those parameters are marked as outputs. And then they’re intended to be found by doing instance search. And so in that sense, they work kind of like a fun dep, but, but there’s no sort of global check that there in fact is a functional dependency. In other words, that you can treat the calculation from the inputs to the outputs as a real function, the way you can in Haskell.
So it’s used for similar things, but it’s not actually functional dependencies.
Jonn: That’s very interesting, that’s a thing to think about. When I looked at how type classes are resolved, I thought, oh, it’s not I, who is writing a baggy code, it’s Lean that has an incomplete resolver for instances, and I have to wait for a little bit and that’s gonna be doing globally instance checks and stuff like this.
David: Nope. And what would it even mean to be globally unique? Right. So when you’re writing proofs about programs, then you’d also need to have a notion of equality of instances in order to say whether two instances are the same, right?
Because talking about uniqueness is fundamentally talking about sameness, cause it says that if there exist two, then they are equal. But once you start talking about instance equality like that, that’s like a global, non-modular property of a program.
And like that’s problematic for various reasons. And also you’d want that available as a reasoning principle in your language, right? To say that any two instances of this type class are equal. Cause you can’t reason using that, but you are restricted by it, then you’re not gonna have a good time. And you know, I think a nice principle of these languages is that anytime they keep you from doing a thing, they should use the fact that you can’t do that thing to make your life easier in some other way.
And it’s very unclear to me what that principle would actually look like in a language like this. Like that’s like an open research problem of how to do. And one choice they’ve made in lean, and like, for example, part of why they don’t have cotypes, is they wanna be very, very conservative with their type theory, and not do innovation in the core theory, because they want it to use sort of tested, well-proven results to make the mathematicians feel safe.
Jonn: Deeply nested structures – how are they made and updated? Are there optics in Lean? Is there a starter library maybe? Or is there a third-party library that’s popular?
David: I don’t know. You should write one.
Jonn: So the way we drill into the data structures is with projections?
David: Yeah.
Jonn: And they are named or indexed, so I can say give me field number… Okay.
David: There’s one thing about Lean is that there’s a lot of ways to do each thing, it’s a very sort of flexible notation. And that’s been one of the challenges, actually, in writing the book – not overwhelming people in chapter one while still giving them enough so that they can have a reasonable chance of figuring out some code that they run into and understanding the code that the compiler gives 'em back in error messages.
So there’s a real sort of balancing act on which syntax to describe early and which syntax to wait with, and when to introduce, some of those kind of things. But yeah, for now it’s kind of like writing Haskell code pre-lens. I’ve seen people posting a little bit on the Zulip about lenses, but I haven’t seen any sort of widely used library for them.
Blitz questions
Jonn: That’s it for my questions, but while listening to you, I wrote down some things that I didn’t understand. So if we could do a very brief blitz round. Brief question number one: tactics, what’s tactics?
David: So tactics. When you have one of these languages where you wanna write your proof, oftentimes you – so we talked earlier about proof objects, and so what that means is typically a type represents a proposition that is like a statement that might be true or might be false.
And an inhabitant of that type is evidence that the proposition corresponding to the type is true, and that evidence is what we call a proof object. And you can build these by hand if you want. So like, let’s say we wanna write a proof, then I can then say like, okay, this type, what are the constructors of it – so I’m gonna sit down and write them in, and eventually build it.
However, that’s got a lot of downsides. One of them is that a big proof object can be incredibly tedious to construct. Oftentimes, in real mathematical proofs, you’ll have maybe five or six cases that are all completely obvious and easy, and two that are interesting.
And you might have very similar, but not identical reasoning in all of those obvious cases. So you’ll have a big bunch of almost copy-pasted code that’s kind of boring and then you’ll have the other thing, and it takes up like three pages and scrolling through it is unpleasant.
Another situation you then run into is, let’s say you change the way you phrase your theorem slightly. It turns out that it was a lemma, you were proving on the way to some bigger result. And then actually, if your definition were tweaked a little bit, then the bigger result will be easier to prove. Now you have to go change that whole thing again. And so you have a real software maintenance problem.
Tactics are a solution to both of these problems. They are meta programs in the sense that they are programs that write programs, where you give the computer a set of high-level instructions, and then it goes and creates this proof object behind the scene based on those instructions.
And you can write these instructions to be kind of general. Like, a lot of tactic languages have a notion of failure and recovery. So you can say, try this, if that doesn’t work, try this, if that doesn’t work, try this, if that doesn’t work, try this, and so on. And then you can package that up into like a little blob of code that can solve a lot of related programs or a lot of related tasks all at once.
And then you can have a proof where you say, you know, take care of the easy cases, tactic, and then anything that fails to solve, give it back to me. And then you do more work. And then let’s say you add three more easy cases. You don’t have to change your proof at all because the generated code gets updated, but not the code that does the generating.
Tactics go way back. Actually the first version of ML, you know, the ancestor language to Haskell and Standard ML, and OCaml. ML is short for meta language and it was the meta language, which is to say, the tactic language for Edinburgh LCF, which was Robin Millner’s proof system. And so, you know, ML has this nice exception mechanism, and that goes all the way back toward wanting to recover from failures and tactics.
And it has this nice way of doing type abstraction because doing type abstraction was used to protect the trusted code of the internals of the proof system from the untrusted user tactic code, which could do arbitrary, weird stuff. And as long as all the primitives were correct, it was okay.
Tactic languages kind of come in two flavors. One of them are languages specifically designed to be tactic languages like LTAC and Coq. The other are the use of the language itself within some monad, which has tactic or approving-related effects. This is the approach taken in Lean 3, Lean 4, and sort of one way to understand the meta programming stuff I did for Idris 1 as well.
And Agda has a similar design these days too. So like, if you wanna write a tactic in Agda, you write it in Agda. If you want a tactic in Lean, you write it in Lean. I don’t know if they’ve reimplemented that stuff in Idris 2, but in Idris 1, you would definitely write it in Idris as well.
Jonn: The final question is about something I saw when I was playing with Lean. I noticed almost as if it was a kind, I’m not sure. I noticed kinds that are indexed with universes, but I also noticed something called sort, is it kind of props, what is it??
David: Yeah. So the term kind isn’t used in something like Lean. There, we instead usually talk about types and the universes that they inhabit because the arrow type in Lean is actually just a type, you know, you can apply the identity function to Nat and get Nat back that has type of Type -> Type. And this is like a particular thing that you have in lean.
So in Lean, you have a universe called Prop and the types that inhabit the Prop universe are propositions. And they have an interesting property, which is that they are considered proof irrelevant. What that means is that any two proofs of the same proposition are considered equivalent by the Lean type checker. So there’s no way to write a program that checks which proof of this proposition did I get.
And, Coq has a very similar notion with some slightly different logical principles to govern it. And Agda has got this thing they call S, like a notion of Prop in the very newest versions of Agda, which is also a bit like this, but with some different restrictions on it.
But you’ve got this universe called Prop. Remember, the universe is like a type that describes other types. Then you’ve got the universe Type 0, and that describes ordinary runtime data types. So like natural numbers, and those sorts of things that don’t contain any types on their own. And you’ve got Type 1, which contains Type plus all the other things. And then you’ve got Type 2, which contains Type 1 plus all the other things.
And so Sort basically is numbered the same way that Type is, except Sort 0 is Prop and Sort 1 is Type 0. You can think of type as being like a Haskell type synonym, so Type u equals Sort u + 1.
We want to use polymorphism to talk about like the set of universes that a type can inhabit, but because Prop works so much differently than the other types, we often want to make a thing that only lives in Prop or that is not allowed to live in Prop.
So when you wanna make that distinction, then the Prop type distinction is convenient. And if we really don’t care where it lives, if it can be in Prop or in Type, then we say Sort.
Haskell Foundation
Jonn: It was really, really nice to talk to you. And it was very insightful. If you have anything to advertise, please, the floor is yours.
David: Sure. I would be remiss if I didn’t advertise the Haskell Foundation, which is where I’m the Executive Director. So I’m gonna give you a little spiel about it. The Haskell Foundation is a nonprofit organization, and we’re trying to make life better for people using Haskell.
And that can mean a lot of different things. And if you think you have a good way to make Haskell better, please get in touch with me. If you want to help out with making Haskell better, get in touch with me. We’re a very young, very small foundation. It’s not like we have a big staff or anything. If you send an email, it’ll be me reading it.
But, you know, if you’re running into sort of challenges with Haskell, if you’re running into aspects of using it in your life that you love and, and you want to find better ways to share your love, then let me know if you see opportunities for the community that you think we can be useful.
We don’t have a gigantic budget, but we do have a budget. And we can hopefully coordinate volunteers and make things better for all of us.
Jonn: That’s very nice. As a co-founder of Serokell, if you need anything in particular, you can always come to us and we’ll see what we can do.
David: Thanks. Yeah. Yeah. I think being a nonprofit here is really useful because it allows us to think strategically in ways that other organizations can’t, and allows us to sort of build up an institution over time that can take care of things that are harder to take care of only by volunteers.
Like, you know, we’ll never probably have the funding to just pay people to do everything and have employees for everything, but at least we can have the role of making sure that there are volunteers and raising the alarm if there aren’t for certain key topics. And, you know, remembering documentation over time and, and these sorts of things.
I think that, you know, it’s great that we have companies like Serokell who are making Haskell better. And I hope that we have good – I guess now I’m in management, I have to use the word synergy.
Jonn: Yeah. I mean, absolutely. I think that as far as I understand, if some company like Serokell or Tweag or anyone else would say, what do you want to be done, you’d be happy?
David: Yes, absolutely, it already happens today. And then we’re very happy. We have a number of sponsors who support us financially, which is what allows us to have me work on the foundation full time, we also have a contractor who’s working on making the CI better for GHC full time.
But also, you know, people who want to do work on making tooling better and making libraries better, if we can help coordinate that so we cut down the duplication of effort and some people maybe we can find connections that didn’t exist or share experiences that people aren’t running into. If we could do any of that, then that’s excellent.
Thanks to David for being a guest on Functional Futures!
If you want to hear more from David, you can follow him on Twitter.
To hear more from us, subscribe to Functional Futures on YouTube or on your favourite podcasting platform.