Why Dependent Haskell is the Future of Software Development

Monday, December 17th, 2018

Hi, I’m Vladislav Zavialov. For the last couple of months, my job at Serokell has been to lead our efforts to improve our primary tool – the Glasgow Haskell Compiler (GHC). I’m focusing on making the type system more powerful, inspired by Richard Eisenberg’s thesis that introduces Dependent Haskell.

The Current State of Affairs

Dependent types are a feature I’d love to see in Haskell the most. Let’s talk why. There’s a tension between writing code that is performant, code that is maintainable and easy to understand, and code that is correct by construction. With available technology, we are in a “pick two” situation, but with direct support for dependent types in Haskell we might have our cake and eat it too.

Standard Haskell: Ergonomics + Performance

Haskell, at its core, is simple: it is just a polymorphic lambda calculus with lazy evaluation plus algebraic data types and type classes. This happens to be just the right combination of features to allow us to write clean, maintainable code that also runs fast. I will briefly compare it to more mainstream languages to substantiate this claim.

Memory unsafe languages, such as C, lead to the worst sort of bugs and security vulnerabilities (buffer overflows and memory leaks). There are specific circumstances in which one would want to use such a language, but most of the times it’s a horrible idea.

Memory safe languages form two groups: the ones that rely on a garbage collector, and Rust. Rust seems to be unique in that it offers memory safety without a GC (there’s also the now unmaintained Cyclone and other research languages in this group, but unlike them, Rust is paving its way towards mainstream use). The downside is that, while safe, memory management in Rust is still manual and non-trivial, and in applications that can afford to use a garbage collector, developer time is better spent on other issues.

This leaves us with garbage collected languages, which we’ll further break into two categories based on their type systems.

Dynamically typed (or, rather, unityped) languages, such as JavaScript or Clojure, do not offer static analysis by design, therefore cannot offer the same level of confidence in code correctness (and no, tests cannot replace types – we need both!).

Statically typed languages, such as Java or Go, often come with severely limited type systems that force programmers to write boilerplate or fallback to unsafe programming techniques. A notable example is how the lack of generics in Go necessitates the use of interface{} and runtime casts. There’s also no separation between effectful (IO) and pure computations.

Finally, among memory-safe garbage-collected languages with powerful type systems, Haskell stands out by being lazy. Laziness is an important asset for writing composable, modular code, making it possible to factor out any parts of expressions, including control structures.

It almost seems to be the perfect language until you realize how far it is from reaching its full potential in terms of static verification when compared to proof assistants such as Agda.

As a simple example of where the Haskell type system falls short, consider the list indexing operator from Prelude (or array indexing from the primitive package):

(!!) :: [a] -> Int -> a
indexArray :: Array a -> Int -> a

Nothing in these type signatures asserts that the index is non-negative but smaller than the length of the collection. For high assurance software, this is immediately a no-go.

Agda: Ergonomics + Correctness

Proof assistants (for example, Coq) are software tools that enable computer-assisted development of formal proofs of mathematical theorems. To a mathematician, using a proof assistant is a lot like writing a pen-and-paper proof, yet with unprecedented rigour required by the computer to assert the correctness of such proof.

To a programmer, however, a proof assistant isn’t that different from a compiler for an esoteric programming language that has an incredible type system (and perhaps an IDE, then it’s called an interactive theorem prover), and mediocre (or even lacking) everything else. A proof assistant is what happens if you spend all your time developing a type checker for your language and forget that programs also need to be run.

The holy grail of verified software development is a proof assistant that is also a good programming language with industrial-strength code generator and runtime system. There are experiments in this direction, for instance Idris, but Idris is a language with eager (strict) evaluation and its implementation isn’t particularly stable at the moment.

The proof assistant closest to a Haskeller’s heart is Agda, as in many ways it feels a lot like Haskell with a more powerful type system. We use it to prove various properties of our software, and my colleague Danya Rogozin wrote a post series about it.

Here’s the type of the lookup function analogous to Haskell’s (!!):

lookup : ∀ (xs : List A) → Fin (length xs) → A

The first parameter here is of type List A which corresponds to [a] in Haskell. However, note that we also give it a name, xs, to refer to it later in the type signature. In Haskell, we can refer to function arguments only in the function body at the term level:

(!!) :: [a] -> Int -> a -- can't refer to xs here
(!!) = \xs i -> ...     -- can refer to xs here

In Agda, however, we can refer to this xs value at the type level as well, as we do in the second parameter of lookup, Fin (length xs). A function that refers to its parameter on the type level is called a dependent function, which is an example of dependent types.

The second parameter of lookup is of type Fin n for n ~ length xs. A value of type Fin n corresponds to a number in the [0, n) range, so Fin (length xs) is a non-negative number smaller than the length of the input list, which is precisely what we need to represent a valid index into the list. Roughly speaking, this means that lookup ["x","y","z"] 2 is well-typed, but lookup ["x","y","z"] 42 is rejected by the type checker.

When it comes to running Agda, we can compile it to Haskell using the MAlonzo backend, and yet the produced code cannot offer competitive performance. This is no fault of MAlonzo, however – it has no choice but to insert numerous unsafeCoerce so that GHC would accept code that has been previously checked by the Agda type checker, but the very same unsafeCoerce is what destroys the performance.

This puts us in a difficult situation where we have to use Agda for modelling and formal verification, and then reimplement the same functionality in Haskell. With this setup, our Agda code serves as a machine-checked specification, which is, while an improvement over a natural language specification, not even close to the end-goal of “if it compiles, it works as specified”.

Haskell Extended: Correctness + Performance

Striving for the static guarantees of dependently typed languages, GHC has a long history of adding extensions that increase the expressive power of the type system. I started using Haskell when GHC 7.4 was the newest and shiniest Haskell compiler, and by then it already had the main extensions for advanced type-level programming in place: RankNTypes, GADTs, TypeFamilies, DataKinds, and PolyKinds.

And yet, we still don’t have proper dependent types in Haskell: no dependent functions (Π-types) or dependent pairs (Σ-types). We do have an encoding for them, though!

The state of the art is to encode functions on the type level as closed type families, use defunctionalization to allow unsaturated function application, and bridge the gap between terms and types using singleton types. This leads to a sizeable amount of boilerplate, but there’s the singletons library to generate it with Template Haskell.

This means that for the very brave and determined, there is an encoding of dependent types in today’s Haskell. As a proof of concept, here’s an implementation of the lookup function analogous to the Agda version:

{-# OPTIONS -Wall -Wno-unticked-promoted-constructors -Wno-missing-signatures #-}
{-# LANGUAGE LambdaCase, DataKinds, PolyKinds, TypeFamilies, GADTs,
             ScopedTypeVariables, EmptyCase, UndecidableInstances,
             TypeSynonymInstances, FlexibleInstances, TypeApplications,
             TemplateHaskell #-}

module ListLookup where

import Data.Singletons.TH
import Data.Singletons.Prelude

singletons
  [d|
    data N = Z | S N
    len :: [a] -> N
    len [] = Z
    len (_:xs) = S (len xs)
  |]

data Fin n where
  FZ :: Fin (S n)
  FS :: Fin n -> Fin (S n)

lookupS :: SingKind a => SList (xs :: [a]) -> Fin (Len xs) -> Demote a
lookupS SNil = \case{}
lookupS (SCons x xs) =
  \case
    FZ -> fromSing x
    FS i' -> lookupS xs i'

Here’s a GHCi session to demonstrate that lookupS indeed rejects indices that are too large:

GHCi, version 8.6.2: http://www.haskell.org/ghc/  :? for help
[1 of 1] Compiling ListLookup       ( ListLookup.hs, interpreted )
Ok, one module loaded.
*ListLookup> :set -XTypeApplications -XDataKinds 
*ListLookup> lookupS (sing @["x", "y", "z"]) FZ
"x"
*ListLookup> lookupS (sing @["x", "y", "z"]) (FS FZ)
"y"
*ListLookup> lookupS (sing @["x", "y", "z"]) (FS (FS FZ))
"z"
*ListLookup> lookupS (sing @["x", "y", "z"]) (FS (FS (FS FZ)))

<interactive>:5:34: error:
    • Couldn't match type 'S n0’ with 'Z’
      Expected type: Fin (Len '["x", "y", "z"])
        Actual type: Fin ('S ('S ('S ('S n0))))
    • In the second argument of ‘lookupS’, namely ‘(FS (FS (FS FZ)))’
      In the expression:
        lookupS (sing @["x", "y", "z"]) (FS (FS (FS FZ)))
      In an equation for ‘it’:
          it = lookupS (sing @["x", "y", "z"]) (FS (FS (FS FZ)))

This example demonstrates that “possible” does not mean “feasible”. I am joyful that Haskell has the features needed to implement lookupS, but I’m also in distress because of the accidental complexity that it brings. I would never advise to use this style of code in a non-research project.

In this particular example, we could’ve used length-indexed vectors to achieve a similar result at lesser complexity. However, the direct translation of the Agda code serves better to reveal the issues that we have to deal with in other circumstances. Here are some of them:

  • The typing relation a :: t and the kinding relation t :: k are different. 5 :: Integer holds in terms, but not in types. "hi" :: Symbol holds in types, but not in terms. This leads to the need for the Demote type family to map type-level encodings to their term-level counterparts.

  • The standard library uses Int to represent list indices (and singletons uses Nat in the promoted definitions). Int and Nat are non-inductive types, and while more efficient than the unary encoding of naturals, they don’t play well with inductive definitions such as Fin or lookupS. Thus we had to redefine length as len.

  • Functions are not promoted out of the box. singletons encodes them as closed type families, using defunctionalization to work around the limitation that type families may not be partially applied. This is a complex encoding, and we have to put our definition of len into a Template Haskell quote so that singletons will generate its type-level counterpart, Len.

  • Due to lack of built-in dependent functions, we have to use singleton types to bridge the gap between terms and types. This means we are using SList as input to lookupS instead of a regular list. This results not only in mental overhead of having multiple definitions for lists, but also in runtime overhead of doing the conversions back and forth (toSing, fromSing) and carrying evidence for the conversion procedure (the SingKind constraint).

Bad ergonomics are only a part of the problem. The other part is that none of this works reliably. For example, I reported Trac #12564 back in 2016, and there’s Trac #12088 from around the same time – both are posing significant trouble when it comes to things more advanced than textbook examples such as list lookup. These GHC bugs are still not fixed, and the reason for this, I believe, is that GHC developers simply don’t have enough time, being busy with other issues. The amount of people actively working on GHC is surprisingly small, and some areas are destined to be left unattended.

Summary

I said earlier that we’re in a “pick two” situation, so here’s a table to show what I mean by that:

Standard
Haskell
Agda Haskell
Extended
Runtime
performance
✔️ ✔️
Correctness by
construction
✔️ ✔️
Ergonomics and
maintainability
✔️ ✔️

The Bright Future

Out of the three options we have, each has a flaw. However, we can fix them:

  • We can take Standard Haskell and extend it with dependent types directly instead of the awkward encoding via singletons. (Easier said than done).
  • We can take Agda and implement an efficient code generator and RTS for it. (Easier said than done).
  • We can take Haskell Extended, fix its bugs and continue adding new extensions to make common programming patterns more ergonomic. (Easier said than done).

The good news is that all three options seem to converge at a single point (sort of). Imagine the tidiest possible extension to Standard Haskell that adds dependent types and, consequently, the ability to write code correct by construction. Agda could be compiled (transpiled) to this language without unsafeCoerce. And Haskell Extended is, in a sense, an unfinished prototype of that language – we need to enhance some things and deprecate others, but eventually we’ll arrive at the desired destination.

Demolition of singletons

A good indication of progress is the removal of special cases and workarounds from singletons. Bit by bit, we will make this package redundant. For instance, in 2016 I made use of the -XTypeInType extension to remove KProxy from SingKind and SomeSing. This change was possible due to an important milestone – unification of types and kinds. Compare the old and the new definitions:

class (kparam ~ 'KProxy) => SingKind (kparam :: KProxy k) where
  type DemoteRep kparam :: *
  fromSing :: SingKind (a :: k) -> DemoteRep kparam
  toSing :: DemoteRep kparam -> SomeSing kparam

type Demote (a :: k) = DemoteRep ('KProxy :: KProxy k)

data SomeSing (kproxy :: KProxy k) where
  SomeSing :: Sing (a :: k) -> SomeSing ('KProxy :: KProxy k)

In the old definitions, k occurs in kind positions exclusively, on the right hand side of kind annotations t :: k. We use kparam :: KProxy k to carry k in types.

class SingKind k where
  type DemoteRep k :: *
  fromSing :: SingKind (a :: k) -> DemoteRep k
  toSing :: DemoteRep k -> SomeSing k

type Demote (a :: k) = DemoteRep k

data SomeSing k where
  SomeSing :: Sing (a :: k) -> SomeSing k

In the new definitions, k travels freely between kind and type positions, so we don’t need the kind proxy anymore. The reason is that since GHC 8.0, types and kinds are the same syntactic category.

In Standard Haskell, we have three completely separate worlds: terms, types, and kinds. If we look at the GHC 7.10 sources, we can see a dedicated parser for kinds, and a dedicated checker. In GHC 8.0, these are gone: types and kinds share the same parser and checker.

In Haskell Extended, being a kind is just a role that a type fulfils:

f :: T z -> ...               -- 'z' is a type
g :: T (a :: z) -> ...        -- 'z' is a kind
h :: T z -> T (a :: z) -> ... -- 'z' is both

Unfortunately, that’s only 97% true. In GHC 8.0–8.4 there are still differences in renaming and scoping rules for types in type and kind positions. These remnants of the past are being removed, though. In GHC 8.6, I unified the renamer for types and kinds by adding the StarIsType extension and merging the functionality of TypeInType into PolyKinds. There’s an accepted proposal to unify the scoping rules, but its implementation is scheduled for GHC 8.10 to comply with the 3 Release Policy. I’m looking forward to this change and I added a warning to GHC 8.6, -Wimplicit-kind-vars, that you can use to write forward-compatible code today.

Fast-forward a couple of GHC releases, when it will be true that kinds are types, no strings attached. What’s the next step? Let’s take a brief look at SingKind in the latest version of singletons:

class SingKind k where
  type Demote k = (r :: Type) | r -> k
  fromSing :: Sing (a :: k) -> Demote k
  toSing   :: Demote k -> SomeSing k

The Demote type family is needed to account for the discrepancy between the typing relation a :: t and the kinding relation t :: k. Most of the times (for ADTs), Demote is the identity function:

  • type Demote Bool = Bool
  • type Demote [a] = [Demote a]
  • type Demote (Either a b) = Either (Demote a) (Demote b)

Therefore, Demote (Either [Bool] Bool) = Either [Bool] Bool. This observation prompts us to make the following simplification:

class SingKind k where
  fromSing :: Sing (a :: k) -> k
  toSing   :: k -> SomeSing k

Look, ma, no Demote! And, in fact, that’d work for the Either [Bool] Bool case, and for other ADTs. In practice, though, there are other types besides ADTs: Integer, Natural, Char, Text, and so on. They are not inhabited when used as kinds: 1 :: Natural is true in terms, but not in types. That’s why we have these unfortunate lines:

type Demote Nat = Natural
type Demote Symbol = Text

The solution to this issue is to promote all primitive types. For example, Text is defined as:

-- | A space efficient, packed, unboxed Unicode text type.
data Text = Text
    {-# UNPACK #-} !Array -- payload (Word16 elements)
    {-# UNPACK #-} !Int   -- offset (units of Word16, not Char)
    {-# UNPACK #-} !Int   -- length (units of Word16, not Char)

data Array = Array ByteArray#
data Int = I# Int#

If we properly promote ByteArray# and Int#, then we can use Text instead of Symbol. Do the same thing with Natural and a perhaps a couple of other types, and we can get rid of Demote, – right?

Not really. I’ve been ignoring the elephant in the room: functions. They, too, have a special Demote instance:

type Demote (k1 ~> k2) = Demote k1 -> Demote k2

type a ~> b = TyFun a b -> Type
data TyFun :: Type -> Type -> Type

The ~> type is the representation that singletons uses for type-level functions, based on closed type families and defunctionalization.

The initial instinct could be to unify ~> with ->, as both stand for the type (kind) of functions. The issue here is that (->) in a type position and (->) in a kind position mean different things. In terms, all functions from a to b have type a -> b. In types, on the other hand, only constructors from a to b have kind a -> b, but not type synonyms or type families. For reasons of type inference, GHC assumes that f a ~ g b implies f ~ g and a ~ b, and this holds for constructors, but not functions, hence the restriction.

Therefore, in order to promote functions but keep type inference, we will have to move constructors into their own type, let’s call it a :-> b, which does indeed have the property that f a ~ g b implies f ~ g and a ~ b, and all other functions will continue to be of type a -> b. For example, Just :: a :-> Maybe a, but isJust :: Maybe a -> Bool.

Once Demote is dealt with, the last mile is to remove the need for Sing itself. For this, we will need a new quantifier, a hybrid between forall and ->. Let’s take a close look at the isJust function:

isJust :: forall a. Maybe a -> Bool
isJust =
  \x ->
    case x of
      Nothing -> False
      Just _  -> True

The isJust function is parametrized by a type a, and then a value x :: Maybe a. These two parameters exhibit different properties:

  • Visibility. When we call isJust (Just "hello"), we pass x = Just "hello" visibly, but a = String is inferred by the compiler. In modern Haskell, we can also use a visibility override and pass both parameters visibly: isJust @String (Just "hello").

  • Relevance. The value that we pass as input to isJust will be indeed passed at runtime, as we need to match on it using case to see if it’s Nothing or Just. It is thus considered relevant. On the other hand, the type itself is erased and cannot be matched on: the function will treat Maybe Int, Maybe String, Maybe Bool, etc, uniformly – therefore, it is considered irrelevant. This property is also called parametricity.

  • Dependency. In forall a. t, the type t may mention a and, therefore, depend on the concrete a that is passed. For example, isJust @String has type Maybe String -> Bool, but isJust @Int has type Maybe Int -> Bool. This means that forall is a dependent quantifier. Contrast that with the value parameter: it doesn’t matter if we call isJust Nothing or isJust (Just ...), the result is always Bool. Therefore, -> is not a dependent quantifier.

To subsume Sing, we will need a quantifier that is visible and relevant, just like a -> b, but dependent, like forall (a :: k). t. We will write it as foreach (a :: k) -> t. To subsume SingI, we shall also introduce an invisible relevant dependent quantifier, foreach (a :: k). t. At this point, we no longer need singletons, as we have just introduced dependent functions to the language.

A Glimpse of Dependent Haskell

With promotion of functions and the foreach quantifier, we will be able to rewrite lookupS as follows:

data N = Z | S N

len :: [a] -> N
len [] = Z
len (_:xs) = S (len xs)

data Fin n where
  FZ :: Fin (S n)
  FS :: Fin n -> Fin (S n)

lookupS :: foreach (xs :: [a]) -> Fin (len xs) -> a
lookupS [] = \case{}
lookupS (x:xs) =
  \case
    FZ -> x
    FS i' -> lookupS xs i'

This code isn’t any shorter – after all, singletons does a good job of hiding the boilerplate. However, it is much cleaner: there are no Demote, SingKind, SList, SNil, SCons, fromSing. No use of TemplateHaskell, as we now can use the len function directly instead of generating type family Len. The runtime performance will be better, too, as we no longer need the fromSing conversion.

We still had to redefine length as len to return inductively defined N instead of Int. It is probably best to declare this issue out of scope of Dependent Haskell, as Agda uses inductively defined N for its lookup function too.

There are ways in which Dependent Haskell is simpler than Standard Haskell. After all, it merges the worlds of terms, types, and kinds, into a single, uniform language. I can easily imagine using this style of code in a production codebase to certify crucial components of the application. Many libraries will be able to offer safer APIs without the complexity cost of singletons.

Getting there will not be easy. There are many engineering challenges ahead of us in all of GHC components: the parser, the renamer, the typechecker, and even the Core language, all require non-trivial modification or even a redesign. Lately, my focus has been on parsing, and I’m going to publish a fairly large diff in the coming weeks.

Stay tuned for future posts where I will dive into technical details of the problems that I’m solving as part of my work on GHC at Serokell to make dependently typed programming practical!

Comments, discussions, and more on Reddit and Hacker News.