Research Programming Artificial Intelligence Interviews Other

Serokell’s Work on GHC: Dependent Types, Part 3

Article by Serokell GHC team

April 25th, 2024

9 min read

At Serokell, we are convinced that Haskell would greatly benefit from the addition of dependent types to its type system. We have a dedicated GHC team to make this happen. It currently consists of Vladislav Zavialov and Andrei Borzenkov.

In this report, we’d like to share our recent progress with you.

Summary

There are four topics that the report covers:

@-binders in lambdas and function equations
Namespace specifiers in fixity declarations and pragmas
Refactoring of exact-print annotations (EPA)
Pun-free syntax for list and tuple types

Here we go!

`@`-binders in lambdas and function equations

Let’s start with the most signifant achievement out of the four, namely @-binders in lambdas and function equations. This is a new feature that we implemented, and it looks like this:

{-# LANGUAGE TypeAbstractions #-}
f :: forall a. a -> a
f @a x = x :: a     -- @a-binder on the LHS of a function equation

Notice the @a on the LHS – that’s the new syntax that allows binding invisible type variables. In lambdas, it looks about the same: \ @a -> rhs. The name of the bound variable may or may not coincide in the signature and the equation, the following example is fully equivalent to the last one:

{-# LANGUAGE TypeAbstractions #-}
f :: forall a. a -> a
f @b x = x :: b        -- @b corresponds to forall a. in the signature

There are two reasons you might want to use @-binders.

The first use case is to replace ScopedTypeVariables. Consider this example written in the old style:

{-# LANGUAGE ScopedTypeVariables #-}
f :: forall a. a -> a
f x = x :: a

Notice how in x :: a on the RHS, the a comes into scope from the signature. This means that scoping is not contiguous, i.e. it can have gaps

{-# LANGUAGE ScopedTypeVariables #-}
f :: forall a. a -> a
g = 42   -- scope gap
f x = x :: a

The a is in scope in the signature, then not in scope in g = 42, then in scope again in f x = x :: a. Noncontiguous scope is unusual. Sometimes it is also unreliable, for example, with ScopedTypeVariables you’re not allowed to hide the forall behind a type synonym:

{-# LANGUAGE ScopedTypeVariables #-}
type T = forall a. a -> a
f :: T
f x = x :: a   -- oops! not working anymore

None of this is an issue with @-binders, which scope contiguously and interact harmoniously with other language features such as type synonyms.

The second use case for @-binders is binding type variables in higher-rank scenarios:

ex :: (Int8, Int16)
ex = higherRank (\ @a x -> maxBound @a - x )
                      -- @a-binder in a lambda pattern in an argument
                      -- to a higher-order function

higherRank :: (forall a. (Num a, Bounded a) => a -> a) -> (Int8, Int16)
higherRank f = (f 42, f 42)

In this example, we use an @a-binder in a lambda to bind a type variable. Then we pass the type variable a as a type argument to maxBound. If not for this new feature, we would have to rely on implicit binders in pattern signatures to bring such variables into scope:

\(x :: a) -> ...    -- old style with implicit binding

The problem with implicit binding is that it relies on unification, so if the variable is under a non-injective type constructor (a type family or a type synonym), there’s no way to bind it at all, leading API designers to introduce Proxy arguments (e.g. reify in reflection). Now it is a solved problem.

To understand the connection of @-binders to dependent types, remember that one of the proposed quantifiers in Dependent Haskell is foreach a. t, which is similar to forall a. t, except the argument is retained and passed at runtime. Retaining the argument enables dependent pattern matching. We wanted an elegant solution for binding invisible type arguments that would be appropriate for both of those quantifiers.

In their current form, @-binders can be used with the erased quantifier forall a. t. In the future, we can build upon this binding mechanism in the implementation of the retained invisible quantifier foreach a. t. It offers a much more solid foundation than ScopedTypeVariables, especially if we consider the potential interactions with pattern matching or nested quantification.

It is also quite satisfying that @-binders are a feature that is useful in and of itself, even if we put dependent types aside for a moment. This is thanks to the GHC proposal process, which ensures that each incremental change has a compelling motivation. In fact, @-binders have a long backstory, starting with GHC Proposal #155 (written in 2018, accepted in 2019). The proposal was somewhat underspecified, as our initial implementation attempts revealed. It turned out that we needed to modify the type checker to do skolemisation lazily. The paper “Seeking Stability by being Lazy and Shallow” (2021) added some clarity by formally specifying lazy skolemisaiton, though we later discovered new corner cases that needed to be addressed. GHC Proposal #448 (written 2021, accepted 2022) offered a more comprehensive spec, though this time, it was more of a package deal: there were some other features included in 448, so now we are talking about implementing a small section of it that corresponds to 155. After a few setbacks and about a year of inactivity, @-binders made it into GHC 9.10.

As always, adding a new feature means not only extending the language, but also updating the template-haskell library, the head.hackage compatibility layer, and documenting it in the User’s Guide (see the new section 6.4.17.2. Type Abstractions in Functions).

Namespace specifiers in fixity declarations and pragmas

The second item in our list is the implementation of GHC Proposal #65. Here is a code example:

data T = T
{-# DEPRECATED type T "Deprecation message" #-}

Notice how the name T is used both for a type constructor and a data constructor. But how could we deprecate only one of them? This is where namespace specifiers enter the picture: we can write type T or data T to refer to the type constructor or the data constructor respectively.

The same principle applies to fixity declarations:

f # a = f a

type f # a = f a

infix 1 type #
infix 2 data #

If there are both a term-level operator # and a type operator # in scope, we can write data # and type # respectvely to refer to them in a fixity declaration.

The reason we decided to implement namespace specifiers is that the lack of this feature was blocking progress on non-punning tuple constructors. We needed a way to mark the Solo data constuctor deprecated:

{-# DEPRECATED data Solo "The Solo constructor has been renamed to MkSolo to avoid punning." #-}
pattern Solo :: a -> Solo a
pattern Solo x = MkSolo x

More on that in the NoListTuplePuns section of this report (4th topic).

Compared to @-binders, namespace specifiers in pragmas felt like low-hanging fruit. It was quite clear what modifications to the compiler were necessary. The proposal was accepted back in 2017, and was laying around waiting for someone to come along and submit a patch. Our work on dependent types has finally forced that thunk.

Refactoring of exact-print annotations (EPA)

Our third topic is a refactoring of exact-print annotations. This is mostly an internal change, affecting only GHC developers and GHC API users.

The point is this: adding support for dependent types requires various syntactic additions and changes to the Haskell AST (e.g. extending the HsExpr type with new constructor for “types-in-terms”), so we better have a clear vision of how the AST should be structured. However, there is an ongoing effort (mostly driven by Alan Zimmerman) to extend GHC with capabilities for automatic refactoring. Part of this effort is the addition of exact-print annotations, allowing GHC to faithfully pretty-print a parsed program. The problem is that this too requires extensive modification to the AST.

So instead of stepping on each other’s toes, we discussed the design in #23447 “Where should tokens live in the abstract syntax tree?”", found a satisfying solution and collaborated on its implementation. There is still work to do, but now it is clear where we are heading and how new AST constructors should be structured.

This unblocks the work on those parts of Dependent Haskell that require AST modification, and, hopefully, it means that the new features will work correctly with exact-print.

Pun-free syntax for list and tuple types

Let us conclude the report with a feature that was designed and implemented in a spontaneous collaboration with Torsten Schmits (Tweag).

Originally, the design for pun-free syntax for list and tuple types was a part of GHC Proposal #281 “Visible forall”. You might be familiar with it from our previous reports where we talked about visible dependent quantification forall a -> t. However, the size and complexity of the specification quickly spiraled out of control, so non-punning syntax was factored out into a separate proposal, namely GHC Proposal #475 “Non-punning list and tuple syntax” (written 2021, accepted 2022). See the proposal for the motivation and specification.

The main payload of the proposal is to rename [] to List, (,) to Tuple<n>, and introduce the ListTuplePuns extension for the legacy syntax. Our team at Serokell started the implementation in small incremental steps, e.g. commits 02279a9c37 and a13affce1a. We were going to slowly chip away at it, but we were focusing on other tasks and the progress was slow. Thankfully, Torsten took the initiative and implemented the majority of the proposed changes in commit d91d00fccf.

There is one remaining issue with type inference that prevents us from defining Tuple and Constraints type families, and at the moment, it is not clear how to address it. Enthusiasts can already switch to non-punning syntax by specifyig NoListTuplePuns in GHC 9.10, though the majority of developers might want to wait until Tuple and Constraints are available.

Non-punning syntax is a positive change in its own right, as the proposal argues. Consider, for example:

single :: [Int]
single = [5]

just :: Maybe Int
just = Maybe 5

Can you spot the mistake? It is somehow valid to use brackets [] both for the list type and its value, yet the same can’t be said about Maybe or many other data types. This irregular design has been a source of confusion for newcomers, so we find it preferable to always use distinct syntax for types and terms:

single :: List Int
single = [5]

just :: Maybe Int
just = Just 5

As we move towards dependent types, the problem becomes more prominent. Extensions such as DataKinds and RequiredTypeArguments already allow terms to occur in types and vice versa, e.g.

f :: T (Just 10)  -- promoted term (Just 10) in a type
x = g (Maybe Int) -- type (Maybe Int) in a term

The eventual goal is to unify term and type syntax, and this means we want to avoid ambiguous/punned constructor names, e.g. [a] (type of a list) vs [a] (list of one element). The introduction of NoListTuplePuns is a big step towards this goal.

Features in GHC 9.10

We are proud to say that our GHC team continues to design and implement new Haskell features, both through our own efforts and in collaboration with other contributors, in the true spirit of open-source development. In particular, we would like to highlight two features shipped with GHC 9.10: TypeAbstractions and RequiredTypeArguments.

TypeAbstractions is a language extension that enables @-binders in type declarations, function equations, and other contexts. User’s Guide section 6.4.17. Type abstractions.
RequiredTypeArguments is a language extension that enables visible forall forall a -> t in types of terms. User’s Guide section 6.4.18. Required type arguments.

As with all new features, expect some rough edges, but we look forward to feedback from early adopters.

Previous updates on dependent types

Stay tuned

This was the third blog post of our series “Work on GHC: Dependent Types”. We are commited to our vision of dependently-typed programming in Haskell, so stay tuned for future updates!

tagged:

GHC Haskell

18 upvotes

Get new articles via email

No spam – you'll only receive stuff we’d like to read ourselves.

Serokell’s Work on GHC: Dependent Types, Part 3