DataKinds Are Not What You Think

The DataKinds language extension doesn’t work the way you think it does… probably. Of course I can’t possibly know what mental model you have for DataKinds. Perhaps you are one of the few who understand its true nature. But I’ve seen experts – yes, type-level Haskell wizards – fall into the trap that I’m about to expose.

How DataKinds are commonly explained

To set the stage, let’s take a look at how DataKinds are commonly explained.

Matt Parsons in his article “Basic Type Level Programming in Haskell” (2017) writes:

data Nat = Zero | Succ Nat

In plain Haskell, this definition introduces a new type Nat with two value constructors, Zero and Succ (which takes a value of type Nat). With the DataKinds extension, this also defines some extra new tidbits. We get a new kind Nat, which exists in a separate namespace. And we get two new types: a type constant 'Zero, which has the kind Nat, and a type constructor 'Succ, which accepts a type of kind Nat.

Sandy Maguire in his book “Thinking with Types” (2018) writes:

data Bool = False | True

… the data Bool definition above introduces the following things into scope (as usual):

  • A type constructor Bool of kind Type
  • A data constructor True of type Bool
  • A data constructor False of type Bool

However, when -XDataKinds is enabled, our definition of Bool also introduces the following into scope:

  • A new kind: Bool
  • A promoted data constructor 'True of kind Bool
  • A promoted data constructor 'False of kind Bool

Rebecca Skinner in her article “An Introduction to Type Level Programming” (2021) writes:

data ExampleColor = Red | Green | Blue

Here we’re creating a sum type with constructors Red, Green and Blue, but thanks to DataKinds we’re also creating a new Kind with types Red, Green and Blue.

The common theme in those explanations is that kinds are created from types, as if there was some duplication happening behind the scenes:

duplication of types from kinds

What’s wrong with this explanation?

Prior to GHC 8.0 (2016), this model used to work just fine because types and kinds were indeed entirely separate. But with the introduction of the type-in-type axiom, this picture became outdated and misleading. Here are a few lines of code that do not quite fit within this framework:

type family Sing :: k -> Type

class FromSing k where
  fromSing :: Sing (a :: k) -> k

At first glance, this does not even seem related to the picture above. And yet it reveals a fatal flaw in it. In the type signature for fromSing, is k a type or a kind?

fromSing :: Sing (a :: k) -> k
                   --  ↑     ↑
                   -- kind  type

Well, the first occurrence is a kind, and the second occurrence is a type! How can one variable be a type and a kind simultaneously? This paradox is resolved by the following mental shift: there are no kinds. You heard that right.

  • Pre-8.0: terms have types, types have kinds.
  • Post-8.0: terms have types, types have types.

We used to have terms, types, and kinds. Now we’re down to terms and types. Occasionally, however, the old terminology is still convenient. Saying “type of a type” every time is a bit awkward, so we may say “kind” instead. The important thing to keep in mind is that a kind isn’t a thing in itself; a type can be used as a kind.

So in the example above, k is a type variable. It’s used as a kind in Sing (a :: k), so we might also call it a kind variable. However, this is a statement about how it’s used, not about what it is.

This suggests the following modification to our understanding of DataKinds:

how modified understanding of datakinds looks

Compared to the previous model, we now have the type level and the term level, whereas kinds don’t get their own level and are merged into types. This is definitely an improvement, but it’s still incorrect in a subtle way. Since this interpretation of DataKinds is based on duplication, we end up with two separate Bools. But are they really distinct?

Building upon our definitions of Sing and FromSing, here are some instances for them:

data SBool (a :: Bool) where
  STrue :: SBool True
  SFalse :: SBool False

type instance Sing @Bool = SBool

instance FromSing Bool where
  fromSing :: Sing (a :: Bool) -> Bool
  fromSing STrue  = True
  fromSing SFalse = False

Haskell is no stranger to puns. I could declare data Proxy a = Proxy and have two proxies in scope: one is a data constructor and the other is a type constructor. So it wouldn’t be unreasonable to assume that the same thing is happening with Bool: perhaps one is a type and the other one is a kind, yet they are internally distinct. Let me mark type and kind occurrences with different numbers:

data SBool (a :: Bool1) where
  STrue :: SBool True
  SFalse :: SBool False

type instance Sing @Bool1 = SBool

instance FromSing Bool2 where
  fromSing :: Sing (a :: Bool1) -> Bool2
  fromSing STrue  = True
  fromSing SFalse = False

The hypothesis we need to test now is whether GHC thinks that Bool1 and Bool2 are actually distinct entities. The answer shall reveal itself if we compare the type signature of fromSing in the class declaration and in the instance:

fromSing :: Sing (a :: k) -> k
fromSing :: Sing (a :: Bool1) -> Bool2

We seem to instantiate k to Bool1 in one position and to Bool2 in another. So if we had Bool the type and Bool the kind, we would expect a type error of the following form:

• Couldn't match type 'Bool' with kind 'Bool'

… because GHC couldn’t decide what k stands for. And yet the code above compiles just fine, which leads us to the only possible conclusion: there’s just one Bool.

types and kinds are one!

One might wonder: are True and False actually duplicated if Bool isn’t? As long as terms and types are separated, we can’t really tell. A term-level variable can’t occur in a type-level context or vice versa, so we can’t ask the compiler if the term-level True and the type-level True are equal. There is currently no way to test this hypothesis. One could indeed argue that promoted data constructors are a mirror image of their term-level counterparts.

Dependent types, however, lead us to a more elegant model:

with dependent types, everything is an expression

For now this simplification is out of reach, but, personally, I find it to be a very attractive direction.

Conclusion

Enabling the DataKinds extension does not create new kinds. Instead, it allows existing types to be used as kinds. This objection isn’t based on vague philosophical grounds but has practical implications, as illustrated by a fairly basic use of singleton types that we covered.

Haskell courses by Serokell
More from Serokell
fintech functional programmingfintech functional programming
Learning Haskell: A Resource GuideLearning Haskell: A Resource Guide
Haskell in Production: CentralAppHaskell in Production: CentralApp