In our Haskell in Production series, we interview developers and technical leaders from companies that use Haskell for real-world tasks. We cover benefits, downsides, common pitfalls, and tips for building useful Haskell products.
This time, we have quite a special guest – Simon Marlow from Meta. He’s one of the co-authors of the Glasgow Haskell Compiler (GHC) and the author of Parallel and Concurrent Programming in Haskell. Currently, he’s working at Meta on Glean, a system for collecting, deriving and querying facts about source code.
Read further to learn where Meta uses Haskell and what Simon thinks about the language and its future adoption opportunities.
Interview with Simon Marlow
Hi, Simon! What’s your role at Meta, and what are you currently working on?
Hi! I’m an engineer in the Code Search and Indexing team at Meta, and I’m working on Glean, our system for indexing and querying information about source code.
Where is Haskell used at Meta?
Are Meta’s open-source Haskell projects – Haxl and Glean – used by other teams at Meta? If so, could you say where?
There are various teams using Sigma to fight different kinds of abuse on Facebook. Glean is used as a backend for various tools and systems that developers use in their day-to-day work: things like code browsers, IDEs, code search, analysis and documentation tools.
Was Sigma the first significant Haskell project at Meta? If so, how did the team go about introducing Haskell and convincing people in charge that it’s a good choice?
Yes, Sigma was the first significant Haskell project to be used in production at Meta. It was really a case of being in the right place at the right time! When I joined Meta (Facebook as it was called at the time, in 2013), Sigma existed and was successfully identifying and removing vast amounts of spam and other kinds of abuse continuously from Facebook. At the time, Sigma was based on a custom domain-specific language (DSL) called FXL, essentially a small purely-functional expression-based language whose key features were (1) automatic memoization, (2) efficient parallel data-fetching, and (3) hot code-swapping. The limitations of a custom language were beginning to be a bottleneck for its users though: the tooling was very limited for things like debugging, development and profiling. There were no libraries, facilities for abstraction in the language were almost non-existent, and performance was not great. Any improvements to FXL had to be made by the handful of people working on Sigma, which was already stretched with running that kind of platform at the scale needed.
When I suggested replacing FXL with Haskell, it seemed to be a potential solution to many of the problems. Haskell’s existing tooling was much more mature, there were plenty of libraries to use, as a more powerful language Haskell would enable building more complex logic than was possible with FXL. Moreover, as a compiled language, Haskell’s performance should be a lot better than a custom interpreter.
Still, we had to convince ourselves that it was actually going to work, so we identified the key pieces of technology we would have to demonstrate for Haskell to be a viable solution. The idea is to “fail fast”: if it isn’t going to work, we want to know sooner rather than later. So we prototyped implementations of parallel data-fetching and memoization (which became Haxl), hot code-swapping (which we also open-sourced), and some benchmarks to validate performance. When these all worked, we set about building out all the functionality we would need to replace FXL and actually doing the migration, all the time while FXL was still being heavily used. The whole process took around 2 years before we had fully switched over to using Haskell.
Does Haskell have any benefits (relevant to the projects discussed) that can’t be replicated by other, more commonly used programming languages?
It’s difficult to find things that you can’t do in other languages, because typically everything is possible, it’s just a question of whether your language helps or makes it harder. But let me call out a few areas where Haskell has particularly helped us:
- Building EDSLs. For a number of reasons Haskell’s feature set makes building EDSLs quite satisfying. You can hide a lot of implementation details and provide a clean API to the EDSL author. This was key in our Haxl deployment, where code authors are writing code in Haskell but using a very stylised set of APIs and programming within a single Monad.
- Concurrency. Of course, most languages have concurrency of one kind or another, but Haskell’s concurrency is particularly easy to use. Not having to deal with explicit async-style concurrency is a big win, and high-level APIs like
Control.Concurrent.Asyncare a joy to use. Multiple times we’ve found that adding parallelism to a program can be done with a one-line change.
- Asynchronous exceptions. I honestly don’t know how I would protect our servers against very large requests without the help of asynchronous exceptions. I wrote a blog post about it. Admittedly it can be a pain to write exception-safe IO code sometimes and there are a bunch of pitfalls that the compiler doesn’t help you find, but I don’t know of a better solution.
- Hiring people for Haskell teams is never a problem. There always seems to be a queue of passionate functional programmers lining up whenever we have spare headcount on our Haskell teams.
Are there any significant downsides of Haskell that you encountered while working on these projects?
Here are a few:
- Upgrading the toolchain and libraries is a huge overhead. It takes us about a man-year to upgrade GHC and the libraries at Meta. Given that we don’t have a dedicated team maintaining our Haskell language infrastructure, this is difficult to justify.
- As with any language at a big company, you have to build integrations with vast amounts of standard libraries and infrastructure to get anything done, and the surface area that you have to work with expands all the time. For example: integration with the build system, RPC system (Thrift at Meta), logging, testing, deployment, and so on.
- Debugging and profiling Haskell in production can be hard, mainly due to the lack of good runtime stack traces and integration with standard tools like perf.
Before working at Meta, you were one of the co-developers of GHC (Glasgow Haskell Compiler). How big of a part of success to these projects would you attribute to having such experienced people like you on Meta’s team? Could this be reproduced by people that don’t have such in-depth Haskell-specific knowledge?
For our project we really did need some deep expertise because we had to develop new technology (hot code-swapping at a large scale, for example), and we were breaking new ground in terms of the scale of a production Haskell deployment. But things have progressed in the 10 years since we started that project, and certainly if you’re following the well-trodden paths (deploying a web application, for example), things will be much smoother and require less specialist expertise.
What direction do you see for Haskell in big enterprises like Meta or Google in the next 10 years?
It’s tough for smaller languages in big enterprises. There is a huge mountain to climb just to be able to use a new language in a mature big-company environment due to the sheer number of existing systems, libraries and tools that your language has to integrate with. For example, to be able to deploy Haskell code in production at Meta we need Haskell to work with the build system, to interact with our RPC layers (Thrift), service discovery, logging, configuration, deployment systems, and so on. All of those integrations have to be built and maintained over time. Smaller companies will often be using widely-used open-source tooling and libraries for which integrations already exist, but in a big-company environment there is a lot more custom infrastructure. Moreover, the surface area that you need to integrate with tends to expand over time. All of this makes it hard to use a language that isn’t one of the main supported languages.
Smaller languages have had success in areas that don’t require integration with the full set of infrastructure. For example, OCaml is used in a few places at Meta (the Hack, Flow and Pyre typecheckers, for example), without having as deep an integration with the infrastructure as Haskell, because compiler/typechecker tools tend to be more standalone and don’t need the full range of production integrations.
Is there any other language that occupies the same niche as Haskell but does or could do it better?
There’s probably still room for a language that has the combination of strong typing, powerful abstraction and garbage-collection in the typical language portfolio. Rust doesn’t quite fill that niche because users complain about having to manage memory explicitly, which for many applications is more control than they want or need.
From your experience, what are the main blockers for Haskell adoption in large companies?
There’s the barriers I mentioned earlier in terms of integration with all of the established tools, systems and infrastructure. Beyond that the blockers are social and cultural. In my experience having a Haskell team tends to be beneficial from a staffing standpoint: the team is never short of passionate applicants to fill any open headcount. However, on the other hand deciding to use Haskell in the first place is a risky proposition. Is the team going to have to support its own tooling? Are we going to have to devote a lot of time to upgrading the compiler and libraries? Don’t we have a smaller pool of people to draw from if we need to urgently replace team members? Is there a longer learning curve for new people joining the team? Do we have people with deep expertise in the compiler and tools if we have an urgent problem that needs to be solved?
In practice at Meta these haven’t been blockers, but they’re definitely considerations that you would be taking into account if you were in a leadership position making a choice about whether to use Haskell. Of course it’s easy to focus on the potential problems and forget to weigh those up against the benefits: in Sigma for example the benefit of Haskell was that we got a language that provided automatic parallelism and batching, good performance, safety, isolation, hot code-swapping, and powerful abstraction facilities amongst other benefits.
Are there any tips you’d give to people that want to introduce Haskell in their own company?
Have a clear understanding of the benefits that Haskell will bring for your project, especially if it’s the first usage of Haskell at the company, because you’ll be taking all the upfront costs of using a new language. You’ll want some passionate individuals to drive things, because without that it’s not likely to succeed.
Standardising things for our internal codebase has been useful. At Meta we have a standard set of extensions and libraries that are enabled everywhere, a standard coding style (consistency is more important than your personal preferences) and good consistency is maintained by a mixture of tooling (warnings and HLint) and code review—something that I think is crucial to a healthy engineering culture whatever language you’re working in. The important thing with code review is not to treat it as a hurdle or a barrier (“I just have to get this past code review…”) but as an opportunity to build shared understanding of the code, to propagate best practices and to learn from each other. The code reviewer’s job is to help the author improve their code and get it ready for inclusion into the codebase.
It’s worth thinking about the balance of using powerful language features against the cost of lengthening the learning curve and making the codebase less accessible. It doesn’t have to be in the form of hard and fast rules (“these are the only extensions we allow…”) but it can be in the form of gentle push-back during code review (“do we really need to use lenses here or would ordinary records be OK?”). Sometimes those powerful language features really are warranted, but let’s make sure it’s worth it. Find the right balance for your team/codebase.
Understand the integration costs and ongoing maintenance burdens that you’re committing to. Using Haskell for a system that you’ll be deploying in production probably has more integration costs than a tool that you’ll be using at development time, or for standalone analysis or report generation, for example.
Big thanks to Simon for finding the time to chat with us! 💜
If you want to hear Simon talk more about his work at Meta, you can watch his talk from Haskell eXchange 2022.