Haskell in Production: FOSSA

In our Haskell in Production series, we interview developers and technical leaders from companies that use Haskell for real-world tasks. We cover benefits, downsides, common pitfalls, and tips for building useful Haskell products.

Our today’s guest is Eliza Zhang, who works in the engineering team at FOSSA – a tool for open-source risk management.

In the interview, we talk about her experience using Haskell in FOSSA’s backend analysis services and CLI. Read further to learn about the benefits and downsides of Haskell she encountered while working on the project, as well as what tips she would give to teams starting Haskell projects like hers.

Interview with FOSSA

Could you give our readers a brief introduction to FOSSA and your role there?

When engineering teams use open source, it exposes them to legal (“is this a license like GPL v3 or AGPL?”), security (“does this have a vulnerability?”), and code quality (“is this still maintained?”) risks. At many large companies, this manifests in other functions (security, compliance, etc.) becoming gatekeepers in the engineering release process. At its worst, it becomes a 2-week fire drill that happens every time a major release gets done, as other teams pester engineers to figure out what dependencies are being used, which are actually okay to use, and how to remove the bad ones.

For large companies with hundreds of teams and thousands of services, each using different languages and build tools, this is an enormous effort. Today, it’s solved through the sheer willpower of spreadsheets, paperwork, tickets, and process.

FOSSA builds a tool that integrates with existing compiler, build, and CI systems to automatically determine, track, and analyze your dependencies. It works across all of your languages and all of your projects and integrates with zero configuration. It also provides a bunch of useful tools on top, aggregating risk information (like licensing and vulnerabilities), providing a policy engine to figure out which dependencies are compliant, providing integrations (like a GitHub PR check and an IDE integration) so developers get warned about non-compliant dependencies early, and automatically generating required reporting to help your team stay compliant. FOSSA automates away the checklist of extra compliance issues that engineering teams have to deal with when they use open source, so that engineers can spend their time actually building stuff.

I joined FOSSA in 2017 as an early engineer, and have played various roles in engineering since then. I currently lead our Platform team, working on improving the performance, stability, and correctness of our underlying data and analysis systems.

Where in your stack do you use Haskell?

You can roughly divide FOSSA into three pieces: a CLI that our customers use to analyze their builds, a set of backend build analysis services, and a web application that ties it all together.

Our web application is written in Node.js, our CLI is written in Haskell, and our backend analysis services are a bit of a mix. Our older services are in Node.js, and our newer services are in Rust, Haskell, and Go.

How did you decide to choose Haskell for the project?

FOSSA CLI v2 was sort of a perfect candidate for a Haskell project. We had built out the first version of the CLI in Go, and writing and testing parsers in Go turned out to be a real pain. When we began working on the second version of the CLI, we decided to give Haskell a shot because we had a team that had both interest and experience, and the domain of build analysis is very Haskell-friendly (it turns out to be very compiler-like: parse a bunch of data, think really hard, and then output a bunch of data).

Our new backend analysis service uses the CLI’s build analysis code as a library, and its team was staffed with folks pulled from the original CLI team, so picking Haskell was a quick decision there.

Are there any specific qualities of Haskell that make it well suited for the particular use case?

Even without a specific use case, Haskell is a pretty good general-purpose language. Static types and effect tracking make for a really good refactoring and testing story. The library ecosystem has most of what you need. You could get away with writing everything in the IO monad (just write JavaScript in Haskell), and you’d still have a language that feels imperative but has much better jump-to-definition support and static safety guarantees.

For us, the first use case that made us seriously consider adoption was parsing. We write a lot of parsers in the CLI, and a lot of those parsers have really complicated interactions with compilers and build tools. Our CLI also gets shipped on-premises, so we don’t have the freedom to do constant incremental updates; each release we make has to be rock solid.

Haskell was a particularly good fit for this use case because of its great parsing libraries and great testing story. Megaparsec allowed us to build parsers with very high confidence in their correctness, and effect tracking and our effect system made testing these parsers at various levels (unit, integration, end-to-end) really simple. Effect tracking also enabled us to build a great debugging feature called “replay logging” that allows us to replay an analysis run from debug logs, which makes debugging customer issues much easier.

When people think about building startups, Haskell is usually not the first language they have in mind. Could you talk a little about your experience building a startup with Haskell?

Unless you have mission-critical constraints around performance (e.g. if you’re building a database, use a fast language) or domain (e.g. if you’re doing data science, use Python), language choice actually doesn’t matter that much. Pick the language you’re most comfortable in and can build fastest in. As long as your language has a reasonably mature ecosystem for what you want to do, almost any choice is reasonable.

And Haskell is already there! There are good web frameworks, good blog post explainers, good RPC libraries, and a good OAuth2 implementation. Once you have a decent OAuth2 implementation and an AWS API library, you know you’re in pretty good shape for a SaaS product.

Much more important than the language itself is the team that you’re working with, and their experience and interest. If your team has Haskell experience and has people who have written Haskell in anger, adopting Haskell will be a much easier experience. Having someone who can decipher Cabal’s cryptic error messages is like night and day when debugging. If you don’t have at least one experienced (or at least very passionate) Haskell developer, moving fast and delivering with Haskell will be hard. The self-teaching learning curve is quite steep.

We picked up Haskell primarily because it was a good fit for the interest and experience of the people on our team. Starting with two experienced Haskell programmers made a big difference in our ability to onboard the rest of the team. As a happy side effect, we got to reap the benefits of Haskell: our Haskell code is easier to understand, easier to debug, and easier to test than our other codebases.

The specific choice of language turns out to not be super important to building a startup. The important thing is to use whatever your team moves fastest with, because that’s what lets you best support the business. For us, that happened to be Haskell.

Did you run into any downsides of Haskell while developing the project? If so, could you describe those?

The Haskell developer experience still has a lot of rough edges.

  • The build tools aren’t as mature as other languages. Cabal still has occasional bugs (for example, using branch tags for git dependencies doesn’t always work), and is generally unintuitive (for example, there’s no built-in command for adding dependencies or upgrading their versions). M1 Mac support in general is also lacking, and we’ve had to figure out how to build Cabal from source in certain cases.
  • The developer tools aren’t as mature as other languages. Haskell Language Server is amazing, but still crashes on a regular basis on some codebases, and struggles with certain things (such as handling Template Haskell, or renaming exported symbols, or jumping to definition to source code of third-party dependencies).
  • The library ecosystem isn’t as extensive as those of other languages. There’s less of a culture of documentation (popular libraries often have amazing docs, but some packages just link to a paper), certain needs don’t have a standard answer yet (there is no Sidekiq or Celery for Haskell), and some important libraries are unmaintained (connection hasn’t seen an update in 3 years despite open issues and pull requests). We found ourselves forking and upstreaming patches to dependencies more than in other languages, although still not a huge amount.
  • The blogging and article ecosystem isn’t quite there for beginners. The literature for intermediate users is actually quite good, but the quality of blog posts for beginners is relatively low. There are lots of monad tutorials, but very few of them are good, and the good ones are hard to find.

All of these issues are annoying, but not fatal. The instructions we needed to rebuild Cabal for M1s were readily Google-able. We’ve only had to patch a handful (4 or 5?) of our dependencies. We’ve been able to mitigate the lack of beginner articles by teaching in-house. But it’s still annoying that these are not available out of the box, and it often leaves a bad first impression on new adopters.

Could you list some Haskell libraries that your team found very useful while developing FOSSA and that you would like to feature?

We use most of the usual great Haskell libraries, as well as a couple of unusual ones.

Some of the usual popular libraries we use include:

  • megaparsec (parsing)
  • aeson (JSON, and JSON-ish serialization and deserialization)
  • path-io (type-safe filesystem path handling)
  • optparse-applicative (for CLI flag parsing)
  • servant (web server framework)
  • hspec (testing framework)
  • req (HTTP client)
  • conduit (streaming framework)

Some unusual choices that have turned out great include:

  • algebraic-graphs, which is an unusual graph data structure library that makes it really easy for us to stitch dependency graphs together.
  • fused-effects, an effects system that we were concerned would be overkill and difficult to teach but turned out to be surprisingly approachable. We’ve also written our own Simple effect carrier for easily defining first-order effects.
  • rel8, a relatively new Postgres library that has astoundingly good ergonomics and is much nicer than any other database library or ORM that I’ve ever seen.
  • Blammo, a really nice logging wrapper from the folks at Freckle.
  • faktory, a really nice client library for working with our Faktory job queue, also from the folks at Freckle. We’ve upstreamed a few patches for this, and still have a couple of major changes that we’re figuring out how to upstream.

In addition to our upstreamed patches, we’re also open sourcing pieces of our own system that we’ve found useful:

  • cgroup-rts-threads (for setting concurrency within containers).
  • Some other yet-to-be-released libraries (including a database migration tool and a resource pool with retry semantics).

Has your team run into any hiring difficulties when hiring Haskell developers?

This was a serious concern when we began adopting Haskell. Surprisingly, it turned out to not be a big deal! I think a couple of factors played into this.

First, there’s broad interest in Haskell as a technology. We were concerned that Haskell developers would be hard to find. In fact, we found that Haskell developers were quite starved for interesting job opportunities! We wound up often getting inbound interest because we were using Haskell. It helps that engineers interested in Haskell are often also interested in our company’s domain (compilers, build systems, and programming language analysis), and vice versa.

Second, we’ve had an easier-than-expected experience teaching Haskell to new engineers who are interested in Haskell but don’t have any experience. In fact, several of the engineers on our Haskell teams came from other teams writing in other languages that transferred over and learned Haskell. We’ve even had candidates join one of our other teams writing a language they’re more experienced with, with the intention to learn Haskell-in-production from folks on our team.

Do you have any in-house training programs to teach or upskill Haskell developers?

While we don’t have any formal training programs, we have had a pretty good experience onboarding engineers to Haskell. In our experience, the time for a working programmer to go from zero to production-ready code review in Haskell is about 6 weeks. It’s actually not that much longer than most other languages. Haskell just feels scarier because the zero-to-side-project time (which is a couple of weeks) is much longer than in other languages (e.g. zero-to-side-project in Go is about half an hour).

I think the main factor in our success is the fact that we started with experienced Haskell engineers on our team who could help teach everybody else. We use a buddy system for onboarding new engineers. Having someone experienced to pair with makes an enormous difference in learning, since they can help interactively debug weirdness with your tools, and can help teach through concrete examples. It also really helps smooth out the rough edges of the developer experience. Is this weird type error because of a thing you’re fundamentally misunderstanding, or because GHC is just inferring something weird because you misplaced a parenthesis? This sort of question is much easier to answer with an experienced programmer.

What tips would you give to people that want to use Haskell in a startup-like environment?

The most important thing in a startup-like environment is velocity. You need to move fast, and you need to be able to change direction fast.

Having higher velocity lets you iterate faster, which helps you support the business better. Nothing is more important than this goal. Even when we talk about the importance of maintainability, what we’re really talking about is our ability to sustain high velocity in the face of changing business directives.

Focus on velocity. If you’re interested in adopting Haskell, build out a proof-of-concept and evaluate how your team’s velocity with Haskell felt. As you write code, pick language features (especially pragmas) that are easier to debug and refactor, and easier for new engineers on your team to understand. As you learn more about your problem domain, proactively revisit and redesign old primitives to better align them with the domain. Every team will make different style and design choices based on what their individual preferences are, but these choices should all focus on “which approach helps our specific team move the fastest?”. Smaller and more experienced teams may want to use more advanced features, while larger or less experienced teams may move fastest by sticking to the 80% solutions.

We found that we achieved good velocity by reducing the number of knobs in the language (by fixing our language pragmas and GHC flags across the codebase), using simpler design patterns for effects (focusing on testability and ease of writing application code rather than thoroughness in modeling effects), and refactoring more aggressively than usual for other languages (because refactoring is cheaper and safer with Haskell). We also revisit these decisions whenever we feel like they’re dragging on our velocity to see if we can do better.

The exact practices that improve velocity the most will vary from team to team, and within the same team over time. The important thing is to not lose sight of velocity and business value as the goal.

Hope you enjoyed our interview with Eliza!

For more interviews with companies that use Haskell to solve real-world problems, check out our Haskell in Production series. Also, be sure to follow us on Twitter or subscribe to our mailing list (via the form below) to receive new Serokell articles via email.

Serokell Haskell courses: Everyday optics
More from Serokell
Algebraic Data Types in Haskell ImageAlgebraic Data Types in Haskell Image
The concept of Haskell type witnessThe concept of Haskell type witness
Haskell in Production: Mercury thumbnailHaskell in Production: Mercury thumbnail