Research Programming Artificial Intelligence Interviews Other

Haskell in Production: Channable

Q: Are there any libraries that you found very useful in the development process and would like to feature?

First of all, there is conduit which we use for stream processing. In our experience it has been both fast and easy to use. The only downside is that code using it is quite prone to space leaks. Well-typed has a very informative blog post on that topic.Besides that, we use the excellent warp server for both internal and external HTTP-based interfaces. In some projects, we used it directly, in others through an additional layer like scotty or servant .And last but not least, there is ghc-compact , which has helped us tremendously to get good performance even when working with large datasets. Though it might not really qualify as a library since it’s shipped with GHC and just wraps a feature of the GHC runtime system.

Article by Gints Dreimanis

June 28th, 2022

9 min read

In our Haskell in Production series, we interview developers from companies that use Haskell for real-world tasks. We cover benefits, downsides, common pitfalls, and tips for new Haskellers.

Our today’s guest is Fabian Thorand, who is a team lead at Channable. Read further to learn where Channable uses Haskell, why they chose it, and what they like and don’t like about it.

Interview with Fabian Thorand

Could you tell us a little bit about Channable and your role there?

Channable offers a channel management platform through which webshops can publish their products (or other offers like job postings or vacation deals) to a wide variety of marketplaces, ad platforms, or price comparison sites. The product information can be tweaked through user-defined rules following a simple “IF … THEN … (ELSE …)” structure (e.g. reducing prices during a sales period or removing products with missing information before sending them to a third-party platform).

I joined Channable about 4,5 years ago as a backend developer in the infrastructure team, where we work on the core applications powering the Channable tool. About a year ago, the team was split into two subteams due to our continuous growth, and I became the team lead of one of them.

Where in your stack do you use Haskell?

We use Haskell for a variety of backend services. The biggest one by far is our data processing system, which powers the import from our customers, manages the data storage, and applies the user-defined rules to the data before streaming it to other components in our backend which handle the actual connections to the third-party platforms.

Another big project is our job-scheduling system for running a set of jobs with dependencies between them in the right order on a cluster of worker machines (think of “downloading data from the client system” or “exporting products to a third-party system”).

The third major part of our infrastructure using Haskell is an API gateway that handles routing (to our various backend services) and provides a common implementation for authentication, authorization, rate-limiting, etc. so that the backend services don’t have to concern themselves with that.

Last but not least, there are a few smaller utilities that we have open-sourced, such as vaultenv (fetching secrets from Hashicorp Vault and providing them via environment variables to a program) and icepeak (a JSON document store with support for push notifications via websockets).

Why did you decide to choose Haskell?

Haskell was first introduced as an experiment rewriting a component that was hitting the limits of what was possible in Python and the existing architecture (the full story is on our tech blog).

We needed a language that would complement Python well. A language that would compile to fast code while providing safety rails through its strong type system and still being high productivity. Given that we also already had several people enthusiastic about functional programming, Haskell was the natural choice.

Since that project turned out to be very successful (and is still in use today – being continually developed since then), Haskell was also chosen for another greenfield project that was started shortly after, our API gateway.

When I joined Channable, Haskell was thus already in use for these two projects. At the time, we were also running into scaling and operational troubles with our existing feed processing application (written in Scala using Apache Spark), and so Haskell was chosen once more for its replacement. More details about that rewrite can be found here.

Are there any specific qualities of Haskell that made you decide in its favor?

The main selling point is Haskell’s strong type system. It eliminates many types of runtime errors that we regularly see crop up in our Python code base (though it has gotten better since we use types in Python as well – via mypy). Additionally, it makes refactoring existing code a breeze: one can be sure that almost all required changes are caught by the compiler.

Another advantage is the focus on immutable (and persistent) data structures, allowing for cleaner designs and better testability and concurrency.

An underappreciated advantage of Haskell is also its great runtime system. It makes it very easy to add both concurrency and parallelism to your programs using lightweight threads. Something that is usually a lot harder to do in other programming languages if you didn’t design for it from the start.

Are there any libraries that you found very useful in the development process and would like to feature?

First of all, there is conduit which we use for stream processing. In our experience it has been both fast and easy to use. The only downside is that code using it is quite prone to space leaks. Well-typed has a very informative blog post on that topic.

Besides that, we use the excellent warp server for both internal and external HTTP-based interfaces. In some projects, we used it directly, in others through an additional layer like scotty or servant.

And last but not least, there is ghc-compact, which has helped us tremendously to get good performance even when working with large datasets. Though it might not really qualify as a library since it’s shipped with GHC and just wraps a feature of the GHC runtime system.

What kind of effect system do you use: RIO, mtl, fused-effects, Polysemy, or something else?

Some of the older projects use mtl-style MonadXYZ classes and corresponding deep monad transformer stacks. Those turned out to be quite unwieldy to work with for various reasons.

Firstly, for M monad (transformer) types in your stack and N classes, you need about N * M instance declarations, which adds a lot of noise to the codebase. Additionally, deep monad stacks hurt performance quite a bit, so whenever we have performance critical code, we just use plain IO.

A secondary issue is that some monad transformer combinations look tempting at first glance, but turn out to be a bad idea in practice, like the (in)famous “ExceptT over IO”: not only is it incurring a performance penalty due to the repeated packing and unpacking of the inner Either, but it also makes the code more complex instead of simpler, as there are now two failure paths to consider (with different handling functions), since IO can always potentially throw an exception.

For those reasons, our newer projects usually use some form of ReaderT IO (or just IO-functions with explicit function arguments for dependencies). Occasionally, we also use free monads in places where there is a well-defined “language” of operations that we might want to mock in tests.

Did you run into any downsides of Haskell while developing the project? If so, could you describe those?

One big downside of Haskell that is noticeable in the daily development workflow is the lack of tooling. Fortunately, at least the “IDE” side of it improved quite a bit with the haskell-language-server project, which was a game-changer in terms of development convenience.

The remaining areas that are still lacking compared to other languages are debugging (Debug.Trace-debugging is often the only viable approach) and profiling. The latter probably deserves some explanation as GHC does come with built-in profiling capabilities. Unfortunately, those are very intrusive and sometimes drastically change the performance and allocation behavior of the generated code. External profilers that don’t require special builds (except for debugging info) like perf don’t work with Haskell code. On the upside, we did have good experiences with the eventlog-based profiling, which has a lower overhead than a full-blown profiling build.

In terms of the language itself, by far our biggest issue was its GC which is performing really badly when having large heaps (upwards of several gigabytes) without taking precautions like using compact regions (see our blog for more details). With heaps growing even to the 100+GB region, we also encountered some quadratic runtime algorithms in the memory allocator of the RTS – and even compact regions didn’t help then. We documented that particular bug and have since provided a fix for it as well.

While there are a few tuning parameters, there is nothing that would alleviate these particular problems without requiring invasive code changes.

How easy is it to hire Haskell engineers for the team? Do you do any in-house training programs to get non-Haskell engineers up to speed with Haskell?

So far we didn’t have trouble finding Haskell engineers to fill our vacancies. That can probably partially be attributed to the fact that our headquarter is located in Utrecht – since Utrecht University has a strong Haskell-focused track in the Computing Science Master’s programme.

We have no explicit in-house training program, but we have hired and successfully onboarded people of varying Haskell skill levels – and an important part of the culture at Channable is to grow and learn.

As an aside, you also use Nix, which seems like a popular choice for Haskell projects nowadays. How did you decide to use it, and how was your experience using these two technologies together?

Some colleagues were using Nix for their side projects or running NixOS on their personal machines. At the time, we also had some trouble with our existing build and deployment infrastructure and were searching for solutions, and it turned out that Nix fit the bill perfectly.

We started by using Nix just as a convenient manner to bring various development tools into scope (such as stack), but not yet using it for building our code (so stack – while installed via nix – was not run in Nix-mode yet at that time). That way, we already prevented some of the issues, like having incompatible stack versions installed across developer machines.

The next step was then to port some of our build pipelines, starting with the Haskell projects. The main advantages were the pre-built Nix cache for Haskell packages (no more rebuilding the entire stackage snapshot with every minor upgrade) and the reproducibility of the builds.

The full story can be found on our tech blog

Your tech blog posts mention languages like the aforementioned Nix and Idris, working with which is definitely on some people’s bucket lists. Is experimenting with technologies common in Channable? If so, what benefits do you see in this?

We have a regular Hackathon day in the development department where people can just try out ideas they have, or investigate new technologies that look interesting. The tool written in Idris is dbcritic and was the product of one of those days.

Having this room for experimentation is definitely an important part of the engineering culture at Channable. There are a few things that people came up with during those Hackathons that are now regularly used – either in our production software, as a productivity tool, or as part of the CI pipeline.

We probably wouldn’t use Idris for any larger project, but since this particular program solves a concrete problem we had and was easy enough to package as it was, we decided to adopt it in our CI workflow.

What tips would you give for a new Haskell team starting a project in an area similar to yours?

My first recommendation would be to keep things simple. Haskell provides a lot of language-level complexity at your fingertips (just enable a few language extensions), but one needs to carefully consider if that complexity is worth the cost. Both in terms of syntactic overhead, but also in terms of increasing the burden for onboarding future colleagues

The other is more specific, but if you plan to work with large amounts of data (that isn’t just plain bytes), be sure to take into account Haskell’s GC from the start. When working with a large heap, there will be GC troubles. The only mitigation (while still being able to work with regular Haskell data structures) are compact regions. As using those has an impact on the overall architectural and algorithmic design, rewriting an existing program to use them might not be straightforward.

Where can people learn more about Channable?

About what we do on the technical side: https://www.channable.com/tech/
About the company in general: https://www.channable.com/
About our open-source projects: https://github.com/channable/

Hope you enjoyed our interview with Fabian!

To read more interviews with companies that use Haskell to solve real-world problems, head to our interview section. Also, be sure to follow us on Twitter or subscribe to our mailing list (via form below) to get updates whenever we release new articles.

tagged:

nix idris haskell haskell in production

35 upvotes

Get new articles via email

No spam – you'll only receive stuff we’d like to read ourselves.

Haskell in Production: Channable

Interview with Fabian Thorand