In our Functional Futures podcast episodes, we interview technical leaders from companies that use Haskell for real-world tasks. This season will be dedicated to the business side of development with functional programming.
Our guest is Max Tagher, the co-founder and CTO of Mercury, the fintech startups use for banking* and all their financial workflows. Mercury has been using Haskell from the start of the company, from scratch. And our conversation will be dedicated to that.
*Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group, Column N.A., and Evolve Bank & Trust, Members FDIC.
What motivated the decision to use Haskell from the beginning of the Mercury from scratch?
My co-founders and I all worked at the same previous company, and we used Ruby on Rails there. And there are a lot of things I like about Ruby on Rails, but we just had a lot of runtime errors, like bugs in the code. It was kind of hard to reason about the code, and that’s something you really don’t want for banking services.
It’s really important that customers have the correct total of their money, and they get access to that money all the time. But even something as innocuous as a dashboard bug could kind of reduce trust in customers, as, like, if they can’t get this right, how do I know they’re getting the other stuff right? So that made a statically typed language really important.
I used Haskell previously for open source work. And it just has a lot of really great features to prevent bugs from happening because the type system is so powerful. So that’s kind of like the primary reason. And then, kind of like a secondary reason that I only suspected at the time, is that it would help to recruit.
So there’s this blog post by Paul Graham, and it’s called the Python Paradox, and it’s about how, in the early nineties ifyou were hiring Python programmers, it was really easy to get great programmers because there were all these people who wanted to use Python. They were these exploratory programmers who were willing to go outside the norm.
And it’s kind of the same for Haskell. We’ve hired two people who have written books about Haskell at Mercury, and it would have been very hard to recruit those people before Mercury had the kind of reputation it does today.
How has using Haskell from the start influenced the company, the development culture?
I think a lot of things about Haskell sort of interplay well with the other parts of Mercury culture. Haskell definitely has a kind of culture of correctness around it. There are many libraries in Haskell that use a feature like newtypes, for example, and that might be possible in other languages, but it doesn’t mean you’re going to have these features to prevent bugs. So you have something like that, but that already mixes with the other sides too.
The people in our risk and compliance teams care a lot about not making mistakes as well. It kind of reinforces and builds into things that need to be corrected because it’s a banking application and that’s key there.
Did it affect your time to market strategy?
The first thing I want to say is it’s really hard to know. Banking is very conservative compared to other applications. If you think about what the table stakes are for a bank, there are so many features someone needs. They need to send wires and paper checks. They need to deposit checks. They need debit cards. All this stuff is quite a lot, and you can’t just go out and build that stuff. You need to build partnerships with banking partners.
We initially went with one partner, and they didn’t work out. And we had to totally rebuild stuff on another partner. It took us a year and a half to launch Mercury. There are parts of Haskell where it’s slower to get stuff ramped up. I’d say probably one of the clearest things about this is in other languages, you have things like Passport, JS, Ruby, Devise. And I think they make it a lot easier to get off all like user authentication and stuff set up in a really nice way down to even sending emails or password resets, like all that stuff’s out of the box.
And in Haskell, sometimes you would like to have all the pieces there, but you might have to put them together yourself a little bit more. So that can slow you down a little bit more in the beginning. There is a benefit in the beginning, but I think that it tapers off because as you get to any sort of scale, you kind of have to do stuff from scratch anyway.
Did it benefit it in the long term because these central themes should be more customized for you?
Yeah, I think so. One is you understand what’s going on there clearly, and you’re making intentional decisions about it. Security is kind of a core competency of a banking application, down to choosing the right cost of the bcrypt password hashing algorithm and making a decision there, so for us that definitely made sense. I think it’s easier to just start from that and then adjust as you go. That’s got all sorts of complexity to it.
We heard that you joined the honorable ranks of companies who are not only using Haskell, but also contribute to its compiler and tooling. Where do your contributions fit in?
I think we’ve mostly gotten past the kind of the initial stuff we ran into like issues with, on too big of a project, we ran into linker errors on the new version of macOS and stuff like that. Now we’re into just sort of supporting additional performance improvements on the compiler. That’s definitely like our biggest area of focus. We have a project that is like 1.2 million lines of code.
It’s got a huge number of modules. It’s just like one monolithic code base, so we’re doing some work right now to, for example, support Buck2 for compilation. This is like an alternative to Bazel that Facebook has as a build system, so we’re kind of like taking GHC, and we’re using this Buck2 build system that Facebook develops. It’ll allow us to get a lot better per-module caching and, overall, get a big speed-up of builds across our CI systems and people’s local development environments; they can share a cache.
So the primary area of focus for our contributions is the performance of the compiler. And then I’d say a secondary area is things that support better development environments. Like in the open source community, there is a Haskell language server. And we had to build some of our own alternatives to that, because that hasn’t quite scaled up to our codebase. And we’ve had to make some contributions to the compiler though we just have one full-time person who works on the compiler, or they work with some contractors from Tweag or Well-Type to improve the compiler.
Do you have some specific vision of a way you want to extend these contributions further?
I don’t especially think we have some sort of vision of how we think people should change. Pretty transparently we should like to help other companies using Haskell basically, and they never even realize it. I mean, potentially maybe we could. If this Buck2 stuff goes well, maybe we could try and get other companies using that and make sure this is a very well-supported flow across the ecosystem. Potentially, it is more of a specialization for bigger companies.
The way Haskell language server works, it needs to do huge recompiles. And at least with other language servers, they often had to re-implement the compiler to get the different performance characteristics they needed. It’s just so different to run a language server that has real-time access than a compiler that’s focused just on the throughput of compiling code and getting to an end state executable. We’ve had to do some work with our tools. For example, we will use the static files that it generates. Let’s try to use these static files and not rely on recompilation, but I think, there’s no different grand vision.
Has using Haskell affected your hiring process and team growth? It’s kind of challenging to find a good Haskell developer, and your team is a pretty big one.
Yeah, so I think we are at about 200 engineers right now, with the vast majority doing some sort of Haskell development. Surprisingly, hiring Haskell engineers is the easiest role to fill in all of Mercury, which is unusual since engineers are generally harder to recruit. There’s greater demand for programmers familiar with Haskell than there are available candidates. This allows us to attract many talented developers.
Haskell is often not the first language one learns. This provides extra insight into the candidates’ skills. In contrast, with widely used languages like JavaScript, it’s challenging to gauge experience. Many people may have only dabbled in it for a short time.
We also draw attention from notable community figures like Gabriella Gonzalez, Matt Parsons, and Rebecca Skinner. They are well-known for their contributions to Haskell literature and blogs. Their involvement creates a flywheel effect that encourages others to join our team. Early hires are crucial for startups. They establish the company culture and quality standards, making it easier to attract more talent.
How do you address the bus factor issue with these early hires that you made and built your company with?
So the bus factor concept is how many people need to get hit by a bus before your company can no longer function. I don’t think the bus factor has ever really been an issue for us. Right from the beginning, it was easier to hire strong developers. Many people wanted to work for us because we use Haskell.
Early on, we hired people who had used Haskell before. There were plenty of strong developers who met all our other criteria and were focused on the product. The only case where it can be an issue is if there’s an existing team and one person on the team wants to use Haskell. If the other team members are committed to learning it, then I think it’s fine. This is especially true because there’s a lot of depth to the Haskell language, with many great features that make it ergonomic to write code and offer better type safety.
For most of the Haskell code we have, it is very similar to what you’d write in a language like Ruby on Rails. For example, if you look at our code for inviting a new user, we need to take in the HTTP request, parse the JSON, insert some rows in the database to represent this invite, and send out an email. All of this is quite imperative code. We intentionally do not use advanced Haskell language features, which keeps the code base more accessible. Recruiting has never been a challenge, and Haskell is actually a lot easier to learn than people think.
How do you assess the impact of the bus factor in your project? You mentioned that many features in your startup Conversion Rate are custom-built rather than utilizing ready-to-use financial tools, which could lead to issues with tribal knowledge.
The libraries are really solid because the Haskell ecosystem has been around since the late 80s, even before Java. And the libraries themselves are very high quality. Sometimes you need to put those pieces together, but it’s not really a challenge for developers. The other thing is, as I kind of talked about before, when you transition to a bigger company or a bigger scale, you kind of want to build stuff with “Lego pieces” anyway. You don’t want the framework to dictate what your password reset looks like. You probably have your rough email system or whatever. I think that reduces some of the tribal knowledge issues.
How has your experience with Haskell in production in many years evolved since the project’s inception?
I don’t feel like a ton has changed in how we expand a web application; we mainly add new HTTP endpoints and tables. For example, I recently added a feature to comment on transactions, which is not much different from the early coding days when we just added endpoints.
However, as we’ve developed deeper systems central to our banking layer, we had to build out a ledger with specific correctness guarantees. This necessitated more rewrites, which is normal in web applications as core logic evolves.
For the most part, though, not many changes have occurred. We now have improved type safety around database transactions. Previously, arbitrary actions, like making HTTP requests during a transaction, could lead to stability issues, such as deadlocks or exhausting the database transaction pool.
With Haskell’s type safety, if you’re in a database transaction, you cannot make arbitrary HTTP requests. Instead, you can only perform safe actions, like logging. This is a common challenge in production codebases, but Haskell helps prevent such issues at compile time, unlike other languages that require constant vigilance.
What tools or libraries have you found indispensable for Haskell? Developing in a production environment during this long period of time.
The web framework we use is called Yesod. It’s written by Michael Snoyman. Yesod is sort of like Ruby on Rails-ish. We don’t use nearly all the features of Yesod because we’re just returning JSON from our web server. So, the main libraries we use are Persistent, which is the database library, and then also Esqueleto, which is sort of just an extension to Persistent that lets you do more advanced queries, like joins across tables and common table expressions, and stuff like that.
So, those are by far the most common libraries. There are other libraries we use, but for the most part, Persistent SQL editors are the ones you’re dealing with when you’re writing a new feature.
How do you balance pure functional programming principles with practical business requirements?
I don’t think functional programming principles are in conflict with practical business requirements. From a business perspective, many benefits of functional programming align with their needs. A pure function is one that doesn’t have side effects; it takes a map of certain keys and values and returns a new map. At my previous company, I encountered a function that was supposed to be pure, but actually modified the map passed to it. This caused a hard-to-track bug, and I had to debug step by step through the production database on a staging server. The business doesn’t want us spending time debugging issues that should arise from a simple function.
It’s great that you can extract parts of your code and simplify it. For instance, moving a function that takes a map to a map into its own chunk allows the main code interacting with the database and parsing JSON to be smaller and clearer. This makes the tricky parts easier to read, which is ideal for about 95% of applications.
While there can be trade-offs — like when dealing with deeply nested pure functions, which are rare — you might need to refactor code to accommodate metrics. We have a monad for metrics in our codebase, but that’s a quick refactor.
You can create a free architecture that builds a pure tree of potential operations and an interpreter for that tree, allowing for business logic on top. However, we rarely reach that complexity with Haskell, so for us, there’s no significant trade-off.
How do you balance the need for quick iterations in a startup with Haskell’s compile-time guarantees? Can these guarantees potentially slow down development compared to going into production?
That’s a good question. There are a lot of pros and cons to it. When a compiler does more work for you, it’s probably going to end up being slower. It’s like checking more guarantees than a dynamically typed language that gives you nothing, or a language like Go that provides a lot less than Haskell does. It’s not just important to get code into production fast; you need to get correct code into production quickly. If you deploy broken code and have to revert it, that’s terrible for customers and disrupts the development team. Also, if you push a bug into production, it’s often really hard to track it down. It’s much harder to investigate that than it is in a local environment.
That’s why I think it’s really nice to have these compile-time guarantees. It becomes even more important as you scale to a larger company. If the code base is a million lines long, you want assurances that making a change in one area of the system won’t break something in another. Haskell gives you a lot more ability to do that.
It’s been a while since I’ve done a super clean build. I want to say it takes about 12 minutes. But usually, we use a kind of interpreted version of Haskell that doesn’t load stuff you don’t need. You can get a recompiling REPL set up for the code you’re looking at. That’s why GHCI has that functionality, or we use GHCID, the open-source version before it. Production compiles are pretty slow at this point; I think it takes about 30 minutes to build and run tests. That’s kind of why we’re investing in this Buck2 system: we want better caching for our CI production builds that need to be built really optimized.
Another big topic I want to discuss is the specificity of the financial domain. Have you found functional programming to be particularly advantageous compared to classical methods for developing such applications?
I don’t think there’s a big difference there. In our fraud code, we have guarantees from the type system that a request for something is made only once, even if it needs it twice. We can store all the inputs and results. For example, given a company with transactions and a specific risk tier, we can document the risk decision we made.
When an auditor comes in and asks, “What decision did you make, why, and when?” we can point to that data. I don’t see a pure functional programming aspect related to this. It’s really just the standard features of Haskell that we’re benefiting from, such as new types and generalized new type deriving. This makes it easy to prevent errors, like mixing up a social security number with an EIN number, which would otherwise be a pretty easy mistake.
Is the main benefit in the types rather than in the functional programming aspects, such as Haskell’s architecture and type system?
Yeah, if I were to use a language like Clojure, which is a functional language that is untyped on the JVM, I don’t think that would work at all for what we’re going for. Now, there is some interplay here. The functional system of Haskell itself provides better type safety features. You get things like immutability. I think it depends a little bit on how you define a functional language and similar concepts.
How do you see the use of untyped functional languages, like Clojure on the JVM, in relation to your needs?
My understanding of a domain-specific language is that you kind of build up a small set of actions that can occur in a system. Potentially, you’re building out a whole language that has a syntax to it and everything. One example of this is that I think there’s a Haskell company I’ve seen that built this mini-programming language for their semi-technical users to use to write queries or things like that. Or maybe you make a smaller one that has a more constrained set of actions and kind of build an interpreter around that.
I haven’t used them a ton, but my impression is that we mostly don’t use anything like that. And probably that added complexity is just not really worth it. I’m not a big believer in anything like that, at least with my current understanding of a domain-specific language. We want to add constraints sometimes; you run them and stuff like that, but I usually don’t feel like I want a full domain-specific language.
In my understanding, domain-specific languages are popular in many applications to prevent users and developers from making decisions that could harm finances. Why don’t you find them useful in your context?
With sort of limited experience working with domain-specific languages, I feel like part of this could be that most of the guarantees we need are at the database level and with things like locking and stuff like that. I guess it depends on how much you define something in a domain-specific language. You would like to add money or remove money or, like, transfer funds between these accounts. If something as simple as that is a domain-specific language, then, sure. I think of constraining those actions as a way to describe what a function can do.
In your experience, how does Haskell’s performance compare to other languages dealing with high-volume financial data?
I’d say our experience has been pretty great. Haskell has a green threading model, very similar to Go, where you can spin up a new thread, and when that thread is blocked, the GHC runtime will run another one. This allows you to process many things simultaneously. I really like this model because it’s quite simple. I find it simpler than something like Ruby on Rails.
I remember, with Rails, we’d run 64 processes on a machine, which added complexity. You could use threading, but there were a lot of limitations or gotchas. In Haskell, however, threading is straightforward, especially since Haskell gives you guarantees that make it easier to write concurrent or thread-safe code. We want performance, but we’re not dealing with the scale of something like a consumer social network, where millions of users might flood in. Our growth has been steadier.
The vast majority of performance issues we deal with are related to databases, which is probably true for most web applications. It’s easy to add a new web server, but it’s critical to avoid slow database transactions or unnecessary I/O. Haskell helps us write guarantees around that, making it easier to spot potential issues when handling database queries.
For example, in Ruby on Rails, you might write something like organization.users.permissions, and it would load everything behind the scenes. While that’s convenient, it can make it hard to see if you’re accidentally running an N+1 query. In Haskell, it’s much easier to look at the code and ensure you avoid such issues. You can clearly see when and how database queries are happening, which helps prevent performance bottlenecks.
How do you handle observability in a functional architecture with a large codebase, given the need for separate monads for input-output and their constraints?
So you’re kind of asking what sort of monads does Mercury use to kind of organize its code? I’m interested not in specificities, but in them also, but in the whole concept of observability in that big system.
We use the OpenTelemetry library, which was written for Haskell by one of the people who work at Mercury. I think OpenTelemetry is a fairly standard choice at this point. It essentially inserts markers into the code whenever we do a database transaction or something like that. Since most of our code operates in an IO-based context, we’re able to get a clear structure of where time is being spent. The vast majority of the time, when something is slow, we’re just figuring out, like, hey, this database query is slow, or we’re doing this query multiple times, so we need to go fix that.
The other observability tools we use involve sending metrics to Prometheus. We have something like a MonadMetrics setup where you’re able to trigger an explicit metric if you need it. For example, in the case of requests or similar events, we have this functionality built into the middleware of Yesod. So, it doesn’t seem too different from using a regular programming language in that respect.
I’m also curious about your infrastructure choices, especially your use of Nix and declarative approaches. How does this impact your operations? What are the benefits and challenges?
I’ll caveat by saying that I’m most familiar with our backend by far. I used to write a decent amount of front-end code, but I don’t do that anymore, and I’ve almost never written any Nix. That said, there are two main ways we use Nix.
One is for our development environments, to make sure every environment has the same version of libraries and tools. It makes auditing and troubleshooting easier — like if only one machine is using a specific Java dependency, we can spot that quickly.
On the other hand, even though Nix has a huge package collection — maybe the largest out there — we definitely run into cases where there isn’t a package we need, and we have to package it ourselves or ask a vendor for support. This, however, is for local development, not production servers. Nix does make rollbacks incredibly simple. You can instantly revert to the previous version, and that rollback covers everything.
Nix has a lot of great pros, but also some cons. It’s made harder because most people only rarely interact with Nix, so it always feels a bit unfamiliar if you don’t use it frequently enough to build proficiency. Overall, I’d say it’s the architecture choice I’m least certain about, compared to others like PostgreSQL, Haskell, or TypeScript, which feel pretty rock-solid.
Do you benefit from Haskell regarding the security side?
I think we absolutely benefit. When I used Ruby on Rails at a previous company, there were multiple arbitrary code execution vulnerabilities just from deserializing JSON. Python might have had similar problems with its pickle library. In Haskell, this kind of issue just doesn’t come up. When you look at the JSON parsing code, it’s really hard to see how you could ever reach arbitrary code execution. So that’s been a definite benefit for us.
Other ways Haskell helps with security are tied to the type system. The most important thing to prevent security issues is to address them at the root, preventing problems before they arise. The last thing you want to rely on is training people to write secure code — because that’s always error-prone. You’re constantly onboarding new people, and mistakes happen. The type system gives you ways to block specific security errors from happening in the first place. When you have a powerful, flexible type system, you start looking for ways to use it to prevent those errors.
If you could start over, would you make the same decisions about the tech stack and development culture?
I would definitely use Haskell again. I think that has worked wonderfully. And I’ll say, honestly, it’s hard to say you’d do anything different. Mercury’s been very, very successful, so how much do I really want to risk it? But Haskell’s been fantastic. I think we have perfect benefits for ourselves. In a lot of ways, it only gets better as you grow to a larger team and you want more of this type safety on a sprawling codebase.
React and TypeScript – we didn’t talk about them much, but I’m very happy with how they worked out. We use native Kotlin and Swift on our Android and iOS apps, respectively. And I think those have both been really great. I like them because they allow us to have a native application that perfectly resembles the surrounding system.
Postgres, by far, is my biggest recommendation to any startup. You don’t need anything else. You don’t need another caching layer, another job queue layer – until you’re very, very big.
Nix is the only thing I’m somewhat unsure about. It really has strong benefits and strong cons, and those are difficult to weigh against each other. For me, there’s more of a “fog of war” there, since I don’t interact with it that much.
What are the things that you did differently in your time?
This third-party tool we ended up using was really difficult to work with. We really learned our lesson there as far as you think you can just buy the system, and it works, but it’s actually just been easier to build ourselves. That’s by far been the area where I was like, I’d go back and reverse that decision. That clearly was a bad move. The other stuff mostly feels pretty great.
Thanks a lot. I really appreciate this conversation, Max.
Yeah, nice to meet you as well. Thank you, Daniel.
- Video version: https://youtu.be/NgI5OgfERU0?si=GeJg6oUGUrtQdQnC
- Audio version: https://podcasters.spotify.com/pod/show/functionalfutures/episodes/Interview-with-Max-Tagher-e2q8d6n
- Mercury LinkedIn: https://www.linkedin.com/company/mercuryhq
- Mercury X: https://x.com/mercury|
- Mercury YouTube: https://www.youtube.com/c/mercuryfi
- Max X: https://x.com/maxtagher
- Max LinkedIn: https://www.linkedin.com/in/maximilian-tagher-641ba147