Our journey to a mono repo in a microservices environment

microservices! The promise of scalability, independence, and a developer’s dream – I truly agree. But what if that dream turns into a bit of a nightmare? That’s exactly what happened to us a year and a half ago. As any product starts, we started small, with just a couple of repositories for our microservices and libraries. Then, as our library collection increased, so did the number of repos. We soon found ourselves tangled in a complex web of dependencies, and navigating it felt like hacking through a jungle.

A bit about the situation we were in

Look at these dependencies tree!

*All the services are updating the same database (we have more microservices in the product updating other databases, but they are not relevant to the story)

Seems cool right? not really, I will try to describe the main pains of where the fun (and frustration) began:

Branch Tree updating: Imagine working on a feature in a library buried deep in the dependency tree. Every change required a domino effect of updates across all the service branches that depended on it.
Leaf Level updating: Sometimes, a change in a common library (not has to be an infrastructure or deep library, even a logic library) needed to be reflected across multiple services – online services, offline services, the whole gang. Updating them all felt like an endless game. feature after feature.
Build errors: Skipping leaf-level updates for some services might seem tempting, but it came back to haunt us later. Developers would encounter mysterious build errors stemming from changes made weeks or months ago, with no clear path to a fix. Not exactly a recipe for developer happiness.
E2E Earthquake: And then there were the E2E tests. Updating a service late in the game could trigger cascading failures, leaving us scrambling to identify the culprit. It could take days of detective work, involving multiple developers and a deep dive into commit history. All this while, our shiny new service remained unreleased, gathering dust.

We knew something had to change. We needed a solution that streamlined development, fostered collaboration, and didn’t require spelunking through a code cavern.

A small note

For our luck, the dependency jungle and relevant microservices was under team ownership so we can control the path to solution without affecting other teams – so other microservices in the product and even in the team was out of the picture, only the relevant libraries/services in the dependency jungle interested us.

Exploring Escape Routes

We didn’t just dive headfirst into a mono repo. We meticulously evaluated our options:

Automatic merge request for updating dependencies: This involved the CI/CD automatically opening merge requests in dependent applications/libraries whenever new code was merged into a library. While easy and fast to implement, it wasn’t atomic, required manual intervention, and left some issues like build errors and the E2E unaddressed.
Common logic libraries can work as-a-service: This classic approach is aligned with microservice principles and addressed most of our issues. However, it meant a significant architectural change, which we weren’t keen on due to potential performance drawbacks for our big data processing Spark applications.
Mono Repo: This emerged as the most promising solution, offering the potential to fix all our woes.

Mono Repo

We opted for the mono repo solution because we craved a fully automated approach that addressed all our pain points without altering our product architecture. Here’s how it transformed our development workflow:

Instantaneous Impact: Updates to a library are instantly reflected across all dependent branches and services. No more time-wasting branch updates, just a faster, more efficient workflow.
Build Error Bonanza: Identifying and fixing build errors becomes a breeze. Developers can pinpoint issues right away, preventing them from becoming time bombs down the line.
E2E Efficiency: Our CI/CD pipeline acts as a safety net for E2E tests. Any change that breaks a service’s E2E test simply won’t get merged. This saves us from those multi-day debugging marathons and keeps our releases rolling smoothly.

Implementing the Mono Magic: A Technical Deep Dive

Initially, we explored tools like Bazel (by Google) that could identify changes in specific apps/libraries upon every merge and trigger independent CI/CD pipelines for them. This approach, while maintaining separate pipelines, would still utilize the single codebase of a mono-repo.

Then, we come up with those points that led us to change a bit the mono repo implementation:

Version Visibility: Maintaining separate version tags for each application within the mono-repo would make it difficult to ensure all components were aligned and in sync.
Interdependency Woes: Since most changes would likely impact the core common logic library upon which all applications depend, independent pipelines wouldn’t capture the need for a unified installation process. We envisioned a single installation command for all interconnected applications, similar to how a single microservice is deployed.
Selective Migration: We weren’t aiming to migrate all repositories into the mono-repo, only the tightly coupled applications and libraries that formed the tangled dependency tree within our team.

What we learn? we need an atomic installation and single version.

These considerations led us to the Helm umbrella chart pattern, which leverages Helm’s dependency feature (https://helm.sh/docs/helm/helm_dependency/). In essence, this pattern allows us to create a single installation command for all applications within the mono-repo.

Here’s a breakdown of the key technical details:

Unified Codebase: We consolidated all relevant libraries and services into a single, well-organized Git repository, forming the foundation of the mono-repo.
Centralized CI/CD Pipeline: A single CI/CD pipeline was implemented to govern the entire mono-repo. This pipeline streamlines the development process by managing builds, tests, and deployments for all interconnected components.
Helm Umbrella Chart: We utilized Helm’s dependency feature to define the relationships between the various applications within the mono-repo. This umbrella chart acts as a master control panel, specifying the individual charts (applications) and their dependencies. When deploying the umbrella chart, Helm automatically resolves all dependencies and installs the applications in the desired order.
Java Build Tool Integration: For building and managing our Java-based applications within the mono-repo, we leveraged sbt (Simple Build Tool) (https://www.scala-sbt.org/1.x/docs/Multi-Project.html), a popular build tool for Scala projects that also excels in handling multi-project builds. This integration ensures efficient compilation, testing, and packaging of all our Java libraries and services within the mono-repo framework.

This approach ensures a single version/tag for the entire codebase and facilitates a streamlined single-command installation process, all while maintaining our microservice architecture. The changes are primarily confined to the development and CI/CD phases; the production environment continues to leverage independent, containerized microservices running on Kubernetes.

just to feel a bit –

Hold on, isn’t this a monolith?

Not quite! A monolith is a giant single service (like we did a decade ago), whereas our microservices remain independent and containerized, running separately on Kubernetes in production. The mono-repo only affects the development and CI/CD phases, bringing all the tightly coupled services and libraries together for a smoother development experience. It’s like having all your tools neatly organized in one toolbox, but each tool is still used independently for its specific job.

It’s all about velocity (and developer experience)

Transitioning to a mono-repo wasn’t without its challenges, but the impact on our team has been undeniably positive. Streamlined workflows and instant dependency updates have fostered a more efficient development process. While it’s difficult to pinpoint an exact increase in velocity, the anecdotal evidence is clear: developers are demonstrably happier. They spend less time wrestling with dependency issues and have more confidence in the stability of their code thanks to the robust CI/CD pipeline ensuring E2E test success. While the numbers of the velocity increase might not be readily available, the positive feedback from the team speaks volumes about the success of this approach.