Sagas — Part 2b: Sagas in Distributed Systems Continued

Pragmatism and Boundaries

Sagas — Part 2b: Sagas in Distributed Systems Continued

Sagas — Part 1: An Introduction
Sagas — Part 2: Sagas in Distributed System
Sagas — Part 2b: Sagas in Distributed Systems Continued
Sagas — Part 3: Choreography Instead?
Sagas — Part4: Design Considerations

“Pragmatic” Persistence

In part two of this series, I talked about the difference between Sagas and Process Managers and said Sagas do not persist their state. They use information from messages alone to decide what to do next. That’s true; however, what is meant by persistence here, in my opinion, is that they don’t use their persisted state to make decisions. Does that mean they can save the state as long as it’s not used for decision-making? The purest would probably say no.

However, consider a situation where the service hosting the Saga crashes. As the steps in the Saga are idempotent (I will discuss this in the next post), we can restart the Saga from the very first step, redoing already completed steps — if any. But what happens when one of those already completed steps fails because the corresponding service is down for an extended period or it has a bug awaiting deployment for a fix? The Saga is ground to a halt unnecessarily. We have experienced this in production systems. Suppose we had persisted the Saga’s progress. In that case, we could have restarted the Saga exactly where we left it before the crash and avoided needlessly stopping it by trying to complete steps that had already been completed. Sometimes we need to pick the pragmatic option if we think it makes our system more resilient.

Too Many Sagas?

If you rely on Sagas in many of your use cases, then your service boundaries are incorrect. Microservice boundaries are transactional boundaries too, and if your transactions continually cross those boundaries, you need to go back to the drawing board.

The overreliance on Sagas could be because your microservices are:

  • Entity Services
    The service boundaries are incorrectly set around entities creating too much coupling between them to carry any task.
  • Nanoservices
    Services are too small and hence cannot achieve anything meaningful on their own. They rely on other services to carry out the most straightforward use cases or every use case.
  • Layered Services
    These result from horizontally decomposing systems — leading to: front-end, middle-tier, and persistence services. Again, these services cannot exist independently and must constantly communicate on almost every transaction.

Microservices should be self-contained and autonomous and have low coupling. Sagas and orchestration, in general, increase coupling between microservices, so we should explore the possibility of re-examining our service boundaries to see if we can model them differently. Techniques such as Value Stream Mapping and EventStorming could help with this.