Saga Pattern

In this blog we will discover what is the Saga pattern and how it works and we will try to make two different implementations using two different approaches : The choeragraphy approach as well as the orchestration.

Introduction

The Saga pattern is a design pattern used to manage distributed transactions in a microservices architecture. Unlike traditional transaction management methods that require locking resources across multiple services, the Saga pattern breaks a transaction into a series of smaller, independent sub-transactions. Each sub-transaction is executed by a different service and, upon completion, triggers the next sub-transaction. If a sub-transaction fails, the pattern invokes compensating actions to undo the changes made by previous sub-transactions, ensuring the system returns to a consistent state. This approach allows for more scalable and resilient transaction management, avoiding the performance bottlenecks and blocking issues associated with protocols like two-phase commit (2PC). The Saga pattern thus provides a robust framework for maintaining data integrity and consistency in distributed systems without compromising on the benefits of microservices architecture.

Sagas can be implemented in two main ways: orchestration and choreography. In orchestration, a central coordinator controls the sequence of sub-transactions, while in choreography, each service reacts to events and triggers the next step in the process. The Saga pattern thus provides a robust framework for maintaining data integrity and consistency in distributed systems without compromising on the benefits of microservices architecture.

Choreography

Overview

Let’s now deep dive into these Saga types starting with Choreography. So, Choreography is a decentralized approach to managing workflows in a microservices architecture which means each microservice performs its tasks independently and communicates with other services through events. This event-driven approach allows services to react to changes and trigger subsequent actions, without requiring a central coordinator.

Let’s see a concrete diagram:

Components of the Diagram

Event Broker:

The event broker is a central component in an event-driven architecture. It is responsible for distributing events to various services. Examples of event brokers include message queues like Apache Kafka, RabbitMQ, or cloud-based solutions like AWS SNS/SQS.

In the diagram, the event broker acts as an intermediary that receives events from services and forwards them to interested subscribers.

Microservices:

Each microservice is represented by a black box with a gear icon, indicating that it performs specific business logic or tasks.

These services are loosely coupled and operate independently. They subscribe to events from the event broker and publish events based on their own processing logic.

Databases:

Each microservice has its own private database, ensuring data encapsulation and independence from other services.

The databases store the data necessary for each service to operate and maintain consistency within the scope of the service’s responsibilities.

Client Applications:

The diagram shows client applications (such as web, mobile, or desktop apps) that interact with the microservices architecture.

Clients may initiate actions that generate events or query the services for information.

How Choreography Works:

Event Generation:

When a client initiates an action (e.g., placing an order), the corresponding microservice performs its task (e.g., processing the order) and publishes an event (e.g., OrderPlaced) to the event broker.

Event Distribution:

The event broker receives the OrderPlaced event and distributes it to all subscribed microservices that are interested in this event.

Event Handling:

Subscribed microservices react to the event based on their own business logic. For example, an inventory service may listen for the OrderPlaced event and update stock levels, then publish an InventoryUpdated event.

A shipment service might listen for the InventoryUpdated event to prepare the shipment, and so on.

Compensating Actions:

In case of a failure, services can publish compensating events to roll back or mitigate the impact of the failure. For example, if the payment service fails, it might publish a PaymentFailed event, prompting the order service to revert the order status.

Here is now an image that illustrates an event-driven interaction between two microservices: the Order Service and the Product Service. This interaction leverages topics for event publishing and subscribing, demonstrating how these services communicate asynchronously:

This event-driven interaction allows the Order Service and Product Service to communicate asynchronously through events, enhancing the decoupling and scalability of the system. Each service operates independently, reacting to events published to shared topics. This design enables flexible and resilient interactions, as services can continue to function and process events even if other parts of the system are temporarily unavailable.

By using topics for event publishing and subscribing, this architecture supports a scalable and maintainable microservices ecosystem, where each service can evolve independently while still maintaining robust inter-service communication.

Demo

Now, we will proceed to the code and demonstrate the same examples above with two microservices: Product and Order services.

You can find the source code for this blog here

In this demo, we will use Spring Cloud Stream for the event broker Kafka, along with Spring Flux.

Spring Cloud Stream

Spring Cloud Stream is a framework for building message-driven microservices. It abstracts the details of connecting to and interacting with message brokers like Kafka, allowing developers to focus on business logic. Key features include:

Binder Abstraction: Simplifies switching between different messaging systems.
Event-Driven Architecture: Enables real-time data processing by reacting to events.
Declarative Bindings: Separates business logic from infrastructure concerns through easy configuration of input and output channels.
Stream Processing: Supports both continuous and batch processing of data streams.
Integration with Spring Boot: Facilitates easy setup and configuration.

Spring Flux

Spring Flux is part of Spring WebFlux, a reactive web framework for building non-blocking, asynchronous applications. Key concepts include:

Reactive Streams: Implements the Reactive Streams specification for asynchronous stream processing with non-blocking backpressure.
Mono and Flux: Core data types representing single and multiple (0..N) items, respectively, used for handling reactive data streams.
Non-blocking I/O: Built on non-blocking runtime engines like Netty, allowing efficient handling of many concurrent requests.
Functional and Annotation-based Endpoints: Provides flexibility in defining routes and handlers through functional programming or traditional annotation-based two microservices: Order and Product, along with a Gateway and Eureka Discovery. This is a basic microservices example. What we are interested in here is how the Saga choreography is implemented.

The saga starts with a POST request to the OrderController with the productId and the desired quantity of this product.

After the order is saved to the database with status Processing, the saga begins by emitting an OrderEvent to the order-event topic.

This is the configuration for the Kafka publisher, which will handle the publishing of the event to the specified topic order-event using the oderSupplier method as defined in the application.yaml:

application.yaml :

Now, we will go to the Product service. Here, we need to check if the quantity requested in the Order service is adequate to determine if the product is out of stock or not.

All this is managed by the event responder, which takes the OrderEvent as an argument from the order-event topic and returns a ProductEvent to the product-event topic, stating whether the product is out of stock or available.

application.yaml

Finally, we return to the Order service, where there is a consumer listening to any event in the product-event topic that states the status of the product's availability. If the product is available, we update the order status from "processing" to "created." If the product is out of stock, we update the order status to "failed."

Testing

Now, for the tests, we need to have Kafka installed either locally, in the cloud, or in a Kafka Docker container. In this case, we will run Kafka locally.

We start with Zookeeper:

bin/zookeeper-server-start.sh config/zookeeper.properties

Then, we start the Kafka server. After that, we can run the application.

bin/kafka-server-start.sh config/server.properties

then we run the application

Here are our products added when launching the application.

Let’s try to order 8 HTC Electric Shaver product with id:1

curl -X POST \\

http://localhost:8093/orders/ \\

-H 'Content-Type: application/json' \\

-d '{

"productId": 1,

"qnt": 8

}'

The order is processed, and the OrderEvent was emitted.

We use this command to start listening to the specified topic and display the events as they are produced to the topic.You should see the events printed in the terminal window in real-time.

bin/kafka-console-consumer.sh --topic order-event --bootstrap-server localhost:9092

Like this for OrderEvent :

After verification in the product service, we received a product event stating that the product is available.

This for ProductEvent :

bin/kafka-console-consumer.sh --topic product-event --bootstrap-server localhost:9092

Then, the order is updated to the created state.

we have now just 2 pieces of product with id:1

Let’s redo the same post.

Now we've received that the product is out of stock because we ordered 8 of this product while there are only 2 in stock.

And the order failed.

This process embodies the choreography pattern using events and Kafka topics. Through this orchestration, events trigger actions across microservices, allowing for decentralized coordination and seamless communication within the system.

Orchestration

Overview

Now that we have talked about how we can handle the saga pattern within the microservice architecture using choreography, there is another approach that we can use which is orchestration.

So what is exactly the orchestration?

It is a very widely used term within the distributed systems where we have a centralized element called the orchestrator that will handle and coordinate with all the other elements to ensure the flow of the operations.

Well, I hope this little introduction has helped you get the general idea of how orchestration works within transactions.

Instead of making communication between the microservices together, we make an orchestrator that will handle this and will take care of that communication.

Here is an overview on how orchestration works

As you can clearly see , we do have an orchestrator that will take care of all the transactions coming and going forth which is very beneficial because it reduces a lot of complexity we do unfortunately face when dealing with choreography. For the communication between the orchestrator and the microservices. It really depends on the needs and the use case.We could be using a message broker like Kafka or making direct HTTP requests.

Now let us dive even deeper in a more concrete example

Well , it feels a little bit absurd at a first glance but we will try to explain it in detail.

Now that we have gotten a little bit familiarized with how transactions tend to be and how they are dealt within the microservices architecture.

First we do have three microservices(order-service/payment-service/inventory-

service) and each one of them has its own database.Furthermore, we do have an orchestrator taking care of the transaction.

Last , we do have two kafka topics flowing between the order-service and the orchestrator.As for the communication it is done using Http requests to both the services.

Let us now explain the transaction happening here in detail:

1/The request:

First the request is started by making a request to /order endpoint and by that the creation of the order will start.

2/Sending a topic to the orchestrator:

Once the first manipulations start at the order service we need to proceed with the order and see if quantity ordered is enough and to see whether the payment would be completed. As a consequence a kafka topic would be created and carrying all the necessary information about the order that has been created to the orchestrator.

3/The orchestrator:

Once the orchestrator receives the order, it will then define the workflow in order to make sure that the order would be created. In this case we should also be preparing for a worst case scenario or by that I mean a reverting scenario in case the completion of the order failed.

So At first glance, the orchestrator will send two Http requests to the other two services to proceed in the payment and the retrieval of the desired quantity.

4/The response :

Well if everything goes well which is the ideal scenario, then the requests will succeed and the orchestrator will resend another kafka topic carrying the order with a completed status.Otherwise if things goes wrong the order would be updated to failed .In an ideal scenario, the order should be deleted.

Demo

Now that we have dealt with the explanation of the transaction, let us get our hands a little dirty by diving deeper into the code.

You can find the source code for this blog here

Let us start by explaining some mandatory functionalities within the code.

Here is the global architecture of the project where each service is represented by a microservice alongside the orchestrator and another useful microservice that holds the DTO’s(data object model) shared among the different services in order to reduce the repetitive code.

You may notice that we do have the docker-compose file holding the images of kafka and zookeper.

As for the dependencies , there are some common dependencies shared among the microservices:

Well this dependency takes care of adding all the core functionalities of the reactive programming including webflux, Sink,etc…

For these dependencies , they ensure the communication with the message broker kafka.

Now let us follow a little bit the workflow of the transaction and have a global look at the code itself.

1/The Order-service

Here is the method that handles the creation of the order that return a Mono(reactive type in Project Reactor representing a single value or nothing) and takes within its body the an order-request.

This Object will be treated in the service layer in order to transform the DTO in a real entity that would be created within the database and send the order-request in the kafka topic.

Let us take a look on the configuration file

Here are the topics created in kafka;one responsible for the order creation and the other one responsible for updating the topic.

2/The Orchestrator

Now that we saw how the order is created, let us take a deeper look on how the order is handled in the orchestrator.

Once we receive the order-request, the orchestrator will start its workflow

This is the most important method where first we define the list of steps to follow.

We have already defined the workflow of this transaction by making requests to the payment as well as the inventory service.

For each service, we have defined a process method and a revert method in case something goes wrong:

For the Http requests , we are using the webclient

But wait , what is exactly the webclient?

You could think of webclient as an alternative to openfeign but designed for reactive programming and supports asynchronous requests.

3/The payment and the inventory services

Now that we have witnessed the general workflow of the transaction let us see what happens to the other remaining services.

In the payment service:

The same thing goes to the inventory service

4/The order response

Now that we have seen almost the workflow we need to see how the order is updated.If you have paid attention to the main method of the orchestrator there are two alternatives , whether the order would succeed otherwise it will fail.Well , once is it sent back to the order -service through the kafka topic:

Then it will make a call to the service to update the order based on the response sent through kafka

Testing

Now that we have got a clear glimpse on the actual code which seem a bit difficult due to its reactive nature, we can now see the test cases that might occur.

We sent our first request on the /order endpoint

To see if the order was completed or canceled , let us check the other endpoint

Well that is a relief , the order was completed.

Let us now run a series of tests and see when the order would be canceled because the inventory has a certain limit

Well now that you can see it clearly the third order was canceled because simply there is no enough products left.

Wrap up : Choreography vs Orchestration

Woah, A lot of code was explained in the past few paragraphs but luckily we managed to take a good grasp on the importance of the two aspects of the saga pattern which are the orchestration and the choreography but hold on, we still have an important question to ask: Which is better? Choreography and orchestration.

I can easily say that this one is better than the other but in reality it depends on the situation and the nature of your infrastructure.All we can give right now is a glimpse on the weaknesses and the strengths of each one and how we can handle them.

Well in this table we tried to cover some major advantages and disadvantages of each approach.

Starting by the choreography , it can be a good solution if we focus more on decentralizing the business more and more and also it is efficient for being more resilient and having the ability to overcome failures.However there the problem of complexity because while using choreography we need , for each service to implement events that will be emitted and received which it will add a lot of code and consequently a lot of difficulty. Furthermore , it is extremely difficult to debug and monitor those services simply because things are decentralized which means it will take a lot of time to discover the errors.

As for orchestration , it is easier to implement thanks to the orchestrator where we define the workflow and consequently it is more efficient and makes more sense. However the main problem of orchestration is the centralized orchestrator that once it fails everything will fail and there is a lot of workload on that component.

In conclusion, we can say that choosing between these two approaches depends on your business needs and also on your strategy.