top of page

An Alternative to the Transactional Outbox Pattern

By Karol Bafia, Senior Java Developer


Microservices architecture is Kitopi's way of modularizing the software and organizing teams that support it. A consequence of modularization, however, is the need for integration. There are scenarios where the consistency of this integration must occur 100% of the time. This is when we begin to consider using the Transactional Outbox design pattern. However, we quickly realize how expensive it can be to introduce and maintain¹, so we may also wonder what alternatives are available.


The first alternative is to reconcile data after a failure² manually. This may seem like a poor plan, however, see what your statistics show. How often data problems caused by DB-Messaging inconsistencies occur? What is the impact on customers, the cost of manual data correction, or possible financial penalties? How does this compare to the cost of introducing and maintaining a Transactional Outbox?


If you have to decide between manual reconciliation and rolling out the Outbox pattern in your organization, this article may be for you.


I will discuss alternatives to the Transactional Outbox pattern that guarantee the same but are less expensive and easier to introduce.

(I assume that 2 Phase Commit or Event Store are not alternatives for us, otherwise, we would not consider the Transactional Outbox pattern.)


The Transactional Outbox Pattern.

Let's begin by recalling the pattern itself. Here's how it is defined in the context of microservices:



The diagram used to visualize the pattern shows a use case in which an order is updated and the resulting event is emitted with the help of the separate “Message Relay” process/service.


That outgoing event is important for us. It exposes the integration between microservices. What does it mean?


This means that we are inside a microservices landscape. We don’t end up with a single microservice. There has to be a subscriber to the message we emit, possibly another microservice. That microservice is called asynchronously because it subscribes to the Message Broker.


The fact that a Message Broker is used is crucial to everything that will be discussed here because it is the mechanism that ensures Eventual Consistency.

(From the technical POV Eventual Consistency is achieved by “durable retries”, so messaging or timers)³.


There are 2 common integration cases that I know of from the real world, in which alternatives to Outbox will work just as well. A case when:

  • A service reacts to an event from another service.

  • Caller awaits consistent updates from 2 or more microservices.


Let me share the details in a dedicated section for each of those.


A service reacts to an event from another service.

[Order Service] –✉️→ [Delivery Service]


The Order Service cannot exist without the Delivery Service, so let's extend our example to include it. The Delivery Service subscribes to the Order Service’s event. The delivery service also emits its own event. Let's assume DB-Messaging consistency is also a must inside the Delivery Service.


With the pattern

First, we'll use the Outbox pattern in the Delivery Service: 



Everything will be consistent.

Now let's look for alternative ways to ensure DB-Messaging consistency. Let's start with a little experiment. What will happen if we simply remove the pattern?


Without the pattern


Now the Delivery Service first updates the Delivery record, and then directly invokes a Message publishing (6) without a transaction.


Outgoing message-related consistency

But what about the consistency? What if the service dies⁴ after updating the Delivery record (5) and before publishing the message (6)? Then the messaging consumer (4) will retry the message delivery because there was no ACK5.


Retries (4) will make the DB (5) and the messaging (6) consistent. However, there is a condition that we must meet for this to work. The message’s consumer must be idempotent. Sometimes the Delivery entity can be naturally idempotent, otherwise, a consumer needs deduplication and/or an ordering mechanism. Idempotency means also producing the same outcome if the message (4) is received more than once⁶ᵃ. This means resending the outgoing event (6)⁶ᵇ. After all, we don’t know whether the outgoing event (6) was published while handling the original event (4). This might create a duplicated event (6).


But is this in any way different from when we use the Outbox pattern? No.

A duplicated outgoing event (6) can also occur when the Outbox pattern is used. This is because Message Relays can produce duplicates⁷. So the service, that consumes the Delivery service’s event (7), must also be idempotent. 


Incoming message-related consistency

To complete our comparison, we need to return to the logic for handling the incoming event from the Order Service (4). I wrote that this logic must be idempotent for the approach without the Outbox pattern. So maybe, when applying the pattern, we don’t need to bother about the idempotency, and have less coding? Here also, the answer is: “no”.


Even if we are applying the Outbox pattern, the Delivery Service may die after the DB transaction commits and before the incoming message’s ACK (4). This will result in a redelivery of the message (4). A consumer needs to be prepared for this in both approaches. (Another reason for idempotency is that a duplicate message can also be sent by the Message Relay during its recovery⁷). 


In both approaches, we have very similar requirements for the consumption of an incoming message. We could just skip the outbox insertion, but that doesn't matter because the recipient of outgoing messages (7) has to be ready for duplicates anyway, as discussed above.


Retries are not only for missing ACK

In addition to the issue of DB-Messaging’s consistency within a service, we also need to address data consistency between microservices. Only then will the consistency topic be fully resolved. 

In the case of message handling, we can't return an error to the caller because we couldn't update the database, as we do with REST commands. We have to retry message handling until the DB becomes available (within a reasonable time range). This means that we need to configure the message subscriber (4) to retry after DB exceptions. And we need to do this for scenarios with and without Transactional Outbox.


If we don't use the Outbox pattern, then after we configure retrying for DB exceptions, we must also configure retrying for message publishing exceptions. But since retrying will be configured anyway, we just need to extend it to message broker-specific exceptions (if Kafka’s, see here).


With vs without the pattern

It turns out that we can achieve similar DB-Messaging atomicity with and without the Transactional Outbox pattern if the logic is initialized from the message handler. Additionally, idempotency and retries shall be implemented properly.


Caller awaits consistent updates from 2 or more microservices.

[Order Service] –✉️→ [Customer Service] –✉️→ [Order Service]


Let’s now consider another case of microservices integration. Here is the example scenario from microservices.io we will use to illustrate it:

“Customers have a credit limit. The application must ensure that a new order will not exceed the customer’s credit limit. Since Orders and Customers are in different databases owned by different services the application cannot simply use a local ACID transaction.”

Customer Service has just been added to our landscape.   


First, I will show what the integration would look like if we could manually deal with data inconsistencies. I will use pseudo-code:


Order Service (if inconsistencies can be handled manually)

@POST
OrderController.createOrder(request) {
Order order = Order.pending(generateId(), request))
 orderRepo.save(order) // for supporting idempotency response =
restClientCustomerService.reserveCredit(
                             commandFrom(request))
 if (response.orderExceedsCreditLimit()) {
  order.reject()
 } else {
  order.approve()
 }
 orderRepo.save(order))
}

The code doesn't meet consistency requirements, but its big advantage is its compactness, simplicity, and expressiveness. It is synchronous, which makes front-end development straightforward.

If the Order service dies just before or after the REST call to Customer Service (reserveCredit()) we would end up with inconsistencies between services. These would need to be handled manually. However, this is not the only drawback of such integration. Less obvious problems include:

  1. Scalability issue, because:

  2. HTTP threads and related resources, such as memory, are symmetrically blocked in both services during the reserveCredit() REST call (if nonreactive).

  3. Since DB is usually the bottleneck in e-commerce and similar web applications, scaling up by adding new pods will not help in case of performance issues.

  4. Resiliency issues. If you want to configure timeouts for REST calls between services, you will face the challenges described here.


Replacing the REST connection with a message broker is the optimal way⁸ to solve the consistency problem in a microservices architecture. Scalability and resilience improvement come in a package. But the message broker is just a tool. What will ensure data consistency between services is the Saga Pattern. The pattern is explained here and the Order-Customer integration example comes from there: https://microservices.io/patterns/data/saga.html. So if you need to, you can jump to this site and learn more. But I also copied here the flow of the interactions (choreography variant, but all applies also to orchestration):



  1. REST (synchronous) call for creating an Order in a PENDING state. (The synchronous part ends afterward.)

  2. Once an order is persisted, an Order Created event is emitted. (The asynchronous part starts with this step)

  3. The Customer Service‘s DB is updated

  4. Then the event is emitted

  5. The OrderService’s event handler either approves or rejects the Order and persists the change.

Points 1 and 2 and 3 and 4 additionally require DB-Messaging consistency.

This is where we can use the Outbox pattern and next check for alternatives.


Let's try to write it down using pseudo code (for simplicity we cover only the Order service, but the logic in the Customer service is similar)


With the pattern

 Order Service (saga + outbox to handle inconsistencies)

@POST
@Transactional
OrderController.createOrder(request) {
 orderId = generateId()
 Order order = Order.pending(orderId, request))
 orderRepo.save(order)
 outboxRepo.save(order.getPendingEvents()) // OrderCreatedEvent
 return orderId; // for polling
}
@GET
OrderController.getOrder(orderId) {
 return orderRepo.findById(orderId)
}

Order Service cont. (saga + outbox)

@MessageHandler
CustomerServiceEventHandler.handle(CreditReservedEvent event) {
 order = orderRepo.load(event.getOrderId())
 order.approve()
 orderRepo.save(order)
}

@MessageHandler
CustomerServiceEventHandler.handle(CreditLimitExceededEvent) {
  …  order.reject()
  …
}

We can see that the only synchronous part is the logic for saving the order in the PENDING state. After that, the REST call ends, and the subsequent flow, starting from the Message Relay, is asynchronous. So how can the client determine the outcome of the call? Or how can the client be notified of possible errors from the asynchronous part? Most likely that client is our front end, so it will not subscribe to the message broker or expose a webhook.

Our front end will need to use one of the following approaches:

  • Periodical polling (GET / orders / {orderID}) (BTW the application should have a polling limit in case something goes wrong),

  • Listening to the saga’s completion events using web sockets.


Without the pattern

Saga + Outbox is our baseline. The complete create order call is asynchronous. The outcome of the call (either the order’s approval, rejection, or errors) is later polled.


So now let’s try to remove the outbox pattern. As we learned from the implementation of the Delivery Service without the Outbox pattern, calls need to start from the message broker to assure eventual consistency. Let’s do a small tweak to the outbox approach.


We will post the createOrder REST request to the message broker. In other words, we will turn a synchronous command into an asynchronous one. Then we will remove the Outbox part. The idea is inspired by the Listen to Yourself Pattern.

(It is not literally it because of its drawbacks discussed for example here).


The pseudo-code will illustrate this best:


Order Service (saga + “listen to yourself”)

@POST
OrderController.createOrder(request) {
 orderId = generateId()
 messageBroker.publish(
                 CreateOrderInternalCommand.from(orderId, request))
 return orderId; // for polling
}
 
@GET
OrderController.getOrder(orderId) {
 return orderRepo.load(event.getOrderId()).orElse(KEEP_POLLING)
}

Notice the change in the POST order method. If the business case requires some validation before creating the pending order, it can’t be done here as it could be in the approach with the Outbox pattern. This may be irrelevant, like in our example, but it can also make this approach inapplicable.


The next difference can be seen inside the GET order method. It can happen that the method will return KEEP_POLLING to the client. This is the scenario in which the internal command is not yet processed, and the PENDING order is not yet in DB (we say the client can’t “read its own writes”).


The logic continues as follows:

Order Service cont. (saga + “listen to yourself”)

@MessageHandler // the only subscriber
InternalMessasegesHandler.handle(CreateOrderInternalCommand cmd) {
 Order order = Order.pending(cmd))
 orderRepo.save(order)
 messageBroker.publish(order.getPendingEvents())//OrderCreatedEvent
}
 
CreditReservedEvent & CreditLimitExceededEvent handlers are the same as for Saga+Outbox

 

With vs without the pattern.

It turns out that:

We can achieve similar DB-Messaging atomicity with and without the Transactional Outbox pattern if the Saga Pattern is used. However, if synchronous validation of the request vs the DB’s state is needed, this alternative may not work.


Conclusions

You should not need the Transactional Outbox pattern if:

  • The logic is initialized from the message handler.

  • A transaction that spans multiple services is required (not applicable to all business scenarios).

Idempotency and retries shall be carefully implemented no matter if the Outbox pattern is used or not.


Further (dive in) reading:

Comparison of all alternatives for dealing with distributed transactions:


References

1 Regarding only the Message Relay, to help you catch the complexity impression, for example, check the documentation of  Kafka-Connect and Debezium for Postgres

"Implementing DDD", Making Aggregates Work Together, section: “Lack of the technical mechanisms”

4 Dies, service:  in practice nowadays means OutOfMemoryError (OOM) or pod being killed.

5 ACK = acknowledgment sent to the Message Broker that a message was consumed. See how it works for Spring-Kafka, but it should be similar for other brokers.

6 “Life beyond Distributed Transactions: an Apostate’s Opinion” (“physics” for no-2PC and DBs scaling)

“To ensure idempotence, the recipient entity is typically designed to remember that the message has been processed. Once it has been, the repeated message will typically generate a new response (or outgoing message) which mimics the behavior of the earlier processed message.”

b section: 5. ACTIVITIES: COPING WITH MESSYMESSAGES

Debesium documentation. Section “Kafka Connect process crashes” “Because there is a chance that some events might be duplicated during recovery from failure, consumers should always anticipate some duplicate events”. This is true for any Outbox implementation.

“Software Architecture: The Hard Parts - Neal Ford” (synchronous sagas, although exist, are impractical for microservices architecture)


 

 




Kommentarer


bottom of page