Introduction:
Microservices have become the standard in software development, offering benefits such as scalability, rapid deployments, smaller codebases, and low coupling. However, there’s a common pitfall: the risk of unintentionally creating a distributed monolithic architecture, where the failure of one service can impact others. To address this, we need a robust communication strategy that ensures independence between microservices.
Communication Challenges
In a microservices ecosystem, two primary reasons for service communication are data retrieval for processing and event notification. Both scenarios, essentially two sides of the same coin, can lead to dependencies between microservices. To break these dependencies, we need to store a portion of the data from other microservices locally.
Communication Approaches for Microservices:
- Synchronous Communication (HTTP or GRPC Call): When Microservice A needs to communicate with Microservice B, it directly initiates a call and waits for a response. However, this approach violates the principle of low coupling. If Microservice B encounters issues or is temporarily down, errors will propagate back to Microservice A. Additionally, events sent from Microservice A may not be received by Microservice B, highlighting the drawbacks of this synchronous method. Given these challenges, synchronous communication is generally avoided in favor of more resilient alternatives.
- Asynchronous Communication Using a Queue: As a response to the pitfalls of synchronous communication, asynchronous communication via a message queue is a preferred approach. While it addresses the need to inform Microservice B asynchronously, challenges persist in the interaction between the queue and Microservice A. The drawbacks include:
- Queue Broker Failures: If queue brokers experience downtime and cannot push messages, Microservice A may need to roll back transactions, introducing coupling or requiring careful handling of failures during request processing.
- Incomplete Rollback: Microservice A publishes an event, but if a database issue forces a rollback of the entire transaction, the event may already be consumed by Microservice B, leading to data inconsistencies.
- Stale Data Consumption: Microservice B might consume data before it’s committed to the database by Microservice A, resulting in stale data being read, compromising data integrity.
While asynchronous communication via a queue offers advantages, it’s crucial to address these potential challenges to ensure the robustness and consistency of microservices interactions.
The problem we’re facing is related to an issue that we can’t atomically both perform an external call (to the message broker, another service, etc.) and commit the ACID transaction. In the happy path scenario, both tasks will succeed, but problems start when one of them fails for any reason. I will try to explain how we can overcome these issues by introducing a transactional outbox pattern.
The Transactional Outbox Pattern:
Step 1: Create an Outbox Message Table
- A table named ‘outbox_message’ is created to store data required for publishing events to a broker.
- Entries are inserted into this table within the existing transaction of microservice A.
- If a transaction fails, no entry is created, ensuring unwanted events aren’t published.
Step 2: Implement a Cron Job
- A cron job reads data from the ‘outbox_message’ table and sends it to the broker.
- Upon successful delivery and acknowledgment, the message is marked as sent.
- In case of issues, the cron job retries.
Benefits of the Transactional Outbox Pattern:
- Guarantees at-least-once message delivery
Implementation Considerations:
- Manage the potential scalability of the ‘outbox_message’ table through archiving and deletion of processed rows.
- Ensure that consumers are idempotent to prevent data inconsistencies.
- Optimize the cron job frequency to meet requirements while minimizing the impact on the database.
Conclusion:
The Transactional Outbox Pattern offers a reliable solution to decouple microservices effectively. By intelligently handling data communication, this pattern ensures that the failure of one microservice doesn’t jeopardize the entire system’s integrity. Keep these principles in mind during implementation to enjoy the benefits of a resilient microservices architecture.
References:
- https://microservices.io/patterns/data/transactional-outbox.html
- https://medium.com/design-microservices-architecture-with-patterns/outbox-pattern-for-microservices-architectures-1b8648dfaa27