Event-driven microservices with Apache Kafka

Event-driven microservices with Apache Kafka

Some of the main challenges that monolith applications face are having low availability and handling service disruptions. This is where the use of Apache Kafka for asynchronous communication between microservices can help you avoid bottlenecks that monolithic architectures with relational databases would likely run into.

With Kafka, not only do you get high availability, but outages are less of a concern and failures are handled gracefully with minimal service interruption. The key here is Kafka's ability to retain data for a configured amount of time. This gives you the option to rewind and replay events as needed.

So what is an event?

Let's take a page out of the English dictionary. By definition, an event is a thing that happens at a given time especially if it's of some importance. If we bring it to the software side of things we can say that events are things that happen, within a software system or, more broadly, during the operation of a business or other human process that interacts with a system. An event-driven application is one that is organized around reacting to these events. Examples of events might include:

  • A new user requesting login credentials.

  • A shopping customer arriving at check out.

  • Your pizza delivery getting to your front door.

  • An email reaching the recipient.

  • Video playback on request.

When such an event takes place, there will be a corresponding action or process that is initiated in response. The corresponding response doesn't have to be an action but can be as simple as just logging the event alone.

It should come as no surprise that the way we see events in software systems is no different from how we perceive events in our daily lives.

This means that while events can take any shape or form, there is no fixed guide on how they should be built. What matters is how events are handled is based on the specific type of architecture used.

There are a lot of advantages to designing applications in this way. From a business perspective, it makes the development of systems fall more in line with real-world processes. There is also the added advantage of being able to insert new events into the system as needed. A quick example of this would be if a user is required to agree to a certain term before a purchase can go through instead of just completing the checkout in one go due to updated regulations.

This means that the service that handles that action can be triggered by an existing event, rather than rewriting application logic in multiple, tightly coupled components to accommodate the new business process. But most importantly, with systems that are organizing around events, we are able to help cut down the number of one-to-one connections within a distributed system and improve code reuse, increasing the value of the microservices you build.

How Kafka comes into play

Now let's take a moment to speak more about the Apache Kafka instance. At its heart sits a distributed log. What the log-structured approach does is very straightforward. It is basically a collection of messages, appended sequentially to a file. If an application or service wishes to read messages from Kafka it first locates to the position of the last message it read, then scans sequentially, reading messages in order, while periodically recording its new position in the log

There is an underlining advantage to the log-structured approach taken by Kafka.

Since both reads and writes are sequential operations, it makes them sympathetic to the underlying media, leveraging pre-fetch, the various layers of caching and naturally batching operations together.

This increases the efficiency of the operation because you read messages from Kafka, the server doesn’t need to import them into the JVM. Data is copied directly from the disk buffer to the network buffer. An opportunity afforded by the simplicity of both the contract and the underlying data structure.

Hence a lack of complexity allows for greater performance. But the advantages don't just end there, this approach makes the system well suited to storing messages for longer terms. Most traditional message brokers are built using index structures, hash tables or B-trees, used to manage acknowledgements, filter message headers, and remove messages when they have been read.

While these are good solutions in and of themselves, the downside is that these indexes must be maintained and serviced on a constant basis. Doing this will not come without cost. Chief amongst them is that they must be kept in memory to get good performance, hence limiting retention significantly. But the log is O(1) when either reading or writing messages to a partition, so whether the data is on disk or cached in memory matters far less.

From a services perspective, there are a few implications to this log-structured approach. If a service has some form of outage and doesn’t read messages for a long time, the backlog won’t cause the infrastructure to slow significantly. A common problem with traditional brokers, which have a tendency to slow down as they get full. Being log-structured also makes Kafka well suited to performing the role of an Event Store, for those who like to apply Event Sourcing within their services.

The Case for using Kafka

Apache Kafka is ideal for an event-driven system but to understand just how valuable it can be we need to dive a little deeper. Traditional message queues such as RabbitMQ still have plenty of use cases, including in more loosely coupled event-driven models, such as event notification, which don’t necessarily require a full-featured data platform.

But one reason Kafka is so popular right now is that it has a lot of good things going for it, including its pedigree.

A quick look at its history will show that Kafka started its life over at LinkedIn, where one of the design requirements was that it could sustain operations on a massive scale.

It has since become an open-source project maintained under the Apache Foundation.

There is also an enterprise-grade platform based on Kafka which is offered by Confluent, a company founded by some of Kafka’s original developers.

At its core, Kafka has characteristics that set it apart from traditional enterprise message queues:

  • For one, it is a distributed platform, meaning data can be replicated across a cluster of servers for fault tolerance, including geolocation support.

  • As compared to the traditional queueing and publish-subscribe models, Kafka offers a hybrid model that combines the advantages of both: message-processing remains scalable even when messages are received by multiple consumers.

  • It offers strong guarantees that messages will be received in the chronological order in which they were published.

  • Kafka’s storage system can efficiently scale to terabytes of data, with a configurable retention period—meaning even when outages that last whole days occur, once service is restored, the event stream proceeds in order.

  • Configurable retention also means Kafka is equally suitable for real-time streaming applications and periodic batch data pipelines.

  • In addition, Kafka offers a stream-processing API that allows for complex transformations of data as it’s passed between service endpoints.

These and other features make Kafka an attractive fit for more advanced event-driven patterns, such as event-sourcing, where message queues are not a good fit.

Implementation: Patterns of event-driven architecture

Even understanding both sides of the equation, if Kafka is the way to go for your application, there is still a need to understand how it will be implemented. There are multiple ways to implement event-driven architecture within a system. With some being easier to implement. Others, however, can be more adaptable to complex needs. Which one is best for a given use case will depend on a number of factors, including how many microservices are in play, how tightly coupled they should be, and how much “data chatter” the system can accommodate.

Below is a quick list of more popular patterns for the event-driven architecture:

Event notification

This is a rather simple and straightforward model. A microservice sends events simply to notify other systems of a change in its domain. Case in point, a user account service might send a notification event when a new user is created. This information can be used by other services or ignored. This would not affect the function if the event notification. Notification events usually don’t carry much data with them, resulting in a loosely coupled system with minimal network traffic spent on messaging.

Event-carried state transfer

We go one step further from simple notification, in this model the recipient of an event also receives the data it needs to perform further actions on said data. The aforementioned user account service, for example, might issue an event that includes a data packet containing the new user’s login ID, full name, hashed password, and other pertinent details. This model can be appealing to developers familiar with RESTful interfaces, but depending on the complexity of the system, it can lead to a lot of data traffic on the network and data duplication in storage.

Event-sourcing

The goal of this model is to represent every change of state in a system as an event, each recorded in chronological order. In so doing, the event stream itself becomes the principal source of truth for the system. For example, it should be possible to “replay” a sequence of events to recreate the state of a SQL database at a given point in time. This model presents a lot of intriguing possibilities, but it can be challenging to get right, particularly when events require participation from external systems.

There exist a myriad of other event-driven patterns out there with more being developed every day. So do not be limited to what is listed above.

The Pros and Cons

While this approach makes a lot of sense from a business-oriented perspective, it still makes sense to take a hard look at the pros and cons of using such a system.

Pros

  • Services are decoupled, fungible, and independent.

  • Messages are buffered and consumed when resources are available.

  • Services scale easily for high-volume event processing.

  • Kafka is highly available and fault-tolerant.

  • Consumers/producers can be implemented in many different languages.

Cons

  • This architecture introduces some operational complexity and requires Kafka knowledge.

  • Plus handling partial failures can be relatively challenging.

Conclusion

The event-driven architecture is gaining in popularity with each passing day. This isn't coincidental. From a developers perspective, it provides an effective method of connecting microservices that can help build future-proof systems. Moreover, when coupled with modern streaming data tools like Apache Kafka, event-driven architectures become more versatile, resilient, and reliable than with earlier messaging methods.

All this still takes a back seat to the fact that event-driven patterns mirror how businesses operate in the real world. This not only makes it a practical approach to solve existing problems while allowing organizations to explore building applications in such a way that they can first map out the events that need to be modelled in their workflow before diving into its development.

When system development starts to mimic real-world events, the line that separates business-line managers and the application development groups get shorter.

This is where event-driven architecture really shines. Not only is it powerful but it bridges the gap between business objectives and IT.