03/05/24

Pub/Sub for Event-Driven applications

Pub/Sub concepts, cloud differences, and tooling

8 Min Read

Pub/Sub, short for Publish/Subscribe, is a common messaging pattern in modern software architecture. It's especially common when creating event-driven applications, as it is an efficient tool for enabling components in a distributed systems to communicate asynchronously.

This article introduces what Pub/Sub is, its common use cases, related development challenges, and compares managed services provided by cloud providers like Google Cloud Platform (GCP) and Amazon Web Services (AWS).

Understanding Pub/Sub

At its core, Pub/Sub involves two types of actors: publishers and subscribers. Publishers emit messages without needing to know who will consume them, while subscribers receive messages based on their subscription criteria.

This decoupling of producers and consumers enables scalable and flexible system designs, often in the form of an event-driven architecture. Pub/Sub based systems have several benefits:

  • Asynchronous Messaging: Allows services to exchange messages without waiting for a direct response, reducing bottlenecks.
  • Scalability: Easily scales to accommodate growing data volumes and user demands.
  • Durability and Reliability: Ensures messages are not lost, often through at-least-once delivery guarantees.

Common Use Cases

Pub/Sub can be used as an effective tool for solving many types of problems, some of the most common are:

  • Event-Driven Systems: Trigger actions in other parts of the system when an event occurs, such as updating a database when a new order is placed.
  • Microservices Communication: Enable loosely coupled service architectures by allowing services to communicate without direct dependencies.
  • Workflow Automation: Coordinate complex workflows across multiple systems and services.

Pub/Sub in the Cloud: GCP and AWS Offerings

Using a managed service for Pub/Sub is often practical for operational efficiency and scalability reasons. However, no two clouds offer exactly the same Pub/Sub implementations and it's good to have an idea of how they differ.

Firstly, we'll introduce the different offerings from Google Cloud Platform (GCP) and Amazon Web Services (AWS), then in each section through the article we'll highlight important distinctions.

  • Google Cloud Pub/Sub: A fully managed real-time messaging service that allows you to send and receive messages between independent applications. It supports at-least-once delivery, ensuring messages are delivered reliably.
  • Amazon SNS (Simple Notification Service): A fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication. It excels at broadcasting messages to a wide array of subscribers.
  • Amazon SQS (Simple Queue Service): Offers a queue-based messaging service to decouple and scale microservices, distributed systems, and serverless applications. SQS supports at-least-once delivery and even exactly-once processing in some configurations.

Key Pub/Sub concepts

At-least-once delivery

At-least-once delivery means configuring a Pub/Sub topic to ensure that, for each subscription, events will be delivered at least once. This means that if the topic believes the event was not processed, it will attempt to deliver the message again.

Keep in mind: All subscription handlers should be idempotent. This helps ensure that if the handler is called two or more times, from the outside there's no difference compared to calling it once. This can be achieved using a database to track if you have already performed the action that the event is meant to trigger, or ensuring that the action being performed is also idempotent in nature.

Exactly-once delivery

Pub/Sub topics can often also be configured to deliver events exactly once. This means creating stronger guarantees on the infrastructure level to minimize the likelihood of message re-delivery.

However, there are still some rare circumstances when a message might be redelivered. For example, if a networking issue causes the acknowledgement of successful processing the message to be lost before the cloud provider receives it (the Two Generals' Problem). Therefore, if correctness is critical under all circumstances, it's still advisable to design your subscription handlers to be idempotent.

Differences per cloud provider

By enabling exactly-once delivery on a topic the cloud provider enforces certain throughput limitations:

  • AWS SQS: 300 messages per second for the topic (see AWS SQS Quotas).
  • GCP Pub/Sub: At least 3,000 messages per second across all topics in the region (can be higher on the region see GCP Pub/Sub Quotas).

Ordered Topics

Pub/Sub Topics can be configured to be ordered or unordered:

  • Ordered: Messages with the same ordering key aren't delivered until the earliest message is processed (or dead-lettered — i.e. put outside the queue), potentially causing delays due to head-of-line blocking.
  • Unordered: Messages can be delivered in any order.

Generally, unordered topics allow for better throughput on the topic, as messages can be processed in parallel. This is therefore a good default behavior. However, in some cases, you will likely require that messages must be delivered in the order they were published for a given entity. In these cases, it's important to keep the throughput limitations in mind.

Throughput limitations per cloud provider

Each cloud provider enforces certain throughput limitations for ordered topics:

Challenges working with Pub/Sub: Local Development and Testing

One of the primary hurdles developing Pub/Sub systems, when reliant on cloud services like Google Cloud Pub/Sub or Amazon SNS/SQS, is replicating the production environment locally for development and testing purposes. Understanding and addressing these challenges is important for maintaining an ergonomic and efficient development workflow.

Some of the practical issues developers regularly face:

  • Local Environment Setup: Mimicking a cloud-based Pub/Sub service locally can be complex, since there's not a simple 1:1 local equivalent of the cloud running services. This can complicate and slow down the development process and increase the effort required for initial setup.
  • Testing Limitations: While unit and integration tests can be designed to simulate Pub/Sub behavior, it's challenging to fully replicate the cloud services' behavior, as there are several configuration options which are made on the services running in the cloud. This can lead to discrepancies between how the application behaves in testing versus production environments.
  • Dependency on Internet Connectivity: Developing against cloud-based Pub/Sub services necessitates a constant internet connection, which can be a problem in environments with unstable connectivity.

Overcoming development challenges

Developers and teams can use several strategies to mitigate these challenges and ensure a more efficient development and testing process:

  • Use of Mocks and Emulators: Many cloud providers offer local emulators for their Pub/Sub services (e.g., the Google Cloud Pub/Sub emulator). These tools simulate the cloud service's API and allow developers to test their applications locally without internet connectivity. Additionally, custom mocks can be implemented within test suites to mimic the behavior of Pub/Sub services. The drawback of this approach is overhead in terms of setting up and maintaining tools, as well potential costs involved with using managed services for emulation.
  • Cloud-Based Development Environments: Leveraging cloud-based development environments or dev/test environments in the cloud can provide a closer approximation to production settings. This approach requires internet connectivity for development, and will require an initial setup investment.
  • Use a purpose-built Development Platform: Encore is an example of a Development Platform designed to provide a seamless workflow for building event-driven and distributed systems, using Pub/Sub, from local development and testing to running in the cloud on AWS/GCP.

Encore as a solution for efficient Pub/Sub development

Encore provides a fully type-safe implementation of Pub/Sub via an Open Source Infrastructure SDK, available for Go and TypeScript. It lets you define the common distributed systems resources like services, databases, cron jobs, and Pub/Sub, as type-safe objects in your application code.

With the SDK you only define infrastructure semanticsthe things that matter to your application's behavior — not configuration for specific cloud services. Encore parses your application and builds a graph of both its logical architecture and its infrastructure requirements, it then automatically generates boilerplate and orchestrates the relevant infrastructure for each environment. This means your application code can be used to run locally, test in preview environments, and provision and deploy to cloud environments on AWS and GCP.

This approach completely removes the need for mocks and emulators, and allows you to work offline and run you application locally. You also get the added benefit of avoiding having a separate infrastructure configuration like Terraform, since the application code becomes the source of truth for your application's infrastructure requirements.

When your application is deployed to your cloud, there are no runtime dependencies on Encore and there is no proprietary code running in your cloud.

Example: Creating a Pub/Sub topic

If you want a Pub/Sub Topic, you declare it directly in your application code, like so:

import "encore.dev/pubsub" type User struct { /* fields... */ } var Signup = pubsub.NewTopic[*User]("signup", pubsub.TopicConfig{ DeliveryGuarantee: pubsub.AtLeastOnce, }) // Publish messages by calling a method Signup.Publish(ctx, &User{...})

To run your application, you simply use encore run. Encore will automatically set up the local infrastructure and generate the boilerplate code necessary. You also get a local development dashboard with distributed tracing to help you understand and debug application behavior with ease.

Your code doesn't change when you want to deploy to the cloud. Encore will generate the necessary boilerplate and provision the necessary infrastructure in all environments:

  • NSQ for local development
  • GCP Pub/Sub for environments on GCP
  • SNS/SQS for environments on AWS

Conclusion

Despite some challenges, the benefits of Pub/Sub systems in building scalable, decoupled, and efficient applications are undeniable. By using the appropriate tools for development and testing, organizations can overcome the complexities of working with these systems and solve for an effective local development workflow.

Ready to escape the hamster wheel?

Encore is Backend Development Platform purpose-built for creating event-driven and distributed systems — from developing locally to scaling in your cloud on AWS/GCP.