Building manageable, loosely-coupled serverless solutions with Azure Event Grid

Published on
Reading time

This is a post I've been working on for a while and comes off the back of a talk I did at the online meetup for Sydney Serverless in April 2020.

Over the last few years I've spent a lot of time focused on the serverless space having built one of the first Azure Functions backends in Australia that runs a consumer mobile app you might have used.

Despite the promise of quicker development and easier ops management I've been thinking about the many downsides of serverless (or microservice-style) solutions.

Then I saw this Tweet and it really bought home how some people feel about "simple" serverless solutions and highlights the challenge we really face with longer term maintainability (yes, this example is on AWS, but it could apply just easily on other platforms).

Then you get into microservices at scale.. and your head basically goes 🤯

Imagine in either of these scenarios trying to determine inter-dependence between services... sure, in Twitter's case they are most likely connected via a standard protocol (HTTP), but when it's a stateless protocol like HTTP if you aren't baking the right telemetry into your services (or using the right tooling) you end right up back at the problem we've always had with distributed systems: having no idea what services call (or rely on) what services.

In the enterprise we had the idea of the Enterprise Service Bus (ESB) which was designed to facilitate consistent inter-service communication and gave (theoretically) one place to go and determine relationships between services. The reality tended to be very different, so while ESBs exist and have been used to varying levels of success, they haven't really come up as part of modern microservices designs. Ideally if we combined ESBs with Services Orientated Architectures (SOA) we were supposed to have solved the problem of identifying service dependencies over a decade ago.

I have another TLA for you in answer to that suggestion: LOL.

How can Azure Event Grid help?

Event Grid can massively simplify how you connect services in any solution without the need for a lot of configuration or management ceremony.

Let's look at a few ways in which it can help.

Service interdependence

Firstly, how about a way to discover what subscribers you have to a particular event-emitting service in your architecture?

I'm going to use an in-built Azure event source - Azure Storage Account. In this scenario the Account is part of an service similar to the "File Upload" microservice in the architecture diagram in the first tweet in this post. So how can we find anything that relies on this Storage Account?

We can use the Azure CLI and query the Resource subscriptions like this:

resourceid=$(az resource show -n mystorageaccount -g myresourcegroup --resource-type "Microsoft.Storage/storageaccounts" --query id --output tsv)
az eventgrid event-subscription list --resource-id $resourceid

This results in a list of Event Subscribers to this Storage Account which means no more guessing or having to look at config files, documentation or log files!

We can also use the same approach for custom topics we may stand up for our solution.

Health and reliability

Azure Event Grid also provides options for handling transient failure and events that are potentially malformed, problematic or otherwise never consumed.

When we create Event Subscriptions we can set the retry and "dead letter" policy. These policies are per Subscription which means we can apply different policies for the same event source which is super handy when we want different behaviours for different subscribers.

The default behaviour for Event Grid Subscriptions is set to the maximum values you can use - retry delivery 30 times and keep events on a Topic for up to 24 hours (1,440 minutes). You can change both these settings. The retry mechanism also has a bunch of smarts baked into it that avoid you needing to write any code.

Extensibility and openness

As I touched on above, you can create custom topics that have no relationship with Azure infrastructure generated events and at Build 2020 Partner Topics were announced, with the first being from identity provider Auth0.

While latency might be a challenge, nothing stops you from using Event Grid even if you aren't running on Azure. Additionally, the most basic way to subscribe to a Topic on Azure Event Grid is to use a webhook which means any application or service that can subscribe to a webhook can use Event Grid as its eventing backplane.

Finally, through the work of the team, Azure Event Grid offers support for the CNCF Cloud Event specification providing developers with a platform-agnostic way to bolt in their services without locking to Event Grid's schema or service.

A small demo

Quick wins are always good wins, and I think Event Grid can deliver this to any team building solutions that requires extensibility.

To prove how you can add value quickly with Event Grid I built a simple demo that reuses most of an another demo, showing you how you can extend existing implementations using Event Grid. The architecture diagram is shown below.

Demo serverless architecture diagram

The main components are all mostly from Microsoft Ignite The Tour 2019-2020 which had a great set of demos that used Azure Logic Apps and Azure Functions to manipulate files uploaded to Azure Storage. You can find these pieces on GitHub as part of the learning modules.

I built a simple Node.js web application that runs on Azure App Service's free tier which you can give a try at: (if you get a 'service unavailable' response it means the Free tier limit has been hit for the day).

The source for this web application is also on GitHub if you want to see my terrible Node skills. 😜

The point I want to make with this demo though is that as I've standardised on Event Grid I could easily add further processing based on this file upload and could, in fact, improve the current basic web application behaviour so it automatically displays processing updates by using a custom Topic and a webhook subscription similar to this demo from the Azure team.

If I want to see what's using the Storage Account as an event source? I just run my query above! If I want to add more processing logic based on the file upload? Add another subscriber to the existing Event.

Happy Days 😎

As a special added bonus for getting this far here's a video of a session I did aligned with this blog.