Easy Filtering of IoT Data Streams with Azure Stream Analytics and JSON reference data

I am currently working on an next-gen widget dispenser solution that is gradually being rolled out to trial sites across Australia. The dispenser hardware is a modern platform that provides telemetry data that can be used for various purposes by the locations at which the dispenser is deployed and potentially by other third parties.

In addition to these next-gen dispensers we already have existing dispenser hardware at the locations that emits telemetry that we already use for other purposes in our solution. To our benefit both the new and existing hardware emits the same format telemetry data 🙂

A sample telemetry entry is shown below.

We take all of the telemetry data from new and old hardware at all our sites and feed it into an Azure Event Hub which allows us to perform multiple actions, such as archival of the data to Blob Storage using Azure Event Hub Capture and processing the streaming data using Azure Stream Analytics.

We decided we wanted to do some additional early stage testing with some of the next-gen hardware at a few sites. As part of this testing we also wanted to push the data for just specific hardware to a partner organisation we are working with. So how did we achieve this?

The first step was to setup another Event Hub. We knew this partner would not have any issues consuming event data from a Hub and it made the use of Stream Analytics an obvious way to process the incoming complete stream and ensure only the data for dispensers and sites we specify is sent to the partner.

Stream Analytics has the concept of Reference Data which takes the form of slow-moving (or static) data that can be read from a blob storage account in Azure.

We identified our site and dispensers and created our simple Reference Data JSON file – sample below.

The benefit of this format is that we can manage additional sites and dispenser by simply editing this file and uploading to blob storage! Stream Analytics even helps us by providing a useful naming scheme for files so you don’t even need to stop your Stream Analytics Job to update it! We uploaded our first file to a location that had the path of

/siterefdata/2018-01-09/11-40/sitedispensers.json

In future when we want to update the file, we edit it and then upload to blob storage at, say

/siterefdata/2018-02-01/00-00/sitedispensers.json

When the Job hits this date / time (UTC) it will simply pick up the new reference data – how cool is that?!

In order to use the Reference Data auto-update capability you need to set up the path naming scheme when you define the reference data as an input into the Stream Analytics Job. If you don’t need the above capability your can simply hard code the path to, say, a single file.

The final piece of the puzzle was to write a Stream Analytics Job that used the Reference Data JSON as one input and read the site identifier and dispensers from the included integer array. Thankfully, the in-built GetArrayElements Function came in handy, along with CROSS APPLY which gives us the ability to iterate over the elements and use them handily in the WHERE clause of the query!

The resulting solution now handily carves off the telemetry data for just the dispensers we want at the sites we list and writes them to an Event Hub the partner organisation can use.

I commented online that this sort of solution, and certainly one that scales as easily as this will, would have been something unachievable for most organisations even just a few years ago.

The cloud, and specifically Azure, has changed all of that!

Happy Days 😎

Tagged , , , ,

Use Azure Health to track active incidents in your Subscriptions

Yesterday afternoon while doing some work I ran into an issue in Azure. Initially I thought this issue was due to a bug in my (new) code and went to my usual debugging helper Application Insights to review what was going on.

The below graphs a little old, but you can see a clear spike on the left of the graphs which is where we started seeing issues and which gave me a clue that something was not right!

App Insights views

Initially I thought this was a compute issue as the graphs are for a VM-hosted REST API (left) and a Functions-based backend (right).

At this point there was no service status indicating an issue so I dug a little deeper and reviewed the detailed Exception information from Application Insights and realised that the source of the problem was the underlying Service Bus and Event Hub features that we use to glue together our services.

You can see the increased error rate from the Service Bus Metrics view below.

Service Bus Metrics

While I was doing this an alert popped up in the Portal advising a service incident and directed me to the Azure Service Health feature in order to view the full incident details and also to track it.

On the Azure Health page I could see an active incident and decided to try out the alerting feature to track this during a commute home.

I clicked on the Add Alert option and configured a new email-based alert. You can also push alerts into your preferred IT Service Management (ITSM) solution, but we aren’t yet using an ITSM platform for this solution, but this would be our choice in future!

In Services I chose Service Bus and Event Hubs and for Regions I selected the two Australian Regions. Note that I had to set up an Action Group as I hadn’t used the feature previously – in the screenshot below I am just reusing the one I previously setup.

Alerts Setup

A short while after saving the Alert configuration the recipients in the Action Group started to receive update emails containing the most recent status of the incident. A sample is shown below.

Notice Email

About 45 minutes after this alert we received a resolution notification.

The amount of time saved for our team with the ease of this setup is pretty amazing, and if you’re not using this feature already you should go and explore it in the Portal and set it up for you key Azure components.

What a great early Christmas present!

😎

Tagged , , , , ,

Understanding Azure’s Container PaaS Capabilities

If you’ve been using Azure over the past twelve months, you can’t but have the feeling that it’s become a bit like this…

Containers... Containers Everywhere

.. and you’d be right.

To be fair, though, Containers have been one of the hot topics in computing in general and certainly one that’s been getting the most interest in my recent Azure Open Source Roadshows.

One thing that has struck me though is that people are not clear on the purpose of all the services in Azure that have ‘Containers’ listed as a capability, so in this post I am going to try and review the Azure Platform-as-a-Service offerings that have Container capabilities and cover what the services can be used for.

First, before we begin, let’s quickly get some fundamentals under our belts.

What is a Container?

Containers provide encapsulation and isolation for workloads and remove the need for a complete Operating System image to be deployed in order to manage resource allocation.

They have proven popular because they typically have smaller footprints than Virtual Machines, boot much faster as a result and have a modern build process based on composition that gels well with software development.

A Container still needs to “run” somewhere – this “somewhere” is what I will call a “Container Host” through the rest of this post.

So where does Docker fit into all of this? Docker provides tooling for the creation, running and management of Containers and is by far the best known tech in this space. Microsoft has worked with Docker to ensure the Docker tooling supports Windows and Windows Containers.

Our most basic Container workload setup then would be: one Container Host running one Container.

What is a Container Orchestrator?

A big part of running Containers at scale is their management which is where technologies like Kubernetes (k8s), Docker Swarm and DC/OS come into play. These technologies allow you to manage multiple Containers and their workloads, performing orchestration of deployments and controlling connectivity between Container instances running on Container Hosts.

An Orchestrator typically runs more than one node to ensure availability, but nothing stops us from running a simple single node setup like Minikube to start to learn about them.

Right, now we have some fundamentals in place, let’s take a look at what Azure offers.

Azure’s Container Offerings

Note that we are going to focus on PaaS services – you can of course still run Containers on Virtual Machines, or deploy something like OpenShift in Azure if you wish.

Please note: any service listed as ‘Preview’ should not be used for Production deployments!


Azure Container Registry (ACR)

What is it? ACR is an Azure-hosted Container Registry based on the open-source Docker Registry v2 spec. This is a turnkey part of Azure’s Container story.

Why use it? When you build a Container Image you need a place to store it. Docker Hub is the Registry where you pull all your public Images from and which is run by Docker. ACR provides you with a private Docker-compatible Registry that you can push Images to and use as a deployment source.

Benefits:

  • Private Registry you configure that is not published on a well-known endpoint that is a source for public Images
  • Provides a unique *.azurecr.io Registry endpoint which can be used to store Images that can be deployed *anywhere* (not just Azure)
  • Webhook support that can be used for Continuous Deployment, particularly with Azure’s Web Apps for Containers (see below)
  • Control access to the Registry using Azure Active Directory Credentials
  • ACR provides seamless authentication (i.e. no configuration) with other Azure services like Azure Container Instances, Azure Container Services, App Service and Batch.
  • Geo-replication is the hotness! (requires Premium level) * Preview

Restrictions:

  • Cooler features (like geo-replication) are at higher price point only
  • I’m struggling here for others! 🙂

> ACR Documentation.


Azure Container Instances (ACI) * Preview

What is it? An Individual Container Host that can run one or more Container. No need for you to manage the Host.

Why use it? These are probably the easiest way to get going running Container-based workloads in Azure. If you have a simple workload that needs a public IP and which can talk to various Azure PaaS services then consider ACI over Web Apps for Containers or Azure Container Service.

Benefits:

  • No need for you to manage the Container Host – tell it which Containers to run and that’s it!
  • Pay per-second for use with customisable CPU Core and memory options
  • Supports multiple Containers in single ACI Host
  • “Whole of Azure” scale: deploy ACI workloads in any Azure Region (where ACI is available).

Restrictions:

  • No production Orchestrator support: there is an experimental Kubernetes Connector, but apart from this you cannot bolt an ACI Host into an Orchestrated environment
  • No VNet support: you can’t connect an ACI Host to Azure VNets.

> ACI Documentation.


Web Apps for Containers (App Service on Linux)

What is it? A Container Host that runs on a Linux-based variant of Azure’s App Service that is aimed at web-centric workloads (hence the name). Like ACI, you still don’t need to manage the Host.

Why use it? If you have a website or HTTP API workload you traditionally host on Linux and that you can (or have) containerised, then this is a good spot to start as it limits your workloads exposure to HTTP (80) and HTTPS (443) ports.

Additionally, even if you haven’t containerised your solution, you can still use this service to host it. When you select a framework to use to host your solution (like Java or PHP) the framework is deployed to Web Apps for Containers as a Docker Image!

Benefits:

  • Get access to standard App Service features like Autoscale, Custom Domains, SSL and Continuous Deployment
  • No need for you to manage the Container Host – tell the Web Apps Instance which Container to run and it will do that
  • Deploy existing Docker images from Docker Hub, Azure Container Registry or from Azure’s pre-built framework images
  • Troubleshoot containers using SSH from Kudu.

Restrictions:

  • No Orchestrator support: what you gain through using App Service you give up in not being able to bolt the Container Host into an Orchestrator like Kubernetes
  • Multi-Container deployments are not supported
  • Not all Windows-based App Service features are supported (yet)
  • Not currently supported in App Service Environments.

> Web Apps for Containers Documentation.


Azure Container Services (ACS)

What is it? A service that allows you to run Container Hosts that are managed by an Orchestrator of your choice (Kubernetes, Docker or DC/OS).

Why use it? If you already run Container workloads on VMs (regardless of hosting location) that use an Orchestrator, or you’d like to start using Containers at scale and need an Orchestrator, then this is the service to use.

Benefits:

  • No need for you to manage the underlying Virtual Machine infrastructure
  • Orchestrator and Container Host setup is managed for you by the ACS engine (which is open source)
  • Container Host scalability is supported via use of Virtual Machine Scale Sets (VMSSs)
  • All hosts (Orchestrator and Container) are vanilla instances – this is not a special “Azure release”
  • Orchestrators have Azure extensions allowing them to perform actions such as creating Azure Load Balancers when you specify load balanced workloads in your setup.
  • Integration with Azure Container Registry for Image deployments.

Restrictions:

  • You pay for both the Container Hosts and Orchestrator Nodes (they are just VMs after all)
  • You can’t increase the number of Orchestrator / Cluster Masters after you have initially created an ACS cluster
  • You can’t upgrade the Orchestrator once you have created an ACS cluster – you need to create a new ACS cluster to gain access to a newer release.

> ACS Documentation: Kubernetes | Docker | DC/OS.


Azure Container Services – Managed Kubernetes (AKS) * Preview

What is it? This service is similar to ACS (above), however in this service (which only supports Kubernetes) the Orchestrator Nodes are managed on your behalf by Microsoft.

Why use it? If you are invested in Kubernetes (or intend to use it as your Orchestrator) and would prefer not to have to manage the Orchestator Nodes then you should select this over standard ACS with “unmanaged” Kubernetes. If you are using the ACS Kubernetes offering already then this is a logical place to migrate to once AKS is Generally Available.

Benefits:

  • You don’t pay for the Orchestrator Nodes running Kubernetes
  • Orchestrator Node availability, patching and upgrading is managed by Microsoft
  • No need to create a new ACS cluster to pick up new Kubernetes releases
  • Will support 100% of the standard Kubernetes API
  • All other ACS features remain in place!

Restrictions:

  • Only supports Kubernetes!
  • During preview does not support all Kubernetes features.

> AKS Documentation.

On a side note: AKS + ACI (with its Kubernetes connector) + ACR will be an amazing PaaS Container story once all these components are all Generally Available! 😎


Azure Service Fabric

What is it? Service Fabric is both a cluster Orchestrator and a development framework for delivery of highly available, distributed applications. It pre-dates the current Container hype cycle and is used to deliver services in Azure such as CosmosDB.

Why use it? If you want to leverage the Reliable Actor and Service patterns offered by the Service Fabric development framework. Also worth considering if you haven’t yet started with an Orchestrator like Kubernetes.

Benefits:

  • Mature product that underpins key Microsoft cloud-scale services
  • Runs in Azure or on-premises.

Restrictions:

  • Container workloads can’t benefit from the development framework as they run as ‘guest executables’ on cluster nodes (this will change in future as you will be able to Containerise Reliable Actors and Services)
  • Developers using Windows 10 can’t deploy Container-based solutions to local Service Fabric clusters.

> Service Fabric Container Documentation.


Azure Batch

What is it? The name says it all really – you can use Azure Batch to run compute workloads that can be broken into lots of concurrent processes. Examples include payroll runs, animation rendering or research modelling. Batch sits well within the High Performance Compute (HPC) landscape.

Why use it? If you have a batch-style workload with processors that can be Containerised then this is a service you should seriously be considering. More-so if you wish to consider a hybrid scenario where you run some of your workload in-house and burst to Azure as required. As you have Containerised workload you can ship dependencies in a single bundle.

Benefits:

  • You can use Docker Hub as a source for Images (yes, you could pull tensorflow and run it in Azure 😉 ), in addition to ACR and any other compatible Registry
  • Use Singularity in addition to Docker Containers with Batch
  • Run processes on low-priority VMs to reduce the cost (best for non-time sensitive operations).

Restrictions:

  • RDMA (high performance networking) support only available for Containers running on Linux.

> Azure Batch Container Documentation.


So there we are – hopefully the Azure Container story now makes much more sense and you can pick between the services to understand those that would be most appropriate to your use case.

Happy days! 😎

Tagged , , , , , ,

Azure API Management: 200 OK response but no backend traffic

I’m noting this post down in the “if only someone had already made a big noise about this I might have saved some time” category.

The work I’m doing at present involves fronting some APIs with Azure API Management and then exposing them securely.

When I hit the moment I thought I was done today I was doing some testing, and no matter what I did I couldn’t get my backend service to respond, and I could clearly see no traffic hitting the backend.

After double-checking my policies and doing a few more tests (only a couple of hours) I then happened across this Stack Overflow question and its answer

It turns out that I had, somewhere along the line, removed the “forward-request” policy from the Policy applying to all APIs published via API Management.

So how to fix? As Darrell says, find the offending Policy and add the missing item back.

Edit Policy

When done it should like the image below.

API Policy

… and your API calls will now work as expected and not just give you back 200 OK! 😎

Tagged , , , , ,

Continuous Deployment for Docker with VSTS and Azure Container Registry

I’ve been watching with interest the growing maturity of Containers, and in particular their increasing penetration as a hosting and deployment artefact in Azure. While I’ve long believed them to be the next logical step for many developers, until recently they have had limited appeal to many every-day developers as the tooling hasn’t been there, particularly in the Microsoft ecosystem.

Starting with Visual Studio 2015, and with the support of Docker for Windows I started to see this stack as viable for many.

In my current engagement we are starting on new features and decided that we’d look to ASP.Net Core 2.0 to deliver our REST services and host them in Docker containers running in Azure’s Web App for Containers offering. We’re heavy uses of Visual Studio Team Services and given Microsoft’s focus on Docker we didn’t see that there would be any blockers.

Our flow at high level is shown below.

Build Pipeline

1. Developer with Visual Studio 2017 and Docker for Windows for local dev/test
2. Checked into VSTS and built using VSTS Build
3. Container Image stored in Azure Container Registry (ACR)
4. Continuously deployed to Web Apps for Containers.

We hit a few sharp edges along the way, so I thought I’d cover off how we worked around them.

Pre-requisites

There are a few things you need to have in place before you can start to use the process covered in this blog. Rather than reproduce them here in detail, go and have a read of the following items and come back when you’re done.

  • Setting up a Service Principal to allow your VSTS environment to have access to your Azure Subscription(s), as documented by Donovan Brown.
  • Create an Azure Container Registry (ACR), from the official Azure Documentation. Hint here: don’t use the “Classic” option as it does not support Webhooks which are required for Continuous Deployment from ACR.

See you back here soon 🙂

Setting up your Visual Studio project

Before I dive into this, one cool item to note, is that you can add Docker support to existing Visual Studio projects, so if you’re interested in trying this out you can take a look at how you can add support to your current solution (note that it doesn’t magically support all project types… so if you’ve got that cool XAML or WinForms project… you’re out of luck for now).

Let’s get started!

In Visual Studio do a File > New > Project. As mentioned above, we’re building an ASP.Net Core REST API, so I went ahead and selected .Net Core and ASP.Net Core Web Application.

New Project - .Net Core

Once you’ve done this you get a selection of templates you can choose from – we selected Web API and ensured that we left Docker support on, and that it was on Linux (just saying that almost makes my head explode with how cool it is 😉 )

Web API with Docker support

At this stage we now have baseline REST API with Docker support already available. You can run and debug locally via IIS Express or via Docker – give it a try :).

If you’ve not used this template before you might notice that there is an additional project in the solution that contains a series of Docker-related YAML files – for our purposes we aren’t going to touch these, but we do need to modify a couple of files included in our ASP.Net Core solution.

If we try to run a Docker build on VSTS using the supplied Dockerfile it will fail with an error similar to:

COPY failed: stat /var/lib/docker/tmp/docker-builder613328056/obj/Docker/publish: no such file or directory
/usr/bin/docker failed with return code: 1

Let’s fix this.

Add a new file to the project and name it “Dockerfile.CI” (or something similar) – it will appear as a sub-item of the existing Dockerfile. In this new file add the following, ensuring you update the ENTRYPOINT to point at your DLL.

This Dockerfile is based on a sample from Docker’s official documentation and uses a Docker Container to run the build, before copying the results to the actual final Docker Image that contains your app code and the .Net Core runtime.

We have one more change to make. If we do just the above, the project will fail to build because the default dockerignore file is stopping the copying of pretty much all files to the Container we are using for build. Let’s fix this one by updating the file to contain the following 🙂

Now we have the necessary bits to get this up and running in VSTS.

VSTS build

This stage is pretty easy to get up and running now we have the updated files in our solution.

In VSTS create a new Build and select the Container template (right now it’s in preview, but works well).

Docker Build 01

On the next screen, select the “Hosted Linux” build agent (also now in preview, but works a treat). You need to select this so that you build a Linux-based Image, otherwise you will get a Windows Container which may limit your deployment options.

build container 02

We then need to update the Build Tasks to have the right details for the target ACR and to build the solution using the “Dockerfile.CI” file we created earlier, rather than the default Dockerfile. I also set a fixed name for the Image Name, primarily because the default selected by VSTS typically tends to be invalid. You could also consider changing the tag from $(Build.BuildId) to be $(Build.BuildNumber) which is much easier to directly track in VSTS.

build container 03

Finally, update the Publish Image Task with the same ACR and Image naming scheme.

Running your build should generate an image that is registered in the target ACR as shown below.

ACR

Deploy to Web Apps for Containers

Once the Container Image is registered in ACR, you can theoretically deploy it to any container host (Azure Container Instances, Web Apps for Containers, Azure Container Services), but for this blog we’ll look at Web Apps for Containers.

When you create your new Web App for Containers instance, ensure you select Azure Container Registry as the source and that you select the correct Repository. If you have added the ‘latest’ tag to your built Images you can select that at setup, and later enable Continuous Deployment.

webappscontainers

The result will be that your custom Image is deployed into your Web Apps for Containers instance and which will be available on ports 80 and 443 for the world to use.

Happy days!

I’ve uploaded the sample project I used for this blog to Github – you can find it at: https://github.com/sjwaight/docker-dotnetcore-vsts-demo

Also, please feel free to leave any comments you have, and I am certainly interested in other ways to achieve this outcome as we considered Docker Compose with the YAML files but ran into issues at build time.

Tagged , , ,

Microsoft Open Source Roadshow – Free training on Azure – Canberra and Sydney

Microsoft Open Source Roadshow

I’m excited to have the opportunity to share Azure’s powerful Open Source support with more developers in November.

Our first run of these sessions in August proved popular, so if you, or someone you know, wants to learn more they can sign up below.

We’ll cover language support (application and Azure SDK), OS support (Linux, BSD), Datastores (MySQL, PostreSQL, MongoDB), Continuous Deployment and, last, but not least, Containers (Docker, Container Registry, Kubernetes, et al).

We’re starting off in Canberra this time, then Sydney, so if you’re interested here are the links to register:

  • Canberra – Wednesday 1 November 2017: Register
  • Sydney – Friday 10 November 2017: Register

We’re also running two days in New Zealand in November too if you know anyone who might want to come along.

If you have any questions you’d like to see answered at the days feel free to leave a comment.

I hope to see you there!

Tagged , ,

Microsoft Open Source Roadshow – Free training on Azure – Auckland and Wellington!

Microsoft Open Source Roadshow

Hello New Zealand friends!

I’m really happy to share that we are bringing the Open Source Roadshows to Auckland and Wellington in November 2017!

We’ll cover language support (application and Azure SDK), OS support (Linux, BSD), Datastores (MySQL, PostreSQL, MongoDB, SQL Server on Linux), Continuous Deployment and, last, but not least, Containers (Docker, Container Registry, Kubernetes, et al).

If you’re interested in learning more here are the links to register:

  • Auckland – Tuesday 7 November 2017: Register
  • Wellington – Wednesday 8 November 2017: Register

If you have any questions you’d like to see answered at the days feel free to leave a comment.

I hope to see you there!

Tagged , ,

Speaking: Azure Functions at MUG Strasbourg – 28 September

I’m really excited about this opportunity to share the power of Azure with the developer and IT Pro community in France that is soon to gain local Azure Regions in which to build their solutions.

If you live in the surrounding areas I’d love to see you there. More details available via Meetup.

Tagged , ,

What happened with my Azure VM Scale Set Instance IDs?

In my last couple of posts (here and here) I've covered moving from Azure VMs to running workloads on VM Scale Sets (VMSS).

One item you use need to adjust to is the instance naming scheme that is used by VMSS.

I was looking at the two VM Scale Sets I have setup in my environment and was surprised to see the following.

The first VM Scale Set I setup has Instances with these IDs (0 and 1):

VMSS 01

and my second has Instances with IDs of 2 and 3:

VMSS 02

I thought this was a bit weird that two unrelated VM Scale Sets seem to be sharing a common Instance ID source – maybe something to do with the underlying provisioning engine continuing on Instance IDs across VMSS on a VNet, or something similar?

As it turns out, this is entirely coincidental.

When you provision a VMSS and you set the "overprovision" flag to "true" the provisioning engine will build more Instances than required (you can see this if you watch in the portal while the VMSS is being provisioned) and then delete the excess above what is actually required. This is by design and is described by Microsoft on their VMSS design considerations page.

Here's a snippet of what your ARM template will look like:

"properties": {
"overprovision": "true",
"singlePlacementGroup": "false",
"upgradePolicy": {
"mode": "Manual"
}
}

So, for my scenario, it just happens that for the first VMSS that the engine deleted the Instances above Instance 1, and for the second VMSS the engine deleted Instances 0 and 1!

Too Easy! 🙂

Tagged , ,

Moving from Azure VMs to Azure VM Scale Sets – Runtime Instance Configuration

In my previous post I covered how you can move from deploying a solution to pre-provisioned Virtual Machines (VMs) in Azure to a process that allows you to create a custom VM Image that you deploy into VM Scale Sets (VMSS) in Azure.

As I alluded to in that post, one item we will need to take care of in order to truly move to a VMSS approach using a VM image is to remove any local static configuration data we might bake into our solution.

There are a range of options you can move to when going down this path, from solutions you custom build to running services such as Hashicorp’s Consul.

The environment I’m running in is fairly simple, so I decided to focus on a simple custom build. The remainder of this post is covering the approach I’ve used to build a solution that works for me, and perhaps might inspire you.

I am using an ASP.Net Web API as my example, but I am also using a similar pattern for Windows Services running on other VMSS instances – just the location your startup code goes will be different.

The Starting Point

Back in February I blogged about how I was managing configuration of a Web API I was deploying using VSTS Release Management. In that post I covered how you can use the excellent Tokenization Task to create a Web Deploy Parameters file that can be used to replace placeholders on deployment in the web.config of an application.

My sample web.config is shown below.

The problem with this approach when we shift to VM Images is that these values are baked into the VM Image which is the build output, which in turn can be deployed to any environment. I could work around this by building VM Images for each environment to which I deploy, but frankly that is less than ideal and breaks the idea of “one binary (immutable VM), many environments”.

The Solution

I didn’t really want to go down the route of service discovery using something like Consul, and I really only wanted to use Azure’s default networking setup. This networking requirement meant no custom private DNS I could use in some form of configuration service discovery based on hostname lookup.

…and…. to be honest, with the PaaS services I have in Azure, I can build my own solution pretty easily.

The solution I did land on looks similar to the below.

  • Store runtime configuration in Cosmos DB and geo-replicate this information so it is highly available. Each VMSS setup gets its own configuration document which is identified by a key-service pair as the document ID.
  • Leverage a read-only Access Key for Cosmos DB because we won’t ever ask clients to update their own config!
  • Use Azure Key Vault as to store the Cosmos DB Account and Access Key that can be used to read the actual configuration. Key Vault is Regionally available by default so we’re good there too.
  • Configure an Azure AD Service Principal with access to Key Vault to allow our solution to connect to Key Vault.

I used a conventions-based approach to configuration, so that the whole process works based on the VMSS instance name and the service type requesting configuration. You can see this in the code below based on the URL being used to access Key Vault and the Cosmos DB document ID that uses the same approach.

The resulting changes to my Web API code (based on the earlier web.config sample) are shown below. This all occurs at application startup time.

I have also defined a default Application Insights account into which any instance can log should it have problems (which includes not being able to read its expected Application Insights key). This is important as it allows us to troubleshoot issues without needing to get access to the VMSS instances.

Here’s how we authorise our calls to Key Vault to retrieve our initial configuration Secrets (called on line 51 of the above sample code).

My goal was to make configuration easily manageable across multiple VMSS instances which requires some knowledge around how VMSS instance names are created.

The basic details are that they consist of a hostname prefix (based on what you input at VMSS creation time) that is appended with a base-36 (hexatrigesimal) value representing the actual instance. There’s a great blog from Guy Bowerman from Microsoft that covers this in detail so I won’t reproduce it here.

The final piece of the puzzle is the Cosmos DB configuration entry which I show below.

The ‘id’ field maps to the VMSS instance prefix that is determined at runtime based on the name you used when creating the VMSS. We strip the trailing 6 characters to remove the unique component of each VMSS instance hostname.

The outcome of the three components (code changes, Key Vault and Cosmos DB) is that I can quickly add or remove VMSS groups in configuration, change where their configuration data is stored by updating the Key Vault Secrets, and even update running VMSS instances by changing the configuration settings and then forcing a restart on the VMSS instances, causing them to re-read configuration.

Is the above the only or best way to do this? Absolutely not 🙂

I’d like to think it’s a good way that might inspire you to build something similar or better 🙂

Interestingly, getting to this stage as well, I’ve also realised there might be some value in considering moving this solution to Service Fabric in future, though I am more inclined to shift to Containers running under the control an orchestrator like Kubernetes.

What are you thoughts?

Until the next post!

Tagged , , , , , ,