Moving from Azure VMs to Azure VM Scale Sets – Runtime Instance Configuration

In my previous post I covered how you can move from deploying a solution to pre-provisioned Virtual Machines (VMs) in Azure to a process that allows you to create a custom VM Image that you deploy into VM Scale Sets (VMSS) in Azure.

As I alluded to in that post, one item we will need to take care of in order to truly move to a VMSS approach using a VM image is to remove any local static configuration data we might bake into our solution.

There are a range of options you can move to when going down this path, from solutions you custom build to running services such as Hashicorp’s Consul.

The environment I’m running in is fairly simple, so I decided to focus on a simple custom build. The remainder of this post is covering the approach I’ve used to build a solution that works for me, and perhaps might inspire you.

I am using an ASP.Net Web API as my example, but I am also using a similar pattern for Windows Services running on other VMSS instances – just the location your startup code goes will be different.

The Starting Point

Back in February I blogged about how I was managing configuration of a Web API I was deploying using VSTS Release Management. In that post I covered how you can use the excellent Tokenization Task to create a Web Deploy Parameters file that can be used to replace placeholders on deployment in the web.config of an application.

My sample web.config is shown below.

<configuration>
<appSettings>
<add key="webpages:Version" value="3.0.0.0" />
<add key="webpages:Enabled" value="false" />
<add key="ClientValidationEnabled" value="true" />
<add key="UnobtrusiveJavaScriptEnabled" value="true" />
<add key="LoggingDatabaseAccount" value="__docdburi__" />
<add key="LoggingDatabaseKey" value="__docdbkey__" />
<add key="LoggingDatabase" value="__loggingdb__" />
<add key="LoggingDatabaseCollection" value="__loggingdbcollection__" />
</appSettings>
</configuration>

view raw
web.config
hosted with ❤ by GitHub

The problem with this approach when we shift to VM Images is that these values are baked into the VM Image which is the build output, which in turn can be deployed to any environment. I could work around this by building VM Images for each environment to which I deploy, but frankly that is less than ideal and breaks the idea of “one binary (immutable VM), many environments”.

The Solution

I didn’t really want to go down the route of service discovery using something like Consul, and I really only wanted to use Azure’s default networking setup. This networking requirement meant no custom private DNS I could use in some form of configuration service discovery based on hostname lookup.

…and…. to be honest, with the PaaS services I have in Azure, I can build my own solution pretty easily.

The solution I did land on looks similar to the below.

  • Store runtime configuration in Cosmos DB and geo-replicate this information so it is highly available. Each VMSS setup gets its own configuration document which is identified by a key-service pair as the document ID.
  • Leverage a read-only Access Key for Cosmos DB because we won’t ever ask clients to update their own config!
  • Use Azure Key Vault as to store the Cosmos DB Account and Access Key that can be used to read the actual configuration. Key Vault is Regionally available by default so we’re good there too.
  • Configure an Azure AD Service Principal with access to Key Vault to allow our solution to connect to Key Vault.

Update July 2018: Microsoft has released Managed Service Identities as a way to do delegated permissions between resources in Azure – I would strongly advise wrapping your head around this and leveraging MSI as the way to connect your VMSS instances to Key Vault (and potentially other resources).

I used a conventions-based approach to configuration, so that the whole process works based on the VMSS instance name and the service type requesting configuration. You can see this in the code below based on the URL being used to access Key Vault and the Cosmos DB document ID that uses the same approach.

The resulting changes to my Web API code (based on the earlier web.config sample) are shown below. This all occurs at application startup time.

I have also defined a default Application Insights account into which any instance can log should it have problems (which includes not being able to read its expected Application Insights key). This is important as it allows us to troubleshoot issues without needing to get access to the VMSS instances.

namespace MyDemo.Website.WebApi
{
public class WebApiApplication : System.Web.HttpApplication
{
// This value is used if for some reason the configuration doesn't have a defined
// application insights instrumentation key
private const string DefaultAppInsightsKey = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx";
// Yes, these are hardcoded – need to change? Recompile and build VM Image and redeploy.
private const string ConfigurationDbId = "apiconfigs";
private const string ConfigurationDbCollection = "apiconfigentries";
private const string ServiceIdentifier = "webapi";
private static Dictionary<string, string> instanceTelemetry;
internal static string LoggingDatabaseCollection;
internal static string LoggingDatabaseKey;
internal static string LoggingDatabase;
internal static string LoggingDatabaseCollection;
internal static string ConfigDatabaseAccount;
internal static string ConfigDatabaseKey;
protected void Application_Start()
{
ApplyConfiguration();
// removed other initialisation code
}
private void ApplyConfiguration()
{
// Assumes you correctly version your assemblies… you do, don't you? 🙂
instanceTelemetry = new Dictionary<string, string>
{
{ "Version", Assembly.GetExecutingAssembly().GetName().Version.ToString() },
{ "Hostname", Environment.MachineName }
};
// Assumption that VM is an instance in an Azure VM Scale Set where
// instance names end with a 6 character hexatrigesimal value to uniquely identify them.
var instanceKey = Environment.MachineName.Substring(0, Environment.MachineName.Length 6); // strip last 6 characters
// Secret URI in keyvault that will contain config DB information.
// Yes, the KeyVault instance is hardcoded which requires the solution to be re-compiled and re-deployed if
// changed. This is on purpose to stop people editing configuration on running instances. They
// should be editing the source code and pushing through the CD pipeline to re-bake the VM image.
var configDbSecretUri = $"https://yourkeyvaultinstance.vault.azure.net/secrets/{instanceKey}-{ServiceIdentifier}-configdb-account/";
var configDbKeySecretUri = $"https://yourkeyvaultinstance.vault.azure.net/secrets/{instanceKey}-{ServiceIdentifier}-configdb-key/";
// Call Key Vault and retrieve secret.
// Uses an Azure AD Service Principal to access
var keyVaultClient = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(KevVaultUtils.GetToken));
// Configuration DB account and key read from KeyVault.
ConfigDatabaseAccount = keyVaultClient.GetSecretAsync(configDbSecretUri).Result.Value;
ConfigDatabaseKey = keyVaultClient.GetSecretAsync(configDbKeySecretUri).Result.Value;
var configResult = new ApiConfiguration();
try
{
// DocumentDbClient is just a helper class that provides a singleton client class I can use.
configResult = DocumentDbClient.ConfigInstance.CreateDocumentQuery<ApiConfiguration>(
UriFactory.CreateDocumentCollectionUri(ConfigurationDbId, ConfigurationDbCollection), new FeedOptions { MaxItemCount = 1 })
.Where(ac => ac.Id == $"{instanceKey}-{ServiceIdentifier}" && ac.AccessKey == ConfigDatabaseKey).AsEnumerable().FirstOrDefault();
if (!string.IsNullOrWhiteSpace(configResult.Id))
{
// Default fallback App Insights instance list above.
// This means even if we have a misconfiguration we should get telemetry *somewhere*
TelemetryConfiguration.Active.InstrumentationKey = string.IsNullOrWhiteSpace(configResult.AppInsightsKey) ? DefaultAppInsightsKey : configResult.AppInsightsKey;
// Our runtime values that can be used by our code
LoggingDatabaseAccount = configResult.LoggingDatabaseAccount;
LoggingDatabaseKey = configResult.LoggingDatabaseKey;
LoggingDatabase = configResult.LoggingDatabase;
LoggingDatabaseCollection = configResult.LoggingDatabaseCollection;
}
else
{
throw new ConfigurationErrorsException("Service is not configured correctly.");
}
}
catch (Exception ex)
{
AppInsightsClient.ClientInstance.TrackException(ex, instanceTelemetry);
// Bubble Exception
throw;
}
}
}
}

view raw
Global.asax.cs
hosted with ❤ by GitHub

Here’s how we authorise our calls to Key Vault to retrieve our initial configuration Secrets (called on line 51 of the above sample code).

namespace MyDemo.Website.WebApi.Helpers
{
public static class KevVaultUtils
{
// You could also utilise a certificate-based service principal as well
// Yes, these are hardcoded 🙂
private const string kvSP = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx";
private const string kvSPkey = "NOTREALLYTHEPASSWORD";
public static async Task<string> GetToken(string authority, string resource, string scope)
{
var authContext = new AuthenticationContext(authority);
var clientCred = new ClientCredential(kvSP, kvSPkey);
var result = await authContext.AcquireTokenAsync(resource, clientCred);
if (result == null)
throw new InvalidOperationException("Failed to obtain the JWT token");
return result.AccessToken;
}
}
}

view raw
KevVaultUtils.cs
hosted with ❤ by GitHub

My goal was to make configuration easily manageable across multiple VMSS instances which requires some knowledge around how VMSS instance names are created.

The basic details are that they consist of a hostname prefix (based on what you input at VMSS creation time) that is appended with a base-36 (hexatrigesimal) value representing the actual instance. There’s a great blog from Guy Bowerman from Microsoft that covers this in detail so I won’t reproduce it here.

The final piece of the puzzle is the Cosmos DB configuration entry which I show below.

{
"id": "swtst01-webapi",
"AccessKey": "MATCHES_KEY_OF_CONFIG DB",
"AppInsightsKey": "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy",
"LoggingDatabaseAccount": "https://myruntimedb.documents.azure.com:443/",
"LoggingDatabaseKey": "COSMOS_DB_KEY_OF_DBACCOUNT",
"LoggingDatabase": "runtimeitems",
"LoggingDatabaseCollection": "runtimedatabases",
}

view raw
runtimeconfig.json
hosted with ❤ by GitHub

The ‘id’ field maps to the VMSS instance prefix that is determined at runtime based on the name you used when creating the VMSS. We strip the trailing 6 characters to remove the unique component of each VMSS instance hostname.

The outcome of the three components (code changes, Key Vault and Cosmos DB) is that I can quickly add or remove VMSS groups in configuration, change where their configuration data is stored by updating the Key Vault Secrets, and even update running VMSS instances by changing the configuration settings and then forcing a restart on the VMSS instances, causing them to re-read configuration.

Is the above the only or best way to do this? Absolutely not 🙂

I’d like to think it’s a good way that might inspire you to build something similar or better 🙂

Interestingly, getting to this stage as well, I’ve also realised there might be some value in considering moving this solution to Service Fabric in future, though I am more inclined to shift to Containers running under the control an orchestrator like Kubernetes.

What are you thoughts?

Until the next post!

3 thoughts on “Moving from Azure VMs to Azure VM Scale Sets – Runtime Instance Configuration

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s