Microsoft Application Insights – APM for Everyone

When you work as heavily as I have with a technology like Application Insights you do tend to forget the amazing power you have at your fingertips.

Over the last few years I’ve come to rely heavily on Application Insights as the primary Application Performance Management (APM) tool of choice for services I build, whether they are hosted in Azure or not.

In this post I am going to take a quick walk through features that I think every developer should now about with Application Insights so they can also get maximum benefit from it too!

Your language has an SDK

Chances are pretty good that if you’re on a popular platform that Application Insights will have an SDK you can use. SDKs are great because adding them to a solution produces a bunch of default telemetry with nothing more than a Telemetry Key required.

The Application Insights team maintains their SDK documentation and SDK code references on Github. Needless to say .Net has great support, but Java, JavaScript and Node.js also get first-party support, with community support for Go, Python and Ruby. Want to do APM that includes native mobile experiences? No problem, drop in the HockeyApp SDKs.

Use it regardless of your hosting environment

Not using Azure to host your solution? Not a problem. If you can make outbound calls from your host to Application Insights then you can use Application Insights. 💯

Useful free tier

In an upcoming post I’ll talk more about perceived and actual value of free services in the cloud, but let me say for most basic scenarios the 5 GB of ingested Application Insights data per month will more than suffice. If not, you can manage your costs by moving to a sampling model that means you can still glean useful insights about your application’s behaviours without breaking the bank.

No features are removed at the free tier pricing tier either – you can still do full analytics on the log information that is captured!

Dependency tracking

The out-of-the-box dependency tracking is super handy to diagnose performance issues that result from upstream calls.

The only downside here is that the default capabilities are good at tracking HTTP-based dependencies, SQL Server, and not much else (at time of writing). Having said this, there is a published way for you to track other custom dependencies if needed, though it requires dedicated code – the out-of-the-box tracking requires no additional special code which is amazing!

I have to say that HTTP dependency tracking has been exceptionally useful in a REST-heavy environment, even tracking HTTP calls to external service providers like SendGrid, Twilio and others, providing us an easily accessible view on where our latency is arising from.

The sample below shows dependency behaviour for a single request to a caching service in an application. The very first request (at bottom of list) is a call to Cosmos DB which returns a 404 (Not Found) HTTP status code which then triggers a lookup of some data via a HTTP call to an API with the result returned then written to Cosmos DB for the next request. This is super useful information and I did precisely nothing to my code (other than add the Application Insights SDK to my solution) to capture this for every request!

Remote Dependencies

Track impact of releases

Application Insights has a REST API which allows you to add custom steps to Continuous Deployment pipelines to publish a Release Annotation to your timeline in Application Insights so you can see if a release impacts your solution.

Visual Studio Team Services’ Release Management will do this for you automatically, but if you aren’t using VSTS then you can still leverage this capability. A sample is shown below (thankfully we had no negative impact with this release!)

Release Annotation

Insights to your inbox

Super handy if you don’t want to go hunting for stats or you want to share aggregated stats with stakeholders.

App Insights Email

Heavy duty analytics

If the default experiences in the Azure Portal aren’t enough, then you can leverage the power of Azure Log Analytics to perform more detailed queries and drill into your data and build tables or graphs from the results.

A good example of this is the answer I provided to the following on Twitter from Troy.

Each request will be captured along with useful metadata (in this case from the underlying .Net codebase) which allows us to do further querying and filtering on the data.

Here’s a sample of such a request (this one is a HTTP request to an API endpoint) with the metadata shown which is needed to help solve Troy’s question.

Sample HTTP Request

The trick is then to head over to the Log Analytics environment…

Open Analytics

.. and then drill into the data to provide you with your desired answer.

Analytics query

You can then tabulate or graph the output. The above is a really simple query – trust me, you can do far more complicated than this!

Failure drill-in

This view has recently improved and become far more interactive – you can easily identify common reasons for failures and drill right in to, in my experience, identify root cause within a matter of moments!

In HTTP applications you do get a bit of expected noise (things like expected 401, 403 and 404 errors) which can be annoying to sift through, particuarly for REST-type APIs, but it’s a small price to pay for the power you get!

Failures View

Availability Checks, Health Alerts and Smart Detection

I’m not going into these in too much detail, but you can also set Alerts and health checks in Application Insights and the service will also do analysis of trends and alert you to items that may require your attention (even if you don’t have a specific rule set).

Custom Events, User Journeys and Cohorts

Like health checks I am not going to go through these in detail, but if this is the sort of insight you need, then it is possible to access it here too. If you need to log custom data in Application Insights you can do that too using Custom Events.

What are you waiting for?!

I can honestly say I would be hard pressed these days to build anything without including Application Insights in it, particularly if I won’t have direct access to the hosting environment.

Troubleshooting runtime issues becomes much easier with the details you can glean from walking request stacks as presented by Application Insights. I’ve isolated and fixed more than my fair share of runtime issues (mostly configuration related) without ever needing to try and reproduce locally because I could quickly tell via the telemetry where things were going wrong.

Happy days! 😎

Tagged , , ,

Provide non-admin users with read-only access to Service Endpoints in VSTS

I am currently transitioning some work to another team in our business. Part of this transition has been to pre-configure various Service Endpoints in Visual Studio Team Services (VSTS) to provide a way for the new team to deploy into target Azure environments without the team necessarily having direct or privileged access into those Azure environments.

In this post I am going to look at how you can grant users access to these Service Endpoints without them being able to modify them. This post will also be useful if you’ve configured Service Endpoints (as an admin) and then others on the team (who are non-admins) are unable to see them.

Note that this advice applies to any Service Endpoint – not just Azure!

By default only users who are members of the following groups can see Service Endpoints:

– Project Admins
– Endpoint Admins
– Endpoint Creators.

It’s unlikely that you want all your team members to hold these roles, so let’s see how we can grant rights to use Service Endpoints without being an admin!

We’re going to complete this task with an existing Service Endpoint, but you should hopefully see how you can do this at the same time you setup a new Endpoint in future.

Open up your Team Project and in the top navigation mouse over the settings (cog) icon and from the context menu click “Services”.

Service Endpoints

Once the Endpoints page has loaded, select the Endpoint you wish to allow non-admin users to see.

Selected Endpoint

Now click on ‘Roles’ to display the currently assigned users and groups and their permissions (the current list will only contain users or groups at an ‘Administrators’ level).

Roles Screen

Now we’re in the right place to add our additional read-only users or groups!

Click on the ‘+ Add’ button and the Add user dialog is displayed. Ensure that the ‘Role’ is set to ‘User’ and then find the User or Group you want to assign this right to. In our demo below we are allowing the current project’s Contributors group to use Endpoints.

Add user dialog

Once you click the ‘Add’ button the user or group will be granted read-only rights to the Endpoint. This will allow them to find or use the Endpoint in Build or Release Management Definitions (like below).

Release Definition

Happy (secured) days! 😎

Tagged , , , ,

Azure AD B2C Custom Attributes: How to easily find their unique key value

When working with Azure Active Directory B2C you can create what are known as Custom Attributes which allow you to store data about users beyond the attributes (firstname, lastname, etc) that are available out-of-the-box.

When you want to work with these Custom Attributes in a solution you build you will need to know the unique key of the attribute in order to reference it.

What do I mean by this? Let’s take a quick look using an example.

Note that you will need to be a B2C Global Admin in order to perform some tasks covered in this post.

Creating Custom Attributes

These are created via the Azure Management Portal. In my sample I am going to add an attribute to hold a tier rating for a user (say, Gold, Silver and Bronze) called “TierRating”.

The video below shows how you can do this.

Find Attribute’s Unique Key Value

Now we have this Custom Attribute created we will want to use it in our solution. If you’re eagle-eyed you may find in the Portal that these Custom attributes appear be named ‘extension_AttributeName’ (i.e. ‘extension_TierRating’).

This won’t work in your solution though 🙂

When you create a Custom Attribute this is actually being done for you by a custom application called the “b2c-extensions-app” that is deployed to all B2C tenants at provisioning time.

Why am I telling you this? I am telling you this because it’s the key to determining the Custom Attribute’s unique key value 🙂

You will need the Application ID for the b2c-extensions-app, which you can find in the Portal as shown in the video below.

Using it in your code

Now we have this value (in our demo video the value is ‘bb10b272-0267-46f0-8b6f-4367e8b1b1e6’) we can start to interact with Custom Attributes in our code.

Firstly we need to drop the dashes so it becomes ‘bb10b272026746f08b6f4367e8b1b1e6’. We combine this with the “Name” value for the Attribute, along with a prefix of “extension_”.

So for our tier rating Custom Attribute the full key for it becomes ‘extension_bb10b272026746f08b6f4367e8b1b1e6_TierRating’.

A sample of how this key is used in our solution is shown below.

This pattern is used for every Custom Attribute you create in this Directory.

So there we have it – the easiest way you can determine the actual unique key for a Custom Attribute!

Happy days 😎

Tagged , ,

Easy Release Versioning for .Net Projects using VSTS and TFS

Versioning. Here we are. Again.

Over the years I have always worked hard to make versioning a foundational piece of every CI / CD solution I’ve setup. Reliable, logical versioning becomes key to long-term maintenance and troubleshooting efforts, and whatever you can do to make it a “no-brainer” is worth it (your future self will thank you).

The move to .Net Core changed the way a few items work in the .Net world, including versioning, and besides, I am always looking for ways to make versioning easier.

So here’s my cheat-sheet for versioning your solutions. It won’t suit all application types, but for my use case (.Net Web Apps) it works just fine. It will work with VSTS and newer TFS versions too.

I haven’t tested on VB projects, but this should work for them just as easily as C#.

NET Core: Setup Your Project File

Versioning has been simplified in the .Net Core world. Edit your csproj and modify it as follows:

<PropertyGroup>
  <Version Condition=" '$(BUILD_BUILDNUMBER)' == '' ">1.0.0.0</Version>
  <Version Condition=" '$(BUILD_BUILDNUMBER)' != '' ">$(BUILD_BUILDNUMBER)</Version>
</PropertyGroup>

If your file doesn’t have a version node, add the above. This tip comes from Stack Overflow, but I’ve modified it slightly.

The above setup will mean debugging locally will give you a version of 1.0.0.0, and in the event you build in a non-VSTS / TFS environment you will also end up with a 1.0.0.0 version. $(BUILD_BUILDNUMBER) is an environment variable set by Team Build and which will be updated at build time by VSTS or TFS.

NET Framework: Add Custom Task

In the “old” .Net world we have to update the properties of the AssemblyInfo file that is a part of the project, specifically targeting File Version and Assembly Version.

There isn’t an in-built build Task to do this for you, and rather than hack together a script, why not use a great custom task from the marketplace (which also supports TFS)?

I’m using the “Assembly Info” task from Bleddyn Richards, primarily because it has the most recent updated date out of the similar tasks available, which means it’s hopefully getting plenty of love and care from the owner 🙂

Add the above Task to your build definition (make sure to do it before you build the Solution / project) and then set the version numbering as shown below.

VSTS Task Config - versioning

Setup Build Versioning

The above steps are great, but they will count for nothing (or cause a compile fail) if we don’t have a valid versioning number.

The default VSTS build version number format takes this format:

$(date:yyyyMMdd)$(rev:.r)

This results in a build number that looks like this:

20180201.1 (for the first build on February 1 2018).

This isn’t a valid .Net Version number, so we need to change it.

First, let’s add two Variables to our build definition: MajorVersion and MinorVersion.

You can set these to any valid integer value. These can be manually controlled over time as you determine the need to increment Major and Minor version numbers. Note you can make them whatever you like, keeping in mind the size restriction I mention below.

Build Variables

Now let’s change the Build Numbering scheme to use these variables, a specific date format, and the revision:

Number Format

$(MajorVersion).$(MinorVersion).$(date:yy)$(DayOfYear)$(rev:.r)

Which produces a build number that looks like this:

2.0.18037.1 (for first build on February 6 2018 for Major Version 2, Minor Version 0).

You can choose a format that works for you, with one proviso that each version segment must be less than 65,000, which sounds like a lot, until you consider that 20180201 (Feb 1, 2018) is, as an integer (20,180,201) larger than 65,000. Hence my decision to drop to using YY (if you’re reading this in the year 2065 I apologise for my shortsightedness).

The result of these changes will mean that you’ll have a lovely version number automatically written into your solution at build time. An example from a .Net Framework solution is shown below.

Properties Dialog

Happy Days 😎

Tagged ,

Twitter on Linux in Windows Subsystem for Linux

First of all, tip of the hat to Geoff Huntley for putting this in my timeline to start off with :).

So how to get Rainbow Stream to run on Windows Subsystem for Linux (WSL)? Easily!

I’m running on the Slow Ring Insiders (currently on 17074), but hopefully these instructions will work for you.

Crack open a bash shell by running ‘bash’ on your Windows machine and then enter

sudo apt-get install python-pip

sudo apt-get install python-dev libjpeg-dev libfreetype6 libfreetype6-dev zlib1g-dev

sudo pip install backports.functools_lru_cache

sudo pip install rainbowstream

rainbowstream -iot

Now you will enter an interactive console at which you will need to authorise Rainbow Stream to access your profile and act as a client.

The video below shows you the actions you need to take. Enjoy!

Tagged , , ,

Recommendations on using Terraform to manage Azure resources

If you’ve been working in the cloud infrastructure space for the last few years you can’t have missed the buzz around Hashicorp’s Terraform product. Terraform provides a declarative model for infrastructure provisioning that spans multiple cloud providers as well as on-premises services from the likes of VMWare.

I’ve recently had the opportunity to use Terraform to do some Azure infrastructure provisioning so I thought I’d share some recommendations on using Terraform with Azure (as at January 2018). I’ll also preface this post by saying that I have only been provisioning Azure PaaS services (App Service, Cosmos DB, Traffic Manager, Storage and Application Insights) and haven’t used any IaaS components at all.

In the beginning

I needed to provide an easy way to provision around 30 inter-related services that together constitute the hosting environment for single customer solution. Ideally I wanted a way to make it easy to re-provision these services as required.

I’ve used Azure Resource Manager (ARM) templates heavily in the past, but thought I could get some additional value from Terraform as it provides you with additional capabilities that aren’t present in ARM templates. As an example, right now you can’t provision Azure Storage Containers with ARM, but you can with Terraform.

I began, as I do with these sorts of templates, by incrementally defining resources and building the Terraform definition as I went. I got to the point where I decided to refactor some of the Terraform definitions to modularise the solution to hopefully make it a bit easier to understand and manage going forward.

When I did this refactor I also changed a bunch of resource naming schemes to better match my customer’s preferred standard. The net result of all this change was that I had a substantial amount of updates to be applied in my test lab that I had been incrementally updating as I went.

Now the fun begins

I ran ‘terraform plan’ which generated my execution plan (always make sure to provide an “out” parameter so you know ‘apply’ will match the plan exactly). I then ran ‘apply’ and left it running while I went to lunch.

When I returned about 45 minutes later my ‘terraform apply’ was still running, seemingly stuck on destroying one of the resources.

A quick visual check in the Portal of the Azure Resource Group these resources were in suggested that everything I wanted provisioned had been provisioned successfully.

Given this state of affairs I Ctrl+C’d the job, to which Terraform advised me:

Interrupt received.
Please wait for Terraform to exit or data loss may occur.

So, I gave the job a few more minutes to gracefully exit, at which point I sent another Ctrl+C and the job exited with this heart-warming message:

Two interrupts received. Exiting immediately. Note that data loss may have occurred.

Out of interest I immediately ran ‘terraform plan’ to understand what Terraform thought was provisioned versus what actually was.

The net result? Terraform had no idea that anything was provisioned!

A look at the local state file showed it was effectively empty. I restored the backup state file which it turned out was actually of minimal use because the delta between the backup and what I had just applied was too great – the resulting plan looked like an Azure resource massacre about to happen!

What to do?

I thought at this point that I was using the tooling incorrectly – how could I so easily get into this state? If I was using this to manage a production environment I’d be dead in the water.

Through additional reading and speaking with others, this is a known long-term pain point with Terraform – lose your local state and you are in a world of pain. At this time, you can’t even easily rebuild this local state without having to write a bunch of Imports which means you need to know what to import and you lose tracking of elements like random string generation at the same time.

Recommendations and Observations

Out of this experience I have some recommendations and observations around how I see Terraform (in its current state) fitting into environmental management in Azure:

  • Use Resource / Resource Group locks (delete or read-only) always: this applies even outside of use of Terraform. This will stop you from accidentally changing important resources. While you can include the definition of resource locks in your Terraform definitions I’d recommend you leave them out. If you use a Contributor-level user to do your deployments Terraform will fail when it tries to lock Resource Groups.
  • Make smaller, more frequent changes: this equates to a smaller delta between what’s in your state, and what’s in the plan. This means if you do need to recover state from backup you will have less of a change to deal with.
  • Consider your use of Terraform features like the ‘random string provider’ – you could move these to be input parameters that you can generate outside of Terraform. This means you create a fixed set of inputs, so that even if you lose state you can be assured that creating resources with “random” name components will be consistent with your last successful execution.
  • Use Resource Groups with small sets of Resources: fewer resources to deal with in event of a failure.
  • Consider Terraform as an initial provisioning tool for production and a re-provisioning tool for all dev / test and low complexity environments.
  • Use Terraform to detect drift: if you deploy an environment with Terraform, then setup the same definition as a CI build that simply runs ‘terraform plan’ against the deployed environment, using the state you generated on initial deployment as an input. If you have any change (add / delete) as the result of the ‘plan’ then you can fail the build and alert your team to investigate accordingly.
  • Consider for Blue / Green Infrastructure deployments for production only: if you want to push completely fresh infrastructure each time then Terraform is a good tool to consider. The usability of this approach is determined by complexity of your environment and the mix of utility / non-utility services you are deploying. This can work well with a slower cadence of release (monthly or above), even if your environment is fairly complex.
  • Use Azure Storage account backing for your state file (key for Terraform Open Source users). You can do this by setting up an Azure Storage Account and then defining the following in each of your TF files:
    terraform {
      backend "azurerm" {
        storage_account_name = "myterraformstore"
        container_name       = "tfstate"
      }
    }
    

    and then when you execute the init step you provide the additional parameters:

    terraform init -backend-config="access_key=*STORAGE_ACCNT_KEY" -backend-config="name.ofyour.tfstate"
    

    The shame here right now is you don’t get the versioning those who use AWS S3 buckets have access to.

  • Always write an ‘Import’ script once you’ve provisioned key environmental components you can’t afford to lose.

As a side note I notice that there is now an Azure Go SDK dependency for the Terraform Azure provider which is being maintained by Microsoft. I do wonder if this means that Terraform loses some of its appeal because new Azure features for Terraform will invariably be tied to the cadence and capabilities of the Go SDK which is generated against the official Microsoft Azure API. Will this become the way to block provider features that violate the Azure API definition? I guess time will tell.

As with all tools, Terraform has its strengths and weaknesses – hopefully as the product continues to mature we’ll see key features like re-build / import become part of the core value proposition (and not simply appear in the Enterprise version as a paid value add).

Tagged , , ,

Easy Filtering of IoT Data Streams with Azure Stream Analytics and JSON reference data

I am currently working on an next-gen widget dispenser solution that is gradually being rolled out to trial sites across Australia. The dispenser hardware is a modern platform that provides telemetry data that can be used for various purposes by the locations at which the dispenser is deployed and potentially by other third parties.

In addition to these next-gen dispensers we already have existing dispenser hardware at the locations that emits telemetry that we already use for other purposes in our solution. To our benefit both the new and existing hardware emits the same format telemetry data 🙂

A sample telemetry entry is shown below.

We take all of the telemetry data from new and old hardware at all our sites and feed it into an Azure Event Hub which allows us to perform multiple actions, such as archival of the data to Blob Storage using Azure Event Hub Capture and processing the streaming data using Azure Stream Analytics.

We decided we wanted to do some additional early stage testing with some of the next-gen hardware at a few sites. As part of this testing we also wanted to push the data for just specific hardware to a partner organisation we are working with. So how did we achieve this?

The first step was to setup another Event Hub. We knew this partner would not have any issues consuming event data from a Hub and it made the use of Stream Analytics an obvious way to process the incoming complete stream and ensure only the data for dispensers and sites we specify is sent to the partner.

Stream Analytics has the concept of Reference Data which takes the form of slow-moving (or static) data that can be read from a blob storage account in Azure.

We identified our site and dispensers and created our simple Reference Data JSON file – sample below.

The benefit of this format is that we can manage additional sites and dispenser by simply editing this file and uploading to blob storage! Stream Analytics even helps us by providing a useful naming scheme for files so you don’t even need to stop your Stream Analytics Job to update it! We uploaded our first file to a location that had the path of

/siterefdata/2018-01-09/11-40/sitedispensers.json

In future when we want to update the file, we edit it and then upload to blob storage at, say

/siterefdata/2018-02-01/00-00/sitedispensers.json

When the Job hits this date / time (UTC) it will simply pick up the new reference data – how cool is that?!

In order to use the Reference Data auto-update capability you need to set up the path naming scheme when you define the reference data as an input into the Stream Analytics Job. If you don’t need the above capability your can simply hard code the path to, say, a single file.

The final piece of the puzzle was to write a Stream Analytics Job that used the Reference Data JSON as one input and read the site identifier and dispensers from the included integer array. Thankfully, the in-built GetArrayElements Function came in handy, along with CROSS APPLY which gives us the ability to iterate over the elements and use them handily in the WHERE clause of the query!

The resulting solution now handily carves off the telemetry data for just the dispensers we want at the sites we list and writes them to an Event Hub the partner organisation can use.

I commented online that this sort of solution, and certainly one that scales as easily as this will, would have been something unachievable for most organisations even just a few years ago.

The cloud, and specifically Azure, has changed all of that!

Happy Days 😎

Tagged , , , ,

Use Azure Health to track active incidents in your Subscriptions

Yesterday afternoon while doing some work I ran into an issue in Azure. Initially I thought this issue was due to a bug in my (new) code and went to my usual debugging helper Application Insights to review what was going on.

The below graphs a little old, but you can see a clear spike on the left of the graphs which is where we started seeing issues and which gave me a clue that something was not right!

App Insights views

Initially I thought this was a compute issue as the graphs are for a VM-hosted REST API (left) and a Functions-based backend (right).

At this point there was no service status indicating an issue so I dug a little deeper and reviewed the detailed Exception information from Application Insights and realised that the source of the problem was the underlying Service Bus and Event Hub features that we use to glue together our services.

You can see the increased error rate from the Service Bus Metrics view below.

Service Bus Metrics

While I was doing this an alert popped up in the Portal advising a service incident and directed me to the Azure Service Health feature in order to view the full incident details and also to track it.

On the Azure Health page I could see an active incident and decided to try out the alerting feature to track this during a commute home.

I clicked on the Add Alert option and configured a new email-based alert. You can also push alerts into your preferred IT Service Management (ITSM) solution, but we aren’t yet using an ITSM platform for this solution, but this would be our choice in future!

In Services I chose Service Bus and Event Hubs and for Regions I selected the two Australian Regions. Note that I had to set up an Action Group as I hadn’t used the feature previously – in the screenshot below I am just reusing the one I previously setup.

Alerts Setup

A short while after saving the Alert configuration the recipients in the Action Group started to receive update emails containing the most recent status of the incident. A sample is shown below.

Notice Email

About 45 minutes after this alert we received a resolution notification.

The amount of time saved for our team with the ease of this setup is pretty amazing, and if you’re not using this feature already you should go and explore it in the Portal and set it up for you key Azure components.

What a great early Christmas present!

😎

Tagged , , , , ,

Understanding Azure’s Container PaaS Capabilities

If you’ve been using Azure over the past twelve months, you can’t but have the feeling that it’s become a bit like this…

Containers... Containers Everywhere

.. and you’d be right.

To be fair, though, Containers have been one of the hot topics in computing in general and certainly one that’s been getting the most interest in my recent Azure Open Source Roadshows.

One thing that has struck me though is that people are not clear on the purpose of all the services in Azure that have ‘Containers’ listed as a capability, so in this post I am going to try and review the Azure Platform-as-a-Service offerings that have Container capabilities and cover what the services can be used for.

First, before we begin, let’s quickly get some fundamentals under our belts.

What is a Container?

Containers provide encapsulation and isolation for workloads and remove the need for a complete Operating System image to be deployed in order to manage resource allocation.

They have proven popular because they typically have smaller footprints than Virtual Machines, boot much faster as a result and have a modern build process based on composition that gels well with software development.

A Container still needs to “run” somewhere – this “somewhere” is what I will call a “Container Host” through the rest of this post.

So where does Docker fit into all of this? Docker provides tooling for the creation, running and management of Containers and is by far the best known tech in this space. Microsoft has worked with Docker to ensure the Docker tooling supports Windows and Windows Containers.

Our most basic Container workload setup then would be: one Container Host running one Container.

What is a Container Orchestrator?

A big part of running Containers at scale is their management which is where technologies like Kubernetes (k8s), Docker Swarm and DC/OS come into play. These technologies allow you to manage multiple Containers and their workloads, performing orchestration of deployments and controlling connectivity between Container instances running on Container Hosts.

An Orchestrator typically runs more than one node to ensure availability, but nothing stops us from running a simple single node setup like Minikube to start to learn about them.

Right, now we have some fundamentals in place, let’s take a look at what Azure offers.

Azure’s Container Offerings

Note that we are going to focus on PaaS services – you can of course still run Containers on Virtual Machines, or deploy something like OpenShift in Azure if you wish.

Please note: any service listed as ‘Preview’ should not be used for Production deployments!


Azure Container Registry (ACR)

What is it? ACR is an Azure-hosted Container Registry based on the open-source Docker Registry v2 spec. This is a turnkey part of Azure’s Container story.

Why use it? When you build a Container Image you need a place to store it. Docker Hub is the Registry where you pull all your public Images from and which is run by Docker. ACR provides you with a private Docker-compatible Registry that you can push Images to and use as a deployment source.

Benefits:

  • Private Registry you configure that is not published on a well-known endpoint that is a source for public Images
  • Provides a unique *.azurecr.io Registry endpoint which can be used to store Images that can be deployed *anywhere* (not just Azure)
  • Webhook support that can be used for Continuous Deployment, particularly with Azure’s Web Apps for Containers (see below)
  • Control access to the Registry using Azure Active Directory Credentials
  • ACR provides seamless authentication (i.e. no configuration) with other Azure services like Azure Container Instances, Azure Container Services, App Service and Batch.
  • Geo-replication is the hotness! (requires Premium level) * Preview

Restrictions:

  • Cooler features (like geo-replication) are at higher price point only
  • I’m struggling here for others! 🙂

> ACR Documentation.


Azure Container Instances (ACI) * Preview

What is it? An Individual Container Host that can run one or more Container. No need for you to manage the Host.

Why use it? These are probably the easiest way to get going running Container-based workloads in Azure. If you have a simple workload that needs a public IP and which can talk to various Azure PaaS services then consider ACI over Web Apps for Containers or Azure Container Service.

Benefits:

  • No need for you to manage the Container Host – tell it which Containers to run and that’s it!
  • Pay per-second for use with customisable CPU Core and memory options
  • Supports multiple Containers in single ACI Host
  • “Whole of Azure” scale: deploy ACI workloads in any Azure Region (where ACI is available).

Restrictions:

  • No production Orchestrator support: there is an experimental Kubernetes Connector, but apart from this you cannot bolt an ACI Host into an Orchestrated environment
  • No VNet support: you can’t connect an ACI Host to Azure VNets.

> ACI Documentation.


Web Apps for Containers (App Service on Linux)

What is it? A Container Host that runs on a Linux-based variant of Azure’s App Service that is aimed at web-centric workloads (hence the name). Like ACI, you still don’t need to manage the Host.

Why use it? If you have a website or HTTP API workload you traditionally host on Linux and that you can (or have) containerised, then this is a good spot to start as it limits your workloads exposure to HTTP (80) and HTTPS (443) ports.

Additionally, even if you haven’t containerised your solution, you can still use this service to host it. When you select a framework to use to host your solution (like Java or PHP) the framework is deployed to Web Apps for Containers as a Docker Image!

Benefits:

  • Get access to standard App Service features like Autoscale, Custom Domains, SSL and Continuous Deployment
  • No need for you to manage the Container Host – tell the Web Apps Instance which Container to run and it will do that
  • Deploy existing Docker images from Docker Hub, Azure Container Registry or from Azure’s pre-built framework images
  • Troubleshoot containers using SSH from Kudu.

Restrictions:

  • No Orchestrator support: what you gain through using App Service you give up in not being able to bolt the Container Host into an Orchestrator like Kubernetes
  • Multi-Container deployments are not supported
  • Not all Windows-based App Service features are supported (yet)
  • Not currently supported in App Service Environments.

> Web Apps for Containers Documentation.


Azure Container Services (ACS)

What is it? A service that allows you to run Container Hosts that are managed by an Orchestrator of your choice (Kubernetes, Docker or DC/OS).

Why use it? If you already run Container workloads on VMs (regardless of hosting location) that use an Orchestrator, or you’d like to start using Containers at scale and need an Orchestrator, then this is the service to use.

Benefits:

  • No need for you to manage the underlying Virtual Machine infrastructure
  • Orchestrator and Container Host setup is managed for you by the ACS engine (which is open source)
  • Container Host scalability is supported via use of Virtual Machine Scale Sets (VMSSs)
  • All hosts (Orchestrator and Container) are vanilla instances – this is not a special “Azure release”
  • Orchestrators have Azure extensions allowing them to perform actions such as creating Azure Load Balancers when you specify load balanced workloads in your setup.
  • Integration with Azure Container Registry for Image deployments.

Restrictions:

  • You pay for both the Container Hosts and Orchestrator Nodes (they are just VMs after all)
  • You can’t increase the number of Orchestrator / Cluster Masters after you have initially created an ACS cluster
  • You can’t upgrade the Orchestrator once you have created an ACS cluster – you need to create a new ACS cluster to gain access to a newer release.

> ACS Documentation: Kubernetes | Docker | DC/OS.


Azure Container Services – Managed Kubernetes (AKS) * Preview

What is it? This service is similar to ACS (above), however in this service (which only supports Kubernetes) the Orchestrator Nodes are managed on your behalf by Microsoft.

Why use it? If you are invested in Kubernetes (or intend to use it as your Orchestrator) and would prefer not to have to manage the Orchestator Nodes then you should select this over standard ACS with “unmanaged” Kubernetes. If you are using the ACS Kubernetes offering already then this is a logical place to migrate to once AKS is Generally Available.

Benefits:

  • You don’t pay for the Orchestrator Nodes running Kubernetes
  • Orchestrator Node availability, patching and upgrading is managed by Microsoft
  • No need to create a new ACS cluster to pick up new Kubernetes releases
  • Will support 100% of the standard Kubernetes API
  • All other ACS features remain in place!

Restrictions:

  • Only supports Kubernetes!
  • During preview does not support all Kubernetes features.

> AKS Documentation.

On a side note: AKS + ACI (with its Kubernetes connector) + ACR will be an amazing PaaS Container story once all these components are all Generally Available! 😎


Azure Service Fabric

What is it? Service Fabric is both a cluster Orchestrator and a development framework for delivery of highly available, distributed applications. It pre-dates the current Container hype cycle and is used to deliver services in Azure such as CosmosDB.

Why use it? If you want to leverage the Reliable Actor and Service patterns offered by the Service Fabric development framework. Also worth considering if you haven’t yet started with an Orchestrator like Kubernetes.

Benefits:

  • Mature product that underpins key Microsoft cloud-scale services
  • Runs in Azure or on-premises.

Restrictions:

  • Container workloads can’t benefit from the development framework as they run as ‘guest executables’ on cluster nodes (this will change in future as you will be able to Containerise Reliable Actors and Services)
  • Developers using Windows 10 can’t deploy Container-based solutions to local Service Fabric clusters.

> Service Fabric Container Documentation.


Azure Batch

What is it? The name says it all really – you can use Azure Batch to run compute workloads that can be broken into lots of concurrent processes. Examples include payroll runs, animation rendering or research modelling. Batch sits well within the High Performance Compute (HPC) landscape.

Why use it? If you have a batch-style workload with processors that can be Containerised then this is a service you should seriously be considering. More-so if you wish to consider a hybrid scenario where you run some of your workload in-house and burst to Azure as required. As you have Containerised workload you can ship dependencies in a single bundle.

Benefits:

  • You can use Docker Hub as a source for Images (yes, you could pull tensorflow and run it in Azure 😉 ), in addition to ACR and any other compatible Registry
  • Use Singularity in addition to Docker Containers with Batch
  • Run processes on low-priority VMs to reduce the cost (best for non-time sensitive operations).

Restrictions:

  • RDMA (high performance networking) support only available for Containers running on Linux.

> Azure Batch Container Documentation.


So there we are – hopefully the Azure Container story now makes much more sense and you can pick between the services to understand those that would be most appropriate to your use case.

Happy days! 😎

Tagged , , , , , ,

Azure API Management: 200 OK response but no backend traffic

I’m noting this post down in the “if only someone had already made a big noise about this I might have saved some time” category.

The work I’m doing at present involves fronting some APIs with Azure API Management and then exposing them securely.

When I hit the moment I thought I was done today I was doing some testing, and no matter what I did I couldn’t get my backend service to respond, and I could clearly see no traffic hitting the backend.

After double-checking my policies and doing a few more tests (only a couple of hours) I then happened across this Stack Overflow question and its answer

It turns out that I had, somewhere along the line, removed the “forward-request” policy from the Policy applying to all APIs published via API Management.

So how to fix? As Darrell says, find the offending Policy and add the missing item back.

Edit Policy

When done it should like the image below.

API Policy

… and your API calls will now work as expected and not just give you back 200 OK! 😎

Tagged , , , , ,