The False Promise of Cloud Auto-scale

Go to the cloud because it has an ability to scale elastically.

You’ve read that bit before right?

Certainly if you’ve been involved in anything remotely cloud related in the last few years you will most certainly have come across the concept of on-demand or elastic scaling. Azure has it, AWS has it, OpenStack has it.  It’s one of the cornerstone pieces of any public cloud platform.

Interestingly it’s also one of the facets of the cloud that I see often misunderstood by developers and IT Pros alike.

In this post I’ll explore why auto-scale isn’t black magic and why your existing applications can’t always just automagically start using it.

What is “Elastic Scalability”?

Gartner (2009) defines it as follows:

“Scalable and Elastic: The service can scale capacity up or down as the consumer demands at the speed of full automation (which may be seconds for some services and hours for others). Elasticity is a trait of shared pools of resources. Scalability is a feature of the underlying infrastructure and software platforms. Elasticity is associated with not only scale but also an economic model that enables scaling in both directions in an automated fashion. This means that services scale on demand to add or remove resources as needed.”

Source: http://www.gartner.com/newsroom/id/1035013

Sounds pretty neat – based on demand you can utilise platform features to scale out your application automatically.  Notice I didn’t say scale up your application.  Have a read of this Wikipedia article if you need to understand the difference.

On Microsoft Azure, for example, we have some cool sliders and thresholds we can use to determine how we can scale out our deployed solutions.

Azure Auto Scale Screenshot

Scale this service based on CPU or on a schedule.

What’s Not to Understand?

In order to answer this we should examine how we’ve tended to build and operate applications in the on-premise world:

  • More instances of most software means more money for licenses. While you might get some cost relief for warm or cold standby you are going to have to pony up the cash if you want to run more than a single instance of most off-the-shelf software in warm or hot standby mode.
  • Organisations have shied away from multi-instance applications to avoid needing to patch and maintain additional operating systems and virtualisation hosts (how many “mega” web servers are running in your data centre that host many web sites?)
  • On-premise compute resources are finite (relative to the cloud).  Tight control of used resources leads to the outcome in the previous point – consolidation takes place because that hardware your company bought needs to handle the next 3 years of growth.
  • Designing and building an application that can run in a multi-instance configuration can be hard (how many web sites are you running that need “sticky session” support on a load balancer to work properly?)  Designing and building applications that are stateless at some level may be viewed by many as black magic!

The upshot of all these above points is that we have tended to a “less is more” approach when building or operating solutions on premise.  The simplest mode of hosting the application in a way that meets business availability needs is typically the one that gets chosen. Anything more is a luxury (or a complete pain to operate!)

So, How to Realise the Promise?

In order to fully leverage auto-scale capabilities we need to:

  • Adopt off-the-shelf software that provides a consumption-based licensing model. Thankfully in many cases we are already here – we can run many enterprise operating system, application and database software solutions using a pay-as-you-go (PAYG) scheme.  I can bring my own license if I’ve already paid for one too.  Those vendors who don’t offer this flexibility will eventually be left behind as it will become a competitive advantage for others in their field.
  • Leverage programmable infrastructure via automation and a culture shift to “DevOps” within our organisations.  Automation removes the need for manual completion of many operational tasks thus enabling auto-scale scenarios.  The new collaborative structure of DevOps empowers our operational teams to be more agile and to deliver more innovative solutions than they perhaps have done in the past.
  • Be clever about measuring what our minimum and maximum thresholds are for acceptable user experience.  Be prepared to set those CPU sliders lower or higher than you might otherwise have if you were doing the same thing on-premise.  Does the potential beneficial performance of auto-scale at a lower CPU utilisation level out-weigh the marginally small cost you pay given that the platform will scale back as necessary?
  • Start building applications for the cloud.  If you’ve designed and built applications with many stateless components already then you may have little work to do.  If you haven’t then be prepared to deal with the technical debt to fix (or start over).  Treat as much of your application’s components as you can as cattle and minimise the pets (read a definition that hopefully clears up that analogy).

So there we have a few things we need to think about when trying to realise the potential value of elastic scalability.  The bottom line is your organisation or team will need to invest time before moving to the cloud to truly benefit from auto-scale once there.  You should also be prepared to accept that some of what you build or operate may never be suitable for auto-scale, but that it could easily benefit from manual scale up or out as required (for example at end of month / quarter / year for batch processing).

HTH

Tagged , , , , ,

2 thoughts on “The False Promise of Cloud Auto-scale

  1. Henry Senior says:

    I reckon it’s been a pretty good promise and keeps getting better, not without its problems though.

    we have autoscaling set up on http://fallenlondon.storynexus.com, our architecture suited a multi-server set up so no problems with sticky sessions.
    we are running on IIS and the problem we now face is the start up time for new servers.
    The problem is not (just) that windows takes longer to boot, but that on AWS (and Azure) the windows machines need to be ‘sysprepped’ before they are properly instantiated.

    Apparently this requires an initial boot for sysprep and then a subsequent boot to run.

    This means we are waiting 10 mins or so before the server is doing anything useful, by which time the fan has been hit by something warm and smelly…

    • Simon says:

      Thanks for stopping by and leaving your experience. It sounds like your seeing the direct impact of TTSO which I blogged about recently. You should look to set you scale trigger thresholds lower and find ways to simplify your auto-scale unit so that startup is reduced as much as possible. Maybe also look to an N+1 approach where you scale out more compute units that you need to ensure that you don’t need to scale as often.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: