Sunday, December 18, 2011

Cloud Controllers

The cloud is all about making new resources available to computing consumers. Not hardware consumers, but processing, memory, and storage consumers.  This distinction, I think, is sometimes overlooked by those of us in the industry.  These resources, ultimately, belong to the application — the customer doing the provisioning doesn't necessarily care about these things.  In the end, their job is to make sure the software they're administering runs effectively.

The availability of physical hardware, or in the cloud's case, virtual hardware, is how we're able to achieve optimal usage of deployed software systems.  Of course, as a consumer, I want to give my application everything it needs to perform.  Not just at the barely acceptable level, but at the screaming fast level.

If I'm to get any of these things from a cloud service, I've got to make sure that the resources my application needs are there ahead of time.  So maybe I'll be proactive and give it more memory than it actually needs?  The issue is that I'm looking at the application from outside of the box.  I can peek my head inside to get a general idea of what's happening and how I can improve the situation.  But I get nothing more than a general idea.  The question is, how can I really know what the application is expecting based on current conditions?  What does it need?  And can I automate this process without writing any code?  Maybe, but I think we need to step back and look at cloud services and what they offer to the applications, not just the users who're initiating the environment.

The User Focus
Cloud infrastructure services have friendly control panels for customers.  Control panels, at least in the context of a cloud environment, should hide some of the ugliness involved with provisioning new resources.  A customer sees a limited set of choices in how they can deploy their application — select from a list of different memory profiles.  Select your required bandwidth.  How much storage will you need?  A form not all that different from something typical of any web application we're used to.

The end result of this process?  The application is deployed — with the hardware it needs.  All the work involved with finding the right physical node in which to place this new virtual machine takes place under the covers.

As with any application, this is a sound principle — take a complex task such as provisioning new virtual machine resources and encapsulate the complexity behind a user interface.  This is what successful applications do well — they're successful because they're easy to use.  There is one broad category of user and we're catering to them — to making life easier in how they interact with the cloud.

The problem that I see, or oversight anyway, with this approach is one of priority. When we're designing software systems, one of the first activities is identifying who'll be using the system.  So who uses cloud service providers?  Well, folks who want to provision new applications without the overhead involved with allocating physical hardware to fit their needs.  An opportunity I see with cloud is automation. Not just a simplified interface for system administrators, but a means for deployed applications to start making decisions on what needs to happen in order to perform optimally.

The Application Focus
Some environmental changes that take places are obvious — like the number of registered users jumping from one hundred thousand to five.  These changes are somewhat straightforward to handle by the administrator — they're not exactly critical to how the service will respond to demand over the next few hours.  This type of environmental change takes place over larger amounts of time — a duration suitable for humans to step in a relieve the situation.  If we're seeing a growth trend in terms of registered users, maybe we'd be smart to assume that we'll need a more robust collection of hardware in the near term.

Now, what about when the timeline of these events — the inclining demand of our application's availability — is compressed into something much smaller, like under an hour for example.  If all available resources our applications has to store, compute, and transfer aren't enough to handle the current state of usage, than we'll see a change in behavior.  Sorry, the users will see a change.  But, thankfully, advanced monitoring tools we deploy to the cloud beside our applications can easily tell us when the application is experiencing trouble and the cloud needs to send more virtual hardware to the rescue.

Even if this isn't an automated procedure, it's still something trivial for the application's monitoring utilities to notify the administrator to go and provision another instance of the application server to cope with the request spike.  In this scenario, there may only be a limited window in which users experience unacceptably poor response times.  But this is often automated too — it doesn't take a system administrator to determine that there aren't enough available resources to fulfill the application's requirements based on current situations.

Google App Engine is a good example of how something like this is automated. Each application deployed to app engine has what are called serving instances. These are the decoupled application instances of the application that doesn't share state with other services.  As the load increases, so do the number of serving instances to help cope with the peak.  Just as importantly, as the peak slowly winds down and the pattern of user behavior returns to normal, app engine kills off superfluous serving instances that aren't necessary.

There are many ways to automate application components to help cope with what users are doing — to prevent one user from sucking available CPU cycles away from others.  Provisioning new instances of the serving instance within the cloud environment for example.  But, does this really take into account what the application is really doing and what's likely to change in the eminent future?  To do that, code inside the application needs to take samples of internal state and propagate these changes outward — toward the outer shell of the application — perhaps even into the cloud operating environment in some circumstances.

The trouble isn't that it's not possible to take into account the inner workings of our applications — it's that it isn't a high-priority for cloud service providers.  It's easy to alter applications deployed to the cloud — to take measurements and make them available to other services that could potentially react to those measurements.  The trouble is, there is simply too much code to write — too much of the burden is put on the customer and not the service provider to offer APIs that can help applications operate effectively in the cloud.