Tuesday, February 24, 2009

Optimistic provisioning in the cloud

One of the technological problems that cloud computing technologies are supposed to solve is the lack of computing power when it is needed. Computing on demand, so to speak. The elasticity of the cloud enables this.

The classic example of this is when a web site operating in the cloud gets "slashdotted" and does not have the necessary computing resources required to fulfil the requests, your site dies and readers (soon to be ex-readers) will be disappointed. Luckily, your site is running in a cloud environment and has the ability to "expand" its' computing when the demand requires it.

What happens when the actual expansion takes place? Generally, a new virtual machine is created and that machines' resources are now available to the process that requires it. The process in this context refers to the overall business requirement that caused the expansion event in the first place. The process that says "give me more computing power" may in fact result from a general discussion amongst several nodes in the cloud.

Here, we have a simple controlling process that handles requests. These may be client requests or requests from other nodes in the same cloud. The controlling process then forwards the request to a resource management process. It is the responsibility of the resource manager to ensure that computing resources are available to fulfil every request. This is where the bottleneck lies, in has_resources(). In the most common case, there are plenty of resources available and has_resources() has very little work to do. However, when resources start to dry up, it needs to make more resources. This is where the costly work of the resource manager lives. It would be great if there were some way to know ahead of time what the peak resource demand will be.

Unfortunately, there is really no reliable way to do this. The best we can do in this situation is guesswork. The resource manager could monitor the distance between the size of resource requirements in a given time interval. Certain thresholds could then be set and once reached, we could then provide resources based on what the probable resource demand will be in the near future.

For instance, lets say I have a simple running within a cloud environment. I post a new entry, "a ton of traffic". Now, before I post this entry, I have an average demand of 5 requests per hour. An hour after posting, the resource manager notices that my average has doubled to 10 requests per hour. This is something that could be handled very comfortably be my service. However, the suddenness of this relatively large change could put the resource manager on alert. Now, hour two after posting "a ton of traffic", the number of requests reaches 20 requests per hour. It seems that this raising demand trend is continuing. The resource manager would then proceed to making more resources available.

With this approach, there is always the risk of over-provisioning resources. This type of data can be misleading. However, it does lend a guiding light toward proactive provisioning. Besides, if the statistical data is misleading, it is better to cleanup over-provisioned resources than being trying to do a huge provision job during the high resource demand.