The Basics of Cloud Capacity

by Ofir Nachmani
December 26, 2012

The IT capacity plan is derived from the current and future resources utilization for holding, storing and accommodating the software services. It is a given fact that servers’ average utilization in the traditional data center is between 5% and 20%. By contract, when planning capacity in the cloud, the basic working assumption is that, utilization should match the demand at all times and support temporary demand peaks and future trends.
Capacity planning is described by Wikipedia as the

“process of determining the production capacity needed by an organization to meet changing demands for its products.” It is also given by the following formula:
(number of machines or workers) × (number of shifts) × (utilization) × (efficiency)

In his CIO’s article about cloud computing capacity, Bernard Golden wrote,

..the assumption that the total number of applications is going to remain stable is tenuous at best. With lower costs, no need of application-level capital funding, and lower friction in obtaining resources, the total number of applications is undoubtedly going to skyrocket. So even if each application can only request a limited number of resources, if the number of applications grows dramatically, capacity planning becomes problematic.
…So even if each application can only request a limited number of resources, if the number of applications grows dramatically, capacity planning becomes problematic.” Wrote Bernard Golden in his CIO’s article about Cloud Computing Capacity.

The following are the most basic factors that should be known or estimated as they serve the inputs for capacity planing.

List your resource type(s).
Understand how much of each resource type you:
- Currently use
- Expect to use in the future
Estimate your headroom:
1. To support load peaks
2. For variation over time
Relative to your application usage

Traditional capacity planning, where new servers were purchased to fulfill the demand of a single application with a load of 20% max, is terminated by cloud computing. The comparison below shows some of the basic differences between the traditional DC and the cloud:

Traditional DC	Cloud
Expensive	The granularity cloud perception and the economy of scale supported by the IaaS providers make it possible to provision a resource by the hour at a costs of cents to few dollars.
Take time (weeks) to purchase/provision	You are just one click away from getting a new server online
Comes in big chunks that only increase and almost never can shrink. Planning errors can lead to severe business implications due to the upfront investment. Systems are actual assets with a high TCO.	No minimum price. The cost of buying resources varies from CAPX to OPEX. It seems that utility computing brings TCO costs down.
Almost zero agility selecting the deployment region.	You can select the region for the deployment (US, EU, etc.) although you don’t know exactly where you are running your application.

Capacity planning – Is it the IaaS vendor’s problem ?

In his article, Bernard Golden claims that use of the public cloud actually transforms the planning and maintenance of capacity for the IT organization (i.e. App provider or IT department) into a simpler problem as the infrastructure is owned by the public cloud provider. It is a fact that the capacity planning issue has become extremly complex for the IaaS vendors and IT organizations that want to maintain an in-house private cloud (using on-premises dedicated resources).
Golden notes,

… forecasting total demand and planning for sufficient capacity to meet it is going to become much more difficult. And make no mistake about it, when this cloud demands gets going, and apps groups begin to assume that resources will be available immediately whenever they’re requested, total demand is going to explode. “

Cloud computing vendors are often compared to an airlines, as the latter focuses on being efficient, having the right amount of resources (planes, fuel, food, etc) to meet demand, with reasonable headroom to meet demand peaks. The cloud providers need to ensure their economy of scale. High utilization and high efficiency are a must to keep sufficient capacity in place to meet demand, otherwise they will find an enormous waste of the resources. Most of the cloud giants are still striving to deliver the basic features to enhance support for self provisioning and extend their resources portfolio. Capacity and utilization must be handled by the vendors in the near future as they will have direct implications on the vendors’ ability to present strong scalability and competitive prices, and hence a better and more attractive cloud business.
Amazon AWS is a good example of a cloud vendor that presents tools such as spot and reserved instances. Golden suggests options to overcome the (private) cloud capacity provisioning, such as manual approval of the internal IT team or ways to raise motivations of the app group (his term for the cloud consumers) and make them increase their resources’ utilization by rolling the cost responsibility to their end. With regards to the latter one should bear in mind the competition between the internal IT team (with its private/in-house cloud) and the attractive options presented by the public cloud providers that have already entered the organization through its “back door”.

Overall, one can expect to see much more emphasis on increasing utilization rates in internal data centers, far beyond the levels achieved even by the best of today’s virtualized environments. The sea change brought by cloud computing and its assumptions of “infinite” capacity and on-demand elasticity, accompanied by pay-per-use pricing, will galvanize change in IT infrastructure and operations far beyond what most envision today.

Considering the above overview, assuming that the capacity issue is shifted to the cloud operator, there is a need to check if cloud elasticity makes the need for capacity planning obsolete or maybe simpler. What about underutilized resources the cloud consumer holds (and pays for) ? What are the relationships between the cloud consumer (the “app group”), the IT team, the current capacity and the SLA ? Does your cloud elasticity policy lead to an optimal usage and utilization?