While speaking on a Cloud Computing panel recently for some government centric customers the concept of cloud consolidation impact came up. We explored this topic for a bit and I believe the resulting outcome was worth sharing.
One of the many advantages of cloud computing today is the consolidation of resources. Through consolidation we are able to attain a higher level of utilization in all areas of the physical and logical infrastructure. This is one of the main reasons virtualization took off the way it did about eight to ten years ago. By adding a cloud abstraction layer on top of a virtual infrastructure we have the ability to consolidate service offerings such as applications or IaaS (Infrastructure-as-a-Service). These service offerings can be metered and billed in a retail like experience that many operations centers are starting to find very advantageous.
This may sound like a great advantage (and it is), but there is a potential downside to the consolidation of resources as well. What if you are mixing your development environment with your mission critical applications and a developer accidentally creates a run away application that impacts the entire system? This is known as the “noisy neighbor” and is a common problem in improperly architected clouds.
Let’s run with the noisy neighbor example for a bit. If this happens, how will you know the root cause and how do you solve this problem long term? To do this your cloud must address a few consolidation issues.
Visibility – There are many in the cloud community that subscribe to the “Black Box” philosophy of cloud computing. The infrastructure is just there and I have no visibility into the operations of the Black Box. I don’t know and I don’t need to know. While I agree the infrastructure should just work, you need visibility into the operations layer to diagnose problems when they arise. They will, trust me. You need to find the Noisy Neighbor in your cloud.
Prioritization – Now that you have the ability to pick out the Noisy Neighbor, how do you keep them from impacting other mission critical services? Since multiple workloads have been consolidated you need an ability to assign priorities to workloads. In the networking world this is often referred to as QOS (Quality of Service). The most important workloads get the most resources and in the event of a shortfall, they have priority over other workloads. Mission Critical Services need to go to the front of the line. You need to keep the Noisy Neighbor from impacted your mission critical systems.
From an operations stand point both visibility and prioritization are critical in providing your customers with what they expect, a cloud that just works.