I have become very dependent on Cloud Computing in my daily life. Let me start with an explanation of that statement and then I’ll explain the light bulb moment (panic attack) I had recently. I have developed a policy of any application that I use must store my data somewhere in “the cloud”. I use as much common sense as possible (encrypt sensitive data, long passwords, etc) but services like Evernote, iCloud, Dropbox, ShareFile, Google Docs, etc. are all part of my life. If I lost any of my devices today, I would be able to quickly and easily replace all of my data. The planning for this isn’t as easy as you would think. Let’s take a few scenarios:
- Offline access: I spend a lot of time on airplanes these days. While most airplanes in the United Stated have Internet access, I have run into a few situations where I’ll be offline for hours at a time. What do you do? Some would say read a book or watch a movie and enjoy life but I’m sorry to say I can’t remember the last time I had that luxury. Because of this I insist on a “Hybrid Cloud” model: local copies of my data at all times
- Your data goes away: This is something new I started based on the experience below. What if your data somehow goes away and no longer becomes “your data”. Everyone assumes your data will always be there. What if the service experiences an outage (most likely), or is raided by the government and the servers are confiscated (least likely), or they have decided you have violated the terms of service and lock/erase your data. There are a number of scenarios where your data might go away. The chances are small but you have to ask yourself, what happens when my “cloud” goes away. I insist on a “Cloud DR site” for my data.
Until recently, I was living by the first scenario only. I thought the worst that could happen was I might have an outage here or there but as long as I had a local copy I would be able to work through the outage and everything would be fine.
Then came a slap in the face in the name of a Google/Evernote Double Whammy. Within a few days of each other Google announced that Google Reader (which I still use everyday) will be going offline on July 1st of this year. This was my first experience with “what if my cloud goes away”. I had been using local RSS reader applications for years but Google Reader was great, access anytime, anywhere, from any device. I am now scrambling to find an alternative.
The second part of my Double Whammy was Evernote. I actually write most of my blogs in Evernote as I tend to be on an airplane when I write and it is very easy to just sync the raw text to my PC, edit and polish, then hit publish. This is exactly what happened when I wrote a few posts on three (yes, three) flights across the country from Portland to Raleigh. All of the data was on my iPad but since I didn’t have Internet access, the posts were ONLY on my iPad. Then the news came that Evernote was hacked. I thought, no big deal, I have my local copies, change my passwords, sync up, move on. Not so much…
Because of the password change Evernote required you to log in on the iPhone & iPad (but not the Mac client) before you can even access your data. I tried to change my password but both the iPhone and iPad versions wouldn’t connect. This lasted for about 8 hours (according to everyone on Twitter I was an isolated incident). I couldn’t do a thing, my data was on the iPad but locked behind a login screen I couldn’t get past. Not good…
This is when I instituted my second rule about a “Cloud DR site”. It is easy to become dependent and lazy in our personal lives about cloud computing but always remember, what if your data goes away? The standard rules of IT & Cloud Operations from the business side still apply to your personal life, you need to control access to your data and it is your responsibility to backup your data.
I’m often asked if cloud computing provides so many great benefits, why aren’t more organizations taking advantage of cloud computing throughout the entire organization? To answer this question let’s step back for a moment and explore why and how most organizations develop and operate their infrastructure and applications at scale. I like to think of each version of an application as a “generation” of that software package.
Production critical applications that operate typical line of business applications tend to fall into one of two categories: applications that generate revenue for the company (think NetFlix or the Amazon store front), and applications that support the business (think enterprise databases, ERM, CRP systems).
Let’s start with the more traditional applications that support a typical enterprise today. From an operations view the approach has been slow is better. Change introduces risk and risk means potential downtime. Down time means work halts. Think of an assembly line and if one piece of the supply chain breaks, all work comes to a screeching halt. I have worked in these types of environments and attended daily “change management” meetings for years. Everyday was an analysis of any changes that required approval from all departments (network, storage, servers, development, etc.) prior to any upgrades to systems. Why was all this necessary? The reason was because the software was not designed to fail, it required an infrastructure that was available at all times and the software was not aware of changes, it simply stopped working. New software packages were rolled out very slowly and often had multiple year road maps and release cycles. Each release cycle was a “generation” of the software with unique characteristics and the each generation tended to live for a long time and evolve very slowly.
Is this type of development a bad thing? Not necessarily. There are many organizations that do not need to go fast. Data integrity and constant uptime may be “good enough”. They may be happy with their current generation of software and as they say “If it ain’t broke, don’t fix it”. This is something many technologists in our industry forget. Inertia is a powerful thing, especially if that object is at rest. These “go slow” systems also tend to be the systems that are locked away behind firewalls and store data but do not typically interact with an organization’s customers on a daily basis.
Now, let’s move on to the second type, applications that generate revenue for the business. Because the application brings home the bacon, the application gets lots of attention (i.e. development budget). The better the application, the faster the business can go, the more money the application can potentially generate for the organization. This type of application tends to be the layer the consumer interacts with at some level and can change drastically as the needs of the consumer changes. Because these changes can be quick and at times unpredictable, methods to quickly develop and maintain these systems became necessary. Because of this need terms such as devops (combining development and operations into a flexible and robust way) and continuous development lead to multiple releases per day. Combining devops with cloud computing leads to flexible production operations that can pivot as the users require and have led to the current wave in cloud computing. As you can see, the concept of software generations really is no longer relevant. When an application is written to take advantage of the unpredictable nature of the underlying infrastructure as well as to grow and shrink as needed, a new development method was needed. Some devops shops perform multiple releases per day and a software generation may only last for hours or even minutes. The reason why cloud computing is successful in this area is because it is not only advantageous but required to survive and thrive in today’s environment.
Two methods are utilized to meet two different needs and both are relevant to the requirements of the applications. As many are starting to discover there is no Silver Bullet in cloud computing, there is only the right tool for the job.
A few news articles and a podcast that posted over the last few weeks got me thinking about how open clouds are today.
How “open” are clouds of any type these days? Let’s take a look at a few examples and see where they stand:
“Closed Clouds” – Amazon / Google / VMware - As put forward in this article from the VAR guy, all three are either already running or plan to run on proprietary technology that is not open source. They also all run on different technology so they are the classic example of a locked in environment. Would you like to move from Amazon to Google Compute Engine easily? Not gonna happen. Your only options will be the options they decide to provide to you. I’m not saying they aren’t (or won’t be) successful. We all know they will build successful products and serve a large customer base. But, you as a consumer, must accept this lock in in exchange for the service. Amazon in particular is evolving and introducing new features at an amazing race. They are much like Apple when they first became cool years ago in the music industry. They stayed ahead of the competition, innovated at an amazing rate, but made sure to lock in the user at every turn (DRM music to start, restrictions on devices and software, applications required approval, etc) This walled garden experience appeals to some and not to others (think Apple vs. Android)
“Open Clouds” – OpenStack Clouds – The OpenStack Foundation has done an amazing job of putting out a message that equates OpenStack with “Open Clouds”. If you peel back the layers, you find something a little different today. We have many players in this space (Red Hat, HP, RackSpace, Piston, CloudScaling, etc). Can you move a workload from HP to RackSpace? All initial signs point to no and last month Rackspace finally admitted that they would only suport their distribution. Now, before anybody thinks I’m talking bad about OpenStack, I’m actually not. Let me explain. I think they fully intend to get to a truly open model someday but the codebase just simply caught up to the idea.
Mark Hinkle actually shed some light on the reason how this could happen in a recent episode of the Cloudcast (.net) talking about Open Source Software. The reason behind OpenStack cloud incompatibility today is because the core projects of the software are currently too immature and require each vendor to provide additional “secret sauce” to make OpenStack function in a production environment. OpenStack developers need to focus on contributions that continue to move the core projects forward and provide compatibility of the minimum functions needed to allow for better upgrades and migration across clouds in the future. Too many companies are trying to differentiate their products (and therefore building incompatibility) when they need to focus on moving the project forward. As pointed out by Derrick Harris, this is easier said than done. Developing and selling commodity features aren’t sexy for startup companies. Many OpenStack companies have seen a shakeup and I predict this trend will continue throughout the rest of 2013.
While speaking on a Cloud Computing panel recently for some government centric customers the concept of cloud consolidation impact came up. We explored this topic for a bit and I believe the resulting outcome was worth sharing.
One of the many advantages of cloud computing today is the consolidation of resources. Through consolidation we are able to attain a higher level of utilization in all areas of the physical and logical infrastructure. This is one of the main reasons virtualization took off the way it did about eight to ten years ago. By adding a cloud abstraction layer on top of a virtual infrastructure we have the ability to consolidate service offerings such as applications or IaaS (Infrastructure-as-a-Service). These service offerings can be metered and billed in a retail like experience that many operations centers are starting to find very advantageous.
This may sound like a great advantage (and it is), but there is a potential downside to the consolidation of resources as well. What if you are mixing your development environment with your mission critical applications and a developer accidentally creates a run away application that impacts the entire system? This is known as the “noisy neighbor” and is a common problem in improperly architected clouds.
Let’s run with the noisy neighbor example for a bit. If this happens, how will you know the root cause and how do you solve this problem long term? To do this your cloud must address a few consolidation issues.
Visibility – There are many in the cloud community that subscribe to the “Black Box” philosophy of cloud computing. The infrastructure is just there and I have no visibility into the operations of the Black Box. I don’t know and I don’t need to know. While I agree the infrastructure should just work, you need visibility into the operations layer to diagnose problems when they arise. They will, trust me. You need to find the Noisy Neighbor in your cloud.
Prioritization – Now that you have the ability to pick out the Noisy Neighbor, how do you keep them from impacting other mission critical services? Since multiple workloads have been consolidated you need an ability to assign priorities to workloads. In the networking world this is often referred to as QOS (Quality of Service). The most important workloads get the most resources and in the event of a shortfall, they have priority over other workloads. Mission Critical Services need to go to the front of the line. You need to keep the Noisy Neighbor from impacted your mission critical systems.
From an operations stand point both visibility and prioritization are critical in providing your customers with what they expect, a cloud that just works.