Uncharted Waters

Mar 25 2019   9:05AM GMT

The Cloud Bait and Switch

Matt Heusser Matt Heusser Profile: Matt Heusser

Tags:
cloud
EC2
Kubernetes

When we started to talk about the cloud, twenty years ago, it was a picture on a napkin. We would draw one physical, concrete system, then an arrow, then a cloud. The cloud represented “the internet.” Its great power was that it was not concrete.

We did not know where the internet was. We did not know how it worked — and that was a good thing.

For the most part, though, we  rented servers in colocation facilities. These servers had IP addresses which would be listed in a lookup table called a Directory Name Service, or DNS. When you went to my website, www.xndev.com, your browser would do the IP lookup in DNS, translate www.xndev.com into 66.172.35.61, then go to that web page. To the casual yahoo! search-er, the computers were in the cloud. The rest of us knew all about the Linux Server in the data center. We had to choose between spending too much money on computer power, or occasionally getting overwhelmed.

That changed ten years ago with EC2. Suddenly the vendor claimed that could spin up as many servers as we needed. Run one server most of the time, then auto-scale when your company hit the front page of The Wall Street Journal. That was the promise, at least.

Today we have Kubernetes, an open source cluster manager. Kubernetes takes the mystery out of auto-scaling – showing that it is more art than science.

The EC2 bait and switch

Imagine that you have a sudden burst of traffic. This could be temporary. If you spool up a new instance too soon, and keep it up too long, you will pay too much in server rental fees. On the other hand, if you wait too long to create a new server, your users may experience delays and timeouts. Knowing when to add compute power is a tradeoff.

Here’s the current wording from the EC2 website on auto scaling. Read it carefully.

Amazon EC2 Auto Scaling enables you to follow the demand curve for your applications closely, reducing the need to manually provision Amazon EC2 capacity in advance. For example, you can use target tracking scaling policies to select a load metric for your application, such as CPU utilization. Or, you could set a target value using the new “Request Count Per Target” metric from Application Load Balancer, a load balancing option for the Elastic Load Balancing service. Amazon EC2 Auto Scaling will then automatically adjust the number of EC2 instances as needed to maintain your target.

In other words, the folks at Amazon don’t exactly know when to add computer power either.

Back when I was a college student, we used to talk about what was slowing down the computer. Sometimes the delay was just waiting for the computer to perform calculations, such as the value of Pi. We called this CPU-bound. Other times the CPU had to swap variables in and out of memory, or memory-bound. Or we could be waiting for keyboard or disk input, which we called IO bound. The lines above indicate that the operator will indicate how the application is bound, and at what point to create new EC2 instances. The simplest way to figure out how an application is bound is  to performance test the application, see when it “falls over” from load, and determine what metric is the key indicator.

To summarize: We thought we were going to get rid of performance testing through scaling. It turns out to scale we need to performance test.

Scaling the cloud with Kubernetes

In Kubernetes the basic unit of work is a pod. The replica set functionality makes it possible to scale the number of pods from some minimum to some maximum. Note that there are still limits to scaling — set the maximum wrong, and you’ll still fall over in the event of a massive traffic spike. The only metric Kubneretes will auto-calculate for you is CPU percent. If you want an additional metric, you’ll need to do a significant amount of configuration. If the application is an in-memory cache, you will need to do that configuration.

Kubernetes does provide it’s own software to generate and process load. If that load is realistic compared to real customers is open to debate.

Once again, we have the cloud bait and switch. Yes, you can auto-scale. You could use the default scaling mechanism (CPU), but if your application is bound by something else, that will be no help at all. You’ll need to set up a max number of instances (limiting the “scale”), or else risk an exploding credit card bill on the public cloud. On the private cloud you risk consuming all your resources and losing the network.

Don’t get me wrong, I’m excited about Kubernetes and the reality of infrastructure as code. At the same time, the reality is we have a lot of work to do.

It’s time to roll up our sleeves and get to work.

 Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: