Microservices, containers, and APIs, oh my. They are the holy trinity that anoints cloud and mobile computing with unfathomable power and limitless scalability. APIs are the glue that hold the others together and make possible a universe of interactions with services, applications, analytics, and data from well, from anywhere and everywhere.
It makes sense, therefore, to want an API to do as much as possible each time it is called. Do more with fewer calls and you maximize efficiency, right? Maybe, maybe not. Ok, let’s try the converse, an API that does one tiny, highly focused task when called upon. Do less with each call and you boost speed. Again, maybe, maybe not.
Just like two children of differing weights, seeking that one golden location on each side of a see-saw fulcrum that places the overall system into magnificent equilibrium, APIs are not really any different. Find that magical balance point between too many small calls and too few big ones, and you’ve built a work of art. It’s the Goldilocks effect brought into the cloud age.
One of the world’s top API experts calls it API granularity. Manfred Bortenschlager, Red Hat’s director of business development for API-based integration solutions and API management, says developers need to get better at it.
“One thing that’s difficult to get right is API design — in particular, the right granularity. An API could give you a lot of data back; a lot of values, which are potentially unnecessary; or just too much data,” Bortenschlager says. “If you are serving a mobile app, this could be too much payload. On the other end of the scale, you could have an API that gives you too little back. This means that an API consumer would have to issue many API calls. Getting this balance right is tricky.”
APIs don’t exist in a vacuum. An application can easily encompass several dozen. Though each performs one task, all of them, when taken together, must be designed for optimal system performance, resulting in a speedy and enjoyable user experience. Many small API calls can get hung up on network latency that drives users crazy. Fewer, bigger calls require fewer round trips, but may return data or metadata that’s simply not needed, slowing down an application. On mobile devices, these small delays can be deadly, leading to session and user abandonment. Not too big. Not too small. Goldilocks.
A separate issue, but no less important, is that businesses publish APIs that allow their customers to gain access to data or services. It’s a fact of life that software gets updated and that you’ll never, ever get all users of an API to be on the same version at the same time. (How many are still running Windows XP?) Some may upgrade today, others not for months, still others not at all.
That means maintaining tight control over versions is essential. And it’s not easy, according to Bortenschlager. “It’s impossible to have an API with changes that never break. It’s impossible. That’s just the nature of it,” he told me. “What’s important is to communicate changes very well and far in advance.”
Through API management, Bortenschlager says, it’s possible to know exactly who is using each version of an API. “If you know that you are going to change a subset of your API to a new version, you can target the communication to those developers in advance,” he says. Good advice.
How does your company manage its API assets to ensure that those used in an application are efficient and compact? And how do you deal with the headache of having multiple versions of an API in use simultaneously? Share your thoughts; we’d like to hear from you.
The CloudExpo conference in New York is always a good take for developers, architects, and managers who want to understand where the technology of cloud computing is headed next. Serverless computing appears to be that destination.
As session presenter Doug Vanderweide from the Linux Academy — as entertaining a speaker as you’ll ever run into at a technology conference — puts it, the first thing you need to know about serverless computing is that, yes, there are servers. They’re just not yours.
Let’s back up a step and note that today’s cloud computing boils down to microservices and containers. Each offers profound benefits, though neither is perfect.
Containers are hot and it is microservices that makes them great, Vanderweide says. Microservices break work into small steps with APIs to handle them. You can manage functionality independently, streamline development, and save time with reusable code. Microservices work best when running in small virtualized environments, namely, containers, which are quickly deployed, inexpensive to run, easily scaled and orchestrated, and offer version control.
But, beware, Vanderweide says. Containers exist in a cloud technology ecosystem that’s changing daily. They’re prone to sprawl, can suffer from broken dependencies, and they are at the mercy of networking woes. Serverless to the rescue.
As Vanderweide explains it, serverless computing is anonymous, generalized virtual machine instances that are managed by the cloud provider. They’re provisioned when needed and de-provisioned when you’re done. They’re billed based on executions and resource consumption, not at an hourly rate. With a focus on triggers, inputs, and outputs, along with high availability and superb scalability, serverless is a great match for microservices.
The base operating system (Linux or Windows) is a general configuration that supports multiple languages (Node.js, Python, .NET Core, Java, etc.). The key to this is the provider can quickly provision instances because they are all the same no matter the corporate user.
The real allure may be in the stellar TCO (total cost of ownership) that serverless delivers. When you look at VM vs. function-based pricing for 2 million executions per month, consuming 4 GB-seconds (4 GB of memory used for one second) per execution, the differences are clear. Vanderweide says that works out to $279.74 on AWS and $220.97 on Azure, but, in a serverless ecosystem, a paltry $121.80 for Azure Function and $129.86 for AWS Lambda. Pretty impressive stuff.
Vanderweide calls this “the long tail of serverless.” For the cloud provider, the sameness of configurations for everyone makes greatly reduces the expense of providing them. That means each new instance is, well, instantly profitable.
Compared with containers, similar function workloads cost less to run and you never pay for capacity that’s sitting idle. Beyond that, automation, abstraction, and cloud vendor services can eliminate DevOps tasks (and possibly DevOps payroll, too). Infrastructure costs drop, the systems development lifecycle is simpler, server management is no longer your problem, and deployments are faster.
It’s not perfect, of course. Serverless computing can suffer from laggy startups of cold code. And it’s an immature technology that may leave you wedded to a specific cloud platform provider, at least for now.
Vanderweide sums up the advantages of serverless computing with a quotation from Greg DeMichillie, head of developer platform and infrastructure at Adobe: “In five years, every modern business will have a substantial portion of their systems running in the cloud. But that’s only the first step.” DeMichillie goes on to say, “The next step comes when you free your developers from the tedious work of configuring and deploying even virtual cloud-based servers.”
What’s your take on serverless computing? Are you still trying to catch your breath with (and I hesitate to use this word) “traditional” cloud computing? Too much too soon? Or are you ready to get out in front of the next wave and carve out a new career path — again? Share your thoughts about serverless computing; we’d like to hear from you.
Designed by engineers, comprehensible only by engineers. You’ve no doubt heard some variation of that old maxim. Let an engineer design a software or hardware product, and the average person will have a tough time figuring out how to use it, because the user interface is arcane, convoluted, circuitous, dense, indescribable, inexplicable — or worse.
What made me think of this is a new paper, published online today by Adobe, called “12 Tips for Mobile Design.” It’s a good read, and I suggest that developers, architects, and anyone else who touches the mobile app universe in any way invest some time. Building a beautiful-looking app that is a joy to use is a vastly different exercise than building efficient, error-free code. After all, as we move from DevOps into BizDevOps, which brings developers deeper into the business side — and closer to customers — than ever before, understanding design concepts (or at least being able to talk a good game) is useful.
Adobe says we need mobile apps that are not just “useful,” but “intuitive” as well.
And there’s the rub. Developers (we used to call them programmers) are good at developing. Good at thinking serially. In loops. In if-this-then-that (IFTTT) case structures. In writing tight, API-driven, containerized-as-microservices code. In stark contrast, interface designers — and it truly is a special discipline combining art and psychology — are good at UI/UX, designing the user interface and user experience, neither of which are logic structures. They can’t write a lick of code. I’m simply suggesting that a little cross-pollination is a good thing for everyone.
What are the 12 tips, you ask? Here’s the list. It’s up to you to do some deeper reading. Read the paper to dive into each one.
- De-clutter the user interface
- Design for interruption
- Make navigation self-evident
- Make a great first impression
- Align with device conventions
- Design finger-friendly tap-targets
- Design controls based on hand position
- Create a seamless experience
- Use subtle animation and micro-interactions
- Focus on readability
- Don’t interrupt users
- Refine the design based on testing
None of these have anything to do with platforms, infrastructures, or anything else “as a service.” It’s not about AWS vs. Azure vs. Google vs. Bluemix. It’s about you. Sure, you’re a great code jockey, but, what about your interface, navigation, experience, color-palette, and typography skills? Where do you fit in? Share your thoughts, we’d like to hear from you.
NASA. Remember NASA? It’s the once-glorious government agency that put men on the moon, the agency whose Voyager I space probe left our solar system in 2013 for parts unknown, the agency that, in the immortal words of John F. Kennedy, did things not because they were easy, but because they were hard.
FORTRAN. Remember FORTRAN? Well, of course you don’t. And that’s precisely why, in mid-2017, maintaining ancient programs written in it isn’t easy. It’s hard. Really hard. It’s so hard, in fact, NASA is holding a contest featuring a prize purse of up to $55,000. It’s the sort of app dev challenge that would look good on any résumé — if you know FORTRAN, that is. And computational fluid dynamics, too.
According to NASA, all you need to do is “manipulate the agency’s FUN3D design software so it runs ten to 10,000 times faster on the Pleiades supercomputer without any decrease in accuracy.” It’s called the High Performance Fast Computing Challenge (HPFCC).
If you’re a U.S. citizen at least 18 years old, all you need do, NASA says, is download the FUN3D code, analyze the performance bottlenecks, and identify possible modifications that might lead to reducing overall computational time. “Examples of modifications would be simplifying a single subroutine so that it runs a few milliseconds faster. If this subroutine is called millions of times, this one change could dramatically speed up the entire program’s runtime.”
If you’ve ever asked what you can do for your country, this may be it.
It’s your chance to go far beyond mere cloud computing, your chance to do outer-space computing — perhaps to infinity and beyond.
Ok, let’s get serious… FORTRAN has suffered mightily from the same ignominious fate as assembler language and COBOL (the language that paid my bills for many years). No one cares about FORTRAN, no one wants to learn it, few institutions bother to teach it, and many who were expert in it are long-since deceased.
Physicist Daniel Elton, in a July 2015 personal blog entry, suggests that FORTRAN remains viable (at least among physicists) because of the enormous amount of legacy code still in production, its superior array-handling capabilities, little need to worry about pointers and memory allocation, and its ability to catch errors at compile time rather than run time. In a March 2015 post in the Intel Developer Zone, Intel’s Steve Lionel (self-anointed “Dr. FORTRAN” and now recently retired) said a poll of FORTRAN users conducted at the November 2014 supercomputing conference indicated 100% of respondents would still be using the language five years later.
With good reason, we live in a world dominated by the likes of Java, C, C++, C#, Python, PHP, Ruby, Swift, R, Scala and scads of others. Visual Basic, Pascal, PL/I, ADA, APL, along with COBOL and FORTRAN have seen their day. The problem is that, to paraphrase Gen. Douglas MacArthur, old programming code never dies — and it doesn’t fade away, either.
How much ancient code from legacy languages do you come across in dealing with enterprise IT? Are you afraid to tinker with it? Does anyone know what those programs actually do? Has the documentation been lost to the ravages of time? Does the source code still exist? Tell us how you deal with it; we’d like to hear from you.
Cloud deployments of software often pose the most ticklish error detection and repair problems. Customers are constantly using a cloud app developer’s products, at all times of day and night and across geographies. Meanwhile, it’s a safe bet that something in those releases will be breaking, and error fixes will needed, said Brian Rue, CEO of Rollbar, which provides real-time error monitoring services for developers. The trick is detecting errors quickly, rather than waiting for customers to report them.
“You’re releasing improvements, releasing bug fixes, and that constant state of change means that you need to have a constant state of monitoring,” said Rue. “If something is broken, and you don’t find out about it until a customer writes in days later, it could easily be days or weeks before you find a way to repeat the problem. The development team gets caught up in a constant state of firefighting.”
Rue shares some best practices for error handling and making code error fixes in this article. Rue co-founded Rollbar after experiencing the problems of error handling when developing gaming apps, at first on a kitchen table in a garage with three colleagues.
The vicious circle
“Imagine a circle starting from deployment,” explained Rue. From deployment, the next thing that happens, typically, is an error happens. Your team needs to discover if it’s a new error or a read error. A new one calls for alerting and prioritization. Once an error is prioritized, then the developers can go explore the data for the error. They can discover what uses the error affects, the values of the variables and other information about the cause of the error.
“Usually, that’s enough data to enable writing and deploying error fixes,” said Rue. Then it’s on to the next problem. “That wheel of release, error monitoring and error fixes is constantly spinning,” he said.
Structured data is good data
The better the data structure is, the more the developer can discover every detail about each code error. “Data really should be structured in terms of keys and values, as opposed to just raw strings,” said Rue. So for example, let’s say there’s an error message that says: “This user tried to log in and it failed.” That might be something that the cloud developer wants to log. That should be logged as, say: “User login failed with the user ID as metadata.” That way, it’s both easier to group as it is, so there is just one message saying: “Login failed.” Then the cloud developer can see all those together and is closer to making error fixes.
“Once you have that structure, you can easily query data forward to see which logins failed. You can figure out how that correlates against other problems, and so on,” Rue said.
Add instrumentation to apps
The core of error monitoring is tracking the application from the perspective of the application, according to Rue. So, to use it the cloud app developer needs to be able to add the instrumentation into the application. Typically that’s as simple as installing a Ruby gem, installing a package from npm or installing a kind of Java middleware; all services most development team have used. “But, at a high level, this requires buy-in from the developers to identify what there is, and then make sure that each component is instrumented,” Rue said.
The Red Hat Summit in Boston this week drew more than 5,000 developers, according to Paul Cormier, president of Red Hat’s products and technologies business. That’s impressive for a major software company that literally started out as a flea-market operation.
“It’s so much fun to watch this all roll out,” Cormier says. “I’ve been at Red Hat for 16 years and was employee #120.” And how did Red Get its start? Cormier says company founder Robert Young began by “downloading Linux off the ‘net, burning it to CDs, and selling it out of the trunk of his car at flea markets.” This is an outfit that’s come a long way, with a pervasiveness that extends into almost every home. At one time, there were seemingly dozens of free, open-source Linux distros in the early days, but it’s the one company that created tools, platforms, and enterprise-class support that is the premier survivor.
Two key announcements made at the Red Hat Summit were OpenShift.io, a complete development environment accessed through the browser, and the Red Hat Container Health Index, a method for scoring containers for several factors, including version currency and security. Other announcements were a tightening of Red Hat’s relationship with Amazon Web Services and an on-premises containerized API management platform, which I reported on last week.
OpenShift.io is a new, comprehensive, end-to-end development, test, and deploy environment in a browser. There’s nothing to install on developers’ local desktops, on-premises, or in a business’s private cloud. Everything needed to design, build, and deploy is available through the browser.
“I’ve said this until I was blue in the face — a container is Linux, it’s just Linux carved up in a different way.”
— Paul Cormier, president, Red Hat products and technologies
“Now that we’ve finally put Dev and Ops together, we’re making the tooling more intelligent and more intuitive for developers to be even more productive,” Cormier says. “The OpenShift.io stuff uses artificial intelligence from all the things we’ve learned over the last 15 years to guide developers through building their application and recommend what might be a better path to go than the path they’re on.” With nothing to install, Cormier says developers can begin building from day one, avoiding the weeks and months it can sometimes take to procure and spin up development resources and infrastructures.
Another major announcement was the Red Hat Container Health Index, a service that grades the containerization performance and security of Red Hat’s own products and the products of certified ISVs. It’s not a one-time examination of containers, but rather a way to track ongoing container health volatility, letting you know that container considered fully secure a month ago, earning an “A” rating is now vulnerable, dropping to a grade of D or F.
“I’ve said this until I was blue in the face — a container is Linux, it’s just Linux carved up in a different way,” Cormier says. “Container tools help you package just the pieces of the user space OS that you need with the application.” When people were playing with containers and yet betting their business on them, they pulled containers from everywhere. Now, customers want a commercial-grade system.
“What we’ve done is containerize all of our products into a RHEL (Red Hat Enterprise Linux) container. We can scan the pieces of the OS that are included and tell if there are known security vulnerabilities, bugs, or if there’s a new version available. We’ve built that into our back-end systems that we use to build all our products,” Cormier says.
Red Hat will now make those tools available to ISV partners to test their own containers. All results will be available through a portal. “If you’re going to be a container provider in the commercial world, this is what you have to do.”
Do you use Red Hat development tools and platforms? What do you think of the company’s announcements this week and how do you plan to leverage these technologies in your upcoming projects? Share your thoughts with us; we’d like to hear from you.
One thing we know for sure is that under CEO Satya Nadella, Microsoft — in both action and spirit — looks very little like the Windows Or Else empire from the days of Steve Ballmer. The latest move is Microsoft’s acquisition this week of Deis, a little-known San Francisco developer of open-source software that makes Kubernetes easier to use.
Deis, in its own words, “helps developers and operators build, deploy, manage, and scale their applications on top of Kubernetes.” We all want to do that.
Writing in an April 10, 2017 blog post, Scott Guthrie, executive vice president of Microsoft’s cloud and enterprise group, wrote “we’ve seen explosive growth in both interest and deployment of containerized workloads on Azure, and we’re committed to ensuring Azure is the best place to run them.” The post goes on to say, “Deis gives developers the means to vastly improve application agility, efficiency and reliability through their Kubernetes container management technologies.” Guthrie We expects the technology to make it easier for customers to work with existing Microsoft container technologies, including Linux and Windows Server Containers, Hyper-V Containers, and Azure Container Service, “no matter what tools they choose to use.”
Deis CTO Gabriel Monroy, perhaps put it best, saying “robust and open container orchestration, paired with new application architectures are giving organizations unprecedented flexibility and choice.” That could be a covert comment on the current Kubernetes vs. Docker Swarm competition.
Monroy goes on to issue something of a minor mea culpa, noting that the union with Microsoft continues Deis’s mission “to make container technology easier to use.”
And there’s the rub. It’s not always easy to use. We’ve got a zillion cloud services and providers giving us an overabundance of tools, languages, technologies, platforms, and techniques. For all the problems legacy monolithic architecture presented, programmers (we didn’t call them developers back then) and SysOps staffers had few components to manage.
Today, here we are with lots and lots of pieces that need to be assembled, like a mosaic, into something that runs flawlessly, performs perfectly, provides unfettered access, supports instant change, and provides a business advantage. What do you do with all these little shards? You put them into containers and orchestrate their deployment and management so they, like a symphony orchestra, play together and become a whole that together is greater than the individual parts. After all, there’s a reason Kubernetes describes itself as “production-grade container orchestration.”
Where do you fall into line when it comes to containers and orchestration? For all the talk, it seems lots of IT operations have yet to dip their collective toes into the containerization waters? How about you? Actively using container technology in production? Working with an early proof-of-concept mini-project? Learning but haven’t taken the plunge yet? Share your experiences — and concerns — with us; we’d like to hear from you.
With Apple’s early June Worldwide Developers Conference a little more than two months away, it’s time to get moving, if you haven’t already, on one of the big changes almost certainly coming to iOS 11 — the dropping of support for apps that are not written for 64-bit processors.
According to a report from metrics provider SensorTower, the number of ripe-for-banishment 32-bit apps in the Apple app store hovered around 170,000 as of mid-March 2017. A big number, indeed, but it represents only about 8% of the approximately 2.4 million apps currently available in the app store. The good news is that the other 92% of listed apps are already 64-bit compatible.
Perhaps not surprisingly, the category with the most non-conforming apps is games at nearly 39,000. That’s about 20.6% of all the problem apps and nearly double the number of apps in the next offending category, education, just shy of 20,000. Other categories with a significant number of non-64-bit apps include entertainment, lifestyle, business, books, utilities, travel, and music. Only two categories have fewer than 1,000 offending apps, weather and shopping.
The high number of problem game apps is, of course, a reflection of the number of gaming apps in the app store in the first place. Gaming apps, many of them positively awful and almost always free, are often the domain of teenagers learning how to write code and do design. It makes sense then that these apps are the ones most likely to be abandoned as their creators mature, their coding skills evolve, and they move on to weightier projects. No doubt some of those apps were likely written to run on the now-defunct Parse platform — a popular choice for game development — but which were never migrated to another hosting environment and left to wither on the vine.
While this purge is all about 64-bitness, it’s not the first time Apple has made an attempt to clean house. In the first nine months of 2016, Apple deleted roughly 14,000 apps per month, according to SensorTower. That changed drastically in September 2016 when Apple started to notify developers that it would remove apps it considered outdated or which did not adhere to current various guidelines. You got 30 days to fix your app or it would be removed. The company wasn’t kidding — In October 2016, the number of apps purged soared to more than 47,000
The message is clear: If your app hasn’t been updated in eons, doesn’t comply with current standards, or is still mired in the 32-bit world, it’s headed for oblivion. Need some help to convert your app to a 64-bit binary? Fear not, Apple has an online guide that includes sample code. Better get busy.
Are your iOS apps up to date as 64-bit binaries? What difficulties to you encounter and how did you solve them? Do you have apps in the Apple app store that you simply choose to abandon? What tools do you use to build apps for iOS? Share your thoughts with us; we’d like to hear from you.
Cloud computing is a long way from being fully mature, but its obsolescence may already be upon us. Is the cloud’s future really up in the air?
As Peter Levine, a partner at venture capital firm Andreessen Horowitz, puts it, “Everything that’s popular in technology always gets replaced by something else,” be it Microsoft Windows, minicomputers exemplified by Digital Equipment Corp., specialized workstations typified by Sun Microsystems, or, yes, even cloud computing.
As Levine explains it, cloud computing, which he views as the centralization of IT workloads into a small number of super-mega-huge datacenters, is an unsustainable, unworkable, slow-to-respond method. The need for instantaneous information makes the network latency associated with a device-to-datacenter model and the corresponding datacenter-to-device return trip simply too long and therefore unacceptable.
Computing, Levine suggests, will move to a peer mesh of edge devices, migrating away from the centralized cloud model. Consider smart cars. They need to continually exchange information with each other about immediate, hyperlocal traffic conditions. Smart cars need to know that an accident occurred 10 seconds ago a half-mile up the road, that a pedestrian is entering a crosswalk, or that a traffic light is about to turn red. For this to work requires realtime data collection, processing, and sharing with other vehicles in the immediate area. The round-trip processing in the cloud model isn’t even remotely (pun intended) fast enough.
Text messaging is similar in that messages exchanged between people sitting just feet apart are still routed through a distant datacenter. It’s inefficient, slow (in compute terms), and unsustainable. The centralization is needed only for logging and journaling.
The answer, Levine postulates, is pushing processing and intelligence out to the edge, using many-to-many relationships among vehicles for information exchange, along with edge-based processing based on super-powerful machine-learning algorithms. No wonder he describes the self-driving car as “a datacenter on wheels.” Similarly, a drone is a datacenter with wings and a robot is a datacenter with arms and legs. They all need to process data in real time. The latency of the network plus the amount of information needing to travel renders the round-trip on the cloud unsuitable, though that’s still plenty fast enough for a Google search, he says.
The cloud still plays a role; data eventually needs to be stored, after all. That makes this model not fully edge and not fully cloud. It’s perhaps closer to what Cisco dubs “fog computing.” It also speaks to the inevitability of how IoT-driven smart cities must operate, a concept explained to me by Esmeralda Swartz, vice president of strategy and marketing at Ericsson.
There’s a profound irony to this. We started the age of IT (MIS as it was then known) with the IBM mainframe as the centralized place where all programs ran, all processing was done, and all data was stored. That was blown apart by decentralization, driven by the client/server model, Ethernet (or Token Ring or ARCnet), network operating systems (NetWare, VINES, LAN Manager, 3+ Open, Windows for Workgroups, Windows NT, OS/2 Warp, etc.) and early network-aware databases, such as Btrieve. Cloud computing swings the pendulum back to the centralized data model of the past, albeit with a dose of edge processing.
It’s throwing out everything you know and seeing from a paradoxically different perspective — just like the young girl presented on Christmas morning with her great-grandmother’s heirloom wristwatch, only to declare, “A watch that doesn’t need batteries? Gee, what will they think of next!”
You can watch Levine’s presentation “Return to the Edge and the End of Cloud Computing” on YouTube.
No doubt you’ve already thought about this. Where do you think cloud computing is headed? Is this a technology that is ultimately doomed to be superseded by something different, better, faster, and cheaper? What does this mean for you as an application developer? Share your thoughts and fears; we’d like to hear from you.
Now that La La Land Moonlight has won the Academy Award for best picture, this is as good a time as any to look back at some screw-ups in the world of cloud computing. May we all learn from our mistakes.
The Force is not with you: Take a trip back to May 9, 2016, less than a year ago. It was on that day the Silicon Valley NA14 instance of Salesforce.com went offline, a condition colloquially known as Total Inability To Support Usual Performance (I’m not going anywhere near the acronym). Customers lost several hours of data and the outage dragged on for nearly 24 hours. CEO Mark Benioff took to his Twitter account to ask for forgiveness. Shortly after, Salesforce moved some of its workloads to Amazon Web Services.
AWS giveth, AWS taketh away: Though transferring workloads to AWS helped Salesforce recover lost customer confidence (though not lost data), the opposite was true for Netflix. On Christmas Eve 2012, at a time when kids might be watching back-to-back-to-back showings of A Christmas Story, problems with AWS’s Elastic Load Balancing service caused Netflix to go down. This Grinch stole Christmas not just from little Cindy Loo Who, but from millions of paying subscribers waiting to see if Ralphie gets his dreamed-about Red Ryder BB rifle. Lessons were learned. Two years later, during a massive AWS EC2 update, Netflix rebooted 218 of its 2,700 production nodes. Alarmingly, 22 failed to reboot, but, the Netflix service never went offline. At the opposing end, Dropbox went old school in March 2016, dumping AWS and moving its entire operation onto its own newly built, enormous infrastructure.
Those darn updates’ll getcha every time: Amid verdant woodlands, beneath pure azure skies, protected by mountains, our cloud service lies. That bucolic portrait of the Pacific Northwest (or New Hampshire, perhaps) mattered little to Microsoft on Nov. 18, 2014 when the Azure Storage Service suffered a widespread outage traced back to the tiered rollout of software updates intended to improve performance. “We discovered an issue that resulted in storage blob front ends going into an infinite loop, which had gone undetected…” was the blogged explanation. Another major outage occurred in Dec. 2015.
Eat in, Dyn out: The Oct. 21, 2016 wave of coordinated distributed denial-of-service attacks targeting Domain Name System provider Dyn impacted dozens of high-profile businesses to varying degrees. These included Airbnb, Twitter, Amazon, Ancestry, Netflix, PayPal, and a long list of others. Dyn’s own detailed post-mortem of the attack makes for fascinating reading. If you think it’s impossible for millions of geographically far-flung, seemingly unrelated IoT devices to attack in a coordinated manner, think again.
You’ve heard of Office 360? Sure you have. The name is favored among cynics who joke that Microsoft’s cloud-based productivity software should be called that because it is offline five days out of every year. Office 365’s e-mail service was down for many users for about 12 hours on June 30, 2016. That follows other outages in various geographies on Dec. 3, 2015; Dec. 18, 2015; Jan. 18, 2016; and Feb. 22, 2016.
Got healthcare? We all know the stories about how healthcare.gov kept crashing due to poor design, inadequate compute resources, demand that vastly exceeded expectations, and so on. Enough said.
What’s that one cloud disaster story you’ve been dying to share? Now’s your chance. Tell us all about it; we’d like to hear from you.