Enterprise cloud users may have written off the spate between gaming companies Improbable and Unity as nothing more than industry in-fighting, but there are some cautionary tales emerging from the dispute they would do well to take note of, argues Caroline Donnelly.
The bun fight that has broken out over the last day or so between UK-based gaming company Improbable and its long-time technology partner Unity has (most likely) gone unnoticed by the majority of enterprise cloud users.
After all, what does a terms and conditions (T&Cs)-related dispute between two gaming firms have to do with them?
But, within the claims and counter claims being bandied about online by both parties (and other assorted gaming industry stakeholders) are some important points that should give enterprises, in the throes of a cloud migration, some degree of pause.
In amongst all the bickering (which we will get on to shortly) there are cautionary notes to be found about why it is so important for enterprises not to rush blindly into the cloud, and to take steps to protect themselves from undue risk. To ensure they won’t be left high and dry if, for example, their preferred cloud partner should alter their service terms without due warning.
From a provider standpoint too, the case also highlights a problem lots of tech firms run into, where their products end up being used in ways they hadn’t quite planned on that might – in turn – negatively affect their commercial interests or corporate ethics.
Improbable vs. Unity: The enterprise angle
For those who haven’t been following (or simply have no clue who either of these two firms are), Improbable is the creator of a game development platform called SpatialOS that some gaming firms pair with Unity’s graphics engine to render their creations and bring them to life.
But, since Thursday 10 January 2019, the two companies have been embroiled in an online war of words. This started with the emergence of a blog by Improbable that claimed any game built using the Unity engine and SpatialOS is now in breach of Unity’s recently tweaked T&Cs.
This could have serious ramifications for developers who have games in production or under development that make use of the combined Unity-Improbable platform, the blog post warned.
“This is an action by Unity that has immediately done harm to projects across the industry, including those of extremely vulnerable or small-scale developers and damaged major projects in development over many years.”
Concerns were raised, in response to the blog, on social media that this might lead to some in production games being pulled, the release dates of those in development being massively delayed, and some development studios going under as a result.
Unity later responded with a blog post of its own, where it moved to address some of these concerns by declaring that any existing projects that rely on the combined might of Unity and Improbable to run will be unaffected by the dispute.
When users go rogue
The post also goes on to call into question Improbable’s take on the situation, before accusing the firm of making “unauthorised and improper use” of Unity’s technology and branding while developing, selling and marketing its own wares ,which is why it is allegedly in breach of Unity’s T&Cs.
“If you want to run your Unity-based game-server, on your own servers, or a cloud provider that [gives] you instances to run your own server for your game, you are covered by our end-user license agreement [EULA],” the Unity blog post continued.
“However, if a third-party service wants to run the Unity Runtime in the cloud with their additional software development kit [SDK], we consider this a platform. In these cases, we require the service to be an approved Unity platform partner.
“These partnerships enable broad and robust platform support so developers can be successful. We enter into these partnerships all the time. This kind of partnership is what we have continuously worked towards with Improbable,” the blog post concludes.
Now, the story does not stop there. Since Unity’s blog post went live, Improbable has released a follow-up post. It has also separately announced a collaboration with Unity rival Epic Games that will see the pair establish a $25m fund to help support developers that need to migrate their projects “left in limbo” by the Unity dispute. There will undoubtedly be more to come before the week is out.
Calls for a cloud code of conduct
While all that plays out, though, there is an acknowledgement made in the second blog post from Improbable about how the overall growth in online gaming platforms is putting the livelihoods of developers in a precarious position, and increasingly at the mercy of others.
Namely the ever-broadening ecosystem of platform providers they have to rely upon to get their games out there. And there are definite parallels to be drawn here for enterprises that are in the throes of building out their cloud-based software and infrastructure footprints too.
“As we move towards more online, more complex, more rapidly-evolving worlds, we will become increasingly interdependent on a plethora of platforms that will end up having enormous power over developers. The games we want to make are too hard and too expensive to make alone,” the blog post reads.
“In the near future, as more and more people transition from entertainment to earning a real income playing games, a platform going down or changing its Terms of Service could have devastating repercussions on a scale much worse than today.”
The company then goes on to make a case for the creation of a “code of conduct” that would offer developers a degree of protection in disputes like this, by laying down some rules about what is (and what is not) permissible behaviour for suppliers within the ecosystem to indulge in.
There are similar efforts afoot within the enterprise cloud space focused on this, led by various governing bodies and trade associations. Yet still reports of providers introducing unexpected price hikes or shutting down services with little notice still occur. So one wonders if a renewed, more coordinated push on this front might be worthwhile within the B2B space as well?
In this guest post, DevOps consultant and researcher Dr. Nicole Forsgren, PhD, tells enterprises why execution is key to building high-performing technology teams within their organisations.
I often meet with enterprise executives and leadership teams, and they all want to know what it takes to make a high-performing technology team.
They’ve read my book Accelerate: The Science of Lean Software and DevOps and my latest research in the Accelerate: State of DevOps 2018 Report (both co-authored with Jez Humble and Gene Kim), but they always ask: “If I could narrow it down to just one piece of advice to really give an organisation an edge, and help them become a high performer, what would it be?”
My first (and right) answer is that there is no one piece of advice that will apply to every organisation, since each is different, with their own unique challenges.
The key to success is to identify the things, such as technology, process or culture, currently holding the organisation back, and work to improve those until they are no longer a constraint. And then repeat the process.
This model of continuous improvement, which is actually identifying strategy, is the most effective way to accelerate an organisational or technology transformation.
However, in looking over this year’s research and checking back in with some of my leadership teams, I realised that one piece of advice would be applicable to everyone, and that is: “High performance is available to everyone, but only if they are willing to put in the work; it all comes down to execution.”
Yes, that’s right – basic execution. While revolutionary ideas and superior strategy are important and unique to your business, without the ability to execute and deliver these to market, your amazing ideas will remain trapped.
Either trapped in a slide deck or in your engineering team, without ever reaching the customer where they can truly make a difference.
If you still aren’t convinced, let’s look at some examples: cloud computing and database change management.
Cloud computing: Done right
Moving to the cloud is often an important initiative for companies today. It is also often something technical executives talk about with frustration: an important initiative that hasn’t delivered the promised value. Why is that? Lack of execution.
In the 2018 Accelerate State of DevOps Report, we asked respondents – almost 1,900 technical experts – if their products or applications were in the cloud. Later in the survey, we asked details about these products and applications they supported. Specifically, these covered five technical characteristics regarding their cloud usage:
- On-demand self-service. Consumers can provision computing resources as needed, automatically, without any human interaction required.
- Broad network access. Capabilities are widely available and can be accessed through heterogeneous platforms (e.g., mobile phones, tablets, laptops, and workstations).
- Resource pooling. Provider resources are pooled in a multi-tenant model, with physical and virtual resources dynamically assigned and reassigned on-demand. The customer generally has no direct control over the exact location of provided resources, but may specify location at a higher level of abstraction (e.g., country, state, or datacentre).
- Rapid elasticity. Capabilities can be elastically provisioned and released to rapidly scale outward or inward commensurate with demand. Consumer capabilities available for provisioning appear to be unlimited and can be appropriated in any quantity at any time.
- Measured service. Cloud systems automatically control and optimise resource use by leveraging a metering capability at some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported for transparency.
The research revealed 22% of respondents (who said they were in the cloud) also said they were following all five of these essential practices. Why is this important? Because these practices are what make cloud computing, according to the National Institute of Standards and Technology’s (NIST) own definition.
And so, it follows, if teams are not executing on all five of them, they aren’t actually doing cloud. This is not necessarily the fault of the technical teams, but the leadership team, who set initiatives for the organisation and define targets.
Now, why does this matter? We found teams that met all five characteristics of cloud computing were 23 times more likely to be elite performers. This means they are able to develop and deliver software with speed, stability, and reliability — getting their ideas to their end users.
So, if you want to reap the benefits of the cloud and leverage it to really drive value to your customers, you have to execute.
Database: A hidden constraint
Database changes are often a major source of risk and delay when performing deployments so in this year’s research we also investigated the role of database change management.
Our analysis found that integrating database work into the software delivery process positively contributed to continuous delivery and a significant predictor of delivering software with speed, stability, and reliability.
Again, these are key to getting ideas and features (or even just security patches) to end users. But what do we mean by database change management? There are four key practices included in doing this right:
- Communication. Notify upcoming database and schema changes with developers, testers, and the people that maintain your database.
- Including teams. This goes a step beyond discussion to really including the teams involved in software delivery in discussions and designs. Involving all players is what DevOps is all about, and it will make these changes more successful.
- Comprehensive configuration management. This means including your database changes as scripts in version control and managing them just as you do your other application code changes.
- Visibility. Making sure everyone in the technical organisation has visibility to the progress of pending database changes is important.
To restate a point from above, when teams follow these practices, database changes don’t slow software teams down or cause problems when they perform code deployments.
These may look similar to, well, DevOps: integrating functionality throughout the pipeline and shifting left, and that’s exactly what it is.
The great news about this is we can take things we have learned about integrating and shifting left on other parts of our work (like IT operations and security) and apply them to the database.
Our analysis this year also found that database is equally important for teams at every stage of their journey, whether they were low, medium, or high performers.
We see this in our conversations in the industry: data is a hard problem and one that eventually needs to be tackled… and one that needs to be tackled with correct practices.
Particularly in data- and transaction-heavy environments, teams and organisations can leverage their data to drive value to their customers – but only when they execute on the key practices necessary.
Execution: Both simple and hard
For anyone who skips to the bottom of an article looking for the summary, here you go: execute on the basics and don’t skip steps. I acknowledge this is often difficult, but the data is definitive: teams and organisations that don’t fully execute don’t realise the benefits. The great news is this: excellence is available to everyone. Good luck!
In this guest post, Jens Struckmeier, founder and CTO of cloud service provider Cloud&Heat Technologies, shares his views on the ecological and technological challenges caused by Europe’s booming datacentre market.
People have become accustomed to many digital conveniences, thanks to the emergence of Netflix, Spotify, Alexa and YouTube. What many may not know is that with every crime series they stream online, they generate data streams that also gnaw on our environment due to their energy consumption.
Most of the data generated in this way flows through the cloud, which is made up of high-performance datacentres that relieve local devices of most of the computing work. These clouds now account for almost 90 percent of global data traffic. More and better cloud services also mean more and more powerful data centres. These in turn consume a lot of energy and generate a lot of waste heat – this is not a good development from an ecological and economic point of view.
Information and Communication Technology (ICT) with its datacentres now account for three-to-four percent of the world’s electric power consumption, estimates the London research department of Savills. Furthermore, from a global perspective, it is thought they consume more than 416 terawatt hours (TWh) of electricity a year – about as much as the whole of France. And that’s just the tip of the iceberg.
The Internet of Things, autonomous vehicles, the growing use of artificial intelligence (AI), the digitalisation of public administration, and the fully automated networked factories of industry 4.0 are likely to generate data floods far beyond today’s levels.
According to Cisco, global data traffic will almost double to 278 exabytes per month by 2021. This means that even more datacentres will be created or upgraded, with a trend towards particularly large, so-called hyperscale datacentres with thousands of servers.
Why datacentre location matters
In competitive terms, the rural regions of Northern Europe benefit from these trends. In cool regions such as Scandinavia, power is available in large quantities and at low cost.
Additionally, the trend towards hyperscale datacentres means the demand for decentralised datacentres near locations where the processed data is also needed is growing. If the next cloud is physically only a few miles away, there are only short delays between action and reaction, and the risk of failure is also reduced.
With a volume of over £20 billion, the UK is Europe’s largest target market for datacentre equipment. When all construction and infrastructure expenses are added together, Savills estimates that 41% of all investments in the UK were made between 2007 and 2017.
The Knowledge Transfer Network (KTN) estimates annual growth at around 8%. There are currently 412 facilities of different sizes, followed by Germany with 405 and France with 259. For comparison: 2,176 data centres are concentrated in the home country of the Internet Economy, the USA.
In a comparison of European metropolises, London is the capital of data centres. The location has twice the computing capacity of Frankfurt, the number two on the market. According to Savills, however, the pole position could change after the Brexit. Either way, London is likely to remain a magnet as an important financial and technology hub, with the location benefiting from existing ecosystems and major hubs such as Slough.
Environmental pressure on the rise
However, as the European leader in terms of location, the UK also has to deal particularly intensively with the energy and environmental challenges in this sector.
Overall, the larger datacentres consume around 2.47 terawatt hours a year, claims techUK. This corresponds to about 0.76 percent of total energy production in the UK. In London alone, capacities with a power consumption of 384 megawatts were installed in 2016, which corresponds roughly to the output of a medium-sized gas-fired power plant such as the Lynn Power Station in Norfolk.
In view of the strong ecological and economic challenges associated with this development, around 130 data centre operators have joined the Climate Change Agreement (CCA) for datacentres.
In this agreement, they voluntarily commit themselves to improving the power usage effectiveness (PUE) of their facilities by 15 percent by 2023.
PUE stands for the ratio between the total energy fed into the grid and the energy consumption of the computer infrastructure itself. In the UK, the average PUE is around 1.9: almost twice as much energy is used for cooling the data centres as for actual operation.
Smarter power use
There are several ways to use this energy more intelligently: while older, smaller datacentres are often air- or water-cooled, resource-conscious operators rely on thermally more effective hot water-cooling systems.
The computer racks are not cooled with mechanically tempered water or air, but with water that is 40 degrees warm, for example, and the cooling process makes it five degrees warmer, for example. In the northern parts of Europe, where the outside temperature hardly ever rises above 40 degrees, the water can then quite simply be cooled down to the inlet temperature outside.
Because water or air do not have to be cooled down mechanically, the energy savings are significant: Compared to the cold-water version, hot water cooling consumes up to 77 percent less electricity, depending on the selected inlet temperature and system technology.
The benefits of datacentre heat reuse
Modern technological approaches make it possible to utilise waste heat from datacentres that has previously been cooled away in a costly and time-consuming process. Engineers link the cooling circuits of the racks directly with the heating systems of office or residential buildings.
Power guzzlers can thus become heating and power stations for cities, which in future will not only be climate-neutral but also climate-positive. This not only saves energy and money, but also significantly reduces carbon dioxide emissions.
If all datacentres in the UK were to use this technology, up to 530,000 tons of carbon dioxide could be saved annually. This would correspond to 7,500 football pitches full of trees or a forest the size of Ipswich.
Green datacentres and Sadiq Khan’s environmental goals
Many market analysts and scientists are convinced that these solutions will also have a political resonance, as they open fascinating new scope for local solutions and regional providers.
This is ensured by internet law and data protection regulations, but also by technological necessities. It is foreseeable, for example, that concepts such as edge cloud computing, fog computing and the latency constraints mentioned will soon make mobile and local micro supercomputers take over some of the tasks of traditional mainframe computing centres.
Market participants who are prepared to face all these trends in good time are well advised. Digitalisation cannot be stopped – instead, it is necessary to actively shape a sustainable, green digital future.
The special role of datacentres in this digital change is becoming more and more relevant for society in general.
The ambitious goals recently formulated by London Mayor Sadiq Khan to reduce environmental pollution and carbon dioxide emissions will be a real challenge for operators.
By 2020, for example, all real estate will use only renewable energy. By 2050, London is to become a city that no longer emits any additional carbon dioxide. Datacentre operators will inevitably have to rely on sustainable, energy-efficient technologies.
In this guest post, Neil Briscoe, CTO of hybrid cloud platform provider 6point6 Cloud Gateway, sets about busting some common cloud myths for enterprise IT buyers
The cloud provides endless opportunity for businesses to become more agile, efficient and – ultimately – more profitable. However, the hype around cloud has made firms so desperate for the action, they’ve become blinded to the fact it might not be the right move for every company.
Many firms get sucked into the benefits of the cloud, without truly understanding what the cloud even is. I mean, would you buy a car without knowing how to drive it?
The eight fallacies of distributed computing guided many leaders into making the right decision for their business when adopting new systems. However, as the cloud continues to evolve, it can be hard to keep up.
With this in mind, here are some of commonly recurring myths that need to be busted about cloud, so businesses can make sound decisions on what is right for them.
Myth one – You have to go all-in on cloud
As with any hyped innovation, cloud is being positioned by many as a ‘fix all solution’ and somewhere that businesses can migrate all their systems to in one swoop.
In reality, many services are at different stages of their life cycle, and not suitable for full migration, and running legacy systems are no bad thing.
It’s important that each system is fit for purpose, so shop around and see exactly which one fits with your strategic roadmap. By trying to fit a square peg in a round hole, you open yourself up to all sorts of security and operational issues.
Myth two – Cloud is cheaper than on-premise
Datacentres are notoriously expensive but the hidden costs in maintaining a cloud infrastructure can be deceptively high. Don’t get distracted by the desirably low initial CAPEX and onboarding costs for migrating to the cloud, as the ongoing OPEX can shoot you in the foot long-term.
Public cloud service providers like AWS are now being much more transparent with their Total Cost of Ownership (TCO) calculator, and the numbers are quite accurate as long as you have discipline. This in turn is reducing the need for enterprises to invest in large capital expenditures and means they only pay for what you need and use.
By having a long-term plan and understanding where, why and how the cloud will benefit your business, this will – in-turn – reduce your costs. Migrating the whole datacentre will not.
Myth three – Cloud is more secure than on-premise
On-premise datacentres are clear. You have a door to go in and out of, and you can have a reasonable grasp of where your external perimeter is, and then secure it. In the cloud, there are infinite doors and windows for a malicious actor to exploit, if not secured properly.
Cloud providers give you the tools to secure your data and infrastructure probably better than you can with on-premise datacentre tools, as long as they’re used correctly.
It’s very easy to build a service and make it completely public-facing without having any resilience or security at all. Don’t assume that all cloud providers will create such high levels of security in the architecture “by default”. Sometimes you must dig deep and truly understand what is happening under the hood.
Myth four – One cloud provider is all you need
Don’t be afraid to go multi-cloud. Each system is different, and you have the power to choose the best-of-breed system for all workloads, even if this means using more than one provider.
As cloud is relatively new, people are still experimenting and figuring out what suits them best. By going multi-cloud, enterprises aren’t restricted to one particular operator and can get best-for-the-job services without sacrificing on agility. Never sacrifice on agility. You can have your cake and eat it.
Myth five – It is better to build, not buy
Building a private network and connectivity platform in the cloud sounds desirable; you build the best platform to serve your requirements without answering to an external vendor. However, building your own network isn’t as easy as 1,2,3 and once you’ve built it, maintaining it is the hardest step.
Talent is in high demand, so ensuring you can keep your talent in house to keep developing and improving your network without letting them escape is an ongoing challenge. It’s a task to ensure continuity.
By “buying”, it’s possible to remove the headache of connectivity, security and network agility of a potentially complex architecture and will allow enterprises to focus on the actual digital transformation itself without draining resources for, what is becoming, a commodity IT item.
In closing, while cloud isn’t a new innovation, enterprises and suppliers alike are continuously developing and understanding how it can benefit business.
Enterprises need to de-sensitise themselves from the hype and investigate exactly how and why the cloud will improve efficiency, competitiveness and reduce costs.
Cloud is changing every day and everyone’s experiences are different, but keeping your eyes and ears open to the numerous benefits but also the threats it can create will ensure that enterprises use the cloud effectively.
In this guest post, Chris Hodson, chief information security officer for Europe, Middle East and Africa (EMEA) at internet security firm Zscaler, takes a closer look at why cloud security remains such an enduring barrier to adoption for so many enterprises.
Cloud computing is growing in use across all industries. Multi-tenant solutions often provide cost savings whilst supporting digital transformation initiatives, but concerns about the security of cloud architectures persist.
Some enterprises are still wary using cloud because of security concerns, even though the perceived risks are rarely unique to the cloud. Indeed, my own research shows more than half of the most appropriate vulnerabilities with cloud usage (identified by the European Union Agency for Network and Information Security (ENISA)) would be present in any contemporary technology environment.
Weighing up the cloud security risk
To establish the risks, beneﬁts and suitability of cloud, we need to define the various cloud deployments, and operational and service models in use today and in the future.
A good analogy is that clouds are like roads; they facilitate getting to your destination, which could be a network location, application or a development environment. No-one would enforce a single, rigid set of rules and regulations for all roads – many factors come into play, whether it’s volume of trafﬁc, the likelihood of an accident, safety measures, or requirements for cameras.
If all roads carried a 30 mile per hour limit, you might reduce fatal collisions, but motorways would cease to be efﬁcient. Equally, if you applied a 70 mile per hour limit to a pedestrian precinct, unnecessary risks would be introduced. Context is very important – imperative in fact.
The same goes for cloud computing. To assess cloud risk, it is vital that we define what cloud means. Cloud adoption continues to grow, and as it does, such an explicit delineation of cloud and on-premise will not be necessary.
Is the world of commodity computing displacing traditional datacentre models to such an extent that soon all computing will be elastic, distributed and based on virtualisation? On the whole, I believe this is true. Organisations may have speciﬁc legacy, regulatory or performance requirements for retaining certain applications closer-to-home, but these will likely become the exception, not the rule.
Consumers and businesses continue to beneﬁt from the convenience and cost savings associated with multi-tenant, cloud-based services. Service-based, shared solutions are pervasive in all industry verticals and the cloud/non-cloud delineation is not a suitable method of performing risk assessment.
Does cloud introduce new cloud security risks?
According to the International Organisation for Standardisation (ISO), risk is “the effect of uncertainty on objectives.” So, does cloud introduce new forms of risk which didn’t exist in previous computing ecosystems? It is important that we understand how many of these are unique to the cloud and a result of the intrinsic nature of cloud architecture.
Taking an ethological or sociological approach to risk perception, German psychologist Gerd Gigerenzer asserts that people tend to fear what are called “dread risks”: low-probability, high-consequence but with a primitive, overt impact. In other words, we feel safer with our servers in our datacentre even though we would (likely) be better served to leave security to those with cyber security as their core business.
The cloud is no more or less secure than on-premise technical architecture per se. There are entire application ecosystems running in public cloud that have a defence-in-depth set of security capabilities. Equally, there are a plethora of solutions that are deployed with default conﬁgurations and patch management issues.
Identifying key cloud security risks
ENISA, which provides the most comprehensive and well-constructed decomposition of what it considers the most appropriate vulnerabilities with cloud usage, breaks cloud vulnerabilities into three areas:
- Policy and organisational – this includes vendor lock-in, governance, compliance, reputation and supply chain failures.
- Technical – this covers resource exhaustion, isolation failure, malicious insider, interface compromise, data interception, data leakage, insecure data deletion, denial of service (DDoS) and loss of encryption keys
- Legal – such as subpoena and e-discovery, changes of jurisdiction, data protection risks and licensing risks
The fact is, however, that most of these vulnerabilities are not unique to the cloud. Instead, they are the result of a need for process change as opposed to any technical vulnerability. The threat actors in an on-premise and public cloud ecosystem are broadly similar.
An organisation is idiomatically only as strong as its weakest link. Whilst it is prudent to acknowledge the threats and vulnerabilities associated with public cloud computing, there are a myriad of risks to the conﬁdentiality, integrity and availability which exist across enterprise environments.
Ultimately, when it comes to the cloud it’s all about contextualising risk. Businesses tend to automatically think of high profile attacks, such as the Spectre meltdown, but the chances of this type of attack happening is extremely low.
Organisations undoubtedly need to assess the risks and make necessary changes to ensure they are compliant when making the move to the cloud, but it is wrong to assume that the cloud brings more vulnerabilities – in many situations, public cloud adoption can improve a company’s security posture.
In this guest post, Allan Brearley, cloud practice lead at IT services consultancy ECS, advises enterprises to start small to achieve big change in their organisations with cloud.
The success of Elon Musk’s Falcon Heavy rocket (and its subsequent return to earth) wowed the world. Imprinted on everyone’s memory, thanks to the inspired addition of the red Tesla Roadster as its payload blasted out Bowie’s Life on Mars on repeat, the mission demonstrates the “start small, think big, learn fast” ethos evangelised by that other great American innovator, Steve Jobs.
And this “start small, think big” ethos can equally be applied to cloud-based transformation projects.
Making the right decisions at the right time is key. Musk understands this, expounding the need to focus on evidence-based decision making to come up with a new idea or solve a problem.
While thinking big about rocket innovation, he committed to three attempts, setting strict boundaries and budgets for each phase. This meant he didn’t waste cash or resources on something that wouldn’t work.
Coming back down to earth, IT teams tasked with moving workloads to the cloud (rather than putting payloads into space) can learn a lot from this approach to innovation.
For organisations not born in the cloud, the decision to bring off-premise technologies into the mix throws up some tough questions and challenges. As well as the obvious technical considerations, there are other hoops to jump through from a cost, risk, compliance, and regulatory point of view, to ensuring you have suitably qualified people in place.
Lay the groundwork for cloud
It is quite common for highly-regulated businesses with complex infrastructure to enter a state of analysis paralysis at the early stages of a cloud transformation due to the sheer scale and difficulty of the task ahead.
Instead of pausing and getting agreement on a “start small” strategy, they feel compelled to bite off more than they can chew and “go large”.
At this point, enterprises often go into overdrive to establish the business case, scope the project, and formulate the approach all in one fell swoop. But this is likely to result in the cloud programme tumbling back to earth with a bang.
It is simply impossible on day one to plan and build a cloud platform that will be capable of accepting every flavour of application, security posture and workload type.
As with any major transformation project, the cultural and organisational changes will take considerable time and effort. Getting your priorities straight at the outset and straightening out your people, process, and technology issues is critical. This involves getting closer to your business teams, being champions for change, and up-skilling your workforce.
A shift in cloud culture
Moving to the cloud often heralds a shift in company culture, and it’s important to consider up front how the operating model will adapt and what impact this will have across the business.
Leadership needs to prepare for the shift away from following traditional waterfall software development processes to embracing agile and DevOps methodologies that will help the business make full use of a cloud environment, while speeding up the pace of digital transformation.
Cloud ushers in a new way of working that cuts swathes across enterprises’ traditional set-up of infrastructure teams specialising in compute, network, and storage etc. Care must be taken to ensure these individuals are fully engaged, supported, trained and incentivised to ensure a successful outcome.
Start small to achieve big things
Starting with a small pathfinder project is a good strategy, as it allows you to lay the foundations for the accelerated adoption of cloud downstream, as well as for migration at scale – assuming this is the chosen strategy.
Suitable pathfinder candidates might be an application currently hosted on-premise that can be migrated to the cloud, or a new application/service that is being developed in-house to run in the cloud from day one.
Once the pathfinder project is agreed upon, the race is on to assemble a dream team of experts to deliver a Minimum Viable Cloud (MVC) capability that can support the activity; this team will also establish the core of your fledgling cloud Centre of Excellence (CoE).
Once built, the MVC can be extended to support more complex and demanding applications that require more robust security measures.
The CoE team will also be on hand to support the construction of the business case for a larger migration. This includes assessing suitable application migration candidates, and grouping them together into virtual buckets, based on their suitability for cloud.
These quick wins will help to convert any naysayers and secure stakeholder buy-in across the business ahead of a broader cloud adoption exercise. They are also a powerful way to get the various infrastructure teams on side, providing as they do a great platform for re-skilling and opening up fresh career opportunities to boot.
In summary, taking a scientific approach to your cloud journey and moving back and forth between thinking big and moving things forward in small, manageable steps will help enable, de-risk and ultimately accelerate a successful mass migration.
As both the space race and the cloud race heat up, it’s good to remember the wise words of the late great Steve Jobs: “Start small, think big. Don’t worry about too many things at once. Take a handful of simple things to begin with, and then progress to more complex ones. Think about not just tomorrow, but the future. Put a ding in the universe.”
To coincide with the first day of the Google Cloud Next 2018 conference (taking place from 24-26 July) in San Francisco, John Jainschigg, content strategy lead at enterprise systems monitoring software provider Opsview shares his views on what datacentre operators can learn from the search giant’s site reliability engineers.
The noughties witnessed many experimental breakthroughs in technology, from the introduction of the iPod to the launch of YouTube. This era also saw a fresh-faced Google, embarking on a quest to expand its portfolio of services beyond search. Much like any highly ambitious, innovative technology initiative, the firm encountered a number of challenges along the way.
In response, Google began evolving a discipline called Site Reliability Engineering (SRE), about which they published a very useful and fascinating book in 2016. SRE and DevOps share a lot of conceptual and an increasing amount of practical DNA; particularly true since cloud software and tooling have now evolved to enable ambitious folks to begin emulating parts of Google’s infrastructure using open source software like Kubernetes.
Google has used the statement “class SRE implements DevOps” to title a new (and growing) video playlist by Liz Fong-Jones and Seth Vargo of Google Cloud Platform, showing how and where these disciplines connect, while nudging DevOps practitioners to consider some key SRE insights, including the following.
- The normality of failure: It is near impossible to produce 100% uptime for a service. Therefore, expecting such a high success-rate is expensive, and pointless, given the existence of masking error rates among your service’s dependencies.
- Ensure your organisation agrees on its Service Level Indicators (SLIS) and Objectives (SLOs): Since failure is normal, you need to agree across your entire organisation what availability means; what specific metrics are relevant in determining availability (SLIs); and what acceptable availability looks like, numerically, in terms of these metrics (SLOs).
- Create an ‘error budget’ using agreed-upon SLOs: SLO is used to define what SREs call the “error budget” which is a numeric line in the sand (such as minutes of service downtime acceptable per month). The error budget is used to encourage collective ownership of service availability and blamelessly resolve disputes about balancing risk and stability. For example, if programmers are releasing risky new features too frequently, compromising availability, this will deplete the error budget. SREs can point to the at-risk error budget, and argue for halting releases and refocusing coders on efforts to improve system resilience.
The error budget point is important because it lets the organisation as a whole effectively balance speed/risk with stability. Paying attention to this economy encourages investment in strategies that accelerate the business while minimising risk: writing error- and chaos-tolerant apps, automating away pointless toil, advancing by means of small changes, and evaluating ‘canary’ deployments before proceeding with full releases.
Monitoring systems are key to making this whole, elegant tranche of DevOps/SRE discipline work. It’s important to note (because Google isn’t running your datacentre) this has nothing to do with what kind of technologies you’re monitoring, with the processes you’re wrangling, or with the specific techniques you might apply to stay above your SLOs. In short, it makes just as much sense to apply SRE metrics discipline to conventional enterprise systems as it does to twelve-factor apps running on container orchestration.
So with that in mind, these are the main things that Google SRE can teach you about monitoring:
- Do not over-alert the user: Alert exhaustion is a real thing, and paging a human is an expensive use of an employee’s time.
- Be smart by efficiently deploying monitoring experts: Google SRE teams with a dozen or so members typically employ one or two monitoring specialists. But they don’t busy these experts by having them stare at real-time charts and graphs to spot problems: that’s a kind of work SREs call ‘toil’ — they think it’s ineffective and they know it doesn’t scale.
- Clear, real-time analysis, with no smoke and mirrors: Google SREs like simple, fast monitoring systems that help them quickly figure out why problems occurred, after they occurred. They don’t trust magic solutions that try to automate root-cause analysis, and they try to keep alerting rules in general as simple as possible, without complex dependency hierarchies, except for (rare) parts of their systems that are in very stable, unambiguous states.
- The value of far-reaching “white box” monitoring: Google likes to perform deeply introspective monitoring of target systems grouped by application. Viewing related metrics from all systems (e.g., databases, web servers) supporting an application lets them identify root causes with less ambiguity (for example, is the database really slow, or is there a problem on the network link between it and the web host?)
- Latency, traffic/demand, errors, and saturation: Part of the point of monitoring is communication, and Google SREs strongly favour building SLOs (and SLAs) on small groups of related, easily-understood SLI metrics. As such, it is believed that measuring “four golden signals” – latency, traffic/demand, errors, and saturation – can help pinpoint most problems, even in complex systems such as carrier orchestrators with limited workload visibility. It’s important to note, however, that this austere schematic doesn’t automatically confer simplicity, as some monitoring makers have suggested. Google notes that ‘errors’ are intrinsically hugely diverse, and range from easy to almost impossible to trap; and they note that ‘saturation’ often depends on monitoring constrained resources (e.g., CPU capacity, RAM, etc.) and carefully testing hypotheses about the levels at which utilisation becomes problematic.
Ultimately, an effective DevOps monitoring system must entail far more than do-it-yourself toolkits. While versatility and configurability are essential, more important is the ability of a mature monitoring solution to provide distilled operational intelligence about specific systems and services under observation, along with the ability to group and visualise these systems collectively, as business services.
In this guest post Paul Timms, managing director of IT support provider MCSA, shares his thoughts on why enterprises can ill-afford to overlook the importance of business continuity and disaster recovery.
With downtime leading to reputational damage, lost trade and productivity loss, organisations are starting to realise continuity planning and disaster recovery are critical to success.
Business continuity needs to be properly planned, tested and reviewed in order to be successful. For most businesses, planning for disaster recovery will raise more questions than answers to begin with, but doing the hard work now will save a lot of pain in the future.
Ascertain the appetite for risk
All businesses are different when it comes to risk. While some may view a ransomware attack as a minor inconvenience, dealt with by running on paper for a week while they rebuild systems, whereas others view any sort of downtime as unacceptable.
The individual risk appetite of your organisation will have a significant impact on how you plan and prepare for business continuity. You will need to consider your sector, size, and attitude towards downtime, verses cost and resources. Assessing this risk appetite will let you judge where to allocate resources, and focus your priorities.
Plan, plan and plan some more
To properly plan for disaster recovery, it is critical to consider all aspects of a business continuity event, together with the impact of it, and how to mitigate these.
For example, if power goes down in the organisation’s headquarters, so will the landline phones, but mobiles will still be functional. A way to mitigate this impact would be to reroute the office phone number to another office or a mobile at headquarters. To do that you need to consider where you store the information about how to do that, and who knows where it is.
This is just one example. You need to consider all the risks, and all the technology and processes that you use. Consider the plan, the risk, the solution and where you need to invest and strengthen your plan to ensure your business can still function in the event of a disaster.
Build in blocks and test rigorously
Ideally IT solutions will be built and tested in blocks, so you can consider the risks and solutions in a functional way. You can consider for example your network, WAN/LAN, storage and data protection.
Plans for each then need to be tested in a rigorous way with likely scenarios. What if, for example, a particular machine fails? What happens if the power supply cuts out? What happens in the case of a crypto virus? Do you have back-ups? Are they on site? Do you have a failover set-up in case of system failure? Is the second system on site or in a completely different geography? What do I with my staff – can they work from home? Are there enough laptops?
These will drive out and validate (or not) assumptions on managing during a business continuity event. For example if your company is infected with a crypto virus and has infected the main servers, it will also have replicated across to your other sites, therefore your only option is to restore from back-ups or have a technology solution that allows you to roll back before the virus was unleashed.
Cloud is not the only answer
It can be tempting to think cloud can solve all the problems, but that is not always the case. Data in the cloud is not automatically backed up and is not necessarily replicated to a second site. These are options on some public cloud services, but they are often expensive and under used, as a result.
Despite being cloud-based, a company’s Office365 environment can still get a virus and become corrupted. If you have not put the IT systems in place to back that data up, then it will be lost. If for example, the cloud goes down, you need to consider a failover system.
The interesting part of this is the public cloud doesn’t go down very often, but when it does it is impossible to tell you how long it will be out of action for. Businesses must therefore consider when to invoke disaster recover in that instance.
Know your core systems
One solution that some companies adopt is running core systems and storage on infrastructure that they know and trust. This means knowing where it is and what it is, so it meets their performance criteria. Businesses also consider how this system is backed up including what network it is plugged into, ensuring it has a wireless router as standby, that the failover system is at least 50 miles away on a separate IP provider, that the replication is tested and that the data protection technology works and is tested.
This gives business much better knowledge and control in a business continuity event such as a power outage. Businesses can get direct feedback about the length of outage meaning they have better visibility and ability to make the right decisions.
Plan and prioritise
When making a plan you need to consider the risks and your objectives. A useful approach towards technology can be to consider how it can help mitigate these risks and help you meet your objectives. When considering budget, there is no upper limit to what you can spend, instead focus on your priorities and then have the board sign them off.
Spending a day brainstorming is a good way of working out what concerns your organisation the most and what will have the most detrimental impact should it go wrong. Needless to say, something that has a high risk of impact needs to be prioritised. In terms of the executor of any business continuity plan, as the saying goes, don’t put your eggs in one basket – involving numerous people and hence ensuring more than one person is trained in the business continuity plan will significantly mitigate the impact of any event.
In this guest post, Eran Kinsbruner, lead technical evangelist at DevOps software supplier Perfecto, talks about why success in agile software development hinges on getting the people, processes and technology elements all in alignment
In a super-charged digital environment where competition is fierce and speed and quality are crucial, many organisations are incorporating DevOps practices (including continuous integration and continuous delivery) into their software development processes.
These companies know software is safer when people with complementary skills in technical operations and software development work together, not apart. However, to keep succeeding organisations must be committed to on-going self-evaluation and embrace the need to change when necessary.
Keeping it fresh is key to success – and for us, this comes from three primary areas; people, processes and technology.
The people part of the DevOps equation
The developer’s role is constantly evolving, and functions that were once owned by testers or quality assurance (QA) teams now fall firmly under their remit. This means new skills are needed, and it’s important that organisations are committed to upskilling their developers.
For some, this means facilitating the mentoring of testers by highly qualified developers. And for others it means considering a change in software development practices to include Acceptance Test Driven Development (ATDD), which promotes defining tests as code is written.
Test automation becomes a core practice during feature development rather than afterwards. Depending on team skills, implementing Behaviour Driven Development (BDD) (which implements test automation with simple English-like syntax) serves less technical teams extremely well. There are bound to be blind spots between developer, business and test personas – and choosing development practices matched to team skills can contribute to accelerating development velocity.
Leadership is another critical aspect of success in DevOps and continuous testing. Diverse teams and personas call for strong leadership as a unifying force, and a leader’s active role in affecting change is crucial. Of course, part of leadership is to enforce stringent metrics and KPIs which help to keep everyone on track.
The importance of process
Teams must always work to clean up their code and to do it regularly. That includes more than just testing. Code refactoring (the process of restructuring computer code) is important for optimal performance, as is continually scanning for security holes.
It also includes more than just making sure production code is ‘clean’. It’s crucial to ‘treat test code as production code’ and maintain that too. Good code is always tested and version controlled.
Code can be cleaned and quality ensured in several ways. The first is through code reviews and code analysis; making make sure code is well-maintained and there are no memory leaks. Using dashboards, analytics and other visibility enablers can also help power real-time decision making which is based on concrete data – and can help teams deliver quicker and more accurately.
Finally, continuous testing by each feature team is important. Often, a team is responsible for specific functional components along with testing, and so testing code merges locally is key to detect issues earlier. Only then can teams be sure that, once a merge happens into the main branch, the entire product is not broken, and that the overall quality picture is kept consistent and visible at all times.
Let’s talk technology
When there is a mismatch between the technology and the processes or people, development teams simply won’t be able to meet their objectives
A primary technology in development is the lab itself. A test environment is the foundation of the entire testing workflow, including all continuous integration activities. It perhaps goes without saying that when the lab is not available or unstable, the entire process breaks.
For many, the requirement for speed and quality means a shift to open-source test automation tools. But, as with many free and open-source software markets, a plethora of tools complicates the selection process. Choosing an ideal framework isn’t easy, and there are material differences between the needs of developers and engineers, which must be catered for.
A developer’s primary objective is for fast feedback for their localised code changes. Frameworks like Espresso, XCUITest and JSDom or Headless Chrome Puppeteer are good options for this.
A test engineer, on the other hand, concentrates on the integration of various developers into a complete packaged product, and for that, their end-to-end testing objectives require different frameworks, like Appium, Selenium or Protractor. And production engineers are executing end-to-end tests to identify and resolve service degradations before the user experience is impacted. Frameworks such as Selenium or Protractor are also relevant here but the integration with monitoring and alerting tools becomes essential to fit into their workflow.
With such different needs, many organisations opt for a hybrid model, where they use several frameworks in tandem.
People, processes and technology – together
Ultimately, we believe that only by continually re-evaluating people, processes and technology – the three tenets of DevOps – can teams achieve accelerated delivery while ensuring quality. It’s crucial in today’s hyper-competitive landscape that speed and quality go hand in hand, and so we’d advise every organisation to take a look at their own operations and see how they can be spring-cleaned for success.
In this guest post, Chip Childers, CTO of open source platform-as-a-service Cloud Foundry, makes the case for why the future of public cloud and IaaS won’t be proprietary.
Once upon a time, ‘Infrastructure-as-a-Service’ basically meant services provided by the likes of Amazon Web Services (AWS), Microsoft Azure or Google Cloud Platform (GCP), but things are changing as more open source offerings enter the fray.
This is partly down to Kubernetes, which has done much to popularise container technology, helped by its association with Docker and others, which has ushered in a period of explosive innovation in the ‘container platform’ space. This is where Kubernetes stands out, and today it could hold the key to the future of IaaS.
History in technology, as in everything else, matters. And so does context. A year in tech can seem like a lifetime, and it’s really quite incredible to think back to how things have changed in just a few short years since the dawn of IaaS.
Back then the technology industry was trying to deal with the inexorable rise of AWS, and the growing risk of a monopoly emerging in the world of infrastructure provision.
In a bid to counteract Amazon’s head start, hosting providers started to make the move from traditional hosting services to cloud (or cloud-like) services. We also began to see the emergence of cloud-like automation platforms that could potentially be used by both enterprise and service providers.
Open source projects such as OpenStack touted the potential of a ‘free and open cloud’, and standards bodies began to develop common cloud provider APIs.
As a follow on to this, API abstraction libraries started to emerge, with the aim of making things easier for developers working with cloud providers who did not just want to rely on a few key players.
It was around this time that many of the cloud’s blue-sky thinkers first began to envisage the age of commoditised computing. Theories were posited that claimed we were just around the corner from providers trading capacity with each other and regular price arbitrage.
Those predictions proved to be premature. At that time, and in the years hence, computing capacity simply wasn’t ready to become a commodity that providers could pass between each other – the implementation differences were simply too great.
That was then – but things are certainly looking different today. Although we still have some major implementation differences between cloud service providers, including the types of capabilities they’re offering, we’re seeing the way forward to an eventual commoditisation of computing infrastructure.
While even the basic compute services remain different enough to avoid categorisation as a commodity, this no longer seems to matter in the way that it once did.
That change has largely come about because of the ‘managed Kubernetes clusters’ used by most public cloud providers now.
The shift has also been happening in the private sector, with many private cloud software vendors adopting either a ‘Kubernetes-first’ architecture or a ‘with Kubernetes’ product offering.
As Kubernetes continues its seemingly unstoppable move towards ubiquity, Linux containers now look likely to become the currency of commodified compute.
There are still implementation differences of course, with cloud providers differing in the details of how they offer Kubernetes services, but the path towards consistency now seems a lot clearer than it did a decade ago.
This more consistent approach to compute now seems as inevitable as the future of IaaS, made possible by the open source approach of Kubernetes.