Thinking about Cloud Strategy

Cloud Computing has been a buzz topic for a while. I understand it: it promises exciting benefits and, therefore, everyone wants to be there. Obviously, that’s not new. Nevertheless, I happened to attend several Cloud Computing events recently and I couldn’t help having a strange feeling of “déjà vu”:

both Skeptics and Believers keep saying the same things over and over again.
there seems to be some confusion or lack of understanding about how different concepts and components relate to each other.
a broad set of diverse offerings and their respective confusing marketing messages don’t help to understand the context.

That is why I’ve decided to sort out my ideas and tell you my own vision about this subject. But, before going on, I would like to point out some important nuances:

I am a Believer with huge respect on Skeptics points of view.
This is a complex phenomenon with many interrelated and relevant perspectives that have to be analyzed both individually and also as a whole.
I believe that this is a never ending learning process. In fact, I have invested a quite significant amount of time listening, reading, thinking and discussing about this matter. So, this vision is inevitably connected to what I’ve learned and, of course, is the reason why I have waited so long to write this down.
This is a vision, not an answer nor a solution. Once you have a vision, you can ask your questions or build your own solutions.
This vision may not be right or complete but we can discuss it here if you want to. I would love to learn more from your point of view.

As I’ve already mentioned, I consider this a question with multiple interconnected facets. So, in order to focus on the aspects I would like to address, I will assume that you are familiar with certain terms and concepts. Anyway, I will try to incorporate as many references as I can.

Multiple Cloud definitions

It is always hard to deal with something you can’t define or identify. Cloud computing has always had this problem. Let’s see some definition types:

“New IT Delivery Model”

Let’s start by looking at how NIST defines Cloud Computing:

Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is composed of five essential characteristics (On-demand self-service, Broad network access, Resource Pooling, Rapid elasticity, Measured Service), three service models (SaaS, PaaS, IaaS), and four deployment models (Private Cloud, Community Cloud, Public Cloud, Hybrid Cloud).

As you can see this is a pretty good, complete, descriptive and technology-oriented definition. We may agree that this is not a “short” description and that some concepts might require further explanation (in the original document, they are).

Although, I consider this the most relevant definition out of its class, let’s go with the next one …

“I am The Cloud”

This is the most common definition between vendors. Obviously, it is a natural consequence of the marketing battle between them. “I am The Cloud” definitions try to reduce the meaning of this concept following two main strategies:

aligning The Cloud to the Vendor product portfolio: i.e. Let’s say that I am an infrastructure vendor. Then Cloud Computing means “IaaS”.
creating a market segment with a new set of acronyms and definitions: DaaS (Desktop or Data), BaaS (Backup), SaaS (Storage), CaaS (Communitations), VaaS (Voice), etc.

To be honest, some of these definitions are correct. In other words, not all vendors try to play “evil” marketing tricks.

“IT Paradigm Shift”

This definition type describes Cloud Computing as an “IT Paradigm Shift: a Marketplace of utility-like commodity Services”.

This new computing context delivers key characteristics on the following areas:

economy: economies of scale, pay-per-use, sense and vision of a competitive market.
productivity: automation, fast provisioning.
delivery: simplicity, agility, elasticity, on-demand, self-service.
security: designed for failure.
architecture: scale-out design, multi-tenancy.
infrastructure: commodity components, network convergence, appliances.

This is, by far, my favorite definition. It is a short and business-oriented definition. It puts focus on the effects and changes -both benefits and challenges- that this technology drives.

There is one last Cloud Computing classification that is often left behind: a definition by Target Market. For example: Cloud Services for Consumers, Cloud Services for SMBs, Cloud Services for Enterprises.

First Reflections

Live it is not usually black or white. So, you can often attend different nuances and interpretations under different situations. Nevertheless, what I am trying to do here is to clarify concepts (if such a thing is possible) and stress what I consider more essential, significant or relevant.

Therefore, we can start pointing out some considerations:

If you don’t have economies of scale (because you don’t share resources or you don’t have unusually high volume procurement purchases), you don’t have Cloud Computing.
If you don’t compare seriously your Cloud Service offer with other market players, either be public or private, you don’t have Cloud Computing.
If you don’t deliver an on-demand self-service interface to your customers, you don’t have Cloud Computing.
If you have people to manage your customer relationship, it is very likely that you don’t have Cloud Computing.
If you don’t manage procurement in advance in order to serve your future demands, you don’t have Cloud Computing.
If a Contract with a Customer covers all his Infrastructure costs, you don’t have Cloud Computing.
Virtualization + Data Center Automation it is not Cloud Computing.
If you don’t manage your infrastructure as a commodity, it is very likely that you don’t have Cloud Computing.
If you haven’t reduced your infrastructure complexity, it is very likely that you don’t have Cloud Computing.
If you haven’t speed up your delivery times several orders of magnitude (from months or weeks to minutes), you don’t have Cloud Computing.
Cloud Computing is one class of Outsourcing. Nevertheless, Outsourcing it is not Cloud Computing.

Where Lock-in Risks really are?

The Cloud Computing scene is full of warnings around Risk Management, good Governance Practices, Models, etc. In fact, one of the most common chapters on conversations about it are Lock-in Risks. Many of these risks are of a technological nature and, to make it more fun, this debate is heavily connected to the one regarding Cloud Standards.

Cloud Standards

At this point you might be suspecting that I am trying to demystify some of these concerns. You are right ;-). So, let’s invest some time on the Cloud Standards “hot topic”. I encourage you to spend some time watching the following video. It’s a pretty revealing debate where Sam Johnston and Benjamin Black maintain opposite positions about it during the last OSCON.

I confess that I do completely agree with Benjamin’s Black position. His arguments can be synthesized into the following points:

Standards will come up naturally from a de-facto situation.
There is a hierarchy of needs: Utility, Interoperability and Independence.
some problems may not need to interoperate to deliver full utility.
some problems can’t achieve utility without interoperability.
some problems may not need to reach complete independence to achieve utility.
The adoption process follows a steps: Disruption, Competition and Maturation.
We are living a discovery process and that exploration will define how the future will look like.

I hope that, what follows next, reinforce this vision. And, therefore, we will see that our concerns are, hopefully, overrated.

The commoditization status rules it all…

Let’s have a look to IaaS and PaaS simplified architecture maps:

As you can see, both of them follow similar patterns. In fact, PaaS solutions are, usually, built on top of some kind of IaaS infrastructure.

I have consciously excluded SaaS from this analysis because its lock in risks can be reduced to the availability of ETL processes and data formats.

I would also like to point out the presence of OSS and BSS subsystems because they are often left apart.

Lock-in risks become significant when a component is far from being a commodity. Fortunately, when you analyze these architecture maps you can take the following reasonable assumptions from a Cloud Computing perspective:

You don’t usually mess with the Fabric Controller. If you do, you have a lock in risk here.
You don’t usually couple your software with your Cloud Provider’s OSS and BSS subsystems. If you do, you have a lock in risk here.
Service Definition Manifest it is not present in IaaS. IaaS players are putting their focus in Federation to enable Hybrid Cloud scenarios rather than carry out full Service Layer automation. In fact, this is an optional component in IaaS, as its key semantics here have to do with VM operations for the most part.
Service Definition Manifest it is not standard in PaaS. In this case, every PaaS player takes its own path. Sometimes it might not exist but, when it does, it can be an stand alone component or a function integrated into Package Deployment Services. Anyway, this Manifest is what allows the Fabric Controller to perform full Service Layer automation. Fortunately, not being a standard is not a big deal, as this feature doesn’t have to do with Application Execution but with Service Automation Helper Services.
Virtualization can be considered a commodity. You are not locked in as there exist widely availability of P2V and V2V procedures and tools. We should also remember that, just in case you want to switch to a different IaaS provider, it is usually cheaper, quicker and less risky to provision new VMs on your new location and then move Applications and Data.
Run-time Services can be considered a commodity. The Web Hosting industry has been showing this off for years. No matter the license type, you can find a wide variety of languages and libraries to build your solutions upon.
Messaging Services may be considered a commodity. Nevertheless, you may need to evaluate if your IaaS provider supports underlying, and potentially hidden, requirements for these services; such as Active/Passive clustering or IP multicast.
Data Persistence Services might not be considered a commodity and represent, in my opinion, the higher lock in risk. Traditional RDBMS could be considered commodity, but, unfortunately, they don’t usually scale-out smoothly. Over the past few years, new solutions and innovations have emerged to deal with this problem. As a picture worth a thousand words, here is a product map that the 451 Group have several weeks ago:

In fact, I would encourage reading the article ”NewSQL, NoSQL and Beyond” where this work is briefly described and commented.

Anyway, the availability of ETL processes and common data formats are still a need and a mitigation factor as it is in the conventional RDBMS world.

Identifying risks

With these assumptions in mind, we are ready to draw a risk map for IaaS and PaaS depending on our own choices:

From a technology perspective, there is an inverse relationship between Diversity and Ubiquity when it comes to lock-in risk evaluation:

higher ubiquity degree of a given component or solution usually represents lower lock-in risks.
too many alternatives for a given component or solution (high diversity degrees) represent high uncertainty about which one will have a long and stable live cycle. That uncertainty can be assimilated as a high lock-in risk when it comes to protecting investments in software development.

Nevertheless, high diversity degrees can be mitigated with:

High ubiquity degrees: choose a product you can find mostly everywhere.
Open Source adoption: at least you will own the source code.
Careful vendor selection: it may guarantee long live cycle and reasonable product roadmap.
Known Software Engineering Patterns and Practices such as Adapters, Facades, ORMs, and so on.

You also have to take into account the expected live cycle of your Cloud Service or Application. More often than one would expect, this is completely unknown or uncertain. If that’s the case, this uncertainty reduces lock-in risks thrown by diversity.

Let’s say, for example, that our Business may not exist in two or three years; or that we are testing the market; our Engineering Team is very likely to change the whole thing in the same period (i.e. I recall Twitter switching from Ruby to Java); then, too much diversity doesn’t represent any issue for our investments.

Cloud vs. Enterprise Vendor lock-in

When thinking about risks, things should be put in context. Cloud lock-in risks may not be different than Enterprise Vendor lock-in risks. I can’t tell what it is the right choice. In fact, I think it is a matter of cases. But there are, definitely, some considerations that can be done:

If you are locked in on something that the market considers a commodity, you have a problem.
Vendor lock-in may cost you more than an alternative in the cloud.
Cloud Computing Transformation efforts may put you on the right path and get you ready for the next stage. So, instead of a problem, it may represent an opportunity for you and your organization.

This is a good moment to think for a while about one common practice in Enterprise environments: the use of Supported and/or Certified Configurations as a sort of “emotional blackmail”. This technique usually works because there exist an asymmetric relationship between vendor and customer that has to do with information asymmetry. Then, as a customer, several psychological mechanisms emerge:

You don’t want problems: you want to run your Business and deliver your Projects.
Everyone shares the idea that the less risky option is just following an existing path: just do what every one does. Never mind the “someone has to be the first” …
Vendors are supposed to “know” more about their products than customers themselves.

Just remember something:

You pay a License fee.
You pay an Extended Support Contract: Premium, Gold, … whatever.
You have to Train your Technical Staff, and you do.
You take your Technical Staff through a Vendor Certification Process, and you do.
You work with an Integration Partner which also has his own Extended Support Contract and Certified Professionals.
… and they are placing Terms and Conditions that may be there for your own good. But, undoubtedly, for theirs …
… and that, when it comes to non certified configurations, they are clearly not delivering what you wanted to but, nevertheless, it is costing you a lot of money. In the end, you are paying money in assurances to manage risks that, finally, you have decided to avoid completely.

… now a Cloud Computing Transformation doesn’t look so bad, isn’t it? 😉

Think for yourself

“We” are all on the same learning journey, but “they” are all trying to sell something that might not be properly aligned to your strategy.

But, wait, what about the Applications themselves?

The problem of an IaaS-centric mindset …

Some vendors stress the need to move to Private Cloud as a transitional solution. Some others even perform the following equation: IaaS = Virtualization + Data Center Automation. I don’t agree with that relationship but it is true that Cloud Computing is usually built on top of Virtualization. So, let’s use that connection as a common, shared and agreed starting point.

Last December, Gartner published a work titled “Server Virtualization: from Virtual Machines to Clouds”. There you will find interesting information and figures about Server Virtualization Market Penetration. So, I decided to take their data and perform the following operations:

Perform a projection for 2015 to have a wider landscape.
Estimate how much of the virtualized workload belongs to new VMs and how much belong to P2V operations from legacy systems.
Try to figure out the “Downsizing Effect”, as new hardware purchases have more computing power than previous ones.
Calculate yearly variations.
Analyze the results.

Important note: These Projections and Estimates don’t have to be right. This is only one scenario used as a tool to help us think about this context. Feel free to change your Projections and Estimates to evaluate different scenarios.

On this first figure you can find a comparison between the Systems Virtualized and their corresponding Capacity when you take into account the “Downsizing Effect” (Virtualized Adjusted Capacity).

We can extract a couple of conclusions from this graph:

As we don’t share resources, we have low economies of scale and can’t take advantage of our excess of capacity.
That excess of capacity may help us when taking our legacy applications into a virtual environment giving them a longer live cycle. Old applications might not fit very well into a virtualized environment but the capacity present in the new location can render that risk negligible.

On the following chart you can see how Virtualized Systems get decomposed in P2V and New Virtual Deployments.

Now we can see one really exciting fact. “Around 2012” we will reach a tipping point: we will have virtualized 50% of the VMs we will have in 2015. This means several things:

This VMs represent our Next Generation Legacy Applications. These applications follow Traditional Enterprise Architecture Design Patterns and, usually, don’t easily scale out (i.e. they often use Active/Pasive clustering for databases).
It has taken 4 years to create this Next Generation Legacy, but we will repeat exactly the same milestone in the next 3 years. We can see that change speeds up.
So, if we don’t do anything about Traditional Enterprise Applications Architecture, the problem will only get worse. But, now, we won’t have virtualization to save us and move the problem forward. We will have the “Downsizing Effect” helping us to scale up, but, will it be enough?

Anyway, without taking a strategic decision around Applications and Traditional Enterprise Architectures you won’t be able to move to the Cloud. Here is when it comes more clear that you shouldn’t think an IaaS strategy without facing your applications ecosystem (PaaS) at the same time.

Now we can remember the statement that lead us to this reflection: “Some vendors stress the need to move to Private Cloud as a transitional solution”. If we listen better we will discover that, when they say Private Cloud, they usually mean IaaS. But, remember, your Applications and Services will still be there… Are you sure that IaaS is a complete and/or right answer to your strategic issues?

Regular Hosting Services as Cloud Providers

In a marketplace of utility-like commodity services there are no predefined “form factors”. In fact, as long as your solution delivers the value that the market expects, it could be whatever.

Traditional Hosting Services share some characteristics that may enable them as players of this game:

very narrow pricing.
feature rich portfolio with more predictable and simple price structure than their Cloud alternatives: unlimited traffic and storage, etc.
wide range of alternatives: from full shared to dedicated.
covering many geographies: almost all over the world.
high diversity of run-time environments: OSs, languages and libraries.
fairly traditional back-end services (RDBMS).
wide experience running that business on a highly competitive market.

We can see Hosting Service Providers in the Virtual Machine market today. And, sooner or later, we will see them implementing scale-out application platforms to complete their current portfolio.

So, despite their original purpose, their reduced prices and value proposition make them appealing as potential “building blocks” on highly distributed applications or services.

Naturally, not every application or service might be suitable to be designed like this. Nevertheless, it might represent an opportunity for other scenarios. In fact, as current cloud offerings require new controls and governance frameworks to deal with variable costs:

wouldn’t be interesting to design a solution as a Global Mashup where you combine on premise components with cloud and hosting services from different Service Providers as part of a new Technology Mix that really matches business needs?
isn’t the Intelligence required to analyze, define, build, operate, maintain and govern that new Technology Mix, a new source of value?

We are used to design applications following the Traditional Enterprise Applications Architecture Patterns. But we should remember that they are not the goal: they are the way. What if we change the way we look at the problem? What if we try different perspectives? Engineering is not about fears, it is about numbers. So, why don’t we try, measure, decide and deliver?

It’s all about vendor switching …

It is clear that it doesn’t exist a full automated Service Provider switching procedure. But, as we have already seen, if we have followed mitigation guidelines, lock-in risks are not a big deal. It is, although, a matter of switching costs and efforts:

Cloud Vendors may charge Data Transfer fees that will impact your migration costs.
You may pay twice the capacity while you are migrating.
Cloud Vendors may have different Service Definition Manifests that impact on how many manual tasks you will have to accomplish as part of your migration.
Cloud Vendors may have different Infrastructure Abstractions and Semantics (Update Domains, Fault Domains, Availability Zones, Regions, etc.) that impact on how many manual tasks and design changes you will have to accomplish as part of your migration.

Migration Processes happen every day on traditional IT environments. I can’t see any structural difference when it comes to Cloud Vendor switching:

If you choose to change your underlying OS, you will have to deal with the application impacts.
If you choose to change your run-time environment, you will have to change your code.
If you choose to migrate your Data Persistence Service, you will have to deal with ETL processes.
etc.

It is also important to remember that ”Things aren't accountable, People are”. So, when it comes to Cloud Failures, even though we can always tell it is Cloud’s Fault, it is very likely that we find some own lesson to learn from.

Let’s recap …

We have wandered around pretty abstract and diverse topics, but I think the following points resume our “little journey”:

We now know that Cloud Computing can be defined from multiple perspectives.
We have an idea of what Cloud Computing is not.
We have concluded that the “cloud standards debate” is useless for the most part.
We have identified the Data Persistence Layer on PaaS offerings as the riskiest one.
We have also identified potential mitigation paths: highly ubiquitous solutions, Open Source adoption, careful vendor selection, Software Engineering Patterns and availability of ETL process and common data formats.
We have discussed that Cloud lock-in risks may not be different than Enterprise Vendor lock-in risks.
We have discovered that an IaaS-only approach may not be complete.
We have tried to open our minds to consider Traditional Hosting Services as potential input in our Solution Architectures.
We have discovered that Cloud Vendor switching is about cost and effort, like any other regular IT asset migration.

Last words …

Fortunately or not, this is an endless and exciting topic. Setting up a strategy begins from a Vision of the Context, and I wish I have contributed to clear it somehow. Now it’s your time to think about it and map it to who you are and what you want to do. Then, you will be able to take your own decisions.

I can’t finish this post without naming two additional investments that are key for Cloud Computing:

Invest in Scripting and Automation Skills: no product nor solution will ever fill your context needs.
Markets and Roles in IT are changing: invest in passionate and talented people.

Well, these are my thoughts. Now it’s your turn, what do you think? :-).

Picture: “PUget Sound just before dawn” by b k. Licensed under CC-BY-20.