Despite how much I love the Cloud, it would be foolish to ignore the many challenges that it poses. And, when concepts such as Liquid IT or Multi-Cloud become part of the agenda, one of those is, without a doubt, Portability.
Back when I was a member of the Atos Scientific Community, I was one of the authors of a whitepaper that addressed this very same topic. Since then, I was fortunate enough to witness other points of view about the subject and some have certainly got me thinking. And, despite Cloud Portability might be one of those never-ending discussions, I think that some additional aspects worth additional consideration.
My starting point
Portability in the Cloud has multiple facets and there is no easy and single answer to it. That is why we defended the “Architect's Approach” as the way to address it.
Without a doubt, going “all-in” with a Cloud Service Provider (CSP) and giving up on Portability has immediate returns. It allows you to gain speed and extract value from day zero. However, the same can be told about any other technology adoption choice. This is not a new problem. Why do we care, then? Because there are risks and concerns at multiple levels. Cloud technologies are unique, though, at the scale of the potential impacts that such decisions might result in and the speed at which they happen.
In any case, hiding behind Architecture Principles such as “Technology and/or Vendor Independence” or “Technology and/or Vendor Agnosticism” to prevent or set back changes does not make anybody a favor. Especially, since inaction or delays can represent a real competitive disadvantage.
This means that Architects and Organizations alike must find their balance while, at the same time, push themselves to be honest and open to challenges that defy their own positions and preconceptions.
Given said that, let's discuss some of the points of view that drew my attention:
A systemic risk means “no risk”
I've seen this argument taking different shapes. But, in essence, it states that the Cloud is now a systemic risk: since everyone is using it, everyone is affected. As a result, and here is the catch, nothing is going to happen because it is in everybody's interest… but, if it does, it doesn't matter either since everyone will be screwed …
… I don't know a single serious business that would accept such a statement as a valid Risk Management strategy. Actually, if we were to accept it for just a second, then it would make no sense to outline Disaster Recovery or Business Continuity Plans, wouldn't it? Our lives would be so much simpler, that is for sure!
Risk Management is an individual responsibility which can't be delegated. Sure, we can ignore it, we can fool ourselves, whatever, fine … But this doesn't change which are our responsibilities: one thing is to make conscious decisions based on thoughtful consideration and a different one is claiming that company in distress makes the sorrow less …
Scale mitigates disruptive changes…
Supporters of this idea suggest that, even though CSPs may not take decisions compatible with customer's needs and concerns, the scale they have and at which they operate prevents them from impacting a large number of users. This way, the Scale itself becomes the mitigating factor that forces them to self-behave and self-control.
Let's use one example to illustrate the point. Let's assume that AWS still has just 1M enterprise customers and let's suppose that they make one decision that impacts just 1% of them. We can argue that 1% qualifies as a “small number” of customers or, at least, not a large one … But, oh wait! 1% of 1M represents 10.000 customers! To give you an idea, they represent roughly the equivalent to all the companies in Spain between 100 to 500 employees.
This proves that, when talking about planet-scale figures, words like “large” or “small” do have an impact and cannot be dismissed or treated lightly. However, the truth is that all of this is completely irrelevant. If you are one of those 10.000, the argument that you are part of the small community of affected companies will not solve your problems at all. That is an argument for CSPs, but not for consumers of Cloud Services.
This brings back the notion of Risk Management as an individual responsibility that I've mentioned above. If it does matter to you, it is on you to do something about it.
Investment + Talent = Reliability
This claim pretends to neglect the need for Risk Management assuming that reliability of a cloud-based solution is a direct function of the huge capital and unparalleled talent that CSPs have. If they are investing so much, and have the brightest minds on earth, they can't fail. As a result, it would be crazy not to go all-in, right?
Needless to say that Investments, Talent and Reliability are three different things that may be related, might be correlated, but, by no means, are equivalent. If they were, instead of three words, we would certainly have just … one?
But, anyway, the crux of the argument is Reliability. So, what is it? Reliability is a multi-level quality of an Architecture or a Solution. Cloud Services usually make the foundations upon which we build and deploy our own stuff in order to deliver an application or a service. This means that at least one piece is not owned by the CSP. We share responsibility with the Service Provider and, as a result, externalizing reliability is, simply, not possible.
On the other hand, CSPs have been shouting to us the “Design for Failure” mantra for years. With that, they indicate that reliability should be baked into the application code rather than infrastructure components. “Design for Failure” represents a paradigm shift from traditional Enterprise Architectures in which reliability was the responsibility of the infrastructure and it also means the opposite of the original claim.
A global oligopoly is good …
This other position claims that prices will always be pushed downwards since big players are on this quest for never lasting economies of scale. At the same time, Cloud Service Providers will never be tempted to turn the screws on fees to drive margins up because the competition is so intense.
I concur that this is a picture that describes the current situation quite well. However, there is a limit on how much prices can go down since infinite growth doesn't exist. On the other hand, “lock-in” (the opposite to Portability) represents, by definition, a barrier to competition. This means that, the more “lock-in” the less competition and, therefore, the bigger the temptation to raise prices.
The truth is that we can't tell how prices will evolve. We know, however, three things:
- Lock-in is less likely to happen (although not impossible) in commodity services offered by different providers supported by de-facto or industry standards that facilitate both entries and exits.
- Lock-in is also more likely to happen in situations in which data gravity has become an issue.
Lock-in happens more frequently on highly differentiated services for which there is no easy replacement.
- A small number of globally dominant players is known as an oligopoly. And this is not good. Period.
This means that, sooner or later, price evolution will through us unpleasant surprises. Actually, it has already happened. The question will always be the depth and breadth of the impact.
Cloud is “just” a way of consuming technology and innovation
I cannot agree more with this posture. There is a BIG catch, though. Cloud Computing is much more than that. Overall other consideration, it is a relationship with a Service Provider. As with any other type of relationship, things can go south. This is why contracts, laws, and regulations contemplate exit clauses.
And here is where we must assess the situation. One thing is to decide to change the energy provider for your home or even ending a life-long marriage. That may affect you personally and even your family, but the scope is limited. On the other side, ending the thing that powers systems, data, and processes for a company has deeper and wider implications. The more exposed to technology and the more coupled the company is with the Service Provider, the harder the impact will be.
Therefore, besides being a way of consuming technology and innovation, Cloud Computing is a relationship that must be managed and taken care of.
The innovation argument also raises the question about “where” that innovation happens. A single-vendor strategy could place you in a competitive disadvantage. Let's say, for instance, that qualitative progress leaps happen on a different Cloud Service Provider. Data gravity, licensing or other contractual terms might become barriers to entry and ruin that opportunity.
In other words, Portability, as an architectural quality, not only supports Vendor Management activities but also ensures that the company is free to act as it needs either at tactical or strategic levels.
The zero-sum game
Some people sustain that we must disregard the concern of being agnostic. This is a cost-based argument is that claims that either you pay the price at the beginning (to become agnostic) or you pay it at the end (when you want to exit out of a Cloud Service Provider). Both situations override each other, resulting in a zero-sum game. This reasoning also raises a huge opportunity cost: if there is no “exist” there is no price to pay … Compelling, isn't it?
Hopefully, at this stage, we have already debunked the fallacy about ignoring Risk Management or exit strategies.
In any case, talking about zero-sum games means talking about numbers. Therefore, we cannot discard the possibility that there is indeed a zero-sum game in certain cases. However, we can't claim it to be a general case. Actually, given that the entrance and the exit can't easily be compared, I would assume that zero-sum games would be the exceptions rather than the norm.
Why do I think they can't be easily compared? Let's see:
- Entrances are progressive over time. That is, you adopt once, and you grow from there. On the opposite side, exists are bound to a period of time: “I want to exit this year on this date” (hello BREXIT!).
- When you adopt the cloud, you know both the source and the target environment. This is not the case with exits unless you have a strategy planned in advance.
- Systems establish relationships with one another over time making the whole bigger than its parts. This means that exits can be expected to be harder and more expensive than entrances.
Obviously, these aspects grow in importance with scale. The bigger the adoption, the bigger the impact. And, when things get “big enough”, the nature of the problem changes too …
All in all, despite the initial appeal of the zero-sum game argument, I can't buy into it …
The conventional wisdom minted the expression “better safe than cure” many moons ago. I subscribe to that and, not surprisingly, my general point of view sits on the side of Portability, Agnosticism, and Independence as core values. However, they must be integrated within an approach driven by Enterprise Architecture. This means framing them in a business context and a business/service strategy. Consequently, technological decisions must be subordinated to them and not the other way around. This also means that universal or maximalist positions will never work.
Lock-in risks tend to concentrate on the upper layers on the stack (especially FaaS/BaaS and SaaS) which are usually closer to business services. This suggests that the pressure to leverage them will always be there. In my opinion, lock-in is OK provided that decisions are rational and conscious; that risks are managed accordingly; and that we make sure that we are not trapped by fallacies like the ones exposed before. But, more importantly, we must make sure that there is complete alignment with the business at all levels.
Picture: “Think!” by Christian Weidinger. Licensed under CC BY-NC-ND 2.0