Back when consumer USB drives were flooding the market, it was quite common to witness how people would challenge datacenter-grade storage. They would do so by comparing it with devices they could readily buy on any retail store immediately and inexpensively. By comparison, IT storage was costly and slower to deliver.
Perceptive observers could think that, either there was a scam going on… or, most probably, those two things were not exactly the same …
The funny thing is that the same still (yes, still) happens with Cloud Services. Despite being around for more than a decade, people still react suggesting that “Cloud Service Providers (CSPs) are quicker and cheaper than the IT department”. There is some merit to that claim (and we will get to that in a minute). However, as with the case of consumer storage, these two are different actors doing different things in the IT Service Delivery value chain.
I know that this is a question that might be clear for some of you. But I keep seeing this way too often. The range of the confusion that I see is also quite large: from non-tech savvy people to folks that participate in the Cloud Services business to diverse degrees.
This suggests that, regardless of what somebody would think, this topic is quite sophisticated and complex for many people. Therefore, let's try to clear it up, highlight the potential differences, and explore the key implications that they represent.
The tip of the iceberg
Often times, people refer to Cloud Services that match the core of the computing stack when they make those claims about purchasing and provisioning technical components quickly and inexpensively. The context of these conversations can be shown in figures 1 and 2.
Sure, purchasing and provisioning take time in the traditional IT world and, yes, they just take minutes in the Cloud. However, these processes are just two small pieces in the IT Service Delivery value chain. For instance: somebody has to select which components must be provisioned and why; someone must integrate them, configure them and support them; some other must make sure that the solution is secure at all the stages of its life cycle; the operational model of that solution must fit in the ecosystem it belongs; medium and long term concerns also ought to be taken into consideration …
In other words, whereas the scope of the initial conversation might seem narrow and self-contained, the reality of IT Service Delivery is much more complex, as highlighted in Figure 3.
As you can see, the IT Service Delivery story is about a team effort of different functions and processes that go beyond purchasing and provisioning.
In addition, when the discussion is focused on technology, speed of development and price of specific components, a more important aspect gets abused during argumentative wars: that is “Business Value”.
The value chain
The term “Business Value” is quite overused these days. That is good since everybody is keen to showcase how their different proposals provide a positive contribution to their prospects. Nevertheless, the constant reference to it also tends to erode its actual relevance and meaning.
As hinted above, the actual value to the business comes from IT Service Delivery as a value chain and not from any of the elements that participate in it. It is a team effort, not a “solo” play. Yes, local incremental improvements have an impact on the overall process (especially when they accumulate). However, it is critical to pay attention to the whole process to make sure that those improvements do not create unexpected problems somewhere else (as it is showcased in Theory of Constraints).
Nicholas Carr in his piece “IT doesn't matter” stated that:
“as infrastructural technologies availability increases and their cost decreases – as they become ubiquitous – they become commodity inputs. They may be more essential than ever to society, but from a strategic business standpoint, they become invisible; they no longer matter”.
Since then, everybody agrees that Cloud Services are just commodities. And, when you put them in the context of the IT Service Delivery value chain, they represent a fraction of the Service Pricing scheme (as depicted in Figure 4) and, generally speaking, have a minor relative weight in the grand scheme of things.
Now that we understand the landscape, we are in a much better position to confront the original claim: “CSPs are quicker and cheaper than the IT department”. Of course they are. But they do different things, have different missions. CSP's costs are also commodities with a minor relative weight in the IT Service Delivery value chain. In other words, in my opinion, that statement misses the whole point.
CSPs optimize provisioning and purchasing in a very immediate way, but we must remember that this just represents a local optimization of the IT Service Delivery pipeline. We can't forget that traditional approaches still respond to business needs and constraints. And, even in these cases, there are optimization strategies that can be explored too, like co-location, renting or outsourcing, to name just a few.
Cloud Service Management and Operations, practices such as ChatOps and DevSecOps, and new roles such as the SRE, leverage Cloud-based Technologies to improve other parts of that value chain. However, these are not “things” that you can buy. In fact, adopting them takes commitment, time, and effort.
This is the actual Cloud Transformation story. A story that also takes into consideration traditional environments, existing modes of operation and specific business challenges, constraints, and priorities. In this context, the perspective of an Enterprise Architecture is key: it is not Cloud Transformation, but Business Transformation that matters. Cloud Transformation is just a means to that end! The ultimate challenge is the ability of the whole organization to develop and transform its own capabilities. Because these will ultimately determine its ability to thrive in the marketplace in the medium and long term.
In summary. Now, the next time you hear “CSPs are quicker and cheaper than the IT department” … I hope, you can say to yourself: “seriously?” and, maybe, you could help by explaining why that is not the point.
I have been dealing with WordPress for years now. I have developed and maintained my own site; built a number of experiments and Proofs-of-Concept and managed blogging platforms for others. Each context has taught me something new, but one of the aspects that keeps me hooked into this environment is the vibrant community that WordPress has.
The amount of plugins, themes, and solutions that get built on top of it is astonishing. WordPress is estimated to run to 33% of all websites and 60% of the ones with a known CMS. There is certainly a virtuous cycle behind this success … but that is a topic for another post 😉
With so many options, comes complexity and some kind of disutility. There are many choices for doing the same thing; components that get abandoned after some period of time; components with serious quality problems despite being “beautiful”; components that do not respect privacy or local regulation; components that are insecure or contain malware; components that do not get properly supported; components that are not compatible with each other; etc.
All this raises the need for filtering and curation. Unfortunately, we are let alone doing this task. You could try searching the web looking for articles that share some recipes. And, some of them can help you discover useful ingredients for your solution. However, extracting the set of criteria that teach you how to do your own selection and maintain it over time is a different thing.
That is why I've decided to share the ones that I use myself hoping that it can also be useful to you.
Dimensions to consider
There are a number of aspects that that help us ensure that we are fully addressing our concerns and also support us turning our check-list into a more actionable one. These are:
- Risks and concerns addressed by each criterion.
- Metrics and indicators that we can use to compare components.
- Rating/Ranking/Priority that can help us weight situations created by conflicting criteria. I like the MoSCoW method, but you can use any other system or combine them in such a way that better fits your needs.
I have also found that including a description for each item and providing some room to provide notes and remarks is quite helpful. This is especially true when other teams must interact with that list on their own.
You can also consider including the rationale for each criterion. However, most of the times, this is implicitly covered by the list of risks and concerns. My take up until now has been to include it only when my stakeholders lack some key domain expertise that would help them understand that implicit rationale. This way, I make sure that the outcome is less redundant and more pragmatic.
With all this in mind, here you have the architecture artifact that I regularly use myself:
The WordPress ecosystem is great. It provides plenty of opportunities to minimize efforts in coding and maintenance thanks to the consumption and integration of features delivered by existing components. However, selecting them requires work and discipline to separate wheat from straw.
The table above is, by no means, a definitive one. I wouldn't be surprised if you have to tailor it to fulfill your specific needs. What matters, in my opinion, is to be driven by a check-list like this. That way you will always be sure that your key concerns will be addressed properly at the earliest stage of the life cycle. Other platforms also share the same type of problems. As a result, this approach might also be helpful there.
Of course, this is my point of view and the strategy I follow to address this curation problem. My question to you is, how do you do it? Do you miss some criteria? Which other concerns do you take into consideration?
Despite how much I love the Cloud, it would be foolish to ignore the many challenges that it poses. And, when concepts such as Liquid IT or Multi-Cloud become part of the agenda, one of those is, without a doubt, Portability.
Back when I was a member of the Atos Scientific Community, I was one of the authors of a whitepaper that addressed this very same topic. Since then, I was fortunate enough to witness other points of view about the subject and some have certainly got me thinking. And, despite Cloud Portability might be one of those never-ending discussions, I think that some additional aspects worth additional consideration.
My starting point
Portability in the Cloud has multiple facets and there is no easy and single answer to it. That is why we defended the “Architect's Approach” as the way to address it.
Without a doubt, going “all-in” with a Cloud Service Provider (CSP) and giving up on Portability has immediate returns. It allows you to gain speed and extract value from day zero. However, the same can be told about any other technology adoption choice. This is not a new problem. Why do we care, then? Because there are risks and concerns at multiple levels. Cloud technologies are unique, though, at the scale of the potential impacts that such decisions might result in and the speed at which they happen.
In any case, hiding behind Architecture Principles such as “Technology and/or Vendor Independence” or “Technology and/or Vendor Agnosticism” to prevent or set back changes does not make anybody a favor. Especially, since inaction or delays can represent a real competitive disadvantage.
This means that Architects and Organizations alike must find their balance while, at the same time, push themselves to be honest and open to challenges that defy their own positions and preconceptions.
Given said that, let's discuss some of the points of view that drew my attention:
A systemic risk means “no risk”
I've seen this argument taking different shapes. But, in essence, it states that the Cloud is now a systemic risk: since everyone is using it, everyone is affected. As a result, and here is the catch, nothing is going to happen because it is in everybody's interest… but, if it does, it doesn't matter either since everyone will be screwed …
… I don't know a single serious business that would accept such a statement as a valid Risk Management strategy. Actually, if we were to accept it for just a second, then it would make no sense to outline Disaster Recovery or Business Continuity Plans, wouldn't it? Our lives would be so much simpler, that is for sure!
Risk Management is an individual responsibility which can't be delegated. Sure, we can ignore it, we can fool ourselves, whatever, fine … But this doesn't change which are our responsibilities: one thing is to make conscious decisions based on thoughtful consideration and a different one is claiming that company in distress makes the sorrow less …
Scale mitigates disruptive changes…
Supporters of this idea suggest that, even though CSPs may not take decisions compatible with customer's needs and concerns, the scale they have and at which they operate prevents them from impacting a large number of users. This way, the Scale itself becomes the mitigating factor that forces them to self-behave and self-control.
Let's use one example to illustrate the point. Let's assume that AWS still has just 1M enterprise customers and let's suppose that they make one decision that impacts just 1% of them. We can argue that 1% qualifies as a “small number” of customers or, at least, not a large one … But, oh wait! 1% of 1M represents 10.000 customers! To give you an idea, they represent roughly the equivalent to all the companies in Spain between 100 to 500 employees.
This proves that, when talking about planet-scale figures, words like “large” or “small” do have an impact and cannot be dismissed or treated lightly. However, the truth is that all of this is completely irrelevant. If you are one of those 10.000, the argument that you are part of the small community of affected companies will not solve your problems at all. That is an argument for CSPs, but not for consumers of Cloud Services.
This brings back the notion of Risk Management as an individual responsibility that I've mentioned above. If it does matter to you, it is on you to do something about it.
Investment + Talent = Reliability
This claim pretends to neglect the need for Risk Management assuming that reliability of a cloud-based solution is a direct function of the huge capital and unparalleled talent that CSPs have. If they are investing so much, and have the brightest minds on earth, they can't fail. As a result, it would be crazy not to go all-in, right?
Needless to say that Investments, Talent and Reliability are three different things that may be related, might be correlated, but, by no means, are equivalent. If they were, instead of three words, we would certainly have just … one?
But, anyway, the crux of the argument is Reliability. So, what is it? Reliability is a multi-level quality of an Architecture or a Solution. Cloud Services usually make the foundations upon which we build and deploy our own stuff in order to deliver an application or a service. This means that at least one piece is not owned by the CSP. We share responsibility with the Service Provider and, as a result, externalizing reliability is, simply, not possible.
On the other hand, CSPs have been shouting to us the “Design for Failure” mantra for years. With that, they indicate that reliability should be baked into the application code rather than infrastructure components. “Design for Failure” represents a paradigm shift from traditional Enterprise Architectures in which reliability was the responsibility of the infrastructure and it also means the opposite of the original claim.
A global oligopoly is good …
This other position claims that prices will always be pushed downwards since big players are on this quest for never lasting economies of scale. At the same time, Cloud Service Providers will never be tempted to turn the screws on fees to drive margins up because the competition is so intense.
I concur that this is a picture that describes the current situation quite well. However, there is a limit on how much prices can go down since infinite growth doesn't exist. On the other hand, “lock-in” (the opposite to Portability) represents, by definition, a barrier to competition. This means that, the more “lock-in” the less competition and, therefore, the bigger the temptation to raise prices.
The truth is that we can't tell how prices will evolve. We know, however, three things:
- Lock-in is less likely to happen (although not impossible) in commodity services offered by different providers supported by de-facto or industry standards that facilitate both entries and exits.
- Lock-in is also more likely to happen in situations in which data gravity has become an issue.
Lock-in happens more frequently on highly differentiated services for which there is no easy replacement.
- A small number of globally dominant players is known as an oligopoly. And this is not good. Period.
This means that, sooner or later, price evolution will through us unpleasant surprises. Actually, it has already happened. The question will always be the depth and breadth of the impact.
Cloud is “just” a way of consuming technology and innovation
I cannot agree more with this posture. There is a BIG catch, though. Cloud Computing is much more than that. Overall other consideration, it is a relationship with a Service Provider. As with any other type of relationship, things can go south. This is why contracts, laws, and regulations contemplate exit clauses.
And here is where we must assess the situation. One thing is to decide to change the energy provider for your home or even ending a life-long marriage. That may affect you personally and even your family, but the scope is limited. On the other side, ending the thing that powers systems, data, and processes for a company has deeper and wider implications. The more exposed to technology and the more coupled the company is with the Service Provider, the harder the impact will be.
Therefore, besides being a way of consuming technology and innovation, Cloud Computing is a relationship that must be managed and taken care of.
The innovation argument also raises the question about “where” that innovation happens. A single-vendor strategy could place you in a competitive disadvantage. Let's say, for instance, that qualitative progress leaps happen on a different Cloud Service Provider. Data gravity, licensing or other contractual terms might become barriers to entry and ruin that opportunity.
In other words, Portability, as an architectural quality, not only supports Vendor Management activities but also ensures that the company is free to act as it needs either at tactical or strategic levels.
The zero-sum game
Some people sustain that we must disregard the concern of being agnostic. This is a cost-based argument is that claims that either you pay the price at the beginning (to become agnostic) or you pay it at the end (when you want to exit out of a Cloud Service Provider). Both situations override each other, resulting in a zero-sum game. This reasoning also raises a huge opportunity cost: if there is no “exist” there is no price to pay … Compelling, isn't it?
Hopefully, at this stage, we have already debunked the fallacy about ignoring Risk Management or exit strategies.
In any case, talking about zero-sum games means talking about numbers. Therefore, we cannot discard the possibility that there is indeed a zero-sum game in certain cases. However, we can't claim it to be a general case. Actually, given that the entrance and the exit can't easily be compared, I would assume that zero-sum games would be the exceptions rather than the norm.
Why do I think they can't be easily compared? Let's see:
- Entrances are progressive over time. That is, you adopt once, and you grow from there. On the opposite side, exists are bound to a period of time: “I want to exit this year on this date” (hello BREXIT!).
- When you adopt the cloud, you know both the source and the target environment. This is not the case with exits unless you have a strategy planned in advance.
- Systems establish relationships with one another over time making the whole bigger than its parts. This means that exits can be expected to be harder and more expensive than entrances.
Obviously, these aspects grow in importance with scale. The bigger the adoption, the bigger the impact. And, when things get “big enough”, the nature of the problem changes too …
All in all, despite the initial appeal of the zero-sum game argument, I can't buy into it …
The conventional wisdom minted the expression “better safe than cure” many moons ago. I subscribe to that and, not surprisingly, my general point of view sits on the side of Portability, Agnosticism, and Independence as core values. However, they must be integrated within an approach driven by Enterprise Architecture. This means framing them in a business context and a business/service strategy. Consequently, technological decisions must be subordinated to them and not the other way around. This also means that universal or maximalist positions will never work.
Lock-in risks tend to concentrate on the upper layers on the stack (especially FaaS/BaaS and SaaS) which are usually closer to business services. This suggests that the pressure to leverage them will always be there. In my opinion, lock-in is OK provided that decisions are rational and conscious; that risks are managed accordingly; and that we make sure that we are not trapped by fallacies like the ones exposed before. But, more importantly, we must make sure that there is complete alignment with the business at all levels.
I don't know your experience with this, but I always had this feeling that Enterprise Architecture is one of those things that everybody talks about without quite knowing what actually is.
I myself can be blamed about this very same thing… I have practiced IT Architecture for years at different levels and across different domains. And, despite always doing my best; learning from brilliant people and working on challenging projects, the same issues emerged over and over again. Here are just a few:
- The focus on a narrow scope wouldn't let me connect with the global picture or the ultimate vision that was driving what I was doing.
- Despite using the most widely accepted conventions in the field, I regularly witnessed how interactions with stakeholders from other domains were hard and painful. There was no “lingua franca” and everybody had to make their best to bridge the gap.
- I was learning incrementally about discrete techniques and approaches and thinking, “this is it”.
I couldn't be more wrong… Eventually, it became obvious to me that, in this hyper-sophisticated world, many things must have already been solved; that my quest to connect with a global vision (whatever that meant at each moment), had a body of knowledge, a method and a set of techniques; that a community of people concerned about them must be doing something somewhere. In other words, somebody must have applied the mindset of an Architect to the Architecture discipline itself.
If all that was true, I had to do something. Otherwise, I would be doomed to fighting the trees without understanding the forest. And, in a sense, I would also be reinventing the wheel over and over again.
Unfortunately, this has a limited impact unless TOGAF becomes a widespread practice in the industry and becomes a de-facto standard. That is why I have decided to share some materials I've developed while I was preparing my certification. I have translated the books (TOGAF 9.1 and 9.2) into a series of interconnected mind maps. There, you will find, not just the structure and the key elements of each topic, but also links going back to the original sources, embedded diagrams, notes, and other online references.
I have found them useful not only to study the subject but also as a quick reference that can be used for everyday practice. I wish you find it useful too and, if you have any feedback, just let me know! 😀
Digital Transformation is a buzzword these days. It is so for good reasons. Many businesses today face fierce pressures from competitors leveraging web scale technologies and digital business models. Many others must also adapt to younger customer bases that are particularly sensitive to engagements via digital means. In this context, Digital is no longer an instrumental item to the business, but a defining one.
Technology has always had the ability to speed things up. We have been hearing this forever and, precisely because of this, we all tend to normalize our perception about speed and minimize how relevant it can be. Just think for a second how many technology advances have been piling up since Computing was born as a discipline: this accumulation results in the business speed that we perceive today.
But progress is not linear. Through history, different waves of change have brought explosive new developments in relatively short periods of time. We may all remember the impact of some of them: the dawn of Personal Computing, The Internet, Mobile Computing, The Cloud and, now, the Internet of Everything. Therefore, speed is the result of something even more important: non-linear acceleration.
In the past, we have been able to address speed and acceleration without breaking the essentials of our tools and practices. However, we have reached a point in which simple evolutions are not enough and more radical approaches are being played. This level of speed and acceleration calls for doing things differently. We must now use different tools, change our practices, rationalize both assets and teams, optimize communication and collaboration and even change our culture!
All of this might look bright and shiny. And, if you are like me, you couldn’t be more excited with such a perspective! It’s all opportunity!
But let’s put now this vision about speed in the Digital Transformation context. In this accelerated world, we are desperate to complete this transformation quickly. And here is the thing: at speed and at scale risks can be deadly, and there are plenty of them. Particularly, there is one that has to do with this notion that claims that we can be “quicker” and “cheaper” by “doing less stuff”. Unfortunately, when this idea is taken too far the consequences can be catastrophic.
Yes, we must identify what really matters and think seriously about what to do with the rest. However, crossing the line of doing less Quality Assurance, less Testing, less Security or less Automation (to name just a few), means setting up a recipe for disaster. Actually, “at speed” and “at scale”, you may need more of them.
In other words, rationalization is one piece of the puzzle. Certainly a necessary one. But when thinking about speed the key is, more than anything else, do and think things differently.