The popularity and confidence in cloud computing platforms continues to grow unabated. More and more businesses are moving mission-critical workloads to public clouds. Forbes recently projected that by 2021, 32% of IT budgets will be spent on public cloud platforms. Forbes also points out that cloud spending has grown 59% on average since 2018.
The Recent Trends of Multi-Cloud Optimization will Continue — Elevating the Importance of a Multi-Cloud Strategy.
The elasticity of cloud platforms provides great potential from an engineering perspective but great challenges from a cost-containment perspective. Traditional engineering teams using on-premises infrastructure are not accustomed to considering cost in a pay as you go environment. When migrating from limited on-premises hardware to the comparatively infinite expanse and variety of cloud, cost containment, tracking and optimization have to be considered.
Cost discipline, by necessity, becomes part of engineering awareness and vigilance — a requirement for businesses looking to exploit the new paradigm.
The Multi-Cloud Way
Many businesses already have a presence in multiple cloud platforms, either due to a strategy, or more likely, due to organic growth. The benefits of cloud technology include the lack of a reliance on a single provider, agility, scalability, high availability, SaaS services, and PaaS platforms. These higher quality services, along with the pay as you go billing model, is very attractive.
Controlling the associated costs requires a well thought out multi-cloud strategy.
A multi-cloud cost strategy considers workload placement by factors.
- Workload/platform optimization. Does the application utilize sufficient platform features to justify placement there? Conversely, does the availability zone provide needed features for the workload? How can inter-region bandwidth charges be balanced against fixed availability zone costs in a distributed deployment?
- Performance. Can the workload be placed on a platform, region, or server-class with overall lower performance without impact? Workloads that can tolerate lower average performance can benefit from right-sizing the computing environment. Similarly, for storage; can the workload tolerate lower performance or even object storage to lower costs.
- Availability. Are some workloads tolerant of low (or at least not high) availability? Can they be placed on cloud excess capacity when available? Most cloud platforms have far cheaper preemptible instances for workloads that can tolerate it ( e.g., ETL / batch jobs that can snapshot progress).
- Serverless. Does the workload require a dedicated server? Similar to shopping for excess capacity, serverless offerings have the potential for cost savings by not maintaining a running server and only incurring costs based on resource consumption on a highly granular basis.
Hybrid cloud strategies also can have an important impact on cost. Hybrid cloud, using on-premises capacity along with public cloud resources, should be considered when excess on-premise capacity exists — or where public cloud offerings aren’t cost-competitive.
For many businesses, compliance requirements will make a hybrid approach necessary. For others, hybrid cloud deployments are simply the result of a phased migration of workloads to the cloud, which may take many months or years.
The basic promise of the public cloud, the efficient consumption of resources on-demand as an operational expense vs. large capital plus operational expense, isn’t guaranteed to make sense under all circumstances.
Cloud Cost Assessment
If some workloads are already running on the public cloud, the first step is quantifying the costs of existing workloads and services over time as a baseline. Quantifying the cost-baseline is key to getting a detailed profile of consumption and waste beyond simple aggregation of spending. Once this baseline is established, it can serve as a starting point for identifying problem areas and building an understanding of how cost relates to system usage.
It is critical to correlate current costs to internal teams or projects to enable accountability.
It is critical for cost control to correlate current costs to internal teams or projects to enable accountability and identify the “low hanging fruit.” The correlation can be very difficult without the assignment of tags/labels to cloud instances as a general policy for teams that are deploying cloud workloads.
One of the benefits of a high-level of cloud automation is the ability to tag workloads transparently so that cost traceability can be achieved consistently. The benefits of cloud workload orchestration in the context of day to day operations (CI/CD processes) are discussed later.
Cloud providers offer tools that can assist with cost analysis. For example, AWS has its “Cost Explorer” and its “Cost and Usage Report.” These are particularly useful in combination with AWS cost allocation tagging.
Azure offers “Cost Management” from the Azure console, which can provide detailed reports. Azure also uses resource tagging to associate cloud resources with accounts (and other indicator-like “projects”).
Google Cloud has a similar service. In addition to the native tools, cloud management platform vendors such as Flexera, Cloudbolt, CloudApp and others provide cost analysis tools across multiple cloud platforms.
Cloud Cost Control
It is critical to raise awareness in teams that use cloud resources of the cost behavior of their workloads so the impact of design and operational decisions can be understood in context. Teams may be consuming large compute instances, retaining unneeded logs or other data on cloud storage, or not tearing down idle resources.
Even with all the benefits of a multi-cloud strategy, the tracking and forecasting associated with the operation of workloads hosted on multiple cloud platforms is a challenge. Add to that the unpredictability of workload scale, one of the major benefits of cloud architectures, and the complexity can become overwhelming.
A strategy for dealing with cost control is needed, potentially along with controls that can overlap with modern DevOps practices.
A casual survey of cloud billing models may lead to the impression that they are the same — but actual costs can be highly workload-dependent. Using the baseline measurement to identify cost hot spots, compare public cloud billing models to identify significant savings.
The complexity and effort to migrate and maintain services on multiple cloud platforms is significant and requires a significant benefit. The costs and benefits are highly workload-dependent. Because of this dependency, any multi-cloud strategy will benefit from a multi-cloud orchestration layer.
The orchestration layer will provide a degree of portability and make it easier to exploit new cloud providers and changing cost advantages. In addition, discounts provided by cloud providers can provide significant savings for organizations.
Flexera reports that less than half, much less in some cases, of customers, exploit cloud discounts such as AWS spot instances — meaning Azure low priority instances and Google ad hoc negotiated discounts.
Besides operational automation, the adoption of a multi-cloud orchestrator that integrates with modern DevOps practices can provide cost containment benefits.
An orchestrator with a declarative “infrastructure as code” approach makes templates a reviewable part of the release process. Cost containment policies can be applied to the template during review to effectively deny the deployment of problematic workloads. Labels or tags are then applied automatically for cost tracking.
For example, the attempted use of inappropriate-instance-types can be denied far in advance of any damage being done. Furthermore, a competent orchestrator will be capable of applying user/group or even time-specific barriers to workload deployment.
In addition, an orchestrator can limit scaling behavior — thus ensuring that complex deployments are completely cleaned up. Cleaned up deployments are critical to avoid zombie-cost-sources like abandoned unattached storage.
The journey to an optimal, cost-efficient multi/hybrid cloud strategy is a complex one. It is important to understand current costs, including on-premise workloads. Understanding the current costs will be your foundation for advancement and growth. You’ll understand which of the various platforms have provided the tools you require.
Automation will play a key role in standardizing and controlling the approved interactions and workload placement on various platforms and provide a degree of workload portability.
Portability is key because the world of cloud providers never stands still — and cloud billing models vary over time — requiring adaptability.
Finally, besides ongoing cost auditing, a practice of manual and automated orchestration-template-review must be in place to avoid unpleasant billing surprises.