Cutting Cloud Costs with Spot Instances on AWS and Azure
What Are Spot Instances?
Spot instances are changing the way people use public cloud services. They are short-term instances offered by cloud providers that cost significantly less than traditional types of instances. Although the price of spot instances depends on supply and demand, users can save up to 90 percent compared to regular on-demand instances.
When Should You Use Spot Instances?
Spot instances are well suited for a variety of applications that can withstand interruption, including:
- Distributed systems such as microservices and cloud-native applications: These types of applications are fault tolerant by design, so even if a spot instance is interrupted, workloads can be automatically transitioned to another instance.
- Scaling out infrastructure already running in on-demand instances: It can be very useful to run the core components of an application, serving regular baseline traffic, and using on-demand instances, which cannot be interrupted. Then, to scale out and serve additional loads, the application can use spot instances. Even if they are interrupted, the core components will still be running.
- Batch processing, such as Hadoop data processing: These types of jobs are stateless and can be stopped and started at request. If a spot instance is interrupted, the job can simply be restarted on another instance.
- Continuous integration (CI), continuous delivery (CD), and large-scale DevOps processes: Most DevOps processes, such as build jobs, can easily be restarted if they are interrupted, so they are suitable for running on spot instances.
- High-performance computing (HPC), machine learning, and heavy-duty databases like SAP HANA: These are workloads that have relatively high compute costs and at the same time are commonly deployed in a distributed topology. Running some of the application instances on spot instances can dramatically cut down costs.
AWS Spot Instances
AWS sells off unused capacity in the form of Elastic Compute Cloud (EC2) spot instances. The hourly price of a spot instance is called the spot price—it is by definition lower than the on-demand price for the same instance type with a variable discount of up to 90 percent.
The auction price for each instance type in each Availability Zone (AZ) is set by Amazon EC2 and is gradually changed based on long-term supply and demand. Users make spot instance requests, and spot instances run as soon as there is available capacity and the user’s bidding price exceeds the current auction price.
Strengths:
- EC2 provides a 2-minute reminder about spot instance deletion and in some cases an earlier warning.
- In AWS, spot instances can live forever as long as the above conditions are met.
- Amazon provides Spot Fleet, a method to automate groups of on-demand and spot instances
- Spot Instance Advisor helps predict which region or AZ will provide minimal disruption.
Weaknesses:
- Spot Instances are not application aware.
- AWS cannot guarantee availability of spot capacity.
Bottom Line:
With appropriate automation and analytics, Amazon lets you combine spot instances with on-demand and reserved instances to run a variety of mission-critical workloads.
Azure Spot Instances
Azure also auctions its spare capacity, calling its spot instances spot virtual machines (spot VMs). They are offered as single VMs or as VM Scale Sets (VMSS), letting you request a group of spot VMs. Spot VMs offer the same features as pay-as-you-go virtual machines of the same instance type.
The price of Spot VMs is determined according to available capacity in the Azure region and the VM type (SKU). Azure commits to changing spot prices slowly (even if market prices fluctuate) to maintain stability, so you can better predict and manage your budget.
Strengths:
- Configure spot VMs through Azure Portal, Azure CLI, Azure PowerShell, or Azure Resource Manager templates.
- Relatively fixed price of spot VMs providing cost predictability
- Capacity managed at a region level (not at the AZ level like in AWS)
Weakness:
- Does not support B-series VMs
- You cannot convert a spot VM to a regular VM or vice versa.
- Azure does not provide an SLA for spot VMs.
- Only 30 seconds notice before VMs are evicted
Bottom Line:
If you want to use automated strategies to transfer workloads from terminated spot VMs to other VMs, Azure provides less-mature capabilities. It does not allow you to manage spot VMs in groups (beyond basic VMSS functionality) and does not let you mix spot VMs with other types, which can increase stability.
Remember that in order to take advantage of spot instance discounts, you must have a mechanism in place to restart workloads on a new instance after a spot instance is terminated. This is just like running another instance of the application on a new computer instance—the process of running spot instances and regular instances is exactly the same, except that spot instances can be terminated at short notice.
Amazon and Azure both provide spot instances, and each cloud provider’s solution has its pros and cons:
- AWS provides more advanced notice before terminating spot instances and better automation capabilities for managing them, including Spot Fleet.
- Azure ensures spot prices are relatively stable and provides capacity at the region level, making it easier to find the instance type you need.
Both AWS and Azure provide mature tools for leveraging spot instance capacity, but it’s not easy. However, the effort will be well worth it when you achieve the deep discounts provided for spot capacity on both of the cloud giants.