Cloud providers charge for computing resources using several distinct cost models, each designed for different workload patterns. The three core models are on-demand (pay-as-you-go), reserved instances, and spot pricing. Beyond those, savings plans, serverless billing, and usage-based charges for storage and data transfer round out the pricing landscape. Understanding how each model works helps you match your spending to your actual needs and avoid overpaying.
On-Demand (Pay-as-You-Go)
On-demand pricing is the default model across AWS, Microsoft Azure, and Google Cloud Platform. You pay for compute resources by the second or minute with no upfront commitment and no long-term contract. You can spin up a server, run it for three hours, shut it down, and only pay for those three hours. This flexibility makes on-demand ideal for short-term projects, development environments, and workloads where usage is unpredictable.
The tradeoff is cost. On-demand carries the highest per-unit price of any model because you’re paying for maximum flexibility. If you run a production workload 24/7 on on-demand instances for a full year, you’ll spend significantly more than you would with a commitment-based option. Think of on-demand as the retail price of cloud computing: convenient but expensive at scale.
Reserved Instances
Reserved pricing lets you lock in compute capacity for one or three years in exchange for a steep discount. AWS offers up to 75% off on-demand prices for reserved instances. Azure reservations can save up to 72%, and Google Cloud’s committed use discounts (CUDs) reduce costs by up to 57%. The longer the commitment, the deeper the discount.
The catch is inflexibility. When you reserve an instance, you’re committing to a specific type of virtual machine, often in a specific region. If your needs change midway through a three-year term, you may end up paying for capacity you no longer use. Reserved instances work best for predictable, steady-state workloads: a database server that runs around the clock, a production application with consistent traffic, or a backend service that doesn’t fluctuate much.
Savings Plans
Savings plans sit between on-demand and reserved instances. Instead of committing to a specific machine type in a specific region, you commit to spending a certain dollar amount per hour on eligible compute services. Azure and AWS both offer this model for one- or three-year terms.
The key difference from reservations is flexibility. A savings plan applies your committed spend across regions and instance types automatically, so if you shift workloads from one machine size to another or move between regions, you still get the discount up to your hourly commitment. When your usage exceeds that commitment, the overage is billed at on-demand rates. For organizations running diverse workloads that change over time, savings plans offer deep discounts without locking you into a single configuration.
Spot Pricing
Spot instances let you use a cloud provider’s surplus capacity at dramatic discounts. AWS spot instances can cost up to 90% less than on-demand. Azure spot pricing offers similar discounts of up to 90%, and Google Cloud’s Spot VMs save between 60% and 91% compared to on-demand prices.
The risk is interruption. The provider can reclaim spot capacity at any time when demand from other customers rises. Your workload gets a short warning (typically two minutes on AWS) and then shuts down. This makes spot instances a poor fit for anything that needs to run uninterrupted, like a customer-facing web application or a primary database. They’re a great fit for workloads that can tolerate being stopped and restarted: batch data processing, continuous integration pipelines, rendering jobs, or large-scale simulations where work can be checkpointed and resumed.
Serverless and Per-Request Billing
Serverless platforms like AWS Lambda, Azure Functions, and Google Cloud Functions use an entirely different billing approach. Instead of paying for a virtual machine that runs continuously, you pay per invocation and per unit of compute time. The billing unit is typically measured in milliseconds of execution time multiplied by the amount of memory your function uses.
When no requests come in, your cost drops to zero. When traffic spikes, the platform scales automatically and you pay for exactly what runs. This model eliminates the problem of idle resources entirely, which makes it attractive for event-driven workloads, APIs with variable traffic, and scheduled tasks that only run a few times per day. The per-request cost is tiny in isolation (fractions of a cent), but it adds up quickly at high volumes, so serverless isn’t always the cheapest option for sustained, high-throughput processing.
Storage and Data Transfer Costs
Compute pricing gets the most attention, but storage and data transfer charges are where many cloud bills grow unexpectedly. Cloud providers typically charge separately for three things: storing data, reading and writing data (API calls), and moving data out of their network.
Data egress, the cost of transferring data out of a cloud provider’s network, is the one that catches organizations off guard. Uploading data into the cloud is usually free, but downloading it or sending it to users costs money per gigabyte. The major providers charge varying rates depending on volume and destination, and those charges can become substantial for applications that serve large files, stream media, or transfer data between cloud regions. Some smaller providers, like Backblaze, offer generous free egress tiers or unlimited free egress through CDN partners, specifically to compete on this front.
Storage itself is priced per gigabyte per month, with different tiers based on how frequently you access the data. “Hot” storage for frequently accessed files costs more per gigabyte than “cold” or archival storage for data you rarely touch. Choosing the right tier matters: storing infrequently accessed backups in a hot storage class can quietly inflate your bill, while putting frequently needed data in a cold tier adds retrieval fees every time you access it.
How Costs Add Up in Practice
Your total cloud bill is rarely just one pricing model. A typical setup might combine reserved instances for your always-on database, on-demand instances for your application servers during normal hours, spot instances for nightly batch jobs, and serverless functions for handling webhooks. On top of that, you pay for storage, data transfer, load balancers, DNS queries, logging, and monitoring.
This complexity is why many organizations adopt a FinOps approach to cloud spending. FinOps, short for cloud financial operations, is a practice built around a few core ideas: engineering teams take ownership of their cloud costs, spending decisions are driven by business value rather than technical defaults, and cost data is visible and accessible to everyone involved. The FinOps Foundation, which maintains the framework used by many large companies, emphasizes that managing cloud costs is an ongoing discipline rather than a one-time optimization.
Choosing the Right Model
The right cost model depends on your workload’s behavior. Start by categorizing what you’re running:
- Steady, predictable workloads that run 24/7 with consistent resource needs are the best candidates for reserved instances or savings plans. The upfront commitment pays for itself within a few months.
- Variable or short-lived workloads like development environments, testing servers, or seasonal applications fit on-demand pricing. You pay more per hour but nothing when the resources are off.
- Fault-tolerant batch workloads that can handle interruptions should use spot instances. The savings are enormous if you architect your jobs to checkpoint progress and resume gracefully.
- Event-driven or low-traffic workloads with unpredictable spikes are natural fits for serverless. You avoid paying for idle time entirely.
Most organizations use a blend. The goal is to push as much of your baseline, always-on usage into commitment-based pricing as possible, handle variable demand with on-demand or serverless, and take advantage of spot pricing wherever your architecture allows it. Reviewing your cloud bill monthly and right-sizing instances (switching to smaller machines when utilization is low) can trim costs further without changing models at all.

