The Azure pricing calculator is a useful tool. It is also systematically misleading about total cost if you do not know where to look. Service prices are accurate — compute, storage capacity, managed service tiers. Those numbers are reliable. Egress, transactions, and operations at scale are where the gap between estimated and actual spend appears. These are documented, not hidden. But they are easy to miss during architecture, and expensive to discover in production. These are the four areas that consistently catch architects off guard.
Bandwidth Egress — The Cost Nobody Models
Azure charges for data leaving a region. Data entering a region is free. The first 100 GB leaving per month is free. After that, pricing is tiered — approximately £0.07/GB for the next 9.9 TB, dropping at higher volumes. The exact rates vary by region pair, but the direction is consistent: outbound data transfer costs money, and it compounds quickly at scale.
The trap is that egress costs are invisible in the Azure pricing calculator unless you explicitly add a bandwidth line item. Most architecture estimates built in the calculator omit it entirely, because the calculator does not prompt you for it when you add compute or storage services. The services look complete. The cost estimate is not.
The places where egress compounds badly are predictable once you know to look for them. Multi-region architectures with cross-region service calls — an API tier in UK South calling a database or cache in West Europe — generate egress on every call. The per-call data volume looks trivial. At production call volumes, it is not. Log analytics pipelines that ship data to a centralised workspace in a different region than where the sources are deployed can generate sustained egress at a volume nobody priced. Content delivery without a CDN in front means every user request for static assets or API responses generates egress charges that a CDN would have absorbed at its own (usually lower) transfer pricing.
The modelling discipline is straightforward: for any cross-region architecture, estimate data volume per region boundary per month before finalising the topology. A rough estimate based on expected call volume and average payload size is sufficient to determine whether the topology is cost-viable. Discovering that a multi-region design generates five figures of monthly egress after the architecture is committed and the deployment is in production is a harder conversation.
[DAN: Add a specific case where egress costs came in significantly higher than the architecture estimate — the magnitude matters here. If you have an example where egress added 30% or more to the monthly bill that was not in the original estimate, that is the concrete illustration this section needs. The cross-region service call pattern or a log pipeline scenario would be the most relatable for architects who have built similar things.]
Azure SQL — DTU vs vCore and When the Expensive Tier Is Cheaper
Azure SQL Database offers two purchasing models with meaningfully different cost structures at different scales.
The DTU model bundles compute, memory, and I/O into a single unit. It is simple to reason about at low scale and cheap to start with. A Basic or Standard tier database costs very little and is appropriate for development databases, small applications, and workloads with predictable and modest resource requirements. The simplicity of the DTU model is genuine — you pick a tier, you get a predictable cost, you do not need to think about resource allocation.
The vCore model separates compute from storage and unlocks capabilities the DTU model does not support: zone redundancy, Hyperscale, Azure Hybrid Benefit for bringing SQL Server licences, and independent scaling of compute and storage. For production databases that need zone redundancy — which most should — the vCore model is the only option. Zone redundancy is not available on DTU tiers.
The common mistake is staying on DTU past the right migration point. At Premium P4 (1,750 DTUs) and above, vCore is almost always more cost-effective for equivalent performance, and it enables zone redundancy that Premium-tier databases typically need anyway. The Premium tier is also where DTU pricing starts to look expensive for what you get — you are paying for bundled resources without the flexibility to right-size independently.
[DAN: Add your rule of thumb for when you recommend the DTU-to-vCore migration. If you have a specific crossover point from your own experience — a DTU tier or monthly cost threshold where you start the vCore conversation — that is more useful than the general principle. Something like “at P2 I start the conversation, at P4 I make the recommendation firmly” gives readers a concrete trigger rather than a gradient.]
The Elastic Pool misconception is worth addressing separately. Elastic Pools allow multiple databases to share a pool of DTUs or vCores, which can be cost-effective when the databases have non-overlapping peak patterns — one database peaks during business hours, another during a nightly batch, and together they stay within the pool’s resource ceiling. The mistake is treating Elastic Pools as a general cost-reduction tool for any group of databases. When all databases in a pool peak simultaneously — which is common in multi-tenant architectures where all tenants share the same usage patterns — the pool provides no resource-sharing benefit and you are paying for pool overhead on top of the compute you need. Match the tool to the actual access pattern, not to the general principle.
Storage Transaction Costs — The Invisible Scaling Problem
Azure Blob Storage pricing has two components: capacity (per GB stored) and operations (per transaction). The capacity component is what most estimates capture. The operations component is what surprises teams at scale.
Operations pricing is tiered by operation type and redundancy tier. Write operations cost more than read operations. List operations — iterating through containers — are charged as a separate operation class. Delete operations are charged. On LRS (locally redundant storage), these costs are modest per operation. On GRS (geo-redundant storage), every write operation is replicated to the secondary region, which increases per-operation costs. GRS is the right choice for data durability requirements that need geographic redundancy; it is not the right choice for high-frequency operational data where the per-operation cost increase is not justified by the durability requirement.
The failure mode is architectures that treat blob storage as a high-frequency cache or a message queue. Blob storage is not designed for high-frequency small operations. Writing thousands of small state files per hour, reading individual records from large files, or using blob storage as an event staging area generates transaction costs that accumulate quickly and were not in the original estimate. The right tools for those access patterns — Redis Cache, Service Bus, Event Hubs — have their own pricing, but that pricing is designed for the access pattern they serve.
ADLS Gen2 uses a different operation pricing model from standard Blob Storage, and the difference matters for big data workloads. ADLS Gen2 charges differently for hierarchical namespace operations — reads, writes, and listings in a directory structure rather than a flat container. For workloads generating millions of small file operations (a common pattern in Spark or Databricks pipelines that create many small part files), the ADLS Gen2 pricing model can be meaningfully more or less favourable than standard Blob depending on whether the access pattern aligns with how ADLS Gen2 prices operations.
[DAN: Add a specific storage access pattern that surprised you in production — the frequency of operations, the operation type (list operations are a particularly common culprit), and the resulting cost that was not in the original estimate. A concrete example like “we were listing a container with 2 million objects hourly and the list operation cost alone was X per month” is the kind of specific that makes this section actionable for architects who are reviewing their own designs.]
The Free Tier Trap
Several Azure services have free tiers that are genuinely free at low scale and non-linearly priced at production scale. The free tier shapes the architecture decision early in development, when production volumes are not yet known, and the pricing cliff only becomes visible when the architecture is already committed.
Azure Functions on the consumption plan gives 1 million executions per month free and 400,000 GB-seconds of execution time free. At development and early production scale, the consumption plan is essentially free. At production volumes — a few million executions per day — the cost calculation changes. The consumption plan then often costs more per month than an equivalent Premium plan or App Service hosting, while also introducing cold-start latency that requires additional engineering effort to mitigate. The break-even between consumption and committed pricing is almost always lower than architects estimate when they first make the hosting decision.
Azure AI Services — Cognitive Services, OpenAI, Vision, Speech — are priced per call. Per-call pricing of £0.001 looks negligible when you are testing against a development endpoint making a few hundred calls per day. At production scale of 1 million calls per day, that is £1,000 per month — £12,000 per year — that was not in the architecture estimate because nobody modelled the call volume before committing to the pricing tier. The numbers are not hidden; they are in the pricing page. The error is not reading the pricing page with production call volume in mind.
The modelling discipline for any pay-per-use service is the same: estimate production call volume before choosing the pricing tier. For Azure Functions, calculate whether production execution volume and duration exceed the Premium plan cost, and factor in the operational value of eliminating cold starts. For AI Services, model the per-call cost at peak and average production volumes and compare that against Standard tier committed pricing or provisioned throughput options where they exist.
The break-even between consumption pricing and committed or tier-based pricing is consistently lower than architects expect. The free tier anchors the perception of cost during development. The production bill is calculated against a different set of numbers.
The Gap Between Estimate and Bill
The Azure pricing calculator is a starting point. The bill is the reality check. The gap between them is almost always in the same places: egress that was not modelled, transaction costs that were not counted, or a pricing tier that made sense at development scale and the wrong choice at production scale.
The correction is not more complex tooling. It is modelling discipline applied at the right point in the architecture process — before the design is committed, not after the first production bill arrives. For any cross-region design, estimate the egress. For any Azure SQL database approaching Premium tier, evaluate whether vCore is cheaper and whether zone redundancy is needed. For any storage-intensive workload, count the operations, not just the capacity. For any pay-per-use service, calculate the break-even before choosing the consumption tier.
These are not edge cases. They are standard patterns in production Azure architectures, and they have standard solutions. The variable is whether the architecture team knows to look for them before the infrastructure is running.