Most Azure cost conversations start in the wrong room. Finance flags the bill. Someone pulls a report. Eyes scan down a list of resource names that mean nothing to anyone without an architecture background. A few VMs get resized. The bill drops 4%. Everyone calls it a win.
It isn’t a win. It’s noise reduction on a structural problem.
In every organisation I’ve worked in where Azure spend was out of control, the root cause was never “we bought too many VMs.” It was a series of architecture decisions — made months or years earlier, often under pressure — that nobody went back to question. This post is about those decisions: what they look like, why they happen, and how to find them before they find you.
1. The Over-Provisioning Trap
The most common architecture decision that destroys Azure budgets isn’t a bad decision at the time — it’s a decision that was never revisited.
A team spins up a D8s_v3 VM during a proof of concept because they need headroom to experiment. The PoC becomes production. The VM stays a D8s_v3. Eighteen months later, Azure Monitor shows it running at an average of 9% CPU utilisation.
What to look for:
- VMs and App Service Plans consistently below 20% CPU and memory utilisation over a 30-day window.
- Azure SQL databases on Premium or Business Critical tiers where the consumption data doesn’t justify it.
The Dan Perspective: While “quick wins” always depend on the scale of the environment, the single biggest lever I’ve found is switching to Linux for App Services. Many teams default to Windows out of habit, but moving to Linux eliminates the licensing overhead and can reduce your compute cost by 50% instantly. If your code can run on Linux, staying on Windows is just paying a voluntary tax to Microsoft.
2. Synchronous by Default
This one costs more than over-provisioned VMs and is harder to spot because it doesn’t show up obviously in a cost report. When teams design services to call each other synchronously, they need both services running simultaneously, at scale, all the time.
Async patterns — queues, event grids, service buses — decouple the services. Neither needs to be sized for the simultaneous peak. The cost difference between a synchronous and asynchronous architecture for the same logical workload is often 30–50% on the compute line, but it appears as a legitimate “App Service Plan” charge rather than “architectural waste.”
3. The Region Decision Nobody Revisits
Azure pricing varies by region. UK South and UK West are not interchangeable from a cost perspective.
Three areas where region decisions consistently waste budget:
- Non-production in premium regions: If production must be in UK South for compliance, your Dev environment doesn’t have to be. Moving non-prod to a cheaper paired region is a zero-impact win.
- DR secondaries priced as active replicas: For many, a warm or cold secondary satisfies RTO/RPO requirements at a fraction of the cost.
- Log Analytics location: Egress costs for shipping logs across regions are often lower than the sustained cost difference of premium region storage pricing.
4. Reserved Instances as an Afterthought
Azure Reserved Instances and Savings Plans can reduce compute costs by 40–72%. Most organisations treat these as something to think about “once we’ve stabilised.”
The architecture never stabilises. The result is that mature, baseline workloads run on pay-as-you-go pricing for years. The architectural discipline here is maintaining an inventory of stable baseline, variable but bounded, and genuinely unpredictable workloads, then reviewing that inventory quarterly.
5. Tagging as a Vanity Exercise
You cannot fix architecture-driven cost problems you cannot see. And you cannot see them without a tagging architecture that actually works.
The fix has three parts:
- Azure Policy in
denyormodifymode: Don’t just audit violations; prevent them. (See Azure Policy for Cost Governance: The Three Rules That Matter Most for a practical policy framework.) - Standardised Schema: Use a platform-level schema (e.g.,
Environment,Owner,CostCenter) to avoid forty-seven different spellings of “Production.” - Accountability: A cost allocation view reviewed monthly by someone with budget authority.
Where to Start
If you are looking at an Azure environment and want to find the highest-impact cost optimisation opportunity quickly, I would start in this order:
- Pull 30 days of Azure Advisor recommendations to find the abandonment and over-provisioning problems.
- Audit your App Service OS choices—look for Linux migration candidates to slash licensing costs.
- Review synchronous integration patterns—look for call chains that force over-sizing.
- Check your tag compliance score—if it’s below 80%, your data is unreliable.
Cost optimisation is not a task you hand to a FinOps team and consider done. It is a discipline baked into the architecture process.
My Philosophy: The cloud isn’t inherently expensive, but it can become a massive liability if it isn’t configured right. Treat cost as a performance metric, and the bill usually takes care of itself.
This is the first post in the Cost-Optimised Cloud series — a practitioner’s guide to designing Azure workloads where cost is a first-class architectural concern.