Azure Landing Zones: Architecture Guide and Common Mistakes

Architecture Brief Systems thinking, implementation detail, and a bias toward clarity over noise.

A landing zone is the word Azure uses for something much more important than it sounds. It is not a template. It is not a starter kit. It is the set of architecture decisions about identity, networking, policy, management, and connectivity that every workload you ever deploy into Azure will inherit. Get those decisions right and your platform is a foundation. Get them wrong and every new workload adds to the remediation debt. The difference between the two is almost always visible within 12 months.

What a Landing Zone Actually Is

Microsoft’s definition is deliberately broad: “an environment for hosting your workloads, pre-provisioned through code.” That definition is accurate but not particularly useful. The more useful framing is narrower: a landing zone is the answer to the question “what must be true about every Azure environment before we deploy anything into it?”

That question resolves into three categories of decisions. First, identity and access — who can do what, where, and under what conditions. This covers your management group RBAC hierarchy, Privileged Identity Management configuration, conditional access policies that apply to Azure management plane access, and how workload identities authenticate to platform services. Second, network topology — hub-and-spoke vs Virtual WAN, how on-premises connectivity terminates, DNS resolution architecture, and how private endpoints resolve across the environment. Third, governance and management — which Azure Policy initiatives are assigned at which scope, what constitutes the log collection baseline across all workloads, and how resources are organised within the management group and subscription hierarchy.

The Cloud Adoption Framework provides a reference architecture for all three categories, packaged as the ALZ reference implementation. It is a genuinely useful starting point. It is not a prescription, and the organisations that treat it as one — deploying the reference implementation wholesale and treating it as a finished landing zone — consistently spend more time adapting it than teams who used it as a reference and built something fit for their context.

[DAN: Add your view on the CAF reference implementation — specifically where you’ve seen organisations adopt it wholesale and then spend months adapting it, vs where you’ve seen teams use it as a reference and build something fit for purpose faster. This is a nuanced point that separates experienced practitioners from those who’ve only read the docs.]

The trap is that the ALZ reference implementation looks correct. The management group hierarchy is well-structured, the policy initiative coverage is broad, the connectivity model is defensible. The problem surfaces when you map it to how a specific organisation actually operates: business units that don’t correspond to the Corp/Online split, compliance requirements that don’t align neatly to the policy initiatives included, network connectivity models that don’t fit cleanly into hub-and-spoke or Virtual WAN. A landing zone that looks correct in the portal but doesn’t match how the business operates is one that engineers will route around.

The Three Most Common Landing Zone Mistakes

Mistake 1: Building one landing zone for everything. A single management group structure and policy set that applies equally to production, development, sandbox, and identity workloads is one of the most common patterns and one of the most problematic. The result is a policy configuration that is either too restrictive for development — in which case engineers bypass it — or too permissive for production — in which case governance gaps accumulate in your most critical environments.

The fix is landing zone segmentation. At minimum, you need separate policy scopes for platform workloads (identity, connectivity, management), corp-connected workloads (workloads with line-of-sight to on-premises or to the hub), and online workloads (internet-facing workloads with different security controls). Development and sandbox environments need their own scope with policies tuned to encourage experimentation without the controls that would make iterative work impractical. The management group hierarchy exists precisely to enable this segmentation through policy inheritance.

Mistake 2: Deferring networking decisions. Network topology is the hardest landing zone decision to reverse. Hub-and-spoke and Virtual WAN are not interchangeable: they have different routing models, different cost structures, different operability characteristics, and different implications for private endpoint DNS resolution. Private endpoint DNS resolution in particular requires a specific architecture — centralised private DNS zones, conditional forwarders configured correctly, DNS resolver configuration — that is much harder to retrofit than to build correctly from the start.

Teams that defer the network decision and connect workloads ad hoc end up with overlapping IP ranges, broken private endpoint resolution, and a network topology that cannot be normalised without redeploying workloads. The rule is simple: make the network decision first. Even if the answer is simple initially — a single hub VNet, basic hub-and-spoke, minimal on-premises connectivity — making it deliberately and encoding it in IaC means the topology can evolve. Not making it means the topology accretes.

Mistake 3: Manual landing zones. A landing zone defined in a Word document and applied through the portal is not a landing zone. It is a description of what someone intended to configure. Drift begins on day one and accelerates as engineers make portal changes to unblock workload deployments. Within six months, the documented landing zone and the actual environment are meaningfully different. Within twelve months, nobody is confident they know what is actually configured.

Infrastructure as code is not optional for landing zones at any scale beyond a single team. Terraform and Bicep are both viable — the choice matters less than the commitment. The code is the landing zone; the portal and the documentation are representations of what the code has deployed. Anything not in code is not reliably in the landing zone.

The Decisions That Are Hard to Undo

Some landing zone decisions are costly to change after workloads are deployed. These deserve disproportionate attention at design time.

Management group hierarchy depth. Adding management group levels is relatively straightforward — you insert a level, move subscriptions, update RBAC and policy assignments. Removing a level after RBAC assignments and policy assignments exist at that scope is disruptive and error-prone. Design the hierarchy for where the organisation will be in three years, not where it is today. A hierarchy that looks overbuilt for a twenty-subscription environment is correct for a two-hundred-subscription environment.

Primary region and paired region strategy. Changing an Azure region after workloads are deployed means redeploying those workloads. The primary region decision deserves more time than it typically gets. The relevant questions are: where is on-premises connectivity terminating? Where are users? What data residency requirements apply? What services are available in which regions and how does that constrain workload architecture? The paired region selection is not entirely free either — Microsoft’s regional pairs are fixed, and workloads designed for geo-redundancy need to work with those pairs.

Identity provider integration architecture. Rearchitecting Entra ID integration after the landing zone is built is expensive. This includes decisions about B2B federation, managed identity usage patterns, Privileged Identity Management configuration, and how workloads authenticate to each other and to platform services. The identity architecture should be settled before the first workload is deployed, because the first workload will make assumptions about it.

Hub network IP ranges. RFC 1918 space is finite, and Azure, on-premises, partner networks, and future acquisitions all compete for it. Hub VNet address space cannot be extended without redeployment of resources in that VNet. Overlaps with on-premises ranges or partner network ranges cause routing problems that are painful and time-consuming to resolve — especially when the overlap is discovered after the hub has dependent spoke VNets peered to it. Document every known network range before committing to hub address space, and leave more room than you think you need.

[DAN: Add a specific example of a hard-to-undo decision you’ve seen organisations have to unwind — ideally the management group or networking category. The remediation story is what makes this concrete for architects who are at the design stage.]

Getting It Right

The organisations that build landing zones that hold up start with questions, not architecture. The questions are: What workload types will we host and what are their connectivity and compliance requirements? What is the identity model? What are the compliance obligations and which workloads do they apply to? How will we manage and monitor the environment? The answers to these questions should determine the architecture; the architecture should not determine the answers.

Validate the landing zone design with a real workload deployment before declaring it done. This sounds obvious and is routinely skipped. A landing zone that has never had a workload deployed into it has never been tested. The policy assignments that seemed reasonable in design will produce unexpected deny effects. The network topology that looked correct in the diagram will have routing gaps. DNS resolution that worked in isolation will break when a workload tries to resolve a private endpoint. Deploy something real, work through the friction, and fix the design before committing to it at scale.

Treat the landing zone as a product, not a project. It needs owners, a backlog, and a release process. Policy assignments will be added as compliance requirements evolve. Network topology will change as workload connectivity requirements expand. RBAC assignments will need updating as organisational structure changes. Teams that treat the landing zone as a project with a completion date produce a foundation that decays — not because it was wrong at deployment, but because it was never maintained.

The most expensive landing zone mistake is the one you make at the beginning and discover at scale. The decisions that feel like details at deployment time — IP ranges, management group depth, policy scope — are the ones that drive the most remediation cost when they turn out to be wrong. The time to be deliberate about them is before the first workload is in production.

#landing zones #governance #architecture #cloud adoption framework #enterprise azure

About the Author

Daniel Inman

Cloud Solution Architect focused on Azure, platform design, and translating technical complexity into decisions that teams can actually execute.

Learn more LinkedIn

Azure Landing Zones: What They Are and Why Getting Them Wrong Is Expensive to Fix

What a Landing Zone Actually Is

The Three Most Common Landing Zone Mistakes

The Decisions That Are Hard to Undo

Getting It Right

Daniel Inman

Pass this article on

Related Architecture

Governance as a Competitive Advantage, Not a Compliance Tax

Governance Drift: How Azure Environments Decay Over Time and How to Prevent It

The Azure Governance Conversation Nobody Wants to Have (Until It's Too Late)