How to Deploy and Configure a Windows Virtual Machine in Azure (Step-by-Step for Admins)

Last updated January 17, 2026 ~27 min read 25 views
Azure Virtual Machines Windows Server Azure Portal Azure CLI PowerShell Azure Resource Manager Bicep VM sizing Managed disks Azure Networking NSG Azure Bastion RDP Microsoft Defender for Cloud Azure Monitor Log Analytics Update Management Entra ID RBAC Key Vault
How to Deploy and Configure a Windows Virtual Machine in Azure (Step-by-Step for Admins)

Deploying a Windows virtual machine in Azure is easy to do quickly, but doing it in a way that is secure, operable, and cost-aware takes deliberate choices. For IT administrators and system engineers, the work is less about clicking “Create” and more about establishing consistent patterns: how you isolate networks, how you grant access, how you patch and monitor, and how you standardize builds so the next VM looks like the last.

This guide is a step-by-step, production-oriented walkthrough for deploying and configuring a Windows virtual machine in Azure. It assumes you want a VM that fits into real operational practices—least-privilege access, controlled inbound connectivity, predictable updates, logging, and backup—without requiring you to rebuild your entire platform. You’ll see the same deployment done via the Azure Portal, Azure CLI, and Bicep, with guidance on when each approach makes sense.

To keep the narrative grounded, the article weaves in several real-world scenarios: a small IT team lifting a line-of-business app server, an enterprise team enforcing “no public RDP” with centralized logging, and a dev/test environment that needs fast rebuilds and tight cost control.

Prerequisites and planning decisions that matter

Before you provision anything, align on a few decisions that affect everything that follows: where the VM will live (subscription/resource group/region), how it will be accessed (RDP via public IP vs Bastion vs VPN), and how it will be operated (patching, monitoring, backup, and identity).

At minimum you’ll need an Azure subscription where you can create resource groups, virtual networks, and compute resources. If you’re working in an enterprise tenant, you may be constrained by policies (Azure Policy), naming standards, allowed regions, or required tags. It’s better to understand those constraints first than to discover later that your deployment was noncompliant.

You should also decide whether this VM is a one-off workload or part of a repeatable pattern. If you expect to deploy more than a few VMs, infrastructure-as-code (IaC) is worth adopting early, even if you still use the Portal to prototype. This guide shows both styles.

Choose a subscription, resource group, region, and naming strategy

Resource groups are the lifecycle boundary for many teams: delete the resource group and you delete the VM, disks, NICs, public IPs, and often the supporting network components if you placed them together. In production, it’s common to split shared network resources into one resource group (for example, rg-network-prod) and workload resources into separate groups (rg-app1-prod). That separation reduces accidental deletion and makes role assignment cleaner.

Region choice is not just about latency. It affects service availability, SKU availability (some VM sizes are not in all regions), and cost. If you’re planning to use Availability Zones, confirm the region supports them for the VM family you want.

Naming matters because Azure resources show up everywhere: logs, metrics, cost reports, and RBAC assignments. Pick a simple, consistent convention that encodes environment and role. Example: vmw-app01-prod or vm-win-web-01-dev. Apply the same pattern to NICs, disks, NSGs, and public IPs so operators can immediately recognize relationships.

Decide how you will access the VM (and avoid “public RDP” by default)

Many first-time deployments expose TCP/3389 from the internet via a public IP and an NSG rule. This works, but it increases your attack surface and creates ongoing operational risk. A safer default is to keep the VM on a private address and use one of these access patterns:

If you already have private connectivity to Azure (site-to-site VPN, ExpressRoute, or a point-to-site VPN), then RDP can remain private.

If you don’t have private connectivity, Azure Bastion is designed to provide browser-based RDP/SSH over TLS without exposing the VM with a public IP. Bastion adds cost, but it is often cheaper than the operational burden of managing exposed RDP plus just-in-time access exceptions.

A third approach is to use Azure Arc or management tooling to reduce the need for interactive sessions, but for a first Windows server you will still want a reliable administrative path.

We’ll implement a secure approach in the networking sections, and when RDP is enabled we’ll do it in a controlled way.

Pick an operating system image and understand licensing basics

Most deployments start with an Azure Marketplace image such as Windows Server 2019 Datacenter or Windows Server 2022 Datacenter. Azure images typically include the Azure VM Agent, enabling extensions, diagnostics, and certain automation features.

Licensing can be handled in two main ways:

Azure “pay-as-you-go” includes Windows licensing in the hourly compute cost.

If your organization has eligible licenses with Software Assurance, you may use Azure Hybrid Benefit to reduce cost. Whether you can use it is a licensing decision, but the practical implication is that you should decide early because it affects ongoing billing.

Size the VM with CPU, memory, and disk IOPS in mind

Sizing is not only “how many vCPUs.” For many Windows workloads, memory is the primary limiter, and for database or file workloads, storage IOPS and throughput can dominate.

General-purpose series (like D-family) fit many application servers. Compute-optimized or memory-optimized families may be required for specific workloads. If you’re migrating from on-premises, compare current CPU and memory utilization, but also consider that Azure vCPU performance is not identical to a physical core. Benchmarking or a short proof-of-concept can prevent expensive mistakes.

Disk choice also matters:

Premium SSD managed disks provide predictable IOPS/throughput for many production workloads.

Standard SSD/HDD may be fine for dev/test or low-IO workloads.

Ultra Disk targets very high performance but has constraints (availability and configuration requirements).

A key practical point: your application performance can degrade if you underprovision disk performance even when CPU looks fine.

Establish baseline operational requirements (patching, backup, monitoring)

A Windows VM that is not patched and not monitored becomes a liability quickly. Before you deploy, decide:

How you will handle Windows Updates (native Windows Update, WSUS, Microsoft Configuration Manager, or Azure Update Manager).

How you will back up (Azure Backup is a common default).

What logs and metrics you need (Azure Monitor and Log Analytics are the usual building blocks).

This guide shows an Azure-native baseline that is common in modern environments, while noting where enterprise tooling may replace it.

Build the network foundation (VNet, subnetting, and DNS)

Networking is the foundation for everything else: identity integration, access paths, and security controls. Even if you’re deploying a single VM, it’s worth creating a virtual network structure that won’t force a redesign later.

Create a virtual network with sensible address space

Choose an address space that will not collide with your on-premises networks or other VNets you expect to peer later. Overlapping CIDRs are a common source of pain during later integration.

A typical starting point could be 10.20.0.0/16 for a workload VNet, with subnets like 10.20.1.0/24 for application servers and 10.20.2.0/24 for management or shared services. The exact scheme is less important than ensuring it is consistent and leaves room to grow.

In enterprise environments, DNS is often provided by domain controllers or central resolvers. In that case, configure the VNet to use those DNS servers so your Windows VM can resolve internal names (and join a domain) without hacks.

Subnets, segmentation, and why “one big subnet” doesn’t scale

For a single VM, placing everything in one subnet is tempting. The problem is that you later need to apply different security controls for different roles. It’s much easier to start with at least a basic separation between workload subnets and a subnet reserved for platform services.

Also note that some services require dedicated subnets (for example, Azure Bastion uses AzureBastionSubnet). Plan those early so you don’t have to readdress later.

Network Security Groups (NSGs) as your first line of control

An NSG is a stateful packet filter applied to a subnet or NIC. In most designs, applying an NSG at the subnet level creates consistency: every VM in the subnet inherits baseline rules.

Your baseline should typically:

Deny inbound from the internet by default.

Allow inbound only from known management sources (VPN subnet, Bastion, jump host subnet).

Allow required app ports from specific sources (for example, allow 443 from an internal load balancer subnet).

Remember that “allowing RDP from my home IP” is not a stable pattern for teams. It also doesn’t help incident response if credentials are compromised.

Scenario: migrating a line-of-business app with minimal change

Consider a small IT team migrating an on-prem Windows Server hosting a line-of-business application. They need the app reachable by internal users, and administrators need occasional RDP access.

A practical Azure approach is:

Place the VM in an app subnet with an NSG allowing app ports only from internal networks.

Use Azure Bastion for admin access if no VPN is available yet.

Add Azure Backup and basic monitoring from day one.

This meets the “minimal change” goal while avoiding the common shortcut of opening RDP to the public internet.

Decide on availability and resiliency (and be realistic)

Not every VM needs the same resiliency model. Your options range from a single VM to multiple VMs behind a load balancer, to availability sets, to Availability Zones.

A single VM is simplest, but it is also a single point of failure. If the workload is critical, decide whether you need at least platform redundancy.

Availability Sets provide fault domain and update domain separation within a datacenter.

Availability Zones place instances across physically separate zones within a region, improving resiliency for certain failures.

For many Windows application servers that are not easily scaled out, a common compromise is a single VM with strong backup and a documented restore process. For customer-facing services or highly critical internal services, architecting for multi-instance availability is usually worth it.

Create the Windows VM (Azure Portal approach)

The Azure Portal is useful for learning and for one-off deployments. The key is to treat it as a way to produce a known-good configuration that you can later translate into CLI or IaC.

Basics: project details and instance details

Start by selecting the subscription and resource group. Use a naming convention that indicates role and environment.

Choose the region aligned with your network design. If you already created the VNet in a region, deploy the VM there.

Select an image such as Windows Server 2022 Datacenter. For VM size, start with a general-purpose size and adjust based on workload needs.

Administrator account, authentication, and secrets handling

When you create a Windows VM, Azure requires a local administrator username and password. Treat this as a break-glass account; you should not routinely use it for daily operations.

In enterprise operations, you typically:

Use Entra ID (formerly Azure AD) login for admin access where supported.

Or join the VM to Active Directory and use domain accounts.

If you must store the local admin password, store it in a secure secret store such as Azure Key Vault, not in a ticket or a spreadsheet.

Disks: OS disk, data disks, and caching

By default, Azure creates an OS disk as a managed disk. For many workloads, Premium SSD is a reasonable default if cost permits.

If the VM will host an application with significant data, add a separate data disk rather than placing everything on the OS disk. This helps with:

Performance tuning (separate IOPS allocation)

Operational separation (easier data handling during rebuilds)

Backup/restore strategies (depending on tooling)

Disk caching settings (ReadOnly/ReadWrite/None) can materially affect performance for certain workloads. Microsoft documentation provides guidance by workload type (for example, databases often use specific caching patterns). If you’re not sure, start with defaults and validate with benchmarks.

Networking: NIC, private IP, public IP, and NSG association

Select the VNet and subnet you prepared. Prefer a private IP-only deployment.

If you require a public IP temporarily, make it explicit and time-bound. Better is to deploy Bastion or use a VPN.

Attach an NSG at the subnet level when possible. If you attach an NSG directly to the NIC, be aware it can create drift if subnet rules change later.

Management: boot diagnostics, identity, and extensions

Enable boot diagnostics so you can retrieve screenshots and logs if the VM fails to boot. This is useful even for experienced operators.

For identity, consider enabling a system-assigned managed identity. A managed identity is an Azure-managed service principal tied to the VM that can authenticate to Azure services without embedding secrets. It becomes important later for accessing Key Vault, Azure Storage, or other resources securely.

Extensions can bootstrap agents (monitoring, domain join, configuration). Keep extensions minimal and intentional; too many can complicate troubleshooting and upgrades.

Tags and governance

Apply tags such as Environment, Owner, CostCenter, and Application. Tags are not just for finance; they help operations filter resources and enforce policies.

After reviewing settings, create the VM.

Create the Windows VM (Azure CLI approach)

Once you’ve built a working configuration in the Portal, the Azure CLI makes it repeatable. This is especially useful for dev/test or for teams standardizing deployments.

The following example creates a resource group, VNet/subnet, NSG with a conservative rule set, and then a Windows VM without opening public RDP. It also enables a system-assigned managed identity.


# Variables

RG="rg-winvm-demo"
LOC="eastus"
VNET="vnet-winvm-demo"
SUBNET="snet-app"
NSG="nsg-snet-app"
VMNAME="vmw-app01-dev"
ADMINUSER="localadmin"

# Create resource group

az group create --name "$RG" --location "$LOC"

# Create VNet and subnet

az network vnet create \
  --resource-group "$RG" \
  --name "$VNET" \
  --address-prefixes 10.20.0.0/16 \
  --subnet-name "$SUBNET" \
  --subnet-prefixes 10.20.1.0/24

# Create NSG

az network nsg create \
  --resource-group "$RG" \
  --name "$NSG"

# Associate NSG to subnet

az network vnet subnet update \
  --resource-group "$RG" \
  --vnet-name "$VNET" \
  --name "$SUBNET" \
  --network-security-group "$NSG"

# Create VM (no public IP, no open RDP)

az vm create \
  --resource-group "$RG" \
  --name "$VMNAME" \
  --image "Win2022Datacenter" \
  --size "Standard_D2s_v5" \
  --admin-username "$ADMINUSER" \
  --admin-password "<use-a-strong-password-here>" \
  --vnet-name "$VNET" \
  --subnet "$SUBNET" \
  --public-ip-address "" \
  --nsg "" \
  --assign-identity \
  --tags Environment=Dev Application=Demo Owner=IT

A few notes about this pattern:

The VM is created without a public IP. That means you must already have a private access method (VPN/ExpressRoute) or you must add Azure Bastion later.

We created an NSG but did not add inbound rules. The default NSG rules allow VNet-to-VNet traffic and Azure load balancer health probes, while denying other inbound. That’s a safer baseline.

In a real environment, you should avoid placing the password inline. For automation, store it in a secure pipeline secret store or use a controlled provisioning process.

If you need to allow RDP from a specific management subnet (for example, a jump host subnet), create an explicit NSG rule. Here is an example allowing TCP/3389 only from 10.20.10.0/24:

bash
az network nsg rule create \
  --resource-group "$RG" \
  --nsg-name "$NSG" \
  --name "Allow-RDP-From-Management" \
  --priority 100 \
  --direction Inbound \
  --access Allow \
  --protocol Tcp \
  --source-address-prefixes 10.20.10.0/24 \
  --source-port-ranges "*" \
  --destination-address-prefixes "*" \
  --destination-port-ranges 3389

This approach scales: you can deploy multiple Windows VMs into the same subnet and control access centrally through the subnet NSG.

Create the Windows VM (Bicep approach for repeatability)

If you want deployments to be consistent across environments, Bicep is a strong choice. Bicep is a domain-specific language that compiles to ARM templates, designed to simplify Azure infrastructure definitions.

Below is a minimal—but production-oriented—Bicep example that creates a VNet/subnet, NSG, a NIC, and a Windows VM with a system-assigned managed identity. It intentionally does not assign a public IP.

bicep
param location string = resourceGroup().location
param vmName string
param adminUsername string
@secure()
param adminPassword string

param vnetName string = 'vnet-${vmName}'
param addressPrefix string = '10.20.0.0/16'
param subnetName string = 'snet-app'
param subnetPrefix string = '10.20.1.0/24'
param nsgName string = 'nsg-${subnetName}'

resource vnet 'Microsoft.Network/virtualNetworks@2023-11-01' = {
  name: vnetName
  location: location
  properties: {
    addressSpace: {
      addressPrefixes: [ addressPrefix ]
    }
    subnets: [
      {
        name: subnetName
        properties: {
          addressPrefix: subnetPrefix
          networkSecurityGroup: {
            id: nsg.id
          }
        }
      }
    ]
  }
}

resource nsg 'Microsoft.Network/networkSecurityGroups@2023-11-01' = {
  name: nsgName
  location: location
}

resource nic 'Microsoft.Network/networkInterfaces@2023-11-01' = {
  name: 'nic-${vmName}'
  location: location
  properties: {
    ipConfigurations: [
      {
        name: 'ipconfig1'
        properties: {
          privateIPAllocationMethod: 'Dynamic'
          subnet: {
            id: vnet.properties.subnets[0].id
          }
        }
      }
    ]
  }
}

resource vm 'Microsoft.Compute/virtualMachines@2024-03-01' = {
  name: vmName
  location: location
  identity: {
    type: 'SystemAssigned'
  }
  properties: {
    hardwareProfile: {
      vmSize: 'Standard_D2s_v5'
    }
    osProfile: {
      computerName: vmName
      adminUsername: adminUsername
      adminPassword: adminPassword
    }
    storageProfile: {
      imageReference: {
        publisher: 'MicrosoftWindowsServer'
        offer: 'WindowsServer'
        sku: '2022-datacenter'
        version: 'latest'
      }
      osDisk: {
        createOption: 'FromImage'
        managedDisk: {
          storageAccountType: 'Premium_LRS'
        }
      }
    }
    networkProfile: {
      networkInterfaces: [
        {
          id: nic.id
        }
      ]
    }
    diagnosticsProfile: {
      bootDiagnostics: {
        enabled: true
      }
    }
  }
}

Deploy it with Azure CLI:

bash
az deployment group create \
  --resource-group rg-winvm-demo \
  --template-file main.bicep \
  --parameters vmName=vmw-app01-dev adminUsername=localadmin adminPassword='<secure>'

In a mature setup, you would not pass passwords directly. You’d either generate them in a secure pipeline and store them, or you’d use a different initial access strategy (for example, domain join during provisioning and disabling local admin sign-in for normal operations). The point here is that Bicep gives you a stable baseline to evolve.

Establish secure administrative access (Bastion, jump hosts, and JIT)

Once the VM is deployed, the next operational question is: how will administrators connect safely?

Azure Bastion for browser-based RDP without public exposure

Azure Bastion places a managed service in your VNet that proxies RDP/SSH over TLS. The VM remains private; you connect via the Azure Portal (and in some configurations, native clients).

To use Bastion, you create a dedicated subnet named AzureBastionSubnet and deploy the Bastion resource. Because Bastion is a shared service, it’s often deployed once per VNet (or per hub VNet in a hub-and-spoke design) and then used for multiple VMs.

This is a common pattern for organizations that want to eliminate public IPs on servers. It also reduces the need to maintain traditional jump hosts.

Just-in-time (JIT) access and Defender for Cloud

For environments that still require inbound RDP in some cases, Microsoft Defender for Cloud can provide just-in-time VM access, which opens management ports only for a limited time and only to approved source IPs. This can reduce exposure compared to permanent inbound rules.

JIT is not a substitute for proper network design, but it can be a pragmatic bridge when you can’t immediately implement Bastion or VPN.

Scenario: enterprise security baseline with “no public RDP”

A common enterprise requirement is “no inbound management ports from the internet.” In practice, the platform team creates a hub VNet with Bastion and connectivity to on-premises, then spokes for workloads. Workload VMs, including Windows servers, have no public IPs and are reachable only through Bastion or private connectivity.

This shapes your VM configuration: you can keep the NSG locked down, reduce the chance of credential stuffing attacks, and centralize audit trails through Azure Activity logs and Bastion session data (where applicable). It’s an example of how access requirements drive network architecture, not the other way around.

Configure identity: Entra ID, domain join, and RBAC

A Windows VM is not just compute; it’s an identity boundary. Decide early how administrators authenticate and how the VM integrates with directory services.

Entra ID login for Windows VMs (where it fits)

Azure supports signing in to certain Windows Server VMs using Entra ID-based authentication via extensions and role assignments. This can reduce reliance on local accounts and improve traceability. However, it doesn’t replace traditional Active Directory domain join for workloads that require Kerberos/LDAP or group policy.

If you plan to use Entra ID login, ensure your operational model supports it: administrators must have appropriate RBAC roles on the VM resource, and you should still have a break-glass method.

Active Directory domain join (classic enterprise pattern)

If your environment uses Active Directory, joining the VM to the domain enables group policy, centralized authentication, and consistent admin delegation.

In Azure, domain join usually means one of these:

The VM connects to domain controllers reachable over the VNet (either DCs hosted in Azure or on-premises over VPN/ExpressRoute).

You use Azure AD Domain Services (Microsoft Entra Domain Services) for managed domain services when you don’t want to run your own DCs.

Domain join automation can be done via VM extensions, but be careful with credentials. If you automate domain join, use a tightly scoped domain join account and protect its password.

Use RBAC to control who can operate the VM

Azure RBAC controls management plane access: who can start/stop the VM, change networking, view boot diagnostics, or reset passwords. This is separate from OS-level rights.

A practical baseline is:

Grant a small group “Virtual Machine Contributor” or a custom role for VM operations.

Limit “Owner” rights to very few users.

Use separate roles for network changes versus compute operations, especially in larger teams.

This reduces the chance that a VM admin accidentally modifies the VNet or NSG in a way that breaks other workloads.

Post-deployment OS configuration: do the minimum, then automate

After provisioning, you still need to configure the guest OS. The goal is not to hand-tune one server; it’s to define a baseline that can be applied repeatedly.

Enable and verify Remote Desktop settings

If you use Bastion or private RDP, you still need RDP enabled on the OS and allowed through Windows Defender Firewall. Most Windows Server images enable RDP, but verify it aligns with your security baseline.

If you are using Bastion, you typically do not need to create inbound NSG rules for RDP from the internet, but the VM must accept RDP from within the VNet.

Windows Firewall and service exposure

Treat Windows Firewall as a second layer after NSGs. NSGs filter at the network level; Windows Firewall filters on the host.

Even if the NSG blocks inbound traffic, leaving unnecessary services open on the host increases risk if the network controls change later or if lateral movement occurs. Establish a baseline where only required inbound ports are open.

Time zone, NTP, and domain time considerations

Time drift can cause authentication failures, especially in domain environments. Windows VMs in Azure generally sync time appropriately, but once domain-joined, domain time hierarchy applies. Validate time configuration early, particularly if you rely on Kerberos.

Install required roles/features and keep it reproducible

If this VM will host IIS, .NET, or file services, install only what you need and document it as code where possible.

For example, installing IIS with PowerShell:

powershell
Install-WindowsFeature -Name Web-Server -IncludeManagementTools

For a file server role:

powershell
Install-WindowsFeature -Name FS-FileServer

The key is to avoid ad-hoc configuration that only exists in someone’s memory. Even if you start manually, translate the steps into scripts, Desired State Configuration, or your configuration management platform.

Secure secrets and configuration with managed identity and Key Vault

One of the fastest ways to create long-term security debt is embedding secrets (API keys, database passwords) inside scripts or application config files. Azure provides a cleaner pattern:

Use a managed identity on the VM.

Store secrets in Azure Key Vault.

Grant the managed identity access to read only the secrets it needs.

Grant Key Vault access to the VM’s managed identity

If you enabled a system-assigned managed identity on the VM, you can retrieve its principal ID and grant permissions. The exact permission model depends on whether the Key Vault uses Azure RBAC or access policies. Many organizations standardize on Azure RBAC for consistency.

Here is an Azure CLI example using RBAC. It assumes you have a Key Vault already created.

bash
RG="rg-winvm-demo"
VMNAME="vmw-app01-dev"
KVNAME="kv-app-secrets-001"

PRINCIPAL_ID=$(az vm show -g "$RG" -n "$VMNAME" --query identity.principalId -o tsv)
KV_ID=$(az keyvault show -n "$KVNAME" --query id -o tsv)

# Grant the VM identity permission to read secrets

az role assignment create \
  --assignee-object-id "$PRINCIPAL_ID" \
  --assignee-principal-type ServicePrincipal \
  --role "Key Vault Secrets User" \
  --scope "$KV_ID"

On the VM, your application or script can then use Azure SDK authentication (DefaultAzureCredential) to fetch secrets without storing long-lived credentials. This reduces credential sprawl and makes rotation manageable.

Configure updates and patch management

Patching is one of the most consistent operational tasks for Windows servers. In Azure, you can run Windows Update manually, but at scale that doesn’t hold.

Use Azure Update Manager where appropriate

Azure Update Manager (the current Azure-native update service) can orchestrate patching schedules for Azure VMs and, with Azure Arc, for non-Azure machines. The operational value is centralized visibility and scheduling, with reporting.

If your organization already uses WSUS or Configuration Manager, you might keep those systems and simply ensure the VM can reach them. The important part is that patching is scheduled, monitored, and audited.

Coordinate reboots and maintenance windows

Patching often implies reboots. For application servers, coordinate patch windows with stakeholders and document expected downtime.

This is where the earlier availability discussion becomes real: a single VM may require downtime, while a multi-instance service can be patched in a rolling manner if designed appropriately.

Scenario: dev/test servers that must rebuild cleanly every sprint

A platform team supporting development might rebuild Windows VMs frequently to keep environments clean. In that case, the strategy shifts:

Use IaC (Bicep/Terraform) and scripts to rebuild.

Rely on automated patch baselines rather than manual updates.

Store configuration in source control.

This scenario highlights why automation matters even when the VM “isn’t production.” Dev/test environments often create the most sprawl, and sprawl becomes cost and risk.

Enable monitoring: Azure Monitor, Log Analytics, and VM insights

A Windows VM in Azure should emit metrics and logs that let you answer basic questions: Is it up? Is performance normal? What changed? Are there security-relevant events?

Azure Monitor is the umbrella for metrics and logs. Log Analytics workspaces store and query logs using Kusto Query Language (KQL).

Create or choose a Log Analytics workspace

In larger environments, you typically centralize logs into a shared workspace per environment or per region. Centralization makes correlation and alerting easier, but it also requires governance: workspace access is sensitive because logs can contain security and operational data.

If you create a new workspace for a small deployment, choose a region aligned with your VM and follow your organization’s data residency requirements.

Enable VM insights (where applicable)

VM insights collects guest performance data and dependency information via agents. This can be useful for capacity planning and operational visibility.

Depending on current Azure agent strategy, you might be using the Azure Monitor agent (AMA) or legacy agents. Align with Microsoft’s current guidance and your enterprise standards. The main point is to standardize on one approach rather than mixing.

Basic KQL queries you’ll actually use

Once logs are flowing, validate that you can query them. For example, to review heartbeat data for machines:

kusto
Heartbeat
| summarize LastSeen=max(TimeGenerated) by Computer
| order by LastSeen desc

To view Windows security events (if collected):

kusto
SecurityEvent
| where TimeGenerated > ago(24h)
| summarize Count=count() by EventID
| order by Count desc

The specific tables available depend on what you collect. The operational takeaway is to verify end-to-end: agent installed, logs arriving, and queries returning expected results.

Alerts: start small and actionable

Alert fatigue is real. Start with a few alerts that are genuinely actionable:

VM not responding (heartbeat missing)

High CPU sustained

Low disk space on OS/data volume

Unexpected shutdown/reboot events

Each alert should have an owner and a response expectation. Otherwise, you’re building noise.

Configure backup and recovery with Azure Backup

Backups are one of the most important controls for a single VM workload. Azure Backup typically uses a Recovery Services vault to manage backup policies and restore points.

Choose an appropriate backup policy

For many Windows server workloads, daily backups with a retention period aligned to business requirements is a common baseline. Critical systems may need more frequent restore points.

Also consider application-consistent backups. For some workloads (notably databases), you should use application-aware backup methods rather than relying on VM-level backups alone. VM-level backups can still be part of the strategy, but they might not satisfy recovery objectives on their own.

Test restores as part of the deployment lifecycle

A backup that has never been restored is not a plan—it’s a hope. Build at least one restore test into your deployment process. For example, restore the VM to an isolated subnet and confirm it boots and that data is present.

This is where your network segmentation pays off: an isolated “restore validation” subnet can allow safe testing without impacting production.

Harden the Windows VM: baseline security without overcomplication

Security hardening can become a deep project, but you can achieve significant risk reduction with a clear baseline.

Reduce inbound exposure and remove unnecessary public endpoints

The most impactful step is to avoid public IPs and open inbound ports. If a public IP is required for a specific service, prefer placing it on a load balancer or application gateway and keep the VM itself private.

Where you must expose ports, restrict sources as much as possible. “Allow from Internet” should be an exception with documented justification.

Configure Defender for Cloud recommendations thoughtfully

Microsoft Defender for Cloud can assess security posture and recommend improvements. Treat these recommendations as input to your baseline, not as a checklist to blindly satisfy.

Some recommendations will be easy wins (endpoint protection, missing OS updates). Others may not apply to your workload or may require architectural changes.

Local admin management and privilege hygiene

Limit local administrator use. If domain-joined, consider group policy for local admin group membership. If not domain-joined, consider tools like Microsoft LAPS (Local Administrator Password Solution) where appropriate and supported.

The objective is to prevent shared local admin passwords across servers and to make credential misuse harder.

Encrypt disks and protect data at rest

Azure managed disks are encrypted at rest by default using platform-managed keys. For workloads with stricter compliance requirements, you may need customer-managed keys (CMK) via Disk Encryption Sets, which adds complexity.

Start by confirming your compliance requirements. Don’t implement CMK “just because” unless you can operate it (key rotation, access controls, audit).

Storage and performance tuning: OS disk vs data disk, temp disk, and filesystems

After the VM is running, validate that disk layout matches the workload.

Use data disks for application data and logs

Many Windows applications write logs and data to C: by default. In Azure, separating application data onto a dedicated managed disk can improve performance and simplify recovery. It also reduces the chance that an OS disk fills and destabilizes the server.

Format data disks with an appropriate allocation unit size for the workload (for example, SQL Server commonly uses 64K allocation units). Use workload vendor guidance where available.

Understand the temporary disk (D:) in Azure

Many Azure VM sizes provide a temporary disk (often D: on Windows). This storage is not persistent and can be lost during host maintenance or redeployments. Do not store anything critical there.

Use the temporary disk only for transient data such as page file (in some designs) or temp files where loss is acceptable.

Scenario: IIS web server with strict uptime requirements

For an internal web application hosted on IIS, a common pattern is a pair of Windows VMs behind an Azure Load Balancer or Application Gateway, with content deployed via a CI/CD pipeline.

In this scenario, each VM should be stateless where possible. Store session state and uploaded content outside the VM (for example, in a database or storage service) so that a VM can be rebuilt or patched without service interruption. Even if you start with one VM, designing with statelessness in mind makes later scaling and maintenance much easier.

Automate baseline configuration with PowerShell and extensions

Manual configuration doesn’t scale. Even if you only have a few servers today, automation reduces variance and accelerates recovery.

Use PowerShell to apply a baseline after provisioning

A simple pattern is to run a PowerShell script post-deployment to:

Set firewall rules

Install roles/features

Configure logging settings

Set local policies

If you use Azure VM extensions to run scripts, keep scripts idempotent (safe to run multiple times) and version-controlled.

For example, installing features and creating a basic inbound firewall rule for HTTPS:

powershell
Install-WindowsFeature -Name Web-Server -IncludeManagementTools

New-NetFirewallRule -DisplayName "Allow HTTPS" -Direction Inbound -Protocol TCP -LocalPort 443 -Action Allow

You still need to ensure the NSG permits the traffic if it is meant to reach the VM. Host firewall and NSG should agree.

Use Custom Script Extension carefully

The Custom Script Extension can download and execute scripts. It’s useful, but it introduces concerns:

Where are scripts hosted (Storage with SAS tokens, Git repo, internal web)?

How are secrets protected?

How do you track which version ran on which server?

In many organizations, configuration management tools (Ansible, Chef, Puppet, DSC, or SCCM) provide a better long-term approach. If you do use extensions, standardize and document the pattern.

Cost management for Windows VMs in Azure

Cost issues usually come from a small set of drivers: oversized compute, premium disks where unnecessary, forgotten dev/test resources, and outbound data transfers in certain architectures.

Right-size based on actual utilization

After deployment, watch CPU, memory, and disk metrics. If the VM is consistently underutilized, resize to a smaller SKU. In Azure, resizing is usually straightforward, though some changes require downtime.

Use reserved instances and Azure Hybrid Benefit where eligible

For steady-state production workloads, reservations can reduce compute cost. Azure Hybrid Benefit can reduce Windows licensing cost if you have eligible licenses.

These are finance and licensing decisions as much as technical ones, but as an operator you should know they exist because they can change architecture decisions (for example, consolidating workloads onto fewer VMs).

Turn off what you don’t need in dev/test

Dev/test sprawl is common. If workloads don’t need 24/7 uptime, use automation to stop VMs after hours. Be careful with dependencies: stopping a domain controller or a shared build server may break other workflows.

An Azure Automation or Logic Apps schedule can manage stop/start, or you can implement it in your CI/CD pipeline.

Operational practices: change control, images, and drift management

A Windows VM is a living system; changes happen through patching, app updates, and admin actions. Without discipline, servers drift and become unique snowflakes.

Capture a “golden image” approach when it’s warranted

If you deploy many similar Windows VMs, consider building a golden image using Azure Image Builder or your preferred imaging pipeline. The image can include base hardening, agents, and common components.

The advantage is speed and consistency. The tradeoff is that you now operate an image pipeline, and you must keep it patched.

For small environments, a script-based approach may be sufficient. The key is consistency, not the specific tool.

Document configuration and keep it close to code

Even if you are not fully IaC-driven, keep configuration artifacts versioned: scripts, NSG rules, VM sizing rationale, and access patterns. This reduces onboarding time for new engineers and makes incident response faster.

Avoid manual changes in the Portal as the steady state

Using the Portal for exploration is fine. Using it for recurring changes tends to create drift. If you create a rule in an NSG manually and later redeploy an IaC template that doesn’t include it, it will be removed or conflict.

A practical workflow is:

Prototype in Portal

Export and translate to CLI/Bicep

Apply changes through code

End-to-end deployment walkthrough tying it together

At this point, you’ve seen the building blocks. To tie them together in a realistic flow, consider this sequence for deploying a production-leaning Windows VM:

First, create or select a resource group and region that match your governance requirements, then create a VNet with non-overlapping CIDR ranges and at least one workload subnet.

Second, attach an NSG to the workload subnet with a “deny-by-default” inbound posture, then decide on an administrative access method. If you can, deploy Azure Bastion and keep the VM private.

Third, deploy the Windows VM with a system-assigned managed identity, Premium SSD for OS disk if the workload needs it, and add data disks for application data. Apply consistent tags.

Fourth, configure identity: domain join if required, or Entra ID login where it fits your model. Align RBAC so VM operators can manage the resource without granting excessive rights.

Fifth, configure guest baseline: roles/features, firewall rules, logging, and time settings—preferably automated.

Sixth, enable monitoring to a Log Analytics workspace, validate data ingestion with KQL queries, and create a minimal set of actionable alerts.

Seventh, enable backup with a policy that matches recovery objectives, and run a restore test into an isolated subnet.

Finally, review cost and right-size based on telemetry rather than assumptions.

This flow matches how mature teams operate: secure connectivity first, then compute, then identity and operations, rather than deploying a VM and scrambling later.