A major topic since COVID has been BCP (Business Continuity Planning) and DR (Disaster Recovery). Recently Windows 365 released some amazing capabilities that we’re going to discuss today. We will cover:
- Existing Resiliency with Windows 365 Cloud PCs
- The New Cross Region Disaster Recovery in Windows 365
- Managing Cloud PC Cross Region Disaster Recovery
Existing Resiliency with Windows 365 Cloud PCs
Today, Windows 365 Cloud PCs resiliency is based on a few different metrics:
- 99.9% highly available Cloud PC user sessions (as referenced in the MSFT SLA)
- MSFT measures downtime in minutes, the period in which all connection attempts by a user to a Cloud PC were unsuccessful, excluding any of the following types of failures:
- Failures resulting from the Cloud PC being in an inoperable state unrelated to the underlying Azure infrastructure (e.g., damaged or corrupt operating system, operating system configuration, or misconfiguration); and
- Failure resulting from an application or other software installed on the Cloud PC.
- MSFT measures downtime in minutes, the period in which all connection attempts by a user to a Cloud PC were unsuccessful, excluding any of the following types of failures:
- Data object resiliency for disk storage of 99.999999999% (Microsoft’s main recommendation for data resiliency is leveraging OneDrive along with OneDrive’s Best Practices that I discussed recently.)
- Automated availability zone failover for the compute instance.
- Recovery Point Objective (RPO) of ~0 (the time-based measurement of the maximum data that can be lost for an user as a result of the DR event)
Some of the failures that could result in an AZ failover are the failure of the virtual NIC, compute instance, storage plane instance, or compute power instance. We will see an automatic failover when one of these failures are seen. This event will require a user to log back into their Cloud PC and some potential minimal disruption.
Resiliency of the Cloud PC Management Service
The Cloud PC Management Service comes into play with the resiliency story as well. For clarity, the Cloud PC Management Service, includes the Intune admin center and the Cloud PC end user portal (windows365.microsoft.com).
The Cloud PC Management Service has redundant architecture within the region with a target uptime of 99.99%. The service has these target objectives:
- RTO of < 6 hours.
- RPO of <30 minutes for changes made in the management service.
The good news is if there is an outage you can still leverage the Windows App to log into their session, https://rdweb.wvd.microsoft.com/webclient/index.html, or a bookmark for their session.
Now, let’s talk about a feature that people tend to overlook called Enterprise State Roaming.
Enterprise State Roaming
One of the interesting items they recommend as part of your strategy is enterprise state roaming, which you can enable here:

Enterprise State Roaming (ESR) is an often forgotten feature, which this graphic covers nicely below:

ESR provides a unified experience across a user’s devices by synchronizing data across multiple devices. The data is hosted in Azure aligning with this table. This data is retained until it becomes stale or deleted manually:
| Country/region value | has their data hosted in |
|---|---|
| An EMEA country/region such as France or Zambia | One or more of the Azure regions within Europe |
| A North American country/region such as United States or Canada | One or more of the Azure regions within the US |
| An APAC country/region such as Australia or New Zealand | One or more of the Azure regions within Asia |
| South American and Antarctica regions | One or more Azure regions within the U |
There’s no retention policy around that data, and it can be removed in a few different ways:
- User deleted from Entra (removed within 90-180 days)
- Directory deleted from Entra (removed within 90-180 days)
- Admin opens an Azure support ticket to delete the data
One last note is to make sure both of these settings are not disabled in Intune for it to work properly:
- Allow Microsoft Account Connection
- Allow Sync My Settings
The New Cross Region Disaster Recovery in Windows 365
Windows 365 now has a new optional offer called “Cross Region Disaster Recovery”

As you can see, this new license which is $4.50 per month per user, provides a really interesting capability. Instead of the standard availability zone failover, this feature provides cross region DR, which is a good best practice for all cloud services.
Windows 365 cross region DR creates geographically distant temporary copies of Cloud PCs that can be accessed in the fallback region (the region where you will failover in the event of a disaster in your primary site).

Once you license the feature, it’s pretty easy to onboard your environment. You will modify your user settings policy to set the fallback region:

Once the user license has synchronized and they have the proper user settings, you will manually onboard them via bulk device actions::

Now, check out the video below where we will cover setting up the user settings policy and triggering a move of the Cloud PC to the fallback region.
After the manual activation, a temporary copy of the Cloud PC (as mentioned earlier) is created using the latest restore point in the fallback region. That means all installed apps, settings, and data move with you.
If an admin deactivates the cross region disaster recovery after the outage event, the temporary Cloud PC is deleted. No applications, settings, data, or other information is preserved from the temporary Cloud PC.
In the event of an outage your RPO and RTO are:
- RTO of < 4 hours for tenants with less than 50,000 Cloud PCs in a region.
- RPO of < 4 hours
Devices are restored as quickly as possible, but the amazing thing if you can target what devices you want restored first. The speed and scale of the restoration process is per region and per tenant. This strategy prioritizes certain devices but doesn’t change the overall RTO for the full environment.
One thing to note, if the fallback region doesn’t have capacity or is unhealthy, the backup Cloud PC won’t be provisioned. The data is still preserved and accessible in the fallback region regardless. You also will want to wait 12-24 hours after assigning the license to give time for replication and its readiness.
Cross Region Disaster Recovery User Experience
When cross region disaster recovery is activated, users will see this on their Cloud PC on the next login as you can see below:

After the cross region disaster recovery activation is complete, when a user signs in to their Cloud PC they receive a temporary Cloud PC. With this device, they get full user context, including:
- Configuration
- Data stored on the local disk
- User-installed applications up to the RPO for the device.
As I mentioned earlier, once you deactivate the failover, the fallback device is removed. The user returns to their primary device and none of the data saved to the fallback device is kept. Data you stored in OneDrive, cloud apps, etc. will not be impacted.
For context, I did some timing exercises and it takes about 16-18 minutes for the Cloud PC from shutdown to move to the fallback region.
Another interesting note, it appears that currently you can only access the temporary Cloud PC from the Windows 365 web portal
If you do connect via the Windows App, you might see some weirdness like this, but feel free to ignore it:

Managing Cloud PC Cross Region Disaster Recovery
When you have the cross region DR implemented, it’s important to be able to get a full view into your fleet. You can see in the video below, how we leverage Windows 365 reports to get insight into the status of our devices, their failover status, and much more:
Interesting note that when you trigger failover it makes you think you broke something with the red exclamation point:

Final Thoughts
In closing, we already have strong resiliency present in Windows 365 and their SLA. This new capability in Windows 365 provides peace-of-mind by allowing users to quickly fallback to a secondary region during a DR event.
We also see some amazing mindfulness by being able to prioritize the devices you want to move or move back, which helps account for VIP users. The cost isn’t too terrible at $50 a year per device to deliver your BCP strategy in Windows 365. It should be interesting to see how the capability improves and matures within Enterprises.

1 thought on “Windows 365 Powering your Business Continuity and Disaster Recovery Plans”
Pingback: Weekly Newsletter – 13th of July to 19th of July 2024 - Windows 365 Community