How to successfully Virtualize MS Exchange – Part 5 – High Availability (HA)

HA has two main configuration options which can significantly impact the availability and consolidation of any vSphere environment but can have an even higher impact when talking about Business Critical Applications such as MS Exchange.

When considering MS Exchange MBX or MSR VMs can be very large in terms of vCPU and vRAM, understanding and choosing an appropriate setting is critical for the success of not only the MS Exchange deployment but any other VMs which are sharing the same HA cluster.

Let’s start with the “Admission Control Setting“.

Admission control can be configured in either “Enabled” or “Disabled” mode. “Enabled” means that if the Admission Control Policy (discussed later in this post) is going to be breached by powering on one or more VMs, the VM will not be permitted to power on, which guarantees a minimum level of performance for the running VMs.

If the setting is “Disabled” it means no matter what, VMs will be powered on. In this situation, it leads to the possibility of significant contention for compute resources which for MS Exchange MBX or MSR VMs would not be ideal.

As a result, it is my strong recommendation that the “Admission Control Setting” be set to “Enabled”.

Next lets discuss the “Admission Control Policy“.

There are three policies to choose from (shown below) each with their pros and cons.

AdmissionControlPolicies

1. Host failures the cluster tolerates

This option is the default and most conservative option. However it calculates the utilization of the cluster using what many describe as a very inefficient algorithm using what is called “slot sizes”.

A slot size is calculated as the largest VM from a vCPU perspective, AND the largest VM from a vRAM perspective and combines the two. Then the cluster will calculate how many “slots” the cluster can support.

The issue with this is for environments with a range of VM sizes, a small VM of 1vCPU and 1Gb RAM uses 1 slot, as would a VM with 8vCPU & 64Gb VM. This leads to the cluster having very low consolidation ratio and leads to unnecessary high numbers of ESXi hosts and underutilization.

As such, this is not recommended for environments with mixed VM sizes, such as MS Exchange MBX or MSR combined with VMs such as Domain Controllers.

2. Specify failover hosts

Specify failover hosts is a very easy setting to understand. You specify a failover host, and it does exactly that, acts as a failover host so if one host fails, all the VMs fail over onto the failover host.

Great, but the ESXi host then remains powered on doing nothing until such time there is a failure. So the fail-over HW provides no value during normal operations.

As such, this setting is not recommended.

3. Percentage of cluster resources reserved as failover spare capacity

This setting also is fairly easy to understand at a high level, although under the covers is more complicated and does not work how many people believe it does.

With that being said, it is a very efficient policy for environments with large VMs like Exchange MBX or MSR.

It avoids the inefficient “slot size” calculation, and works on virtual machines reservations to calculate cluster capacity.

For VMs with no reservation, 32Mhz and 0Mb ram (plus memory overhead) is used from vSphere 5.0 onwards. However for Exchange MBX/MSR VMs which as discussed in Part 3 should have Memory Reservations, HA will then use the full reserved memory to ensure sufficient cluster capacity for the Exchange VM to fail over without impacting memory performance. Now this is great news as we don’t want to overcommit memory for Exchange even in a failure scenario.

From a CPU perspective, 32Mhz will be the default reserved for any Exchange MBX or MSR VM which does not have a CPU reservation, so it makes sense from a HA perspective to use CPU reservations for Exchange VMs to ensure sufficient capacity exists within the cluster to tolerate an ESXi host failure.

CPU reservations will be discussed in more detail in a future post in this series.

As a result, I recommend using “Percentage of cluster resources reserved as failover spare capacity” for the admission control policy for Exchange environments.

Next we need to discuss what is the most suitable percentage to set for CPU and RAM.

The below table shows the required percentage for N+1 (Green) and N+2 (Blue) deployments based on the number of nodes in a vSphere HA cluster.

Table 1:

AdmissionControlPercetage

The above is generally what I recommend as N+2 provides excellent availability, including being able to tolerate a failure during maintenance or multiple host failures concurrently with little or no impact to performance after VMs restart.

So for clusters of sub 16 ESXi hosts, N+1 can be considered, but I recommended N+2 for greater than 16 ESXi host clusters.

The next table shows the required percentage for a cluster scaling from N+1 availability for up to 8 hosts, N+2 for up to 16 host, N+3 for up to 24 hosts and N+4 for the current maximum vSphere cluster of 32.

Table 2:

originalhapercetages

Its safe to say the above table is quite a conservative option (going up to N+4), however depending on business requirements these HA reservation values may be perfectly suited and are worth considering.

For more information see:
1. Example Architectural Decision – Admission Control Setting and Policy
2. Example Architectural Decision – VMware HA – Percentage of Cluster Resources Reserved for HA

Next lets discuss the “HA Virtual Machine Options“.

The below shows the “Cluster default settings” along with the “Virtual Machine settings” which allow you to override the cluster settings.

HAVMoptions

For the “VM restart priority”, I recommend leaving the “Cluster default setting” as “Medium” (Default).

For “Host Isolation Response” this heavily depends on your underlying storage and availability requirements, as such, I will address this setting in detail later in this series.

For the “VM Restart Priority” under “Virtual Machine Settings“, we have a number of options. If a DAG is being used, one option would be to Disable VM Restart and depend solely on the DAG for availability.

This has the advantage of reducing the compute requirements for the cluster to satisfy HA and gives the same level of availability as a DAG, which in many cases will meet the customers requirements.

Alternatively, the Exchange MBX or MSR VMs could be configured to “High” to ensure they are started asap following a failure, above less critical VMs such as Testing/Development.

Regarding “Datastore Heartbeating” and “VM Monitoring“, these will be discussed in future posts.

Recommendations for HA:

1. The “Admission Control Setting” be set to “Enabled”.
2. The “Admission control policy” to set to “Percentage of cluster resources reserved as failover spare capacity”
3. The “percentage of cluster resources reserved as failover spare capacity” be configured as per Table 1 (at a minimum).

Recommendations for HA Virtual Machine Options:

1. Do not disable HA restart for Exchange MBX or MSR VMs
2. Leave “HA restart priority” for Exchange MBX or MSR VMs to “Normal” for DAG deployments
3. Set “HA restart priority” for Exchange MBX or MSR VMs to “High” for non DAG deployments

Back to the Index of How to successfully Virtualize MS Exchange.

How to successfully Virtualize MS Exchange – Part 4 – DRS

DRS is a well known feature of vSphere which is designed to help load balance virtual environments for optimal performance.

With most virtual workloads, DRS does an excellent job of load balancing, so leaving DRS set to “Fully Automated” without specifying any DRS rules is fine.

The “Migration Threshold” can be adjusted from Conservative to Aggressive in 5 increments, with the default being “3” which I recommend.

For more information on this recommendation see : Example Architectural Decision – DRS Automation Level

These two settings are shown below:

NoSAN-ClusterDRSsettings

However with MS Exchange VMs which are CPU and RAM intensive, it doesn’t make sense to have these VMs moved around automatically if it can be avoided. If Exchange MBX / MSR VMs were vMotioned, it may take several minutes for the process to complete, during which time, depending on the vMotion configuration and bandwidth, could result in performance degradation. As a result, avoiding vMotion where possible reduces the risk to Exchange.

Note: I am not saying vMotion does not work, or cannot be configured to work very well for large VMs like MBX/MSR, but if vMotion can be avoided without adding significant complexity or operational cost to an environment, I try to avoid it except during planned maintenance activities.

I still however recommend enabling DRS and configuring it in “Fully Automated” mode, but by combining it with DRS rules for MBX / MSR VMs we can provide both higher and more consistent performance for MS Exchange.

To achieve this I recommend the following:

Create a “Host DRS Group” for each ESXi host in the cluster where Exchange VMs are expected to run and naming them with the ESXi hosts name to make them easily identifiable.

NoSAN-DRS-HostDRSGroup

Next I recommend creating a “VM DRS Group” per Exchange Mailbox VM and naming the VM DRS Group as the Exchange MBX or MSR server name OR another easily identifiable name such as “Exchange DAG Node 1” shown below.

NoSANRSGroup-ExchDAG1

Now that we have our “Host DRS Group/s” and “VM DRS Group/s” created, we setup a DRS “Virtual Machines to Hosts” rule per MBX/MSR VM and ESXi host with the policy “Should run of hosts in group” as shown below.

NoSANExch01ShouldRunHost1

What the above rule does is ensure the MSR or MBX VM runs only on the specified ESXi host unless there is an ESXi host failure, in which can it can automatically restart on another node within the cluster.

NoSAN-DRSRule-ShouldRunOnHostsInGroup

The below screenshot shows an example of what the recommended DRS rules would be in an environment had four MSR or MBX servers.  NoSANExchangeShouldRules

The above rules will result in the MBX or MSR VMs running on separate hosts as shown below.

NoSAN_ExchangeVMs_OnePerHost

Advantages of this DRS configuration:

1. Ensures no compute or network contention between the Exchange VMs
2. Ensures no storage layer contention between Exchange VMs such as HBA queue depths, NIC Note: This will not eliminate storage contention which may exist at a SAN/NAS layer.
3. DRS will not automatically move an MBX or MSR VM meaning performance will not potentially be impacted by the vMotion activity
4. HA is still fully functional
5. vMotion can still be used if required. e.g.: Prior to host maintenance.
6. DRS will still automatically load balance VMs throughout the cluster to ensure optimal performance of all ESXi hosts
7. More efficient than simply using Anti-Affinity rules for MBX/MSR VMs
8. Ensures two or more DAG members will not be impacted in the event of a single ESXi host failure.

Recommendations for DRS:

1. Set DRS Automation level to “Fully Automated”
2. Setup DRS “Migration Threshold” to “3” (Default)
3. Setup a “VM DRS Group” per Exchange Mailbox VM
4. Setup a “Host DRS Group” on a 1:1 basis with Exchange MSR or MBX VMs
5. Setup a DRS “Virtual Machines to Hosts” rule with the policy “Should run of hosts in group” on a 1:1 basis with Exchange MSR or MBX VMs & ESXi hosts
6. Disable Distributed Power Management (DPM) for hosts running Exchange MBX/MSR VMs.

Back to the Index of How to successfully Virtualize MS Exchange.

How to successfully Virtualize MS Exchange – Part 3 – Memory

In Part 1 and Part 2, we discussed how to size and configure Exchange VMs to meet CPU requirements. In Part 3 we will focus on virtual memory (vRAM).

As Exchange 2013 is quite RAM intensive and is not unusual to have memory requirements of >128GB RAM in larger deployments. As such, one of the first things we should consider is Virtual Machine maximums.

Luckily in recent years the maximum VM size in vSphere has increased and is no longer a constraint for virtualizing even the largest of Exchange environments.

The current maximum vRAM configuration per VM is shown below:

vSphere Virtual Machine RAM Maximums

Maximum vRAM per VM1TB (vSphere 5.0 or later)
Maximum vRAM per VM: 255GB (vSphere 4.1)

The key point here is memory is in no way a constraining factor when virtualizing Exchange even in older vSphere 4.1 deployments.

Memory Sizing

For maximum Memory performance, sizing the Exchange VM within a NUMA node gives the maximum benefit from NUMA locality, meaning the latency between the CPU and RAM is minimized.

In the event the memory requirements exceed the NUMA node, consider scaling out until you have at least 4 Exchange VMs (across 4 ESXi hosts) before scaling Exchange VMs up. This ensures higher resiliency and aligns with a Virtualization friendly scale out approach. Once the environment has 4 or more Exchange VMs, scaling up beyond the size of a NUMA node can be a good option to reduce the number of Exchange instances to manage and license without significantly impacting resiliency.

Memory Overcommitment

ESXi has excellent memory overcommitment capabilities which can work very well depending on the Operating system and application running within the guest. However Exchange is generally considered a Business Critical Application and as such, overcommitting memory for Exchange is generally not a good idea and should be avoided where possible.

Memory Reservations

For Exchange VMs, I recommended configuring the VM with “All Memory Locked” or in other words, a 100% memory reservation.

This has two advantages, the first being consistent memory performance for MS Exchange which is critical to ensure a great end user experience.

The second is the potentially large storage saving as the vSwap file is eliminated. For example, if an Exchange VM has 128Gb RAM and no memory reservation, a 128Gb vSwap file will be created by default in the same Datastore as the VMs .vmx file which could impact storage sizing and performance.

ESXi Host / Cluster Sizing Considerations

Exchange VMs are typically larger than the average VM, as a result they can consume a significant percentage of an ESXi hosts memory resources. As a result it is important to size your ESXi hosts to have sufficient RAM for the Exchange VMs.

As such in cases where the Exchange VM is sized to exceed the NUMA node, I recommend sizing ESXi hosts to have at least 25% more physical RAM than the vRAM assigned to your Exchange VMs.

Example: If your Exchange VM is assigned 96Gb, the ESXi hosts in the cluster should have at least 128Gb. This ensures memory for the hypervisor and other smaller VMs such as Domain Controllers to service things like the global catalog requirements for Exchange without contention.

Recommendations:

1. Set “All Memory Locked” (100% Memory Reservation) for Exchange VMs.
2. Where possible, size the Exchange VMs RAM within a NUMA node.
3. Where Exchange RAM requirements exceed that of the NUMA node ensure the size ESXi hosts to have at least 25% more RAM than the Exchange VM (or the largest vRAM VM in the cluster)
4. Ensure VMs vRAM is right sized after deployment to minimize waste (especially considering the recommendation to use memory reservations)

Back to the Index of How to successfully Virtualize MS Exchange.