Example Architectural Decision – Storage DRS Configuration for NFS Datastores

Problem Statement

In a vSphere environment, a NAS array is presenting Thin Provisioned NFS mounts (Datastores) to the vSphere environment. The storage has deduplication enabled across the datastores being used for the SDRS cluster. What is the most suitable configuration for SDRS to ensure the underlying storage efficiencies are not compromised while maintaining an even distribution of utilized capacity and I/O across all datastores?

Assumptions

1. vSphere 5.0 or later
2. NFS Based storage
3. NFS Mounts (Datastores) are Thin Provisioned
4. Deduplication is enabled on the array
5. VAAI is supported by the array and enabled across the vSphere environment
6. All datastores in a Datastore cluster are of the same RAID Type / Offer Similar performance due to having a similar spindle count
7. All datastores are presented to all hosts within the cluster

Motivation

1. Ensure storage efficiencies are not negatively impacted
2. Minimize the vSphere administrators workload where possible

Architectural Decision

Set the DRS automation setting to “No Automation (Manual Mode)”

  • Set “Utilized Space” threshold to 80%
  • Set “I/O latency” to 15ms
  • I/O Metric Inclution – Enabled

Advanced Options

  • No recommendations until utilization difference between source and destination is: 10%
  • Evaluate I/O load every 8 Hours
  • I/O Imbalance threshold  3

Justification

1. Setting Storage DRS to “No Automation (Manual Mode)” ensures that the administrator can confirm the recommendation will not Negatively impact the efficiency of Deduplication or  the thin provisioned NFS mounts
2. When creating a new Virtual Machine, in the “Ready to complete” window, Tick the “Show all storage recommendations” check box to review Storage DRS recommendations and override the recommendations where required
3. Where a VM is deduplicated on the source datastore, and it is moved to the destination datastore, this write activity is considered new data which will scanned by the post deduplication process which will use valuable CPU cycles on the array
4. “XCOPY” is not supported for NFS, as such, any Storage vMotion activity can only be offloaded to the array using the “Full File Clone” when a virtual machine is powered off.
5. Array level snapshots cannot be migrated with the VM using Storage DRS. If Virtual machines were automatically moved then the array level snapshot relasionship with the VM is broken and it cannot be leveraged
6. NFS datastores can be set to autogrow  by a predefined size in the event they reach a predefined utilization threashold
7. Where a significant I/O imbalance is detected by SDRS, the vSphere administrator can consider the impact of the Storage vMotion and where suitable apply the SDRS recommendation
8. SDRS still provides valuable “initial placement” for new virtual machines which will help avoid a situation where datastores are unevenly balanced from a capacity perspective
9. Storage DRS will still analysis I/O and where an imbalance is identified the vSphere administrator can choose to apply the SDRS recommendation to address the I/O imbalance

Implications

1. When selecting datastores for the datastore cluster, having VASA enabled allows the “System Capability” column to be populated in the “New Datastore Cluster” wizard to ensure suitable datastores of similar performance, RAID type and features are grouped together
2. A vSphere administrator will need to review SDRS recommendations

Alternatives

1. Use “Fully Automated”

Example Architectural Decision – HA Admission Control Policy with Software licensing constaints

High Availability Admission Control Setting & Policy with a Software Licensing Constraint

Problem Statement

The customer has a requirement to virtualize “Application X” which is currently running on physical servers. The customer is licensed for a maximum of 32 cores and the software vendor has strict licensing restrictions which do not recognize the use of DRS rules to restrict virtual machines to a sub-set of hosts within a cluster.

The application is Tier 1, and requires maximum availability. A capacity planner assessment has been conducted and found 32 cores and 256Gb RAM is sufficient to run all servers.

The servers requirements vary greatly from 1vCPU/2GB RAM to 8vCPU/64GB Ram with the bulk of the VMs 2vCPU or less with varying RAM sizes.

What is the most suitable hardware configuration and HA admission control policy / setting  that complies with the licensing restrictions while ensuring N+1 redundancy and minimizing the change of poor application performance?

Assumptions

1. None

Constraints

1. Software vendor has strict licensing requirements
2. Only 32 cores are licensed and the customer has no budget for further licenses
3. DRS rules cannot be used to isolate VMs onto one or more hosts due to software licensing agreement

Motivation

1. Ensure maximum availability for the Tier 1 application/s
2. Ensure optimal performance for Tier 1 application/s

Architectural Decision

Purchase a total of three (3) x Two (2) Way Servers, with 8 core CPUs and 128GB Ram each and form a cluster of three nodes.

For the HA Admission control setting use “Enable – Do not power on virtual machines that violate availability constraints”

For the HA admission control policy use “Specify a Failover Host” and select the third host in the cluster. (Leaving two active hosts in the cluster).

Justification

1. Enabling strict admission control is critical to ensure the required level of availability for the Tier 1 application
2. Ensure maximum CPU scheduling efficiency by having two hosts active within the cluster running virtual machines as opposed to a single large host
3. Having 2 active hosts in the cluster allows DRS some flexibility to load balance to resolve contention compared to using a single large 32 core host
4. N+1 redundancy is achieved as one host can fail and the “fail-over” host will become active and be able to take the failed hosts workloads without performance degrading
5. As only 32 cores ( 2 servers with 16 cores each) are active at any one time, the solution complies with the licensing constraint
6. Using CPUs with smaller numbers of cores (such as 5 x 2 way servers with 4 cores per socket) would result in larger VMs not fitting within NUMA nodes and potentially impacting memory performance. Although, with vNUMA in vSphere 5.0 this would be less of an issue.
7. All VMs will fit within a NUMA node thus giving the VMs maximum performance without the requirement for vNUMA which is only available in vSphere 5.0 or later
8. The compute resource supplied by the proposed cluster is sufficient to run the workloads as per the capacity planner assessment.

Implications

1. Additional networking and storage ports for three hosts as opposed to a two host cluster
2. If additional compute is required in the cluster, additional software licenses would need to be purchased. Alternativley if the application servers were redesigned to use a scale out methodology (especially for VMs with 4-8vCPUs) it would likley result in higher overcommitment ratios without significant contention and better utilization of the existing licensed cores
3. One host is sitting as a hot standby not servicing customer workloads and may be considered to be “waste”

Alternatives

1. Use 2 x 4 way 8 core ESXi hosts (32 cores per host) and set HA admission control to specify a fail over host
2. Use 5 x 2 Way 4 core ESXi hosts (8 cores per host) and set HA admission control to specify a fail over host

The Below is a basic diagram of the proposed solution.

FailoverHost

*Post updated February 11th to correct an error.

High CPU Ready with Low CPU Utilization?

I have noticed an increasing amount of search engine terms which results in people accessing my blog similar to

* High CPU Ready Low CPU usage
* CPU ready and Low utilization
* CPU ready relationship to utilization

So I wanted to try and clear this issue up.

First lets define CPU Ready & CPU Utilization.

CPU ready (percentage) is the percentage of time a virtual machine is waiting to be scheduled onto a physical (or HT) core by the CPU scheduler.

CPU utilization measures the amount of Mhz or Ghz that is being used.

Next to find out how much CPU ready is ok, check out my post How Much CPU ready is OK?

CPU Ready and CPU utilization have very little to do with each other, high CPU utilization does not mean you will have high CPU ready, and vice versa.

So it is entirely possible to have either of the below scenarios

Scenario 1 : An ESXi host has 20% CPU utilization and VMs to suffer high CPU ready (>10%).
Scenario 2: An ESXi host has 95% CPU utilization and VMs to have little or no CPU ready (<2.5%)

How are the above two scenarios possible?

Scenario 1 may occur when

* One or more VMs are oversized (ie: not utilizing the resources they are assigned)
* The host (or cluster) is highly overcommited (either with or without right sized VMs)
* Where power management settings are set to Balanced / Low Power or custom

Scenario 2 may occur when

* VMs are correctly sized
* The ESXi hosts are well sized for the virtual machine workloads
* The VM to host ratio has been well architected

So the question on everyone lips, How can high CPU ready with Low CPU utilization be addressed/avoided?

If you have a situation where you are experiencing high CPU ready and low ESXi host utilization the following steps should be taken

* Right size your VMs

This is by far the most important thing to do. I Recommend using a tool such as vCenter Operations to assist with determining the correct size for VMs.

* Ensure your hosts/clusters are not excessively overcommited

I generally find 4:1 vCPU overcommitment is achievable with right sized VMs where the avg VM size is <4 vCPUs. The higher the vCPU per VM average, the lower CPU overcommitment you will achieve.)
If you have an average VM size of 8 vCPUs then you may only see <1.5:1 overcommitment before suffering contention (CPU ready).

* Use DRS affinity rules to keep complimentary workloads together
VMs with high CPU utilization and VMs with very low CPU utilization can work well together. You  also may have an environment where some servers are busy overnight and others are only busy during business hours, these are examples of workload to keep together.

* Use DRS anti-affinity rules to keep non-complimentary workloads apart

VMs with very high CPU utilization (assuming the high utilization is at the same time) can be spread over a number of hosts to avoid stress on the CPU scheduler.

* Ensure your ESXi hosts are chosen with the virtual machine workloads in mind
If your VMs are >=8vCPUs choose a CPU with >=8 cores per socket and more sockets per host, like 4 socket hosts as opposed to 2 socket hosts. If the bulk of your VMs are 1 or 2 vCPUs, then even older 2 socket 4 core processors should generally work well.

* Use Hyperthreading
Assuming you have a mix of workloads and not all VMs require large amounts of cores and Ghz, using hyper threading increases the efficiency of the CPU schedulure by effectively doubling the scheduling opportunities. Note: A HT core will generally give much less than half the performance of a pCore.

* Use “High Performance” for your Power Management Policy

The above seven (7) steps should resolve the vast majority of issues with CPU ready.

For an example of the benefits of right sizing your VMs, check out my earlier post – VM Right Sizing , An example of the benefits.

Also please note, using CPU reservations does not solve CPU ready, I have also written an article on this topic – Common Mistake – Using CPU reservations to solve CPU ready

I hope this helps clear up this issue.