Example Architectural Decision – Storage Protocol Choice for a VMware View Environment using Linked Clones

Problem Statement

What is the most suitable storage protocol for a Virtual Desktop (VMware View) environment using Linked Clones?

Assumptions

1.  The Storage Array supports NFS native snapshot offload
2. VMware View 5.1 or later

Motivation

1. Minimize recompose (maintenance) window
2. Minimize impact on the storage array and HA/DRS cluster during recompose activities
3. Reduce storage costs where possible
4. Simplify the storage design eg: Number/size of Datastores / Storage Connectivity
5. Reduce the total solution cost eg: Number of Hosts required

Architectural Decision

Use Network File System (NFS)

Justification

1. Using native NFS snapshot (VAAI) offloads the creation of VMs to the array, therefore reducing the compute overhead on the ESXi hosts
2. Native NFS snapshots require much less disk space than traditional linked clones
3. Recomposition times are reduced due to the offloading of the cloning to the array
4. More virtual machines can be supported per NFS datastore compared to VMFS datastores (200+ for NFS compared to max recommended of 140, but it is generally recommended to design for much lower numbers eg: 64 per VMFS)
5. Recompositions/Refresh activities can be performed during business hours, or at Logoff (for Refresh) with minimal impact to the HA/DRS cluster, thus giving more flexibility to maintain the environment
6. Avoid’s potential VMFS locking issues – although this issue is not as important for environments using vSphere 4.1 onward with VAAI compatible arrays
7. When sizing your storage array, less capacity is required. Note: Performance sizing is also critical
8. The cost of a FC Storage Area Network can be avoided
9. Fewer ESXi hosts may be required as the compute overhead of driving cloning has been removed

Implications

1.  In the current release, 5.1, View Storage Accelerator (formally Content Based Read Cache or CBRC) is not supported when using Native NFS snapshots (VAAI)
2. Also in the current release 5.1, “Use native NFS snapshots (VAAI) is in “Tech Preview” – This is rumored to change in View 5.2

Alternatives

1. Use VMFS (block) based datastores and have more VMFS datastores – Note: Recompose activity will be driven by the host which adds an overhead to the cluster.

Example Architectural Decision – Enhanced vMotion Compatiblity

Problem Statement

The virtual infrastructure is required to scale over time as demand for compute and/or availability increases.
When purchasing additional ESXi hosts over an expected ESXi host hardware life of >=3 year it is unlikely that the exact make/model of server or CPU type will be available. The solution needs to ensure full functionality across ESXi hosts (specifically vMotion) which may not be exactly the same hardware, although all processors will always be from the same vendor.

How can the vSphere cluster/s be configured for maximum flexibility without significant impact to Virtual machine performance?

Assumptions

1. All CPU types will be Intel or AMD but not a mix of the two
2. All CPUs will have a supported EVC mode

Motivation

1. Ensure full functionality between ESXi hosts whos Intel CPUs may not match exactly
2. Prevent having to purchase large volumes of identical hardware at one time
3. Allow vSphere clusters to be expanded over time using similar, but not identical hardware although maintaining the same CPU make.

Architectural Decision

Enable EVC and maintain it at the maximum supported EVC level for all ESXi hosts in each vSphere cluster.

Justification

1. vMotion is a requirement for the cluster/s to ensure maximum flexibility
2. It is essential to avoid downtime where possible. EVC ensures VMs can be vMotion’d to newer hosts for the purpose of expanding a cluster, OR alternatively, to newer hardware so older hardware can be decommissioned without impact to the VM.
3. The EVC level for the cluster can be increased without downtime
4. Having EVC disabled would require virtual machines being migrated to new hardware have downtime where CPU types are not similar
5. If EVC was not enabled, newer hardware may be placed into a new (smaller) cluster/s and this would add an unnecessary HA overhead as well as reduce the efficiency of DRS

Implications

1. Where the EVC level for a cluster is increased, virtual machines will not leverage new CPU features unmasked by EVC until the next reboot
2. In the event new hardware is added to a cluster and the new hardware is compatible with a higher EVC mode, a virtual machine which has a workload which can benefit from CPU features masked by the existing EVC mode may not perform at the optimal level until older hardware is removed from the cluster and the EVC mode increased.

Alternatives

1. Leave EVC disabled and where CPU types are not compatible to vMotion, shutdown the guest OS for migrations.

Example Architectural Decision – Virtual Machine swap file location

Problem Statement

When using shared storage where deduplication is utilized along with an array level snapshot based backup solution, what can be done to minimize the wasted capacity of snapping transient files in backups and the CPU overhead on the storage controller having to attempt to deduplicate data which cannot be deduped?

Assumptions

1. Virtual machine memory reservations cannot be used to reduce the vswap file size

Motivation

1. Reduce the snapshot size for backups without impacting the ability to backup and restore
2. Minimize the overhead on the storage controller for deduplication processing
3. Optimize the vSphere / Storage solution for maximum performance

Architectural Decision

1. Configure the HA swap file policy to store the swap file in a datastore specified by the host.
2. Create a new datastore per cluster which is hosted on Tier 1 storage and ensure deduplication is disabled on that volume
3. Configure all Host’s within a cluster to use the same specified datastore for vswap files
4. Ensure the new datastore is not part of any backup job

Justification

1. With minimal added complexity, backup jobs now exclude the VM swap file which reduces the backup size by the total amount of vRAM assigned to the VMs within the environment
2. As the vswap file is recreated at start up, loosing this file has no consequence
3. Decreasing Tier 1 storage requirements
4. The storage controller will not waste CPU cycles attempting to dedupe data which will not dedupe
5. Setting high percentages of memory reservation may impact the overcommitment in the environment where specifying a datastore for vswap reduces overhead without any significant downside

Implications

1. A datastore will need to be created for swapfiles
2. HA will need to be configured to store the swap file in a datastore specified by the host
3. The host (via Host Profiles) will need to be configured to use a specified datastore for vswap
4. vMotion performance will not be impacted where a VM is vMotion’d between two hosts that do not have a common vswap datastore as one datastore per cluster will be used for vswap files
5. The datastore will need to be sized to take into account the total vRAM assigned to VMs within the cluster

Alternatives

1. Set Virtual machine memory reservations of 100% to eliminate the vswap file
2. Store the swap file in the same directory as the virtual machine and accept the overhead on backups & dedupe
3. Use multiple datastores for vSwap across the cluster and accept the impact on vMotion