Example Architectural Decision – Datastore (LUN) and Virtual Disk Provisioning (Thin on Thin)

Problem Statement

In a vSphere environment, What is the most suitable disk provisioning type to use for the LUN and the virtual machines to ensure minimum storage overhead and optimal performance?

Requirements

1. Ensure optimal storage capacity utilization
2. Ensure storage performance is both consistent & maximized

Assumptions

1. vSphere 5.0 or later
2. VAAI is supported and enabled
3. The time frame to order new hardware (eg: New Disk Shelves) is <= 4 weeks
4. The storage solution has tools for fast/easy capacity management

Constraints

1. Block Based Storage

Motivation

1. Increase flexibility
2. Ensure physical disk space is not unnecessarily wasted

Architectural Decision

“Thin Provision” the LUN at the Storage layer and “Thin Provision” the virtual machines at the VMware layer

(Optional) Do not present more LUNs (capacity) than you have underlying physical storage (Only over-commitment happens at the vSphere layer)

Justification

1. Capacity management can be easily managed by using storage vendor tools such eg: Netapp VSC / EMC VSI / Nutanix Command Center
2. Thin Provisioning minimizes the impact of situations where customers demand a lot of disk space up front when they only end up using a small portion of the available disk space
3. Increases flexibility as all unused capacity of all datastores and the underlying physical storage remains available
4. Creating VMs with “Thick Provisioned – Eager Zeroed” disks would unnessasarilly increase the provisioning time for new VMs
5. Creating VMs as “Thick Provisioned” (Eager or Lazy Zeroed) does not provide any significant benefit (ie: Performance) but adds a serious capacity penalty
6. Using Thin Provisioned LUNs increases the flexibility at the storage layer
7. VAAI automatically raises an alarm in vSphere if a Thin Provisioned datastore usage is at >= 75% of its capacity
8. The impact of SCSI reservations causing performance issues (increased latency) when thin provisioned virtual machines (VMDKs) grow is no longer an issue as the VAAI Atomic Test & Set (ATS) primitive alleviates the issue of SCSI reservations.
9. Thin provisioned VMs reduce the overhead for Storage vMotion , Cloning and Snapshot activities. Eg: For Storage vMotion it eliminates the requirement for Storage vMotion (or the array when offloaded by VAAI XCOPY Primitive) to relocate “White space”
10. Thin provisioning leaves maximum available free space on the physical spindles which should improve performance of the storage as a whole
11. Where there is a real or perceved issue with performance, any VM can be converted to Thick Provisioned using Storage vMotion not disruptivley.
12. Using Thin Provisioned LUNs with no actual over-commitment at the storage layer reduces any risk of out of space conditions while maintaining the flexibility and efficiency with significantly reduce risk and dependency on monitoring.
13. The VAAI UNMAP primitive provides automated space reclamation to reduce wasted space from files or VMs being deleted

Alternatives

1.  Thin Provision the LUN and thick provision virtual machine disks (VMDKs)
2.  Thick provision the LUN and thick provision virtual machine disks (VMDKs)
3.  Thick provision the LUN and thin provision virtual machine disks (VMDKs)

Implications

1. If the storage at the vSphere and array level is not properly monitored, out of space conditions may occur which will lead to downtime of VMs requiring disk space although VMs not requiring additional disk space can continue to operate even where there is no available space on the datastore
2. The storage may need to be monitored in multiple locations increasing BAU effort
3. It is possible for the vSphere layer to report sufficient free space when the underlying physical capacity is close to or entirely used
4. When migrating VMs from one thin provisioned datastore to another (ie: Storage vMotion), the storage vMotion will utilize additional space on the destination datastore (and underlying storage) while leaving the source thin provisioned datastore inflated even after successful completion of the storage vMotion.
5.While the VAAI UNMAP primitive provides automated space reclamation this is a post-process, as such you still need to maintain sufficient available capacity for VMs to grow prior to UNMAP reclaiming the dead space

Related Articles

1. Datastore (LUN) and Virtual Disk Provisioning (Thin on Thick)CloudXClogo

 

Storage DRS Configuration – Architectural Decision making flowchart

I was speaking to a number of people recently, who were trying to come up with a one size fits all Storage DRS configuration for a reference architecture document.

As Storage DRS is a reasonably complicated feature, it was my opinion that a one size fits all would not be suitable, and that multiple examples should be provided when writing a reference architecture.

A collegue suggested a flowchart would assist in making the right decision around Storage DRS, so I took up the challenge to put one together.

The below is my version 0.1 of the flowchart, which I thought I would post and hopefully get some good feedback from the community, and create a good guide for those who may not have the in-depth knowledge or experience, too choose what should be in most cases an appropriate configuration for SDRS.

This also compliments some of my previous example architectural decisions which are shown in the related topic section below.

As always, feedback is always welcomed.

I hope you find this helpful.

* Updated to include the previously missing “NO” option for Data replication.

SDRS flowchart V0.2

Related Articles

1. Example Architectural Decision – Storage DRS configuration for NFS datastores

2. Example Architectural Decision – Storage DRS configuration for VMFS datastores

Example Architectural Decision – vSphere configuration for handling APD/PDL scenarios

Problem Statement

What is the best way to configure the vSphere environment to handle All Paths Down (APD) and Permanent Device Loss (PDL) situations where the environment uses Active/Active (IBM SVC) storage with FC connectivity via a dedicated highly available Storage Area Network (SAN) fabric?

Requirements

1. Ensure in the event of storage issues the impact to the vSphere environment is minimized.
2. Where possible have the environment automatically respond in the event of storage problems

Assumptions

1. vSphere 5.1 or later
2. The Storage Area Network (SAN)  fabric is highly available (>99.999% availability)
3. All storage is FC (block) based via an Active/Active Disk array (IBM SVC disk system)
4. All ESXi hosts have storage connectivity via multiple HBAs
5. All ESXi hosts are connected to two (2) physically separate FC switches
6. The Path Selection Plugin (PSP) being used is “VMW_PSP_RR” (Round Robin)

Constraints

1. None

Motivation

1. Minimize impact of APD and PDL situations

Architectural Decision

Configure the following advanced settings

Set “Misc.APDHandlingEnable” to 1 (0 is default which is Disabled)
Set “Misc.APDTimeout” to 20 (140 seconds is default)

Set “disk.terminateVMOnPDLDefault” to 1 (Enabled)
Set “das.maskCleanShutdownEnabled” to 1 (Enabled)

Justification

1. The storage array (IBM SVC) operates in an Active/Active manor where the Path Selection Plugin (PSP) is either “VMW_PSP_RR” (Round Robin), “VMW_PSP_MRU” (Most Recently Used) OR “VMW_PSP_FIXED_AP” (Note: Now included in VMW_PSP_FIXED in vSphere 5.1), in the event of one or more path failures, the PSP will handle this event and use a working path. Where an APD situation occurs in a highly available SAN fabric it is likely the issue is a catastrophic failure and it is ideal to terminate I/O as soon as possible. As such lowering the “Misc.APDTimeout” to 20 (minimum) allows for a short outage but does not allow the VM to continue attempting I/O where it cannot be committed to disk.

2. After 20 seconds, any I/O from the VMs will be “fast-failed” with a status of “No_Connect” to prevent “hostd” worker threads being exhausted and causing the “hostd” service to become hung thus increasing resiliency at the ESXi layer.

3. In the event not all hosts in the cluster are impacted by the PDL, HA can detect the PDL on one (or more) hosts and restart the virtual machines on one of the hosts in the cluster which do not have the PDL state on the datastore/s

  • 4. Having “disk.terminateVMOnPDLDefault” enabled , ensures VMs are shutdown in a PDL event
  • 5.

  • The “das.maskCleanShutdownEnabled” setting allows VMs shutdown as a result of a PDL to be automatically restarted by HA

5. Setting the Misc.APDTimeout to “20” does not impact the storage connectivity even in the event of a single SVC cluster node failing as all Storage is Active on all SVC cluster nodes. Note: Half the paths would be lost in the event of a failed SVC cluster node but this does not constitute an APD situation.

Alternatives

1. Leave “Misc.APDHandlingEnable” at 0 (default)
2. Leave “Misc.APDTimeout” at 140 (default) OR set a higher or lower value (20 Min / 99999 Max)
3. Set “das.maskCleanShutdownEnabled” to Disabled
4. Set “disk.terminateVMOnPDLDefault” to 0 (Disabled)
5. Various combinations of the above

Implications

1. After 20 seconds, any I/O from the VMs will be “fast-failed” with a status of “No_Connect”., in the unlikely event of an outage lasting >20 seconds manual intervention will be required.
2. In the event of APD situation, Virtual machines will not be restarted by HA even where other ESXi hosts are not impacted by the APD situation
3. Due to the nature of an APD situation, there is no clean way to recover. Once the issue is resolved at the SAN fabric or disk system layer, ESXi hosts may need to be rebooted.

Related Articles

1. Advanced Configuration options for VMware High Availability in vSphere 5.0 and 5.1 (2033250)

CloudXClogo