Problem Statement
In a production , self service vCloud Director environment, What is the most suitable Storage DRS configuration to improve storage utilization , performance, as well as reduce administrative effort for BAU staff?
Requirements
1. Make the most efficient use of the available storage capacity
2. Maintain consistent level of storage performance
3. Reduce the risk and overhead of capacity management
4. Reduce the risk of a unintentional or otherwise DoS event caused by self service
Assumptions
1. vSphere 5.0 or later
2. VMFS 5 Datastores which are Thick Provisioned
3. Deduplication is not in use
4. VAAI is supported by the array and enabled across the vSphere environment
5. All datastores in each respective Datastore clusters reside on the same RAID type with similar spindle types and count
6. All datastores are presented to all hosts within the cluster
7. Array level snapshots are not in use
8. IBM SVC Storage is being used
9. vCloud Director 5.1 or later
10. Storage I/O Control is enabled at set to 30ms
Constraints
1. IBM SVC storage does not currently support VASA (VMware API for Storage Awareness)
Motivation
1. Ensure production storage performance is not negatively impacted
2. Minimize the vSphere administrators workload where possible
Architectural Decision
Set the DRS automation setting to “Fully Automated”
- Set “Utilized Space” threshold to 80%
- Set “I/O latency” to 15ms
- I/O Metric Inclusion – Enabled
Advanced Options
- No recommendations until utilization difference between source and destination is: 10%
- Evaluate I/O load every 8 Hours
- I/O Imbalance threshold 4
Justification
1. Setting Storage DRS to “Fully Automated” ensures that the administrator does not need to be concerned with initial placement of virtual machines as this will be dynamically and intelligently determined and executed
2. “XCOPY” is fully supported for Block based storage, as such, any Storage vMotion activity is offloaded to the array therefore removing the I/O overhead on the compute and storage fabric.
3. Where a significant I/O imbalance is detected by SDRS, the vSphere administrator is not required to take any action, Storage DRS can identify and remediate issues which fall outside parameters (which are determined by the VMware Architect) automatically. This improves the efficiency of the environment, and reduces the involvement of BAU.
4. SDRS provides valuable “initial placement” for new virtual machines which will help avoid a situation where datastores are unevenly balanced from a capacity perspective in the first place, therefore reducing the chance of virtual machines requiring migration.
5. Setting the “No recommendations until utilization difference between source and destination is” to 20% ensures that SDRS does not move virtual machines around where significant benefit is not realized This prevents unnecessary Storage vMotion activity on the disk system, although this is offloaded from the host to the array, the I/O still may impact production performance for workloads on the same disk system.
6. Setting the “I/O Imbalance threshold (5 Aggressive / Conservative 1 ) to “4” (2nd most aggressive) ensures that I/O imbalance should be addressed before significant imbalance is experienced by the end users. This level of “ aggressiveness” is acceptable as the Storage vMotion can be offloaded (via VAAI “XCOPY” primitive and has almost zero impact on the host. Setting this to “5” may result in minor I/O imbalances being corrected, at the cost of a Storage vMotion and as a result the impact of the more frequent Storage vMotion activity may negate the benefit of the I/O balancing.
7. Storage DRS will address I/O imbalance across the datastore cluster if the latency meets or exceeds the set value of 15ms (the default) and in the event of latency increasing during peak times to >=30ms , Storage I/O Control will ensure fair acess to the storage.
Alternatives
1. Use “No Automation (Manual Mode)”
2. Not use Storage DRS
Implications
1. When selecting datastores for the datastore cluster, having VASA enabled allows the “System Capability” column to be populated in the “New Datastore Cluster” wizard to ensure suitable datastores of similar performance, RAID type and features are grouped together. VASA is currently NOT supported by SVC, as such the datastore naming convension needs to accurately reflect the capabilities of the LUN/s to ensure suitable datastores are grouped together.