Example Architectural Decision Competition – Submissions

All suitable Example architectural decision submissions will be posted here, please vote for your favourite decision by leaving a comment on this page with the example decision number.

SUBMISSIONS FOR ROUND 1 (Closed!)

1. TSM backup configuration for PureFlex environment?

2. Use of RDMs in Standard IaaS Clusters

3. Scalable network architecture for VXLAN

4. vCloud Allocation Pool Usable Memory

5. New vSphere 5.x environment

6. Improve Performance for BCAs on Cisco UCS

7. (More Coming Soon)

WINNER ROUND ONE: Use of RDMs in Standard IaaS Clusters by Chris Jones @cpjones44

This design decision works around some fairly strict constraints, such as no >2TB LUNs, no IP based storage & the inability for monitoring solution to be customized.

While the decision is ultimately fairly straight forward, the decision documents the issue well and justifies the decision and discusses in depth the implications of the decision.

This is an example of a fairly obvious decision (considering the constraints) but shows even where a decision may be obvious, or the only option, that understanding the implications is important. Documenting even obvious decisions is also important so in the event of movement within the team, the solution can be understood by people not involved in the original design process.

RUNNER UP ROUND ONE: TSM backup configuration for PureFlex environment? By Ash Simpson @Yipikaye1

Not unlike Chris Jones’ decision, Ash’s submission works within the constraints of an existing environment, where hardware and software has already been purchased. This is a common issue, where Hardware / Software is purchased before a detailed design phase. This is a huge problem in the industry and I encourage you all to ensure this trend does not continue. Without a detailed design phase, it is not possible to confirm what hardware/software is required, as such hardware/software should only be purchased after the design to completed.

Again this decision is fairly obvious given the constraints, but the decision explains the benefits of this method of configuration and discusses the implications which is important.

The constraints did not list anything preventing purchasing of a different backup solution, although this is somewhat implied by the assumptions.

Congratulations to Chris Jones @cpjones44 & Ash Simpson @Yipikaye1!

Thank you to everyone who submitted design decisions, and I encourage you all to submit new decisions for Round 2 and am looking forward to new competition participants.

SUBMISSIONS FOR ROUND 2 (Closing 31st October 2013)

1. (More Coming Soon)

2. (More Coming Soon)

3. (More Coming Soon)

 

Example Architectural Decision – vMotion configuration for Cisco UCS

Problem Statement

In an environment where a customer has pre-purchased Cisco UCS to replace end of life equipment, what is the most suitable way to configure vMotion to make the most efficient use of the infrastructure?

Assumptions

1. vSphere 5.1 or greater
2. Two x 10GB Network interfaces per UCS Blade (Cisco Palo Adapters)
3. Core & Edge Network topology is in place using Cisco Nexus
4. Cisco Fabric Interconnects are in use

Motivation

1. Optimize performance for vMotion without impacting other traffic
2. Reduce complexity where possible
3. Minimize network traffic across the Nexus core

Architectural Decision

Two (2) vNICs will be presented from the Cisco fabric interconnect to each blade (ESXi Host) which will appear to the ESXi host as vmNIC0 and vmNIC1.

vNIC0 will be connected to “Fabric A” and vNIC1 will be connected to “Fabric B”.

The vMotion VMKernel (VMK) for each ESXi host will be configured on a vSwitch (or Distributed vSwitch) with two (2) 10GB Network adapters with vmNIC0 as “Active” and vmNIC1 as “Standby”.

Fabric failover will not be enabled in the fabric interconnect.

vmNIC Failback at the vSphere layer will be disabled.

Justification

1. Under normal circumstances vMotion traffic will only traverse Fabric A and will not impact Fabric B or the core network thus it will minimize the north-south traffic.
2. In the event that Fabric A suffers a failure of any kind, the VMK for vMotion will failover to the standby vNIC (vmNIC1) which will result in the same optimal configuration as traffic will only traverse Fabric B and not the core network thus it will minimizing the north-south traffic.
3. The failover is being handled by vSphere at the software layer which removes the requirement for fabric failover to be enabled. This allows a vSphere administrator to have visibility of the status of the networking without going into the UCS Manager.
4. The operational complexity is reduced
5. The solution is self healing at the UCS layer and this is transparent to the vSphere environment
6. At the vSphere layer, failback is not required as using Fabric B for all VMK vMotion traffic is still optimal. In the event Fabric B fails, the environment can failback automatically to Fabric A.

Implications

1. Initial setup has a small amount of additional complexity however this is a one time task (Set & Forget)
2. vNIC0 and vNIC1 need to be manually configured to Fabric A and Fabric B at the Cisco Fabric Interconnect via UCS manager however this is also a one time task (Set & Forget)

Alternatives

1. Use Route Based on Physical NIC Load and have VMK for vMotion managed automatically by LBT
2. Use vPC and Route based on IP Hash for all vSwitch traffic (including vMotion VMK)
3. Use the Fabric Failover option at the UCS layer using a single vNIC presented to ESXi for all traffic
4. Use the Fabric Failover option at the UCS layer using two vNICs presented to ESXi for all traffic – Each vNIC would be pinned to a single Fabric (A or B)

Thank you to Prasenjit Sarkar (@stretchcloud) for Co-authoring this Example Architectural Decision.

Related Articles

1. Trade-off factor – Cisco UCS Fabric Failover OR OS based NIC teaming using dual fabric (Stretch-cloud – By Prasenjit Sarkar @stretchcloud)
2 . Why You Should Pin vMotion Port Groups In Converged Environments (By Chris Wahl @ChrisWahl)

Example Architectural Decision – Hyperthreading with Business Critical Applications (Exchange 2013)

Problem Statement

When Virtualizing Exchange 2013 (which is considered by the customer as a Business Critical Application) in a vSphere cluster shared with other production workloads of varying sizes and performance requirements,  should Hyper Threading (HT) be used?

Assumptions

1. vSphere 5.0 or greater
2. Exchange Servers are correctly sized day one or are subsequently “Right Sized”
3. Cluster average CPU overcommitment of 3:1

Motivation

1. Ensure Optimal performance for BCAs (Exchange)
2. Ensure Optimal performance for other Virtual servers in the shared vSphere cluster

Architectural Decision

Enable Hyper Threading (HT)

Alternatives

1. Disable Hyper Threading (HT)
2. Enable Hyper Threading but configure Exchange Virtual machine/s with Advanced CPU, HT Sharing Mode of “None” to ensure Exchange is always scheduled onto physical CPU cores
3. Split off a limited number of ESXi hosts and form a dedicated BCA cluster w/ <= 2:1 overcommitment and disable HT
4. Disable HT on a number of nodes within the cluster but leave HT enabled on other nodes and use DRS rules to pin Exchange VMs to non HT hosts

Justification

1. Enabling Hyper Threading (HT) improves the efficiency of the CPU scheduler, which will minimize the possibility of CPU Ready for the Exchange server/s and other virtual machines on the host where a low level of overcommitment exists (<2:1)
2. For optimal performance, DRS “Should” rules will be used to keep Exchange (BCA) workloads on specific ESXi hosts within the cluster where <=2:1 CPU overcommitment is maintained
3. Configuring Advanced CPU, HT Sharing Mode to “None” (to guarantee only pCore’s are used) may result in increased CPU Ready as the CPU scheduler is forced to find (and wait) for available pCore’s which may result in degraded or inconsistent performance.
4. Sizing for the Exchange solution was completed taking into account only pCore’s (Not HT cores) to simplify sizing
5. As the cluster where Exchange is virtualized is shared with other server workloads with varying levels of importance and performance, HT benefits the vast majority of workloads and results in a higher consolidation ratio and better performance for the vSphere cluster as a whole.
6. In physical servers, enabling Hyper Threading on Exchange servers resulted in wasted or excessive RAM usage for .NET garbage collection due to memory for .NET being allocated based on logical cores. This does not impact “Right Sized” Virtual Machines as only the required number of vCPUs are assigned to the VM, and therefore available to the Guest OS. This avoids the issue of memory being wasted for HT cores.
7. The CPU scheduler in vSphere 5.0 or later is very efficient and can intelligently schedule workload on a hyper-thread or a physical core depending on the VMs CPU demand. While the Exchange server will at some point be scheduled onto a HT thread, this is not likely to be for any extended duration or have any significant impact on performance.
8. Splitting the cluster into BCA’s and server workloads would increase the HA overhead, and effective reduce the usable compute capacity of the infrastructure.
9. Having a cluster with varying configurations (eg: HT enabled on some hosts and not others) is not advisable as it may lead to inconsistent performance and adds unnecessary complexity to the environment

Implications

1. In the event the vCPU to pCore ratio is > 2:1 for any reason (including HA event & Virtual Server Sprawl) the number of users supported and/or the performance of the Exchange environment may be impacted
2. DRS “Should” rules will need to be created to keep Exchange workloads on hosts with <2:1 vCPU to pCore ratio