Melbourne VMUG Presentation 26/7/2012 – Our VCDX Journey

Shane White and I gave a presentation to the Melbourne VMware User group this past Thursday evening.

For those who are interested, here is a copy of the presentation.

Josh Odgers & Shane White – MVMUG July 27th 2012 – VCDX Experience

The VCDX Application Process

I was asked by a person interested in attempting the VCDX if I could share my VCDX application / design, unfortunately as my application was based on an internal IBM project, it is strictly commercial in confidence.

However, I don’t think this is a huge problem as I can share my experience to assist potential candidates with their applications.

In the VCDX Certification Handbook and Application there are several sections, this post focuses on section 4.5 “Design Deliverable Documentation” and specifically the “A. Architectural design”.

Below is a screen shot of this section.

A piece of advise I shared in my post “The VCDX Journey” was that everything in your design is fair game for the VCDX panel to question you about. So for example if your design includes Site Recovery Manager OR vCloud Director , expect to answer questions about how your design caters for these products.

With that in mind, here are my tips.

Tip # 1 – Your design does not have to be perfect!

Don’t make the mistake of thinking you need to submit a design which follows every single “Best practice”, as this is very rare in reality. “Best Practice” is really a concept for VCP’s and to a lesser extent VCAP’s, a VCDX should be at a level of expertise too develop best practices, rather than follow.

Keep in mind, regardless of the architectural decision/s themselves,  You need to be able to justify them and align them to your “Requirements” , “Constraints” & “Assumptions” in both your documentation and the VCDX defense panel itself.

So you may have been forced to do something which is not best practice and that you wouldn’t recommend due to a “Constraint”. This is not a problem for the VCDX application, but be sure to fully understand the constraint and document in detail why the decision you made was the best opinion.

My design did not follow all best practices, nor was it the fastest or most highly available solution I could have designed. Ensure your aware of things which you could have done better, or could have changed if you did not have certain constraints, and document the alternatives.

I would suggest a design which complied with all best practices could be harder to defend, than one which had a lot of constraints preventing using best practices. As a candidate attempting to demonstrate your “Expert” level knowledge, working around constraints too meet your customer/s requirements would give you a better ability to show your thinking outside the square, so this goes for your documentation as well as the panel itself.

Tip # 2 – Don’t just fill out the VMware Solution Enablement Toolkit (SETs) template!

If your a VMware Partner, you will likely have access to VMware SETs. These a great resources which make doing designs easier (especially for people new to VMware architecture) however they are templates and anyone can fill out a template. As a VCDX applicant, you should be showing your “Expert” level knowledge / experience and innovation.

I personally have created my own template, which is a collaboration of numerous resources, including the SETs, but also has a lot of work I have created myself.

In my template I have a lot more detail than what can be found in the “SET” templates, and this I felt really assisted me in demonstrating my expert level knowledge.

For example I have a dedicated section for Architectural decisions where I had around 25 ADs for the design I submitted for VCDX, which covered not just specific VMware options, but Storage, Backup , network etc as these are all critical parts of a VMware solution. I could have have documented a lot more, but I ran out of time.

Tip # 3 – Document all your Requirements / Constraints / Assumptions and reference them.

Throughout your design, and especially your Architectural decisions, you should refer back to your Requirements, Constraints and Assumptions.

Doing this properly will assist the VCDX panel members who review your design to understand the solution. If the design document doesn’t give the reader a clear understanding of the solution then I would be surprised if you will be invited to defend.

During the VCDX defense, you should talk to how you designed too meet the Requirements and how the constraints impacted your design. You also should call out any assumptions, and discuss what risks or impacts these assumptions may have, this will be a huge help in your VCDX defense. so ensuring you have documented the ADs well for your application, is a big step towards your application being accepted.

Tip # 4 – Have your design peer reviewed

Where possible I always have my work reviewed by colleagues. Even VCAPs & VCDX’s make mistakes, so ensure you have your work reviewed. This is an excellent way to make sure your design makes sense, and is complete.

I touched on this in Tip # 3,  but make sure a person with zero knowledge of the solution, can read your design, and understand the solution. So get a review completed by somebody not involved with the project where possible.

Tip # 5 – Include information about Storage/Networking etc in your design

We all know, no VMware solution is complete without some form of Network & Storage, so ensure that your design has at least some high level details of the network & storage. This should assist you in other sections of your design document explaining your Architectural decisions, and give the reader a clearer picture of the whole environment.

Include diagrams of the end to end solution in an appendix so the reader can refer to them if any clarification is required.

Tip # 6 – Read the VCDX handbook and address each criteria.

As per the requirement document screen shot (above), the handbook actually tells you what VMware are looking for in your Architecture design.

It states “Including but not limited to: logical design, physical design, diagrams, requirements, constraints, assumptions and risks.”

In my design, Originally it didn’t in my opinion strictly meet all of the criteria, so I went back and added details to ensure I exceeded the criteria.

So in choosing what design to use for your application, my recommendation would be too not pick a small/simple design, but choose one which allows you to show your in depth knowledge and some innovation. This will make the application process a little more time consuming from a documentation point of view, but should increase your chance of success at the VCDX defense.

I hope this helps, and best of luck to anyone attempting the VCDX @ VMworld this year!

VMware HA and IP Storage *Updated*

With IP storage (particularly NFS in my experience) becoming more popular over recent years, I have been designing more and more VMware solutions with IP Storage, both iSCSI and NFS.

The purpose of this post is not to debate the pros and cons of IP storage, or Block vs File, or even vendor vs vendor but to explore how to ensure a VMware environments (vSphere 4 and 5) using IP storage can be made as resilient as possible purely from a VMware HA perspective. (I will be writing another post on highly available vNetworking for IP Storage)

So what are some considerations when using IP storage and VMware HA?

In many solutions I’ve seen (and designed), the ESXi Management VMKernel is on “vSwitch0” and uses two (2) x 1GB NICs while the IP storage (and Data network) is on a dvSwitch/es and uses two or more 10Gb NICs which connect to different physical switches than the ESXi Management 1GB NICs.

So does this matter? Well, while it is a good idea, there are some things we need to consider.

What happens if the 1GB network is offline for whatever reason, but the 10GB network is still operational?

Do we want this event to trigger a HA isolation event? In my opinion, not always.

So lets investigate further.

1. Host Isolation Response.

Host Isolation response is important to any cluster, but for IP storage it is especially critical.

How does Host Isolation Response work? Well, in vSphere 5, it requires 3 conditions to be met

1. The host fails to receive heartbeats from the HA master

2. The host does not receive any HA election traffic

3. Failing conditions 1&2 , the host attempts to ping the “isolation address/es” and is unsuccessful.

4. The isolation response is triggered

So in the scenario I have provided, the goal is to ensure that if a host becomes isolated from the HA Primary nodes (or HA Master in vSphere 5)  via the 1GB Network that the host does not unnecessarily trigger the “host isolation response”.

Now why would you want to stop HA restarting the VM on another host? Don’t we want the VMs to be restarted in the event of a failure?

Yes & No. In this scenario its possible the ESXi host still has access to the IP Storage network, and the VM the data network/s via the 10Gb Network. The 1Gb Network may have suffered a failure, which may effect management, but it may be desirable to leave the VMs running to avoid outages.

If  both the 1GB and 10GB networks go down to the host, this would result in the host being isolated from the HA Primary nodes (or HA Master in vSphere 5), the host would not receive HA election traffic and the host would suffer an “APD” (All Paths Down) condition. HA isolation response will then rightly be triggered and VMs will be “Powered Off”. This is desirable as the VMs could then be restarted on the surviving hosts assuming the failure is not network wide.

Here is a screen grab (vSphere 5) of the “Host Isolation response” setting, which is located when you right click your cluster “Edit Settings”, “vSphere HA” and “Virtual Machine Options”.

The host isolation response setting for environments with IP Storage should always be configured to “Power Off” (and not Shutdown). Duncan Epping explained this well in his blog, so no need to cover this off again.

But wait, there’s more! 😉

How do I avoid false positives which may cause outages for my VMs?

If using vSphere 5, we can use Datastore Heartbeating (which I will discuss later), but in vSphere 4 some more thought needs to go into the design.

So lets recap step three in the isolation detection process we discussed earlier

“3. Failing conditions 1&2 , the host attempts to ping the “isolation address/es”

What is the “isolation address”? By default, its the ESXi Management VMKernel default gateway.

Is this the best address to check for isolation? In a environment without IP storage, normally in my experience it is suitable, although it is best to discuss this with your Network architect as the device you ping needs to be highly available. Note: It also needs to respond to ICMP!

When using IP storage, I recommend overriding the default by configuring the advanced setting “das.usedefaultisolationaddress” value to “false”. Then configure the “das.isolationaddress1” through “das.isolationaddress9” with the IP address/es of your IP storage (in this example, Netapp vFilers), the ESXi host will now ping your IP storage assuming the HA Primaries (or “Master” in vSphere 5) is unavailable and no election traffic is being received) to check if it is isolated or not.

If the host/s complete the isolation detection process and are unable to ping any of the isolation addresses (IP Storage), (and therefore the ESXi host will not be able to access the storage) it will declare itself isolated and trigger a HA isolation response. (Which should always be “Power Off” as we discussed earlier)

The below screen shot shows the Advanced options and the settings chosen.

In this case, the IP Storage (Netapp vFilers) are connected to the same physical 10Gb Switches and the ESXi hosts (one “hop”) so they are a perfect way to test network connectivity of the network and access to the storage.

In the event the IP Storage (Netapp vFilers) are inaccessible, this alone would not trigger HA isolation response as the connectivity to the HA Primary nodes (or HA Master in vSphere 5) may still be functional. If the Storage is in fact inaccessible for >125secs (if using default settings – NFS “HeartbeatFrequency” of 12 seconds & “HeartbeatMaxFailures” of 10) the datastore/s will be marked as Unavailable and a “APD” event may occur. See VMware KB 2004684 for details on APD events.

Below is a screen grab of a vSphere 5 host showing the advanced NFS settings discussed above.

Note: With Netapp Storage it is recommended to configure the VMs with a disk timeout of 190 seconds, to allow for intermittent network issues and/or total controller loss (which takes place in <180 seconds, usually much less), and therefore the VMs can continue running and no outage is caused.

My advice would be modifying the “das.usedefaultisolationaddress” and “das.isolationadressX” is an excellent way in vSphere 4 (and 5) of ensuring your host is isolated or not by checking the IP storage is available, after all, the storage is critical to the ESXi hosts functionality! 😀

If for any reason the IP Storage is not responding, assuming the HA isolation detection process step 1 & 2 have completed, an isolation event is triggered and HA will take swift action (Powering Off the VM) to ensure the VM can be restarted on another host (assuming the issue is not network wide).

Note: Powering Off the VM in the event of Isolation helps prevent a split brain scenario where the VM is live on two hosts at the same time.

While datastore heart-beating is an excellent feature, it is only used by the HA Master to verify if a host is “isolated” or “failed”, the “das.isolationaddressX” setting is a very good way to ensure your ESXi host can check if the IP storage is accessible or not, and in my experience (and testing) works well.

Now, this brings me onto the new feature in vSphere 5…..

2. Datastore Heart beating.

It provides that extra layer of protection from HA isolation “false positives”, but adds little value for IP Storage unless the Management and IP Storage run over different physical NICs (in the scenario we are discussing they do).

Note: If the “Network Heartbeat” is not received, and the “Datastore Heartbeat” is not received by the HA Master, the host is considered “Failed” and the VMs will be restarted. But, If the “Network Heartbeat” is not received & “Datastore Heartbeat” is received by the HA Master, The host is “Isolated” and HA will trigger the “Host isolation response”.

The benefit here, in the scenario I have described, the “das.usedefaultisolationaddress” setting is “false” preventing HA trying to ping the VMK default gateway & “das.isolationaddress1” & “das.isolationaddress2” have been configured so HA will ping the IP Storage (vFilers) to check for isolation.

Datastore heartbeats, was configured to “Select any of the cluster datastores taking into account my preferences”. This allows a VMware administrator to specify a number of datastores , and these should be datastore critical to the operation of the cluster (Yes, I know, almost every data store will be important).

In this case, being a Netapp environment, the best practice is to separate OS / Page-file / Data / vSwap etc.

Therefore I decided to select the Windows OS & the Swap File datastores, as without these, all the VMs would not function, so they are the logical choice.

The below screen grab shows where Datastore heart-beating is configured, under the Cluster settings.

So what has this achieved?

We have the ESXi host pinging the isolation addresses (Netapp Filers), and we have the HA Master checking Datastore Heartbeating to accurately identify if the host is failed , isolated or partitioned. In the event HA Master does not receive Network heartbeats or Datastore heartbeats, then it is extremely likely there has been a total failure of the network (at least for this host) and the storage is no longer accessible, which obviously means the VMs cannot run, and therefore the host will be considered “Failed” by the master. The host will then trigger the configured “host isolation response” which for IP storage is “Power off”.

QUOTE: Duncan Epping – Datastore Heartbeating “To summarize, the datastore heartbeat mechanism has been introduced to allow the master to identify the state of hosts and is not use by the “isolated host” to prevent isolation.”

I couldn’t have said it better myself.

If the failure is not effecting the entire cluster, then the VM will power off and be recovered by VMware HA shortly there after. If the network failure effects all hosts in the cluster, then the VM will not be restarted until the network problem is resolved.