Example VMware vNetworking Design for IP Storage

On a regular basis, I am being asked how to configure vNetworking to support environments using IP Storage (NFS / iSCSI).

The short answer is, as always, it depends on your requirements, but the below is an example of a solution I designed in the past.

Requirements

1. Provide high performance and redundant access to the IP Storage (in this case it was NFS)
2. Ensure ESXi hosts could be evacuated in a timely manner for maintenance
3. Prevent significant impact to storage performance by vMotion / Fault Tolerance and Virtual machines traffic
4. Ensure high availability for ESXi Management / VMKernel and Virtual Machine network traffic

Constraints

1. Four (4) x 10GB NICs
2. Six (6) x 1Gb NICs (Two onboard NICs and a quad port NIC)

Note: So in my opinion the above NICs are hardly “constraining” but still important to mention.

Solution

Use a standard vSwitch (vSwitch0) for ESXi Management VMKernel. Configure vmNIC0 (Onboard NIC 1) and vmNIC2 (Quad Port NIC – port 1)

ESXi Management will be Active on vmNIC0 and vmNIC2 although it will only use one path at any given time.

Use a Distributed Virtual Switch (dvSwitch-admin) for IP Storage , vMotion and Fault Tolerance.

Configure vmNIC6 (10Gb Virtual Fabric Adapter NIC 1 Port 1) and vmNIC9 (10Gb Virtual Fabric Adapter NIC 2 Port 2)

Configure Network I/O with NFS traffic having a share value of 100 and vMotion & FT will each have share value of 25

Each VMKernel for NFS will be active on one NIC and standby on the other.

vMotion will be Active on vmNIC6 and Standby on vmNIC9 and Fault Tolerance vice versa.

vNetworking Example dvSwitch-Admin

Use a Distributed Virtual Switch (dvSwitch-data) for Virtual Machine traffic

Configure vmNIC7 (10Gb Virtual Fabric Adapter NIC 1 Port 2) and vmNIC8 (10Gb Virtual Fabric Adapter NIC 2 Port 1)

Conclusion

While there are many ways to configure vNetworking, and there may be more efficient ways to achieve the requirements set out in this example, I believe the above configuration achieves all the customer requirements.

For example, it provides high performance and redundant access to the IP Storage by using two (2)  VMKernel’s each active on one 10Gb NIC.

IP storage will not be significantly impacted during periods of contention as Network I/O control will ensure in the event of contention that the IP Storage traffic has ~66% of the available bandwidth.

ESXi hosts will be able to be evacuated in a timely manner for maintenance as

1. vMotion is active on a 10Gb NIC, thus supporting the maximum 8 concurrent vMotion’s
2. In the event of contention, worst case scenario vMotion will receive just short of 2GB of bandwidth. (~1750Mb/sec)

High availability is ensured as each vSwitch and dvSwitch has two (2) connections from physically different NICs and connect to physically separate switches.

Hopefully you have found this example helpful and for a example Architectural Decision see Example Architectural Decision – Network I/O Control for ESXi Host using IP Storage

Example Architectural Decision – Host Isolation Response for IP Storage

Problem Statement

What are the most suitable HA / host isolation response when using IP based storage (In this case, Netapp HA Pair in 7-mode) when the IP storage runs over physically separate network cards and switches to ESXi management?

Assumptions

1. vSphere 5.0 or greater (To enable use of Datastore Heartbearting)
2. vFiler1 & vFiler2 reside on different physical Netapp Controllers (within the same HA Pair in 7-mode)
3. Virtual Machine guest operating systems with an I/O timeout of 190 seconds to allow for a Controller fail-over (Maximum 180 seconds)

Motivation

1. Minimize the chance of a false positive isolation response
2.Ensure in the event the storage is unavailable that virtual machines are promptly shutdown to minimize impact on the applications/data.

Architectural Decision

Turn off the default isolation address and configure the below specified isolation addresses, which check connectivity to multiple Netapp vFilers (IP storage) on the vFiler management VLAN and the IP storage interface.

Utilize Datastore heartbeating, checking multiple datastores hosted across both Netapp controllers (in HA Pair) to confirm the datastores themselves are accessible.

Services VLANs
das.isolationaddress1 : vFiler1 Mgmt Interface 192.168.1.10
das.isolationaddress2 : vFiler2 Mgmt Interface 192.168.2.10

IP Storage VLANs
das.isolationaddress3 : vFiler1 vIF 192.168.10.10
das.isolationaddress4 : vFiler2 vIF 192.168.20.10

Configure Datastore Heartbeating with “Select any of the clusters datastores taking into account my preference” and select the following datastores

  • One datastore from vFiler1 (Preference)
  • One datastore from vFiler2 (Preference)
  • A second datastore from vFiler1
  • A second datastore from vFiler2

Configure Host Isolation Response to: Power off.

Justification

1. The ESXi Management traffic is running on a standard vSwitch with 2 x 1GB connections which connect to different physical switches to the IP storage (and Data) traffic (which runs over 10GB connections). Using the ESXi management gateway (default isolation address) to deter main isolation is not suitable as the management network can be offline without impacting the IP storage or data networks. This situation could lead to false positives isolation responses.
2. The isolation addresses chosen test both data and IP storage connectivity over the converged 10Gb network
3. In the event the four isolation addresses (Netapp vFilers on the Services and IP storage interfaces) cannot be reached by ICMP, Datastore heartbeating will be used to confirm if the specified datastores (hosted on separate physical Netapp controllers) are accessible or not before any isolation action will be taken.
4. In the event the two storage controllers do not respond to ICMP on either the Services or IP storage interfaces, and both the specified datastores are inaccessible, it is likely there has been a catastrophic failure in the environment, either to the network, or the storage controllers themselves, in which case the safest option is to shutdown the VMs.
5. In the event the isolation response is triggered and the isolation does not impact all hosts within the cluster, the VM will be restarted by HA onto a surviving host.

Implications

1. In the event the host cannot reach any of the isolation addresses, and datastore heartbeating cannot access the specified datastores, virtual machines will be powered off.

Alternatives

1. Set Host isolation response to “Leave Powered On”
2. Do not use Datastore heartbeating
3. Use the default isolation address

For more details, refer to my post “VMware HA and IP Storage

VMware HA and IP Storage *Updated*

With IP storage (particularly NFS in my experience) becoming more popular over recent years, I have been designing more and more VMware solutions with IP Storage, both iSCSI and NFS.

The purpose of this post is not to debate the pros and cons of IP storage, or Block vs File, or even vendor vs vendor but to explore how to ensure a VMware environments (vSphere 4 and 5) using IP storage can be made as resilient as possible purely from a VMware HA perspective. (I will be writing another post on highly available vNetworking for IP Storage)

So what are some considerations when using IP storage and VMware HA?

In many solutions I’ve seen (and designed), the ESXi Management VMKernel is on “vSwitch0” and uses two (2) x 1GB NICs while the IP storage (and Data network) is on a dvSwitch/es and uses two or more 10Gb NICs which connect to different physical switches than the ESXi Management 1GB NICs.

So does this matter? Well, while it is a good idea, there are some things we need to consider.

What happens if the 1GB network is offline for whatever reason, but the 10GB network is still operational?

Do we want this event to trigger a HA isolation event? In my opinion, not always.

So lets investigate further.

1. Host Isolation Response.

Host Isolation response is important to any cluster, but for IP storage it is especially critical.

How does Host Isolation Response work? Well, in vSphere 5, it requires 3 conditions to be met

1. The host fails to receive heartbeats from the HA master

2. The host does not receive any HA election traffic

3. Failing conditions 1&2 , the host attempts to ping the “isolation address/es” and is unsuccessful.

4. The isolation response is triggered

So in the scenario I have provided, the goal is to ensure that if a host becomes isolated from the HA Primary nodes (or HA Master in vSphere 5)  via the 1GB Network that the host does not unnecessarily trigger the “host isolation response”.

Now why would you want to stop HA restarting the VM on another host? Don’t we want the VMs to be restarted in the event of a failure?

Yes & No. In this scenario its possible the ESXi host still has access to the IP Storage network, and the VM the data network/s via the 10Gb Network. The 1Gb Network may have suffered a failure, which may effect management, but it may be desirable to leave the VMs running to avoid outages.

If  both the 1GB and 10GB networks go down to the host, this would result in the host being isolated from the HA Primary nodes (or HA Master in vSphere 5), the host would not receive HA election traffic and the host would suffer an “APD” (All Paths Down) condition. HA isolation response will then rightly be triggered and VMs will be “Powered Off”. This is desirable as the VMs could then be restarted on the surviving hosts assuming the failure is not network wide.

Here is a screen grab (vSphere 5) of the “Host Isolation response” setting, which is located when you right click your cluster “Edit Settings”, “vSphere HA” and “Virtual Machine Options”.

The host isolation response setting for environments with IP Storage should always be configured to “Power Off” (and not Shutdown). Duncan Epping explained this well in his blog, so no need to cover this off again.

But wait, there’s more! 😉

How do I avoid false positives which may cause outages for my VMs?

If using vSphere 5, we can use Datastore Heartbeating (which I will discuss later), but in vSphere 4 some more thought needs to go into the design.

So lets recap step three in the isolation detection process we discussed earlier

“3. Failing conditions 1&2 , the host attempts to ping the “isolation address/es”

What is the “isolation address”? By default, its the ESXi Management VMKernel default gateway.

Is this the best address to check for isolation? In a environment without IP storage, normally in my experience it is suitable, although it is best to discuss this with your Network architect as the device you ping needs to be highly available. Note: It also needs to respond to ICMP!

When using IP storage, I recommend overriding the default by configuring the advanced setting “das.usedefaultisolationaddress” value to “false”. Then configure the “das.isolationaddress1” through “das.isolationaddress9” with the IP address/es of your IP storage (in this example, Netapp vFilers), the ESXi host will now ping your IP storage assuming the HA Primaries (or “Master” in vSphere 5) is unavailable and no election traffic is being received) to check if it is isolated or not.

If the host/s complete the isolation detection process and are unable to ping any of the isolation addresses (IP Storage), (and therefore the ESXi host will not be able to access the storage) it will declare itself isolated and trigger a HA isolation response. (Which should always be “Power Off” as we discussed earlier)

The below screen shot shows the Advanced options and the settings chosen.

In this case, the IP Storage (Netapp vFilers) are connected to the same physical 10Gb Switches and the ESXi hosts (one “hop”) so they are a perfect way to test network connectivity of the network and access to the storage.

In the event the IP Storage (Netapp vFilers) are inaccessible, this alone would not trigger HA isolation response as the connectivity to the HA Primary nodes (or HA Master in vSphere 5) may still be functional. If the Storage is in fact inaccessible for >125secs (if using default settings – NFS “HeartbeatFrequency” of 12 seconds & “HeartbeatMaxFailures” of 10) the datastore/s will be marked as Unavailable and a “APD” event may occur. See VMware KB 2004684 for details on APD events.

Below is a screen grab of a vSphere 5 host showing the advanced NFS settings discussed above.

Note: With Netapp Storage it is recommended to configure the VMs with a disk timeout of 190 seconds, to allow for intermittent network issues and/or total controller loss (which takes place in <180 seconds, usually much less), and therefore the VMs can continue running and no outage is caused.

My advice would be modifying the “das.usedefaultisolationaddress” and “das.isolationadressX” is an excellent way in vSphere 4 (and 5) of ensuring your host is isolated or not by checking the IP storage is available, after all, the storage is critical to the ESXi hosts functionality! 😀

If for any reason the IP Storage is not responding, assuming the HA isolation detection process step 1 & 2 have completed, an isolation event is triggered and HA will take swift action (Powering Off the VM) to ensure the VM can be restarted on another host (assuming the issue is not network wide).

Note: Powering Off the VM in the event of Isolation helps prevent a split brain scenario where the VM is live on two hosts at the same time.

While datastore heart-beating is an excellent feature, it is only used by the HA Master to verify if a host is “isolated” or “failed”, the “das.isolationaddressX” setting is a very good way to ensure your ESXi host can check if the IP storage is accessible or not, and in my experience (and testing) works well.

Now, this brings me onto the new feature in vSphere 5…..

2. Datastore Heart beating.

It provides that extra layer of protection from HA isolation “false positives”, but adds little value for IP Storage unless the Management and IP Storage run over different physical NICs (in the scenario we are discussing they do).

Note: If the “Network Heartbeat” is not received, and the “Datastore Heartbeat” is not received by the HA Master, the host is considered “Failed” and the VMs will be restarted. But, If the “Network Heartbeat” is not received & “Datastore Heartbeat” is received by the HA Master, The host is “Isolated” and HA will trigger the “Host isolation response”.

The benefit here, in the scenario I have described, the “das.usedefaultisolationaddress” setting is “false” preventing HA trying to ping the VMK default gateway & “das.isolationaddress1” & “das.isolationaddress2” have been configured so HA will ping the IP Storage (vFilers) to check for isolation.

Datastore heartbeats, was configured to “Select any of the cluster datastores taking into account my preferences”. This allows a VMware administrator to specify a number of datastores , and these should be datastore critical to the operation of the cluster (Yes, I know, almost every data store will be important).

In this case, being a Netapp environment, the best practice is to separate OS / Page-file / Data / vSwap etc.

Therefore I decided to select the Windows OS & the Swap File datastores, as without these, all the VMs would not function, so they are the logical choice.

The below screen grab shows where Datastore heart-beating is configured, under the Cluster settings.

So what has this achieved?

We have the ESXi host pinging the isolation addresses (Netapp Filers), and we have the HA Master checking Datastore Heartbeating to accurately identify if the host is failed , isolated or partitioned. In the event HA Master does not receive Network heartbeats or Datastore heartbeats, then it is extremely likely there has been a total failure of the network (at least for this host) and the storage is no longer accessible, which obviously means the VMs cannot run, and therefore the host will be considered “Failed” by the master. The host will then trigger the configured “host isolation response” which for IP storage is “Power off”.

QUOTE: Duncan Epping – Datastore Heartbeating “To summarize, the datastore heartbeat mechanism has been introduced to allow the master to identify the state of hosts and is not use by the “isolated host” to prevent isolation.”

I couldn’t have said it better myself.

If the failure is not effecting the entire cluster, then the VM will power off and be recovered by VMware HA shortly there after. If the network failure effects all hosts in the cluster, then the VM will not be restarted until the network problem is resolved.