Virtualizing Exchange on vSphere with NFS backed storage?

For many years, customers have been realising the benefits of file based storage from one or more of the many storage vendors offering NFS.

NFS makes a ton of sense for virtualization, and virtualizing Business Critical applications such as Exchange, along with the rest of a company’s servers, can be a great way to reduce complexity and save on CAPEX/OPEX.

However, some vendors, have licensing or support statements which make this more difficult than it needs to be.

One such vendor is Microsoft.

Microsoft currently don’t support Exchange running inside a VMDK on an NFS datastore, even though the VMDK is a virtual SCSI device and acts/performs the same as if it was on a block based LUN, such as FC/FCoE or iSCSI.

I decided to reach out to a bunch of great guys in the virtualization community to try and get some awareness of this issue, and get Microsoft to update the outdated and technically invalid support statement.

As a result, the following TechNet forum article has been posted

Support for Exchange Databases running within VMDKs on NFS datastores

There is also a suggestion in the Microsoft Product improvement forum on the same topic, which as a result of the communities efforts in the past few weeks, have seen it sky rocket to the #1 improvement suggestion to microsoft.

The post and voting can be found here.

Support storing Exchange datat on VMDKs on File shares (NFS/SMB)

So please check out these two articles, and vote and leave your comments in support of this issue. Supporting Exchange in VMDKs on NFS is a No lose situation for customers, and that is what it is all about!

Related Articles:

Integrity of Write I/O for VMs on NFS Datastores Series

Part 1 – Emulation of the SCSI Protocol
Part 2 – Forced Unit Access (FUA) & Write Through
Part 3 – Write Ordering
Part 4 – Torn Writes
Part 5 – Data Corruption

The future of NAS Storage (NFS) for Virtual Environments

I read the article (below) by Howard Marks after seeing it come up on Twitter today, and I found it to be very interesting and refreshing to read as it hits the nail on the head.

http://www.networkcomputing.com/storage-networking-management/vmware-has-to-step-up-on-nfs/240163350

For a long time Network Attached Storage (NAS) has been considered by many (including myself in the past) as a second class citizen, or Tier 3 storage and not a serious choice for mission critical virtual environments.

In recent years, I have used more and more NFS in vSphere environments, and as I went through my VCDX journey I formed the strong view that NFS was in fact the best storage protocol for vSphere/vCloud/View environments having gone through a process of trying to learn as much as possible about every storage alternative available to vSphere.

In fact my VCDX design was based on a vCloud solution running on NFS, and this was one area I found quite easy to present and defend during the VCDX defence due to the many advantages of NFS.

In the article, Howard wrote

“It’s time for VMware to upgrade its support for file storage (as opposed to block storage) and embrace the pioneering vendors who are building storage systems specifically for the virtualization environment.”

I totally agree with this statement, and I think it is in the best interest of VMware, its partners and customers for VMware to go down this path. I think most would agree that Netapp have been leading the charge with NFS based storage for a long time, and in my opinion rightly so, with some new storage vendors also choosing to build solutions around NFS.

Another comment Howard made was

“managing vSphere with NFS storage is somewhat simpler than managing an equivalent system on block storage. Even better, a good NFS storage system, because it knows which blocks belong to which virtual machine, can perform storage management tasks such as snapshots, replication and storage quality of service per virtual machine, rather than per volume.”

I totally agree with the above statement and VMware’s development of features such as View Composer for Array Integration (VCAI) which is only supported on NFS, shows the protocol has significant advantages over block based storage especially for deployment speed and reduced workload on the storage compared. (VCAI uses the Fast File Clone VAAI-NAS Primitive to create near instant space efficient Linked Clone desktops)

I wrote an example architectural decision regarding storage protocol choice for Horizon View (VDI) environments which covers in more depth the advantages of NFS for VDI environments. The article can be viewed here : Example Architectural Decision – Storage Protocol Choice for Horizon View

Also NFS does not suffer from the same challenges as block based storage, as much larger numbers of VMs can share an NFS datastore compared to VMFS datastore without being negatively impacted by latency as a result of SCSI reservations (although vastly improved with VAAI) or contention resulting from limited SCSI queue depths which is something VAAI does still not address.

These limitations of block storage leads to the number of VMs per datastore remaining at the old rules of thumb of <25 for non I/O intensive workloads even with VAAI which some felt was the magic solution to the issue which sadly was incorrect. (Note: Number of desktop VMs per VMFS datastore with VAAI the recommended maximum is 140 compared to 64 without VAAI and NFS of >200).

Howard went on to write

“The first step would be for VMware to acknowledge that NFS has advanced in the past decade.”

I think this has been acknowledged by VMware along with many experts in the industry which is a positive step forward and I believe VMware will give more attention to NFS in future versions.

Howard further commented that

“Today vSphere supports version 3.0 of NFS—which is seventeen years old. NFS 4.1 has much more sophisticated security, locking and network improvements than NFS 3.0. The optional pNFS extension can bring the performance and multipathing of SANs with centralized file system management.”

I really think that VMware adding support in the future for NFS 4.1 will really help cement NFS as the protocol of choice for virtual environments and will be complimentary to VMware’s upcoming VSAN offering.

I think with bolstered NFS support and VSAN, VMware will have a solid storage layer to take virtualization into the future, without requiring storage vendors to immediately support vVOLs which in my opinion is being built (at least in part) to solve the challenges of VMFS and block based storage, when NFS (even version 3.) addresses most requirements in virtual environments very well today, and NFS 4.1 support will only improve the situation.

Howard’s comment (below) appears to echo these thoughts.

“Better NFS support will empower storage vendors to innovate and strengthen the vSphere ecosystem and fill the gap until vVols are ready. NFS support will also provide an alternative once vVols hit the market.”

 

To finish I thought Howard’s comment on snapshots (below) and replication being per Virtual Machine rather than volume or LUN, several vendors are doing this today moving towards NFS 4.1 will help these vendors continue to innovate and provide better and more efficient storage solutions for VMware’s customers which I think is what everyone wants.

Even better, a good NFS storage system, because it knows which blocks belong to which virtual machine, can perform storage management tasks such as snapshots, replication and storage quality of service per virtual machine, rather than per volume.

Scaling problems with traditional shared storage

At VMware vForum Sydney this week I presented “Taking vSphere to the next level with converged infrastructure”.

Firstly, I wanted to thank everyone who attended the session, it was a great turnout and during the Q&A there were a ton of great questions.

One part of the presentation I got a lot of feedback on was when I spoke about Performance and Scaling and how this is a major issue with traditional shared storage.

So for those who couldn’t attend the session, I decided to create this post.

So lets start with a traditional environment with two VMware ESXi hosts, connected via FC or IP to a Storage array. In this example the storage controllers have a combined capability of 100K IOPS.

50kIOPS

As we have two (2) ESXi hosts, if we divide the performance capabilities of the storage controllers between the two hosts we get 50K IOPS per node.

This is an example of what I have typically seen in customer sites, and day 1, and performance normally meets the customers requirements.

As environments tend to grow over time, the most common thing to expand is the compute layer, so the below shows what happens when a third ESXi host is added to the cluster, and connected to the SAN.

33KIOPS

The 100K IOPS is now divided by 3, and each ESXi host now has 33K IOPS.

This isn’t really what customers expect when they add additional servers to an environment, but in reality, the storage performance is further divided between ESXi hosts and results in less IOPS per host in the best case scenario. Worst case scenario is the additional workloads on the third host create contention, and each host may have even less IOPS available to it.

But wait, there’s more!

What happens when we add a forth host? We further reduce the storage performance per ESXi host to 25K IOPS as shown below, which is HALF the original performance.

25KIOPS

At this stage, the customers performance is generally significantly impacted, and there is no easy or cost effective resolution to the problem.

….. and when we add a fifth host? We continue to reduce the storage performance per ESXi host to 20K IOPS which is less than half its original performance.

20KIOPS

So at this stage, some of you may be thinking, “yeah yeah, but I would also scale my storage by adding disk shelves.”

So lets add a disk shelf and see what happens.

20KIOPSAddDiskShelf

We still only have 100K IOPS capable storage controllers, so we don’t get any additional IOPS to our ESXi hosts, the result of adding the additional disk shelf is REDUCED performance per GB!

Make sure when your looking at implementing, upgrading or replacing your storage solution that it can actually scale both performance (IOPS/throughput) AND capacity in a linear fashion,otherwise your environment will to some extent be impacted by what I have explained above. The only ways to avoid the above is to oversize your storage day 1, but even if you do this, over time your environment will appear to become slower (and your CAPEX will be very high).

Also, consider the scaling increments, as a solutions ability to scale should not require you to replace controllers or disks, or have a maximum number of controllers in the cluster. it also should scale in both small, medium and large increments depending on the requirements of the customer.

This is why I believe scale out shared nothing architecture will be the architecture of the future and it has already been proven by the likes of Google, Facebook and Twitter, and now brought to market by Nutanix.

Traditional storage, no matter how intelligent does not scale linearly or granularly enough. This results in complexity in architecture of storage solutions for environments which grow over time and lead to customers spending more money up front when the investment may not be realised for 2-5 years.

I’d prefer to be able to Start small with as little as 3 nodes, and scale one node at a time (regardless of node model ie: NX1000 , NX3000 , NX6000) to meet my customers requirements and never have to replace hardware just to get more performance or capacity.

Here is a summary of the Nutanix scaling capabilities, where you can scale Compute heavy, storage heavy or a mix of both as required.

ScaingSolution