Integrity of I/O for VMs on NFS Datastores – Part 2 – Forced Unit Access (FUA) & Write Through

This is the second part of this series and the focus of this post is to cover a critical requirement for many applications including MS SQL and MS Exchange (which is designed to work with Block based storage) to operate as designed and to ensure data integrity is support for Forced Unit Access (FUA) & Write Through.

As a reminder from the first post, this post is not talking about presenting NFS direct to Windows.

The key here is for the storage solution to honour the “Write-to-stable” media intent and not depend on potentially vulnerable caching/buffering solutions using non persistent media which may require battery backing.

Microsoft have a Knowledge base article relating to the requirements for SQL Server, which details the FUA & Write Through requirements, along with other requirements covered in this series which I would recommend reading.

Key factors to consider when evaluating third-party file cache systems with SQL Server

Forced Unit Access (FUA) & Write-Through is supported by VMware but even with this support, it is also a function of the underlying storage to honour the request and this process or even support may vary from storage vendor to storage vendor.

A key point here is this process is delivered by the VMDK at the hypervisor level and passed onto the underlying storage, so regardless of the protocol being Block (iSCSI/FCP) or File based (NFS) it is the responsibility of the storage solution once the I/O is passed to it from the hypervisor.

Where a write cache on non persistent media (ie: RAM) is used, the storage vendor needs to ensure that in the event of a power outage there is sufficient battery backing to enable the cache to be de-staged to persistent media (ie: SSD / SAS / SATA).

Some solutions use Mirrored Write Cache to attempt to mitigate the risk of power outages causing issues but this could be argued to be not in compliance with the FUA which intends the Write I/O to be committed to stable media BEFORE the I/O is acknowledged as written.

If the solution does not ensure data is written to persistent media, it is not compliant and applications requiring FUA & Write-Through will likely be impacted at some point.

As I work for a storage vendor, I wont go into detail about any other vendor, but I will have an upcoming post on how Nutanix is in compliance with FUA & Write-Through.

In part three, I will discuss Write Ordering.

Integrity of Write I/O for VMs on NFS Datastores Series

Part 1 – Emulation of the SCSI Protocol
Part 2 – Forced Unit Access (FUA) & Write Through
Part 3 – Write Ordering
Part 4 – Torn Writes
Part 5 – Data Corruption

Nutanix Specific Articles

Part 6 – Emulation of the SCSI Protocol (Coming soon)
Part 7 – Forced Unit Access (FUA) & Write Through (Coming soon)
Part 8 – Write Ordering (Coming soon)
Part 9 – Torn I/O Protection (Coming soon)
Part 10 – Data Corruption (Coming soon)

Related Articles

1. What does Exchange running in a VMDK on NFS datastore look like to the Guest OS?
2. Support for Exchange Databases running within VMDKs on NFS datastores (TechNet)
3. Microsoft Exchange Improvements Suggestions Forum – Exchange on NFS/SMB
4. Virtualizing Exchange on vSphere with NFS backed storage

Integrity of I/O for VMs on NFS Datastores – Part 1 – Emulation of the SCSI Protocol

This is the first of a series of posts covering how the Integrity of I/O is ensured for Virtual Machines when writing to VMDK/s (Virtual SCSI Hard Drives) running on NFS datastores presented via VMware’s ESXi hypervisor as a “Datastore”.

Note: To be crystal clear, this post is not talking about presenting NFS direct to Windows or any other guest operating system.

This process is patented (US7865663) by VMware and its inventors and on the patent the process is called “SCSI Protocol Emulation”.

This series will first cover the topics in a vendor agnostic manner, meaning I am talking in general about VMware + any NFS storage on the VMware HCL with NFS support.

Following the vendor agnostic posts, I will follow with a series of posts focusing specifically on Nutanix, as the motivation for the series was to cover off this topic for existing or potential Nutanix customers, some of whom are less familiar with NFS and have asked for clarification, especially around virtualizing Business Critical Applications (vBCA) such as Microsoft SQL and Exchange.

The below diagram visualizes shows how storage can be presented to an ESXi host and what this series will focus on.

A VM accesses its .vmx and .vmdk file/s via a datastore the same way, regardless of the underlying storage protocol (DAS SCSI, iSCSI , NFS , FCP).

GUID-AD71704F-67E4-4AC2-9C22-10B531755566-high

In the case of NFS datastores, SCSI protocol emulation is used to allow the Guest Operating System (OS) and application/s to read and write via SCSI even when the underlying storage (which is abstracted by the hypervisor) is served via NFS which does not natively support the same commands.

Image Source: https://pubs.vmware.com/vsphere-50/index.jsp?topic=%2Fcom.vmware.vsphere.introduction.doc_50%2FGUID-2E7DB290-2A07-4F54-9199-B68FCB210BBA.html

In the following section, and throughout this series, many images shown are from the patent (US7865663) and are the property of the patent owners, not the author of this article.

The areas which I will be focusing on are the ones where there has been the most concern in the industry, especially for business critical applications, such as Microsoft SQL and Microsoft Exchange, being how are the VM operating system and application/s (or data integrity) are impacted when issuing commands when the storage is abstracted by the hypervisor and served to via NFS which does not have equivalent I/O commands as SCSI.

Some examples areas of concern around the industry for VMs running on datastores backed by NFS are:

1. SCSI Aborts / Resets
2. Forced Unit Access (FUA) & Write Through
3. Write Ordering
4. Torn I/O (Writes + Reads)

In this first part, we will look at the SCSI Protocol Emulation process and discuss SCSI Aborts and Resets and how the SCSI protocol emulation process deals with these.

Below is a diagram showing the flow of an I/O request for a VM writing SCSI commands to a VMDK (formatted as NTFS) through the SCSI emulation process and through to the NFS storage.

US07865663-20110104-D00005

The first few steps in my opinion are fairly self explanatory, where it gets interesting for me, and one of the points of contention among I.T professional (being SCSI aborts) is described in the box labelled “550“.

If the SCSI command is an abort (which has no equivalent in the NFS protocol), the SCSI emulation process removes the corresponding request from the virtual SCSI request list created in the previous step (box labelled “540“).

The same is true if the SCSI command is a reset (which also has no equivalent in the NFS protocol), the SCSI emulation process removes the corresponding request from the virtual SCSI request list. This process is shown below in the box labelled “560

US07865663-20110104-D00006

Next lets look at what happens if the SCSI “abort” or “reset” command is issued after the SCSI emulation process has passed on the command to the storage and is now receiving a reply to a command which the Guest OS / Application has aborted?

Its quite simple, the SCSI emulation process receives a reply from the NFS server, looks up the corresponding tag in the Virtual SCSI request list, and because this corresponding tag does not exist, the emulator drops the reply therefore emulating a SCSI abort command.

The process is shown below from box labelled “710” to “720” and finishing at “730“.

US07865663-20110104-D00007

In the patent, the above process is summed up nicely in the following paragraph.

Accordingly, a faithful emulation of SCSI aborts and resets, where the guest OS has total control over which commands are aborted and retried can be achieved by keeping a virtual SCSI request list of outstanding requests that have been sent to the NFS server. When the response to a request comes back, an attempt is made to find a matching request in the virtual SCSI request list. If successful, the matching request is removed from the list and the result of the response is returned to the virtual machine. If a matching request is not found in the virtual SCSI request list, the results are thrown away, dropped, ignored or the like.

So there we have it, that is how VMware’s patented SCSI Protocol emulation allows SCSI commands not supported natively by NFS to be honoured, therefore allowing applications dependant on Block based storage to be ran successfully within a VM where its VMDK is backed by NFS storage.

Let’s recap what we have learned so far.

1. The SCSI Commands, abort & reset have no equivalent in the NFS protocol.
2. The VMware SCSI Emulation process handles SCSI commands not supported natively by NFS thanks to the Virtual SCSI Request List.
3. Guest Operating Systems and Applications running in Virtual Machines on ESXi issue native SCSI commands to the NTFS volume, which is presented to the VM via a VMDK and housed on an NFS datastore.
4. The underlying NFS protocol is not exposed to the Guest OS, Application/s or Virtual Machine.
5. The SCSI Commands, abort & reset are emulated by the hyper visor through removing these requests from the Virtual SCSI emulation list.

In part two, I will discuss Forced Unit Access (FUA) & Write Through.

Integrity of Write I/O for VMs on NFS Datastores Series

Part 1 – Emulation of the SCSI Protocol
Part 2 – Forced Unit Access (FUA) & Write Through
Part 3 – Write Ordering
Part 4 – Torn Writes
Part 5 – Data Corruption

Nutanix Specific Articles

Part 6 – Emulation of the SCSI Protocol (Coming soon)
Part 7 – Forced Unit Access (FUA) & Write Through (Coming soon)
Part 8 – Write Ordering (Coming soon)
Part 9 – Torn I/O Protection (Coming soon)
Part 10 – Data Corruption (Coming soon)

Related Articles

1. What does Exchange running in a VMDK on NFS datastore look like to the Guest OS?
2. Support for Exchange Databases running within VMDKs on NFS datastores (TechNet)
3. Microsoft Exchange Improvements Suggestions Forum – Exchange on NFS/SMB
4. Virtualizing Exchange on vSphere with NFS backed storage?

Can I use my existing SAN/NAS storage with Nutanix?

I question I get regularly is, “Can I use my existing SAN/NAS storage with Nutanix?”.

The short answer is, as always “It depends”.

  • iSCSI, NFS & SMB 3.0 can be presented to Nutanix nodes just like existing non Nutanix nodes.
  • FC based storage cannot be used as Nutanix does not support FC HBAs

The below diagram shows a Nutanix NX-3460 block w/ 4 nodes having both Nutanix Containers presented to the nodes as well as iSCSI LUNs , SMB 3.0 or NFS Mount points connected from the centralized SAN/NAS.

Note: SMB 3 is not supported for ESXi hosts & NFS is not supported for Hyper-V.

Nutanix w External iSCSi NFS  SMB Storage

So what is the use cases for this style of deployment?

If you’re not ready to do an entire infrastructure refresh for whatever reason/s, you may wish to transition to Nutanix over time while maximizing ROI and lifespan of you’re existing storage.

Here is some examples of what I recommend customers do:

1. Migrate Business Critical Applications (BCAs) to Nutanix

There are many benefits of doing this including:

  • Improving resiliency / performance for vBCAs
  • Simplifying storage management for vBCAs
  • Freeing up capacity and reducing the workload on legacy SAN
  • Increasing ease of scalability for critical workloads
  • Use legacy SAN/NAS for high capacity low IOPS workloads which are better suited to centralized storage than vBCAs

Another great option is

2. Migrate Virtual Desktops (VDI) to Nutanix which shares similar benefits to migrating vBCAs including:

  • Separating non complimentary VDI workloads from Server & vBCAs as these workloads do not mix well in centralized storage deployments
  • Improving resiliency / performance for VDI
  • Simplifying storage management for VDI
  • Reducing the workload on legacy SAN/NAS which will give an effective increase in performance for workloads remaining on the SAN/NAS
  • Increasing linear scalability for VDI for if/when the environment scales
  • Use legacy SAN/NAS for high capacity low IOPS workloads which are better suited to centralized storage than VDI

The last example I wanted to point out is Management workloads.

1. Migrate Infrastructure Management workloads to Nutanix.

As has been recommended by many industry experts, separating Management VMs from customer (e.g.: vCAC / vCloud tenants) or production server/desktop workloads (at both the Compute & Storage layers) can dramatically simplify the datacenter and help improve performance, resiliency & recoverability.

Again doing this provides similar benefits to the previous two examples.

  • Separating Management workloads from Server / vBCAs / VDI as these workloads should be separate from a security, resiliency, performance and recoverability perspectives.
  • Improving resiliency / performance for all workloads in the datacenter
  • Simplifying storage management for Management
  • Reducing the workload on legacy SAN/NAS which will give an effective increase in performance for workloads remaining on the SAN/SAN
  • Increasing scalability for if/when the management demands increase.
  • Maximizes the life span / performance of the legacy SAN/NAS

In summary, where it is not possible for budgetary reasons to migrate all workloads to Nutanix, migrating some workloads such as VDI, vBCA or Management to Nutanix will help alleviate the impact of scalability, performance and/or resiliency issues with your existing centralized SAN/NAS.

Nutanix also provides a solution which can start (very) small and continue to be scaled in a granular fashion over time until the SAN/NAS goes End of Life and/or when budget exists. At this time all workloads can then be migrated to Nutanix!