Storage Performance : ReFS vs NTFS

I am regularly asked by customers if they should use NTFS or the newer ReFS when formatting drives for applications like Microsoft Exchange and SQL.

Most customers are asking in the context of performance, so I thought I would share some recent testing results using MS Exchange Jetstress.

Firstly, what is ReFS and when/would you use ReFS?

What is ReFS?

Resilient File System (ReFS) is a new local file system. It maximizes data availability, despite errors that would historically cause data loss or downtime. Data integrity ensures that business critical data is protected from errors and available when needed. Its architecture is designed to provide scalability and performance in an era of constantly growing data set sizes and dynamic workloads.

The key features of ReFS are:

  • Integrity: ReFS stores data so that it is protected from many of the common errors that can cause data loss. File system metadata is always protected. Optionally, user data can be protected on a per-volume, per-directory, or per-file basis. If corruption occurs, ReFS can detect and, when configured with Storage Spaces, automatically correct the corruption. In the event of a system error, ReFS is designed to recover from that error rapidly, with no loss of user data.
  • Availability: ReFS is designed to prioritize the availability of data. With ReFS, if corruption occurs, and it cannot be repaired automatically, the online salvage process is localized to the area of corruption, requiring no volume down-time. In short, if corruption occurs, ReFS will stay online.
  • Scalability: ReFS is designed for the data set sizes of today and the data set sizes of tomorrow; it’s optimized for high scalability.
  • App Compatibility: To maximize AppCompat, ReFS supports a subset of NTFS features plus Win32 APIs that are widely adopted.
  • Proactive Error Identification: The integrity capabilities of ReFS are leveraged by a data integrity scanner (a “scrubber”) that periodically scans the volume, attempts to identify latent corruption, and then proactively triggers a repair of that corrupt data.

Source: Microsoft Technet – Resilient file system

From my perspective, ReFS makes sense when using physical servers with unintelligent storage such as JBOD or any storage which does not perform things such as checksums on both read and write IO and enforce Force Unit Access (FUA). However if you’re deploying MS Exchange / MS SQL etc on intelligent storage such as Nutanix Acropolis Distributed Storage Fabric (ADSF) then ReFS is not required as data integrity is already ensured by the storage layer. For example, in the event of silent data corruption, ADSF will detect the corruption on read and simply retrieve the data from the second copy which resides on a different physical drive on a different node within the cluster. This is also transparent to the Virtual Machine, OS and application and therefore compatible with any OS and application.

As a result ReFS (at least in its current version) is not required for deployments of Microsoft OS,Apps on Nutanix or other storage solutions if they have the same functionality.

None the less, this is not supposed to be a post about Nutanix, so let’s now look at the test bed and results of the performance comparison so you can make an informed decision about which to use

Test Bed Setup

The test bed setup is as follows:

Hypervisor: ESXi 5.5 Rel: 3248547

2 Virtual Machines cloned from the same template:
Windows 2012 R2 , 4 vCPUs , 24Gb RAM
4 Paravirtual SCSI adapters
1 vDisk for OS , 4 vDisks for DB, 4 vDisks for Logs

Both VMs are running on the same node, with only one VM running Jetstress at a time. All tests runs were back to back to ensure results would be fair and to check the consistency of the results.

The only difference between the two VMs is as follows:

VM1:

4 vDisks formatted with NTFS and 64k allocation size for Database
4 vDisks formatted with NTFS and 4k allocation size for Logs

VM2:

All 8 vDisks formatted with ReFS (64k)

Tests performed:

Three Jetstress runs per VM one after another, importantly with new databases created before each run to ensure a fair baseline. Doing this ensured the results were skewed by having the Extent Cache (In-Memory Read Cache) or the Medusa Cache (In-Memory Metadata Cache) pre-warmed.

Each run used 16 threads and resulted in the following results.

ReFS Jetstress Instance:

Run One: 6697 IOPS
Run Two: 6896 IOPS
Run Three: 6796 IOPS

Average: 6796 IOPS (approx +-3% between runs)

NTFS Jetstress Instance:

Run One: 7328 IOPS
Run Two: 7240 IOPS
Run Three: 7296 IOPS

Average: 7288 IOPS (approx +-1% between runs)

Result:

The difference being approx 7% higher performance and more consistency when using NTFS.

Additional Tests:

Out of interest I repeated the tests with a lower thread count (8) to see if the results were consistent as we decreased the threads.

8 Threads:

ReFS: 3921 IOPS
NTFS: 4079 IOPS

The result again went in favour of NTFS by approx 4%. This makes sense as the advantage would diminish as the pressure on the storage layer reduces.

Autotune Result:

I then repeated the test with Jetstress set to Autotune with the following results.

ReFS: 16673 IOPS @ 91 threads (Autotuned)
NTFS: 17758 IOPS @ 96 threads (Autotuned)

The autotune results again show that NTFS has an advantage over ReFS of approx 7% which is in line with the results using 16 threads manually configured.

CPU overheads comparisons

ReFS Jetstress Instance:

Run One:Avg 39.293% (Min 23.725 / Max 44.127)
Run Two:Avg 40.28% (Min 37.785 / Max 44.366)
Run Three: Avg 40.175% (Min 36.520 / Max 43.843)

Average: 39.916%

NTFS Jetstress Instance:

Run One: Avg 39.390% (Min 36.746 / Max 42.651)
Run Two: Avg 39.719% (Min 23.613 / Max 45.960)
Run Three:Avg 39.844% (Min 37.347 / Max 42.400)

Average: 39.651%

So NTFS achieved 7% better performance than ReFS using the same thread count even with the Data Integrity features turned off for ReFS volumes without using any more CPU.

Summary:

Overall these tests demonstrate that NTFS consistently outperforms ReFS for MS Exchange type IO patterns. For intelligent storage, ReFS has no advantages and NTFS will provide better performance with roughly the same CPU overheads and without any risk of data integrity issues.

As the recommendation for ReFS is to disable the data integrity features for Exchange, I am yet to hear a good justification as to why ReFS is recommended, but I welcome any comments from those in the know and if the justifications are solid I will update the post to reflect these reasons.

Related Articles:

1. Jetstress Testing with Intelligent Tiered Storage Platforms

2. MS Exchange on Nutanix Acropolis Hypervisor (AHV)

3. How to successfully Virtualize MS Exchange

4. Deduplication and MS Exchange

5 thoughts on “Storage Performance : ReFS vs NTFS

  1. Hi,

    this is very interesting, glad to read some benchmarks at last.
    Is there any chance you can at a later time re-run this with the vdisks spread over really different storage devices (so not 8 carved out of one datastore, and not 8 datastores on 8 luns from the same raidset, but 8 datastores on 8 physical block devices, meaning no concurrent IO on the lowest layer.)

    Only if we know if it behaves the same when it can flex-and-stretch out among IO queues we know everything about it.

    It would be very very very interesting (but I understand that is of course a lot more effort & hardware thrown on the same test)

    • Glad you found the article interesting.

      The tests were run on Nutanix which is very different to traditional shared storage with RAID/LUNs etc. Yes the tests were on a single datastore, but again, this is very different than with other traditional shared storage. In the case on nutanix read IO is served locally, so no shared IO, noisy neighbour issues and write IO is half local , half remote (for redundancy) so nothing like a centralized SAN/NAS.

      In this case, we have Jetstress which >1000 IOPS is plenty for any production Exchange server sized properly, and Nutanix is showing >17K IOPS on NTFS from a single VM with just 8 vDisks.

      In my experience with Nutanix, multiple datastores has minimal to no benefit on performance.

      I have another blog coming out showing scale out performance with Jetstress but its NTFS only as at least in my opinion based on these tests and numerous real world deployments, I see no reason to run ReFS in its current version on intelligent storage.

      I do agree it would be interesting on other storage to do what you have asked, and to test the same on JBOD, however I didn’t have time to do those tests but I figured I would post these results as im sure they will help some people as there is limited information available currently.

      Thanks for the comments.

  2. (I think we already know that NTFS wouldn’t, even if striped, go sky-rocket performance with the added hardware, but a new-from-the-ground filesystem might act differently)

  3. So,
    1) “ReFS .. is not required for deployments of Microsoft OS”. Not entirely: Foe Exchange deployments, ReFS is not supported for volumes other than those hosting Exchange data (DB, logs and CIs)
    2) Which Exchange DB engine did you use?
    3) If it’s Exchange, did you disable the ReFS data integrity feature as part of the requirements for Exchange database files or the volume hosting those files?
    3) With Exchange 2016, ReFS is recommended over NTFS
    4) You still have the CPU utilization figures from your tests? When I did a small comparison last year, CPU utilization for ReFS was ~15% lower, even with Bitlocker applied.

    • Hi Michel,

      Thanks for the comments.

      1) Agree ReFS is not supported for OS and is only supported for DB, logs etc.
      2) ESE Version 15.00.0516.026 & Jetstress Version 15.01.0466.031
      3) Yes the data integrity features were disabled. I have read the same about ReFS being recommended in Exchange 2016, however I am yet to find a detailed justification for the recommendation. If you are aware of one please forward me the URL and I will put a link to it in this post.
      4) Yes I do, I will update the post above with the CPU utilisation details.

      Cheers

Leave a Reply to Josh OdgersCancel reply