MS Exchange Performance – Nutanix vs VSAN 6.0

When I saw a post (20+ Common VSAN Questions) by Chuck Hollis on VMware’s corporate blog claiming (extract below) “stunning performance advantage (over Nutanix) on identical hardware with most demanding datacenter workloads” I honestly wondering where does he get this nonsense?

FUDfromChuckles

Then when I saw Microsoft Applications on Virtual SAN 6.0 white paper released I thought I would check out what VMware is claiming in terms of this stunning performance advantage for an application I have done lots of work with lately, MS Exchange.

I have summarized the VMware Whitepaper and the Nutanix testing I personally performed in the below table. Now these tests were not exactly the same, however the ESXi Host CPU and RAM were identical, both tests used 2 x 10Gb as well as 4 x SSD devices.

The main differences were ESXi 6.0 for VSAN testing and ESXi 5.5 U2 for Nutanix, I’d say that’s advantage number 1 for VMware, Advantage Number 2 is VMware use two LSI controllers, my testing used 1, and VMware had a cluster size of 8 whereas my testing (in this case) only used 3. The larger cluster size is a huge advantage for a distributed platform, especially VSAN since it does not have data locality, so the more nodes in the cluster, the less chance of a bottleneck.

Nutanix has one advantage, more spindles, but the advantage really goes away when you consider they are SATA compared to VSAN using SAS. But if you really want to kick up a stink about Nutanix having more HDDs, take 100 IOPS per drive (which is much more than you can get from a SATA drive consistently) off the Nutanix Jetstress result.

So the areas where I feel one vendor is at a disadvantage I have highlighted in Red, and to opposing solution in Green. Regardless of these opinions, the results really do speak for themselves.

So here is a summary of the testing performed by each vendor and the results:

 

VSANvNutanixThe VMware white paper did not show the Jetstress report, however for transparency I have copied the Nutanix Test Summary below.

NutanixNX8150Jetstress

Summary: Nutanix has a stunning performance advantage over VSAN 6.0 even on identical lesser hardware, and an older version of ESXi using lower spec HDDs while (apparently) having a significant disadvantage by not running in the Kernel.

Peak Performance vs Real World Performance

In this post I will be discussing Real World Performance of Storage solutions compared to peak performance. To make my point I will be using some car analogies which will hopefully assist in getting my point across.

Starting with the Bugatti Veyron Super Sport (below). This car has a W16 engine with 4 turbochargers and produces 1183BHP (~880kW) and has a top speed (peak performance) of 267MPH (431KPH).

bugatti-veyron-super-sport-

The Veyron achieved the world record 267MPH at Volkswagen’s Ehra-Lessien test track in Germany. The test track has a 5.6 mile long straight. This is one of the very few places on earth where the Veyron can actually achieve its peak performance.

Now for the Veyron to achieve the 267MPH, not only do you need a 5.6 mile long straight, but the Veyron’s rear spoiler must NOT be deployed. Now rear spoilers provide down-force to keep stability so having the spoiler down means the car has a reduced ability to for example take corners.

bugatti-veyron-super-sport_100315491_l

In addition to requiring a 5.6 mile long straight, the rear spoiler being down, the Veyron can also only maintain its top speed (Peak performance) for 12 minutes before the Veyron’s 26.4-gallon fuel tank will be emptied, which is lucky because the Veyron’s specially designed tyres only last 15mins at >250MHP.

veyron-tires-2-thumb-550x336

So in reality, while the Bugatti Veyron is one of (if not the fastest) production car in the world, even when you have all your ducks in a row, you can still only achieve its peak performance for a very short period of time (in this example <12 mins) and with several constraints such as reduced ability to corner (due to reduced aerodynamics from the spoiler being down).

Now what about Fuel Economy? The Veyron is rated as follows:

City Driving: 29 L/100 km; 9.6 mpg

Highway Driving: 17 L/100 km; 17 mpg

Top Speed: 78 L/100 km; 3.6 mpg

As you can see, vastly different figures depending on how the Veyron is being used.

There are numerous other factors which can limit the Veyron’s performance, such as weather. For example if the test track is wet, or has strong head winds, the Veyron would not be able to perform at its peak.

bugatti-veyron-wallpaper-7

So while the Veyron can achieve the 267MPH, In the real world, its average (or Real World) performance will be much lower and will vary significantly from owner to owner.

At this stage you’re probably asking “What has this got to do with Storage”?

A Storage solution, be it a SAN/NAS or Hyper-Converged, all can be configured and benchmarked to achieve really impressive Peak Performance (IOPS) much like the Veyron.

But these “Peak Performance” numbers can rarely (if at all) be achieved with “Real World” workloads, especially over an extended duration.

To quote two great guys in the Storage industry (Vaughn Stewart & Chad Sakac):

Absolute performance more often than not, is NOT the only design consideration.

I couldn’t agree with this more. The storage vendors are to blame by advertising unrealistic IOPS numbers based on 4K 100% read and now customers expect the same number of IOPS from SQL or Oracle.

The MPG of the Veyron is like the number of IOPS a Storage array can achieve. It Depends on how the Car or Storage Array is used! The car will get higher MPG if used only on the highway just like a Storage Array will get higher IOPS if only used for one I/O profile.

As the IO size and profile of workloads like SQL & Oracle are vastly different than the peak performance benchmarks using 4K 100% Read IOPS, expecting the same IOPS number for the benchmark and SQL/Oracle is as unrealistic as expecting the Veyron to do 267MPH in heavy traffic.

heavy-traffic-beirut-saidaonline

But like I said, Its the storage vendors fault for failing to educate customers on real world performance so many customers have the impression that peak IOPS is a good measurement, and as a result customers regularly waste time comparing Peak Performance of Vendor A and Vendor B, instead of focusing on their requirements and Real World performance.

In the real world, (at least in the vast majority of cases) customers don’t have dedicated storage solutions for one application where peak performance can be achieved, let alone sustained for any meaningful length of time.

Customers generally run numerous mixed workloads on their storage solutions, everything from Active Directory, DNS , DHCP etc which has low capacity/IOPS requirements , Database, Email and Application servers which may have higher capacity/IOPS requirements to achieve and backup with are low IOPS but high capacity.

Each of these workloads have different IO profiles and depending on storage architecture may share storage controllers / SSDs / HDDs / storage networking all of which can result in congestion / contention which leads to reduced performance.

Before you start considering what vendors storage solution is best, you need to first understand (and document) your requirements along with a success criteria which you can validate storage solutions against.

If your requirements are for example:

  • Host 10TB of Exchange Mailboxes for 2000 users (~400 random Read/Write 32-64k IOPS)
  • Host 20TB Windows DFS solution
  • Host 50TB of Backups
  • Support 1TB active working set SQL Database
  • Host 10TB of misc low IO random workload
  • Have Per VM snapshot / backup / replication capabilities

Then there is no point having (or testing) a solution for 100k Random Read 4k IOPS, as your requirement may be less than 10K IOPS of varying sizes and profile.

Consider this:

If the storage solution/s your considering can achieve the 10K IOPS with the I/O profile of your workloads and can be easily scaled, then a solution able to achieve 20K IOPS day 1, is of little/no advantage to a solution which can achieve 12K IOPS since 10K IOPS is all that you need.

Now if your Constraints are:

  • 12RU rack space
  • 4kw Power
  • $200k

Anything that’s larger than 12RU, uses more than 4Kw of Power or costs more than $200k is not something you should spend your time looking at / benchmarking etc since its not something you can purchase.

So to quote Vaughn and Chad again, “Don’t perform Absurd Testing”. absurdtesting

In my opinion, customers should value their own time enough not to waste time doing a proof of concepts (PoCs) on multiple different products when in reality only 2 meet your requirements.

An example of Absurd testing would be taking a Toyota Corolla on a test drive to a drag strip and testing its 1/4 mile performance when you plan to use the car to pick-up the shopping and drop the kids off at school.

school crossingcarshopping

Its equally as Absurd to test 100% Random Read 4k IOPS or consider/test/compare a storage solutions <insert your favourite feature here> when its not required or applicable to your use case.

Summary:

  1. Peak performance is rarely a significant factor for a storage solution.
  2. Understand and document you’re storage requirements / constraints before considering products.
  3. Create a viability/success criteria when considering storage which validates the solution meets you’re requirements within the constraints.
  4. Do not waste time performing absurd testing of “Peak performance” or “features” which are not required/applicable.
  5. Only conduct Proof of Concepts on solutions:
    1. Where no evidence exists on the solutions capability for your use case/s.
    2. Which fall within your constraints (Cost, Size , Power , Cooling etc).
    3. Which on paper meet/exceed your requirements!
    4. Where you have a documented PoC plan with a detailed success criteria!
  6. As long as the solution your considering can quickly, easily and non-disruptively scale, there is no need to oversize day 1.
    1. If the solution your considering CANT quickly, easily and non-disruptively scale, then its probably not worth considering.
  7. The performance of a storage solution can be impacted by many factors such as compute, network  and applications.
  8. When Benchmarking, do so with tests which simulate the workload/s you plan to run, not “hero” style 100% read 4k (to achieve peak IOPS numbers) or 100% read 256k (to achieve high throughput numbers).

Unlimited VMs per datastore? Its not a myth with Nutanix!

For many years, I have been asked on countless occasions questions relating to how many VMs can (or should) be placed in one datastore.

In fact, just this morning I was asked this same question, and I decided to whip up a quick post.

I have previously posted an Example Architectural Decision relating to Datastore sizing for Block based storage. What this example was aimed to show was a how things like RPO/RTO and performance should be taken into consideration when choosing a datastore size.

The above example is not a hard and fast rule, but an example of one deployment which I was involved in.

There is a great article written on this topic by VCDX, Jason Boche (@jasonboche), titled  “VAAI and the Unlimited VMs per Datastore Urban Myth” which covers in great detail this topic as it relates to block based storage, being iSCSI, FC & FCoE.

But what about NFS, and what about with Hyper-converged solutions like Nutanix?

NFS has gained significant popularity in recent years, and in my opinion, people who know what they are talking about, no longer refer to NFS as “Tier 3 Storage” which was once common.

With traditional storage solutions, generally only a smaller number of controllers can actively serve IO to the one NFS mount, so the limiting factor preventing running more virtual machines per NFS mount, in my experience was performance but things like RPO/RTO were and are important considerations.

NFS does not suffer from SCSI reservations which resulted in increased latency ,which is what VAAI, specifically the Atomic Test & Set or ATS primitive helped too all but eliminate for block based datastores.

LUNs are limited by there queue depth, which in most cases is 32 (sometimes 64). This is also a limiting factor, as all the VMs in a datastore (LUN) share the same queue which can lead to contention. SIOC helps manage the contention by ensuring fairness based on share values, but it does not solve the issue.

NFS on the other hand has a much larger queue depth, in fact its basically unlimited as shown below.

NFSqueuedepth

So as NFS does not suffer from SCSI reservations, or queue depth issues, what is limiting us having hundreds or more VMs per datastore?

It comes down to how many active storage controllers are able to service the NFS mount, and the performance of the storage controller/s. In addition to this your business requirements around RPO/RTO. In other words, if a NFS mount is lost, how quickly can you recover.

For most traditional shared storage products,

1. Have only 1 or 2 active controllers – thus potentially limiting performance which would lead to lower VMs per NFS datastore.

2. Do snapshots at the NFS mount layer, so if you need to recover an entire NFS mount, the larger it is, the longer it may take.

For Nutanix, by default, NFS is used to present the Nutanix Distributed File System (NDFS) to vSphere, however the key difference between Nutanix and traditional shared storage is every controller in the Nutanix cluster, can and does Actively serve IO to any datastore in the cluster concurrently.

So the limit from a performance perspective is gone thanks to Nutanix scale out, shared nothing architecture, with one virtual storage controller (CVM) per Nutanix node. The number of nodes that’s can be scaled too, is also unlimited. An example of Nutanix ability to scale can be found here – Scaling to 1 million IOPS and beyond, Linearly!

Next what about the RPO/RTO issue? Well, Nutanix does not rely on LUNs or NFS mounts for our data protection (or snapshots), this is all done at a VM layer so your RPO/RTO is now per VM, which gives you much more flexibility.

With Nutanix, you can literally run hundreds or even thousands of VMs per NFS datastore, without performance or RPO/RTO problems thanks to scale out, shared nothing architecture and the Nutanix Distributed File System.

There are some reasons why you may choose to have multiple NFS datastores even in a Nutanix environment, these include, if you want to enable Compression and/or De-duplication which are enabled/disabled on a per container (or datastore) level. As some workloads don’t compress or dedupe well, these types of workloads should be excluded to reduce the overhead on the cluster.

It is important to note, Nutanix uses a concept called a “Storage Pool” which contains all the storage for the Nutanix cluster. On top of a “Storage Pool” you create “Containers” (or datastores). This means regardless of if you have 1 or 100 datastores, they all still sit on top of the one “Storage Pool” which means you still have access to the same amount of storage capacity, with no silos for maximum capacity utilization (and performance!).

Lastly, Nutanix does not suffer from the same availability concerns as traditional shared storage where a single LUN could potentially be lost. This is due to the distributed architecture of the Nutanix solution. For more information on how Nutanix is more highly available than traditional shared storage, check out “Scale out, Shared Nothing Architecture Resiliency by Nutanix

Check out a screen shot of one cluster with ~800 VMs on a single datastore. Note: The sub millisecond latency and 14K IOPS w/ ~900MBps throughput. Not bad!

800VMsonDatastore