Peak performance vs Real World – Exchange on Nutanix Acropolis Hypervisor (AHV)

I wrote a post in April 2015 titled “Peak Performance vs Real World Performance” which discusses how benchmarks are not realistic and the performance shown in benchmarks can rarely be reproduced with real workloads. It has been one of my most popular posts, and I have had overwhelmingly positive feedback, with only a select few still pushing unrealistic peak performance benchmarks as being of value to customers.

I thought I would whip up a post showing an example of benchmarks vs real world performance requirements using MS Exchange Jetstress on Nutanix.

The below is a screen shot from Nutanix PRISM HTML based GUI showing a Virtual Machines Read/Write IOPS , bandwidth and latency during a MS Exchange Jetstress benchmark.

JetstressAHV20160105

The screen shot shows ~4000 Read IOPS and ~4000 Write IOPS at a latency of 1.59ms.

But what does the above really tell us and what does it mean to a customer?

I’ve been quoted as saying “Benchmarks are of little value without context specific to customer requirements!” and I stand by this statement.

Let’s now look at an example of a real customers requirement:

The below is from the Exchange server role requirements calculator and it is a screen shot from the Role requirements tab which shows an estimate of the IOPS required for the Databases and Logs for a single Exchange instance.

ExchangeIOexample

It shows the required IOPS being 536 for the databases and 115 for the logs.

Note: The sizing calculator was for an environment supporting 20000 mailboxes across 3 mailbox servers. As such, the above IO requirements are for ~6666 users.

So now that we have done the MS Exchange solution sizing (shown above is just the storage performance requirements), we understand the requirement to be around 651 mixed Read/Write IOPS per mailbox VM. We can then take a benchmark such as Jetstress and validate that the solution has sufficient storage performance.

To require the ~8000 IOPS the Jetstress test showed, we would need to scale up each Exchange instances to support have a much larger number of users and have each user send/receive 500 emails per day to reach this requirement.

8kJetstressIOPS

But in scaling up each Exchange instance to reach the peak IOPS that even this 3 year old generation Nutanix node can deliver we would vastly exceed the compute sizing recommendations for Exchange 2013 (being 24vCPUs and 96GB RAM) as shown by the calculator below.

ScaleUpExchange

As we can see, for an Exchange instance to require those peak IOPS, we would have to size the Mailbox server VMs with more than 10x the recommended vCPUs (24) and 15x the RAM (96GB). This shows that peak IOPS which can be achieved are not relevant in the real world.

In fact, Exchange generally does not require more than 1000 IOPS. Typically its requires much less, as my earlier example shows. So peak performance numbers are of little/no value as they can’t (and more importantly don’t need to be) reproduced in the real world.

With a tool like Jetstress we can configure a precise Mailbox profiles and test only what you require. If the solution can produce more IOPS than what you need (such as in this example), that’s fine for headroom, but in this day and age where Nutanix allows you to quickly and easily scale (Compute/Storage performance & capacity), I recommend designing for what you need in the foreseeable future (by this I mean 6-12 months) and scale if/when required.

What a benchmark does help you understand is how much headroom a solution has over and above your requirements which can help choose a solution to support mixed workloads, BUT the benchmark would need to be re-ran concurrently with suitable benchmarks for all other applications you intend on mixing to see how the solution behaves with mixed workloads.

As such, single application peak performance benchmarks are almost never valuable (to customers) unless your planning to run application specific silos. I strongly recommend anyone considering implementing an application specific silo, read the following article: Enterprise Architecture & Avoiding tunnel vision.

And… if you’re planning to run application specific silos and/or scaling up workloads to the point they need crazy IOPS, then you’re increasing the size of your failure domains, CAPEX and OPEX which is only doing yourself (or your customer) a disservice. But that’s a topic for another day.

I hope this example shows how real world requirements and performance is vastly different to what a benchmark shows and why peak performance benchmarks should be taken with a grain of salt.

I’ve always said the focus should be on gathering requirements and delivering on business outcomes, not focusing on performance which is typically only a very small part of a solution that delivers a successful business outcome.

Summary:

When sizing an MS Exchange solution on Nutanix, IOPS is not a constraining factor even for large scale deployments. The most common constraining factor is the Microsoft recommended compute maximums being 24 vCPUs and 96GB RAM, which is the same constraint regardless of if you run on Nutanix, or any other virtual / physical platform.

Related Articles:

Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 8 – Analytics (Performance / Capacity Management)

Acropolis provides a powerful yet simple-to-use Analysis solution which covers the Acropolis Platform, Compute (Acropolis Hypervisor / Virtual Machines) and Storage (Distributed Storage Fabric).

Unlike other Analysis solutions, Acropolis requires no additional software licensing, management infrastructure or virtual machines/applications to design/deploy or configure. The Nutanix Controller VM includes built-in Analysis which have no external dependencies. There is no need to extract/import data into another product or Virtual appliance meaning lower overheads e.g.: Less data is required to be stored and less impact on storage.

Not only is this capability built in day one, but as the environment grows over time, Acropolis automatically scales the analytics capability; there is never a tipping point where you need to deploy additional instances, increase compute/storage resources assigned to Analytics Virtual Appliances or deploy additional back end databases.

For a demo of the Analysis UI see the following YouTube Video from 4:50 onwards.

Summary:

  1. In-Built analysis solution
  2. No additional licensing required
  3. No design/implementation or deployment of VMs/appliances required
  4. Automatically scales as the XCP cluster/s grow

Lower overheads due to being built into Acropolis and utilizing the Distributed Storage Fabric

Back to the Index

Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 6 – Performance

When talking about performance, it’s easy to get caught up in comparing unrealistic speed and feeds such as 4k I/O benchmarks. But, as any real datacenter technology expert knows, IOPS are just a small piece of the puzzle which, in my opinion, get far too much attention as I discussed in my article Peak Performance vs Real World Performance.

When I talk about performance, I am referring to all the components within the datacenter including the Management components, Applications/VMs, Analytics, Data Resiliency and everything in between.

Let’s look at a few examples of how Nutanix XCP running Acropolis Hypervisor (AHV) ensures consistent high performance for all components:

Management Performance:

The Acropolis management layer includes the Acropolis Operating System (formally NOS), Prism (HTML 5 GUI) and Acropolis Hypervisor (AHV) management stack made up of “Master” and “Slave” instances.

This architecture ensures all CVMs actively and equally contribute to ensuring all areas of the platform continue running smoothly. This means there is no central application, database or component which can cause a bottleneck, being fully distributed is key to delivering a web-scale platform.

AcropolisCluster1

Each Controller VM (CVM) runs the components required to manage the local node and contribute to the distributed storage fabric and management tasks.

For example: While there is a single Acropolis “Master” it is not a single point of failure nor is it a performance bottleneck.

The Acropolis Master is responsible for the following tasks:

  1. Scheduler for HA
  2. Network Controller
  3. Task Executors
  4. Collector/Publisher of local stats from Hypervisor
  5. VNC Proxy for VM Console connections
  6. IP address management

Each Acropolis Slave  is responsible for the following tasks:

  1. Collector/Publisher of local stats from Hypervisor
  2. VNC Proxy for VM Console connections

Regardless of being a Master or Slave, each CVM performs the two heaviest tasks: The Collection & Publishing of Hypervisor stats and, when in use, the VM console connections.

The distributed nature of the XCP platform allows it too achieve consistently high performance. Sending stats to a central location such as a central management VM and associated database server not only can become a bottleneck, but without introducing some form of application level HA (e.g.: SQL Always On Availability Group) it also could be a single point of failure which is for most customers unacceptable.

The roles which are performed by the Acropolis Master are all lightweight tasks such as the HA scheduler, Network Controller, IP address management and Task Executor.

The HA scheduler task is only active in the event of a node failure which makes it a very low overhead for the Master. The Network Controller task is only active when tasks such as new VLANs are being configured and Task Execution is simply keeping track of all tasks and distributing them for execution across all CVMs. IP address management is essentially a DHCP service, which is also an extremely low overhead.

In part 8, we will discuss more about Acropolis Analytics.

Data Locality

Data locality is a unique feature of XCP where new I/O writes to the local node where the VM is running as well as replicated to other node/s within the cluster. Data locality eliminates the requirement for servicing subsequent Read I/O by traversing the network and utilizing a remote controller.

As VMs migrate around a cluster, Write I/O is always written locally and remote reads will only occur if remote data is accessed. If data is remote and never accessed, no remote I/O will occur. As a result, it is typical for >90% of I/O to be serviced locally.

Currently bandwidth and latency across a well designed 10Gb network may not be an issue for some customers, however as flash performance exponentially increases the network could quite easily become a major bottleneck without moving to expensive 40Gb (or higher) networking. Data locality helps minimize the dependency on the network by servicing the majority of Read I/O locally and by writing one copy locally it reduces the overheads on the network for Write I/O. Therefore Data Locality allows customers to run lower cost networking without compromising performance.

While data locality works across all supported hypervisors,  AHV is unique as it supports data-aware virtual machine placement:  Virtual Machines are powered onto the node with the highest percentage of local data for that VM which minimizes the chance of remote I/O and reduces the overheads involved in servicing I/O for each VM following failures or maintenance.

In addition, Data Locality also applies to the collection of back end data for Analysis such as hypervisor and virtual machine statistics. As a result, statistics are written locally and a second (or third for environments configured with RF3) written remotely. This means stats data which can be a significant amount of data has the lowest possible impact on the Distributed File System and cluster as a whole.

Summary:

  1. Management components scale with the cluster to ensure consistent performance
  2. Data locality ensures data is as close to the Compute (VM) as possible
  3. Intelligent VM placement based on Data location
  4. All Nutanix Controller VMs work as a team (not in pairs) to ensure optimal performance of all components and workloads (VMs) in the cluster

Back to the Index