Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 1 – Introduction

Before I go into the details of why Acropolis Hypervisor (AHV) is the next generation of hypervisor, I wanted to quickly cover what the Xtreme Computing Platform is made up of and clarify the product names which will be discussed in this series.

In the below picture we can see Prism which is a HTML 5 based user interface sits on top of Acropolis which is a Distributed Storage and Application Mobility across multi-hypervisors and public clouds.

At the bottom we can see the currently support hardware platforms from Supermicro and Dell (OEM) but recently Nutanix has announced an OEM with Lenovo which expands customer choice further.

Please do not confuse Acropolis with Acropolis Hypervisor (AHV) as these are two different components, Acropolis is the platform which can run vSphere, Hyper-V and/or the Acropolis Hypervisor which will be referred to in this series as AHV.
nutanixxcp2

I want to be clear before I get into the list of why AHV is the next generation hypervisor that Nutanix is a hypervisor and cloud agnostic platform designed to give customers flexibility & choice.

The goal of this series is not trying to convince customers who are happy with their current environment/s to change hypervisors.

The goal is simple, to educate current and prospective customers (as well as the broader market) about some of the advantages / values of AHV which is one of the hypervisors (Hyper-V, ESXi and AHV) supported on the Nutanix XCP.

Here are my list of reasons as to why the Nutanix Xtreme Computing Platform based on AHV is the next generation hypervisor/management platform and why you should consider the Nutanix Xtreme Computing Platform (with Acropolis Hypervisor a.k.a AHV) as the standard platform for your datacenter.

Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor

Part 2 – Simplicity
Part 3 – Scalability
Part 4 – Security
Part 5 – Resiliency
Part 6 – Performance
Part 7 – Agility (Time to Value)
Part 8 – Analytics (Performance & Capacity Management)
Part 9 – Functionality (Coming Soon)
Part 10 – Cost

NOTE:  For a high level summary of this series, please see the accompanying post by Steve Kaplan, VP of Client Strategy at Nutanix (@ROIdude)

What if my VMs storage exceeds the capacity of a Nutanix node?

I get this question a lot, What if my VM exceeds the capacity of the node its running on. The answer is simple, the storage available to a VM is the entire storage pool which is made up of all nodes within the cluster and is not limited to the capacity of any single node.

Let’s take an extreme example, a single VM is running on Node B (shown below) and all other nodes have no workloads. Regardless of if the nodes are “Storage only” such as NX-6035C or any Nutanix node capable of running VMs e.g.: NX3060-G4 the SSD and SATA tiers are shared.

AllSSDhybrid

The VM will write data to the SSD tier and only once the entire SSD tier (i.e.: All SSD in all nodes) reaches 75% capacity will ILM tier the coldest data off the to SATA tier. So if the SSD tier never reaches 75% you will have all data in SSD tier both local and remote.

This means multiple CVMs (Nutanix Controller VM) will service the I/O which allows for single VMs to achieve scale up type performance where required.

As the SSD tier exceeds 75% data is tiered down to SATA but active data will still reside in SSD tier across the cluster and be serviced with all flash performance.

The below shows there is a lot of data in the SATA tier but ILM is intelligent enough to ensure hot data remains in the SSD tier.

AllSSDwithColdData

Now what about Data Locality, Data Locality is maintained where possible to ensure the overheads of going across the network are minimized but simply put, if the active working set exceeds the local SSD tier Nutanix ensures maximum performance by distributing data across the shared SSD tier (not just two nodes for example) and services I/O through multiple controllers.

In the worst case where the active working set exceeds the local SSD capacity but fits within the shared SSD tier, you will have the same performance as a Centralised All Flash Array, in the best case, Data Locality will avoid the requirement to traverse the IP network and service reads locally.

If the active working set exceeds the shared SSD tier, Nutanix also distributes data across the shared SATA tier and services I/O from all nodes within the cluster as explained in a recent post “NOS 4.5 Delivers Increased Read Performance from SATA“.

Ideally I recommend sizing the Active working set of VMs to fit within the local SSD tier but this is not always possible. If you’re running Nutanix you can find out what the active working set of a VM is via PRISM (See post here) and if you’re looking to size for a Nutanix solution, use my rule of thumb for sizing for storage performance in the new world.

Bug Life: vSphere 6.0 Network I/O Control & Custom Network Resource Pools

In a previous post How to configure Network I/O Control (NIOC) for Nutanix (or any IP Storage) I showed just how easy configuring NIOC is back in the vSphere 5.x days.

In was based around the concepts of Shares and Limits, of which I have always recommended shares which enable fairness while allowing traffic to burst if/when required. NIOC v2 was a Simple, and effective solution for sure.

Enter NIOC V3 in vSphere 6.0.

Once you upgrade to NIOC v3 you can no longer use the vSphere C# client and NIOC also now has the concept of bandwidth reservations as shown below:

NIOCoverview

I am not really a fan of reservations in NIOC or for CPU (memory is good though) and in fact I’ll go as far as to say NIOC was great in vSphere 5.x and I don’t think it needed any changes.

However with vSphere 6.0 Release 2494585 when attempting to create a custom network resource pool under the “Resource Allocation” menu by using the “+” icon (as shown below) you may experience issues.

As shown below, before even pressing the “+” icon to create a network resource pool, the Yellow warning box tells us we need to configure a bandwidth reservation for virtual machine system traffic first.

issue1

So my first though was, Ok, I can do this, but why? I prefer using Shares as opposed to Limits or reservations because I want traffic to be able to burst when required and for no bandwidth to be wasted if certain traffic types are not using it.

In any case, I followed the link in the warning and went to set a minimal reservation of 10Mbit/s for Virtual machine traffic as shown below.

Pix3

When pressing “Ok” I was greeted with the below error saying the “Resource settings are invalid”. As shown below I also tried higher reservations without success.

Pix2

I spoke to a colleague and had them try the same in a different environment and they also experienced the same issue.

I have currently got a call open with VMware Support. They have acknowledge this is an issue and is being investigated. I will post updates as I hear from them so stay tuned.