Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 3 – Scalability

Scalability is not just about the number of nodes that can form a cluster or the maximum storage capacity. The more important aspects of scalability is how an environment expands from many perspectives including Management, Performance, Capacity, Resiliency and how scaling effects Operational aspects.

Let’s start with scalability of the components required to Manage/Administrator AHV:

Management Scalability

AHV automatically sizes all Management components during deployment of the initial cluster, or when adding node/s to the cluster. This means there is no need to do initial sizing or manual scaling of XCP management components regardless of the initial and final size of the cluster/s.

Where Resiliency Factor of 3 (N+2) is configured, the Acropolis management components will be automatically scaled to meet the N+2 requirement. Let’s face it, there is no point having N+2 at one layer and not at another because availability, like a Chain, is only as good as its weakest link.

Storage Capacity Scaling

The Nutanix Distributed Storage Fabric (DSF) has no maximum Storage Capacity, additionally, storage capacity can even be scaled separately to compute with “Storage-only” nodes such as the NX-6035C. Nutanix storage only nodes help eliminate the problems when scaling capacity compared to traditional storage.

Scaling Storage-only nodes run AHV (which are interoperable with other supported hypervisors) allowing customers to scale capacity regardless of Hypervisor. Storage-only nodes do not require hypervisor licensing or separate management. Storage only nodes also fully support all one-click upgrades for the Acropolis Base Software and AHV just like compute+storage nodes. As a result, storage only nodes are invisible, well apart from the increased capacity and performance which the nodes deliver.

Nutanix Storage only nodes help eliminate the problems when scaling capacity compared to traditional storage, for more information see: Scaling problems with traditional shared storage.

Some of the scaling problems with traditional storage is adding shelves of drives and not scaling data services/management. This leads to problems such as lower IOPS/GB and higher impact to workloads in the event of component failures such as storage controllers.

Scaling storage only nodes is remarkably simple. For example a customer added 8 x NX6035C nodes to his vSphere cluster via his laptop on the showroom floor of vForum Australia in October of this year.

https://twitter.com/josh_odgers/status/656999546673741824

As each storage-only node is added to the cluster, a light-weight Nutanix CVM joins the cluster to provide data services to ensure linear scale out management and performance capabilities, thus avoiding the scaling problems which plague traditional storage.

For more information on Storage only nodes, see: http://t.co/LCrheT1YB1

Compute Scalability

Enabling HA within a cluster requires reserving one or more nodes for HA. This can create unnecessary inefficiencies when the hypervisor limits the maximum cluster size. AHV not only has no limit to the number of nodes within a cluster. As a result, AHV can help avoid unnecessary silos that can lead to inefficient use of infrastructure due to requiring one or more nodes per cluster to be reserved for HA. AHV nodes are also automatically configured with all required settings when joining an existing cluster. All the administrator needs to provide is basic IP address information, Press Expand cluster and Acropolis takes care of the rest.

See the below demo showing how to expand a Nutanix cluster:

Analytics Scalability

AHV includes built-in Analytics and as with the other Acropolis Management components, Analysis components are sized automatically during initial deployment and scales automatically as nodes are added.

This means there is never tipping point where there is a requirement for an administrator to scale or deploy new Analysis instances or components. The analysis functionality and its performance remains linear regardless of scale.

This means AHV eliminates the requirement for seperate software instances and database/s to provide analytics.

Resiliency Scalability

As Acropolis uses the Nutanix Distributed Storage Fabric, in the event drive/s or node/s fail, all nodes within the cluster participate in restoring the configured resiliency factor (RF) for the impacted data. This occurs regardless of Hypervisor, however, AHV includes fully distributed Management components; the larger the cluster, the more resilient the management layer also becomes.

For example, the loss of a single node in a 4-node cluster would have potentially a 25% impact on the performance of the management components. In a 32-node cluster, a single node failure would have a much lower potential impact of only 3.125%. As an AHV environment scales, the impact of a failure decreases and the ability to self-heal increases in both speed to recover and number of subsequent failures which can be supported.

For information about increasing resiliency of large clusters, see: Improving Resiliency of Large Clusters with EC-X

Performance Scalability

Regardless of hypervisor, as XCP clusters grow, the performance improves. The moment new node(s) are added to the cluster, the additional CVM/s start participating in Management and Data Resiliency tasks even when no VMs are running on the nodes. Adding new nodes allows the Storage Fabric to distribute RF traffic among more Controllers which enhances Write I/O & resiliency while helping decrease latency.

The advantage that AHV has over the other supported hypervisors is that the performance of the Management components (many of which have been previously discussed) dynamically scale with the cluster. Similar to Analytics, AHV management components scale out. There is never a tipping point requiring manual scale out of management or new deploying instances of management components or their dependencies.

Importantly, for all components, the XCP distributes data and management functions across all nodes within the cluster. Acropolis does not use “mirrored” components/hardware or objects which ensures no two nodes or components/hardware become a bottleneck or point of failure.

Back to the Index

Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 1 – Introduction

Before I go into the details of why Acropolis Hypervisor (AHV) is the next generation of hypervisor, I wanted to quickly cover what the Xtreme Computing Platform is made up of and clarify the product names which will be discussed in this series.

In the below picture we can see Prism which is a HTML 5 based user interface sits on top of Acropolis which is a Distributed Storage and Application Mobility across multi-hypervisors and public clouds.

At the bottom we can see the currently support hardware platforms from Supermicro and Dell (OEM) but recently Nutanix has announced an OEM with Lenovo which expands customer choice further.

Please do not confuse Acropolis with Acropolis Hypervisor (AHV) as these are two different components, Acropolis is the platform which can run vSphere, Hyper-V and/or the Acropolis Hypervisor which will be referred to in this series as AHV.
nutanixxcp2

I want to be clear before I get into the list of why AHV is the next generation hypervisor that Nutanix is a hypervisor and cloud agnostic platform designed to give customers flexibility & choice.

The goal of this series is not trying to convince customers who are happy with their current environment/s to change hypervisors.

The goal is simple, to educate current and prospective customers (as well as the broader market) about some of the advantages / values of AHV which is one of the hypervisors (Hyper-V, ESXi and AHV) supported on the Nutanix XCP.

Here are my list of reasons as to why the Nutanix Xtreme Computing Platform based on AHV is the next generation hypervisor/management platform and why you should consider the Nutanix Xtreme Computing Platform (with Acropolis Hypervisor a.k.a AHV) as the standard platform for your datacenter.

Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor

Part 2 – Simplicity
Part 3 – Scalability
Part 4 – Security
Part 5 – Resiliency
Part 6 – Performance
Part 7 – Agility (Time to Value)
Part 8 – Analytics (Performance & Capacity Management)
Part 9 – Functionality (Coming Soon)
Part 10 – Cost

NOTE:  For a high level summary of this series, please see the accompanying post by Steve Kaplan, VP of Client Strategy at Nutanix (@ROIdude)

RF2 & RF3 Usable Capacity with Erasure Coding (EC-X)

Over the past few weeks with the release of Acropolis base version 4.5 (formally known as NOS) on the horizon there has been a lot of interest in Erasure Coding (EC-X) which was announced at Nutanix .NEXT conference in June this year.

The most common questions are how does EC-X increase the effective SSD tier capacity and the overall cluster usable capacity. This post aims to cover these questions.

Resiliency Factor 2 (RF2) & Erasure Coding

Resiliency Factor 2 ensures that two copies of all data are written to persistent media prior to being acknowledged to the guest operating system. This ensures at N+1 level of redundancy which translates to being able to tolerate a single failure.

RF2 provides a usable capacity of ~50% of RAW.

The below figure shows an example of RF2 where six blocks store three pieces of data in a redundant fashion. In this configuration a single SSD/HDD or node can be lost without impacting data availability.

RF2normal

Now let’s take a look at how the same 6 blocks will be utilized with Erasure Coding enabled:

RF2plusECX

As we can see, we are now able to store four pieces of data (A,B,C,D) with single parity to ensure data can be rebuilt in the event of a drive or node failure. As with standard RF2, an RF2 + EC-X configuration can also tolerate a single SSD/HDD or node can be lost without impacting data availability. We also free up space to be used for another EC-X stripe.

As a result, the usable capacity increases from approx. 50% usable up to 80% usable for clusters of six (6) or larger.

The following table shows the maximum usable capacity for RF2 + EC-X based on cluster size:

Note: Assumes 20TB RAW per node

RF2table

Resiliency Factor 3 (RF3) & Erasure Coding

Resiliency Factor 3 ensures that three copies of all data are written to persistent media prior to being acknowledged to the guest operating system. This ensures at N+2 level of redundancy which translates to being able to tolerate two concurrent SSD/HDD or node failures.

RF3 provides a usable capacity of ~33% of RAW.

The below figure shows an example of RF3 where six blocks store two pieces of data in a redundant fashion. In this configuration the environment can tolerate two concurrent SSD/HDD or node failures without impacting data availability.

RF3normal

Now let’s take a look at how the same 6 blocks will be utilized with Erasure Coding enabled:

RF3ECX

Similar to the RF2 example, we can see we are now able to store more data with the same level of redundancy. In this case, five pieces of data (A,B,C, D) with dual parity to ensure data can be rebuilt in the event of dual concurrent drive or node failures. As with standard RF3, an RF3 + EC-X provides an N+2 level of availability while providing higher usable capacity.

The following table shows the usable capacity for RF3 + EC-X based on cluster size:

Note: Assumes 20TB RAW per node

RF3ECXtable

EC-X Parity Placement

To further increase the effective capacity of the SSD tier and there for supporting larger working set sizes with all flash performance, the Parity for containers with EC-X enabled is stored on the SATA tier.

The following figure shows a standard RF3 deployment:

RF3parityNormal

As we can see, 6 blocks of storage contain just 2 actual pieces of user data all of which reside in the SSD tier.

With RF3 + EC-X the same 6 blocks of storage contain 4 pieces of user data thus increasing the effective capacity of the SSD tier by 100% due to being able to store 4 piece of data compare to two with RF3. In addition the effective SSD capacity is further increased by moving the 2 parity blocks to SATA freeing up a further 33% SSD tier capacity.

RF3ECXparity

I hope that explains how EC-X works and why its such an advantage for Nutanix current and futures customers.

Related Articles:

  1. Nutanix Erasure Coding Deep Dive
  2. Increasing resiliency of large clusters with Erasure Coding
  3. What I/O will EC-X take effect on?
  4. Sizing assumptions for solutions with Erasure Coding (EC-X)