NetApp HCI Versus Nutanix – The Rebuttal

I was made aware of a recent article from Rob Klusman at Netapp titled “Netapp HCI Verses Nutanix” by a Nutanix Technology Champion (NTC) who asked for us to respond to the article “cause there’s some b*llsh*t in it”.

** UPDATE **

Netapp have since removed the post, it can now be viewed via Google Cache here:

http://webcache.googleusercontent.com/search?q=cache:https://blog.netapp.com/netapp-hci-vs-nutanix/

I like it when people call it like it is, so here I am responding to the bullshit (article).

The first point I would like to address is the final statement in the article.

NetApp HCI is the first choice, and Nutanix is the second choice. Leading in an economics battle just doesn’t work if performance is lacking.

Rob rightly points out Nutanix leads the economic battle so kudos for that, but he follows up by implying Nutanix performance is lacking. Wisely Rob does not provide any follow up which can be discredited, so I will just leave you with these three posts discussing how Nutanix scales performance for Single VMs, Monster VMs and Physical servers from my Scalability, Resiliency & Performance blog series.

Part 3 – Storage Performance for a single Virtual Machine
Part 4 – Storage Performance for Monster VMs with AHV!
Part 5 – Scaling Storage Performance for Physical Machines

Rob goes on to make the claim:

Nutanix wants infrastructure “islands” to spread out the workloads

This is just incorrect and not only is it incorrect, Nutanix has been recommending mixed workload deployments for many years. Here is an article I wrote in July 2016 titled “The All-Flash Array (AFA) is Obsolete! where I conclude with the following summary:

MixedWorkloads2016

I specifically state mixed workloads including business critical applications are supported without creating silos. It’s important to note this statement was made in July 2016 before Netapp had even started shipping (Oct 25th 2017) their 3-tier architecture product which they continue to incorrectly refer to as HCI.

Gartner supports my statement that the Netapp product is not HCI and states:

“NetApp HCI competes directly against HCI suppliers, but its solution does not meet Gartner’s functional definition of HCI.”

Mixed workloads is nothing new for Nutanix, and not only is mixing workloads supported, I frequently recommend it as it increases performance and resiliency as described in detail in my blog series Nutanix | Scalability, Resiliency & Performance.

Now let’s address the “Key Differences” Netapp claim:

User interface. Both products have an intuitive graphical interface that is well integrated into the hypervisor of choice. But what’s not obvious is that simplicity goes well beyond where you click. NetApp HCI has the most extensive API in the market, with integration that allows end users to automate even the most minute features in the NetApp HCI stack.

The philosophy of Nutanix intuitive GUI (which Netapp concedes) is all features in the GUI must be made available via an API. In the PRISM GUI Nutanix provides the “REST API Explorer” (shown below) where users can easily understand the available operations to automate anything they choose.

RestAPIexplorer

NutanixRESTAPI

Next up we have:

Versatile scale. How scaling is accomplished is important. NetApp HCI scales in small infrastructure components (compute, memory, storage) that are all interchangeable. Nutanix requires growth in specific block components, limiting the choices you can make.

When vendors attack Nutanix, I am always surprised they try and attack the scalability capabilities as if anything, this is one of the strongest areas for Nutanix.

I’ve already referenced my my Scalability, Resiliency & Performance blog series where I go into a lot of detail on these topics but in short, Nutanix can scale:

  1. Storage Only by adding drives or nodes
  2. Compute Only by adding RAM or nodes
  3. Compute + Storage by adding drives and/or nodes

Back in mid 2013 when I joined Nutanix, the claim by Netapp was true as only one node type (NX-3450) was available, but later that same year the 1000 and 6000 series were released giving more flexibility and things have continued to become more flexible over the years.

Today the flexibility (or versatility) in scale for Nutanix solutions is second to none.

Performance. Today, it’s an absolute requirement for HCI to have an all-flash solution. Spinning disks are slightly less expensive, but you’re sacrificing production workloads. NetApp HCI only offers an all-flash solution.

Congratulations Netapp, you do all flash, just like everyone else (but you came to the party years later). There a many use cases for bulk storage capacity, be it all flash or hybrid, Nutanix provides NVMe+SATA-SSD, All SATA-SSD and SATA-SSD+SAS/SATA HDD options to cover all use cases and requirements.

Not only that but Nutanix allows mixing of All Flash and Hybrid nodes to further avoid the creation of silos.

Enterprise ready. This is an important test. One downfall of Nutanix software running on exactly the same CPU cores as your applications is the effect on enterprise readiness. Many of our customers have shifted away from Nutanix once they’ve seen what happens when a Nutanix component fails. It’s easier to move the VM workload off the current Nutanix system (the one that’s failing) than it is to wait for the fix. Nutanix does not run optimally in hardware-degraded situations. NetApp HCI has no such problem; it can run at full workloads, full bandwidth, and full speed while any given component has failed.

It’s a huge claim by Netapp to dispute Nutanix’ enterprise readiness, considering we have many more years of experience shipping product but hey, Netapp’s article is proving to be without factual basis every step of the way.

The beauty of Nutanix is the ability to self heal after failures (hardware or software) and then tolerate subsequent failures. Nutanix also has the ability to tolerate multiple concurrent failures including up to 8 nodes and 48 physical drives (NVMe/SSD/HDD).

Nutanix can also tolerate one or more failures and FULLY self heal without any hardware being replaced. This is critical as I detailed in my post: Hardware support contracts & why 24×7 4 hour onsite should no longer be required.

For more details on these failure scenarios checkout the Resiliency section of my blog series Nutanix | Scalability, Resiliency & Performance.

Workload performance protection. No one should attempt an advanced HCI deployment without workload performance protection. Only NetApp HCI provides such a guarantee, because this protection is built into the native technology.

 

One critical factor in delivering consistent high performance is data locality. The further data is from the compute layer, the more bottlenecks there are to potentially impact performance.  It’s important to Evaluate Nutanix’ original & unique implementation of Data Locality to understand that features such as QoS for Storage IO are features which are critical with scale up shared storage (a.k.a SAN/NAS) but when using a highly distributed scale out architecture, noisy neighbour problems are all but eliminated by the fact you have more controllers and that the controllers are local to the VMs.

Storage QoS is added complexity, and only required when a product such as a SAN/NAS has no choice but to deal with the IO blender effect where sequential IO is received as random due to competing workloads, this effect is minimised with Nutanix Distributed Storage Fabric.

Shared CPU cores. One key technical difference between the Nutanix product and NetApp HCI is the concept of shared CPU cores. Nutanix has processes running in the same cores as your applications, whereas NetApp HCI does not. There is a cost associated with sharing cores when applications like Oracle and VMware are licensed by core count. You actually pay more for those applications when Nutanix runs their processes on your cores. It’s important to do that math.

I’m very happy Rob raised the point regarding VMware’s licensing (part of what I’d call #vTAX), this is one of the many great reasons to move to Nutanix next generation hypervisor AHV (Acropolis Hypervisor).

In addition, for workloads like Oracle or SQL where licensing is an issue, Nutanix offers two solutions which address these issues:

  1. Compute Only Nodes running AHV
  2. Acropolis Block Services (ABS) to provide the Nutanix Distributed Storage Fabric (ADSF) to physical or virtual servers not running on Nutanix HCI nodes.

But what about the Nutanix Controller VM (CVM) itself? It is assigned vCPUs which share physical CPU cores with other virtual machines.

Sharing Physical cores is a bad idea as virtualisation has taught us over many years. Hold on, wait, no that’s not it (LOL!), Virtualisation has taught us we can share physical CPU cores very successfully even for mission critical applications where it’s done correctly.

Here is a detailed post on the topic titled: Cost vs Reward for the Nutanix Controller VM (CVM)

Asset fluidity. An important part of the NetApp scale functionality is asset fluidity – being able to move subcomponents of HCI around to different applications, nodes, sites, and continents and to use them long beyond the 3-year depreciation cycle.

This is possibly the weakest argument in Netapp’s post, Nutanix nodes can be removed non disruptively from a cluster and added to any other cluster including mixing all flash and hybrid. Brand new nodes can be mixed with any other generation of nodes, I regularly form large clusters using multiple generations of hardware.

Here is a tweet of mine from 2016 showing a 22 node cluster with four different node types across three generations of hardware (G3 being the original NX-8150, G4 and G5).

Data Fabric. The NetApp Data Fabric simplifies and integrates data management across clouds and on the premises to accelerate digital transformation. To plan an enterprise rollout of HCI, a Data Fabric is required – and Nutanix has no such thing. NetApp delivers a Data Fabric that’s built for the data-driven world.

I had to look up what Netapp mean by “Data Fabric” as it sounded to me like a nonsense marketing term, and surprise surprise I was right. Here is how Netapp describe “Data Fabric“.

Data Fabric is an architecture and set of data services that provide consistent capabilities across a choice of endpoints spanning on-premises and multiple cloud environments.

It’s a fluffy marketing phrase but the same could easily be argued about Nutanix Distributed Storage Fabric (ADSF). ADSF is hypervisor agnostic which straight away delivers a multiple platform solution (cloud or on premises) including AWS and Azure (below).

CloudSite

Nutanix can replicate and protect data including virtual machines across different hardware, clusters, hypervisors and clouds.

So the claim “Nutanix does not have a Data Fabric” is pretty laughable based on Netapp’s own description of “Data Fabric”.

Now the final point:

Choosing the Right Infrastructure for Your Enterprise

I’ve written about Things to consider when choosing infrastructure and my conclusion was:

ThingtoconsiderSummary

Nutanix has for many years provided a platform which can be your standard for all workloads and the number of niche workloads that cannot be genuinely supported are now so rare with all the enhancements we’ve made over the years.

The best thing about Nutanix, with our world class enterprise architect enablement and Nutanix Platform Expert (NPX) certification programmes, we ensure our field S.Es , Architects and certified individuals that design and implement solutions for customers every day know exactly when to say “No”.

This culture of customer success first, sales last, comes from our former President Sudheesh Nair who wrote this excellent article during his time at Nutanix

Quite possibly the most powerful 2-letter word in Sales – No

After addressing all the points raised by Netapp, it’s easy to see that Nutanix has a very complete solution thanks to years of development and experience with enterprise customers and their mission critical applications.

Have you read any other “b*llsh*t” you’d like Nutanix to respond to, if so, don’t hesitate to reach out.

VMware you’re full of it (FUD) : Nutanix CVM/AHV & vSphere/VSAN overheads

For a long time now, VMware & EMC are leading the charge along with other vendors spreading FUD regarding the Nutanix Controller VM (CVM), making claims it uses a lot of resources to drive storage I/O and it being a Virtual Machine (a.k.a Virtual Storage Appliance / VSA) is inefficient / slower than running In-Kernel.

An recent example of the FUD comes from an article written by the President of VCE himself, Mr Chad Sakac who wrote:

… it (VxRAIL) is the also the ONLY HCIA that has a fully integrated SDS stack that is embedded into the kernel – specifically VSAN because VxRail uses vSphere.  No crazy 8vCPU, 16+ GB of RAM for the storage stack (per “storage controller” or even per node in some cases with other HCIA choices!) needed.

So I thought I would put together a post covering what the Nutanix CVM provides and giving a comparison to what Chad referred to as a fully integrated SDS stack.

Let’s compare what resources are required between the Nutanix suite which is made up of the Acropolis Distributed Storage Fabric (ADSF) & Acropolis Hypervisor (AHV) and VMware’s suite made up of  vCenter , ESXi ,  VSAN and associated components.

This should assist those not familiar with the Nutanix platform understand the capabilities and value the CVM provides and correct the FUD being spread by some competitors.

Before we begin, let’s address the default size for the Nutanix CVM.

As it stands today, The CVM by default is assigned 8 vCPUs and 16GB RAM.

What CPU resources the CVM actually uses obviously depends on the customers use case/s so if the I/O requirements are low, the CVM wont use 8 vCPU, or even 4vCPUs, but it is assigned 8vCPUs.

With the improvement in ESXi CPU scheduling over the years, the impact of having more than the required vCPUs assigned to a limited number of VMs (such as the CVM) in an environment is typically negligible, but the CVM can be right sized which is also common.

The RAM allocation is recommended to be 24Gb when using deduplication, and for workloads which are very read intensive, the RAM can be increased to provide more read cache.

However, increasing the CVM RAM for read cache (Extent Cache) is more of a legacy recommendation as the Acropolis Operating System (AOS) 4.6 release achieves outstanding performance even with the read cache disabled.

In fact, the >150K 4k random read IOPS per node which AOS 4.6 achieves on the NX-9040-G4 nodes was done without the use of in-memory read cache as part of engineering testing to see how hard the SSD drives can be pushed. As a result, even for extreme levels of performance, increasing the CVM RAM for Read Cache is no longer a requirement. As such, 24Gb RAM will be more than sufficient for the vast majority of workloads and reducing RAM levels is also on the cards.

Thought: Even if it was true in-kernel solutions provided faster outright storage performance, (which is not the case as I showed here), this is only one small part of the equation. What about management? VSAN management is done via vSphere Web Client which runs in a VM in user space (i.e.: Not “In-Kernel”) which connects to vCenter which also runs as a VM in user space which commonly leverage an SQL/Oracle database which also runs in user space.

Now think about Replication, VSAN uses vSphere Replication, which, you guessed it, runs in a VM in user space. For Capacity/Performance management, VSAN leverages vRealise Operations Manager (vROM) which also runs in user space. What about backup? The vSphere Data Protection appliance is yet another service which runs in a VM in user space.

All of these products require the data to move from kernel space into user space, So for almost every function apart from basic VM I/O, VSAN is dependant on components which are running in user space (i.e.: Not In-Kernel).

Lets take a look at the requirements for VSAN itself.

According to the VSAN design and sizing guide (Page 56) VSAN uses up to 10% of hosts CPU and requires 32GB RAM for full VSAN functionality. Now the RAM required doesn’t mean VSAN is using all 32GB and the same is true for the Nutanix CVM if it doesn’t need/use all the assigned RAM, it can be downsized although 12GB is the recommended minimum, 16GB is typical and for a node with even 192Gb RAM which is small by todays standards, 16GB is <10% which is minimal overhead for either VSAN or the Nutanix CVM.

In my testing VSAN is not limited to 10% CPU usage and this can be confirmed in VMware’s own official testing of SQL in : VMware Virtual SAN™ Performance with Microsoft SQL Server

In short, the performance testing is conducted with 3 VMs each with 4 vCPUs each on hosts contained a dual-socket Intel Xeon Processor E5-2650 v2 (16 cores, 32 threads, @2.6GHz).

So assuming the VMs were at 100% utilisation, they would only be using 75% of the total cores (12 of 16).  As we can see from the graph below, the hosts were almost 100% utilized, so something other than the VMs is using the CPU. Best case, VSAN is using ~20% CPU, with the hypervisor using 5%, in reality the VMs wont be pegged at 100% so the overhead of VSAN will be higher than 20%.

VSAN10PercentBS

Now I understand I/O requires CPU, and I don’t have a problem with VSAN using 20% or even more CPU, what I have a problem with is VMware lying to customers that it only uses 10% AND spreading FUD about other vendors virtual appliances such as the Nutanix CVM are resource hogs.

Don’t take my word for it, do your own testing and read their documents like the above which simple maths shows the claim of 10% max is a myth.

So that’s roughly 4 vCPUs (on a typical dual socket 8 core system) and up to 32GB RAM required for VSAN, but lets assume just 16GB RAM on average as not all systems are scaled to 5 disk groups.

The above testing was not on the latest VSAN 6.2, so things may have changed. One such change is the introduction of software checksums into VSAN. This actually reduces performance (as you would expect) because it provides a layer of data integrity with every I/O, as such the above performance is still a fair comparison because Nutanix has always had software checksums as this is essential for any production ready storage solution.

Now keep in mind, VSAN is really only providing the storage stack, so its using ~20% CPU under heavy load for just the storage stack, unlike the Nutanix CVM which is also providing a highly available management layer which has comparable (and in many cases better functionality/availability/scalability) to vCenter, VUM, vROM, vSphere Replication, vSphere Data Protection, vSphere Web Client, Platform Services Controller (PSC) and the supporting database platform (e.g.: SQL/Oracle/Postgress).

So I comparing VSAN CPU utilization to a Nutanix CVM is about as far from Apples/Apples as you could get, so let’s look at what all the vSphere Managements components resource requirements are and make a fairer comparison.

vCenter Server

Resource Requirements:

Small | Medium | Large

  • 4vCPUs | 8vCPUs | 16vCPUs
  • 16GB | 24GB | 32GG RAM

Reference:http://pubs.vmware.com/vsphere-60/index.jsp#com.vmware.vsphere.install.doc/GUID-D2121DC5-1FC8-48DC-A4BA-C3FD72D0BE77.html

Platform Services Controller

Resource Requirements:

  • 2vCPUs
  • 2GB RAM

Reference:http://pubs.vmware.com/vsphere-60/index.jsp#com.vmware.vsphere.install.doc/GUID-D2121DC5-1FC8-48DC-A4BA-C3FD72D0BE77.html

vCenter Heartbeat (Deprecated)

If we we’re to compare apples to apples, vCenter would need to be fully distributed and highly available which its not. The now deprecated vCenter Heartbeat used to be able to somewhat provide, so that’s 2x the resources of vCenter, VUM etc, but since its deprecated we’ll give VMware the benefit of the doubt and not count resources to make their management components highly available.

What about vCenter Linked Mode? 

I couldn’t find its resource requirements in the documentation so let’s give VMware the benefit of the doubt and say it doesn’t add any overheads. But regardless of overheads, its another product to install/validate and maintain.

vSphere Web Client

The Web Client is required for full VSAN management/functionality and has its own resource requirements:

  • 4vCPUs
  • 2GB RAM (at least)

Reference:https://pubs.vmware.com/vsphere-50/index.jsp#com.vmware.vsphere.install.doc_50/GUID-67C4D2A0-10F7-4158-A249-D1B7D7B3BC99.html

vSphere Update Manager (VUM)

VUM can be installed on the vCenter server (if you are using the Windows Installation) to save having management VM and OS to manage, if you are using the Virtual Appliance then a seperate windows instance is required.

Resource Requirements:

  • 2vCPUs
  • 2GB

The Nutanix CVM provides the ability to do Major and Minor patch updates for ESXi and of course for AHV.

vRealize Operations Manager (vROM)

Nutanix provides built in Analytics similar to what vROM provides in PRISM Element and centrally managed capacity planning/management and “what if” scenarios for adding nodes to the cluster, as such including vROM in the comparison is essential if we want to get close to apples/apples.

Resource Requirements:

Small | Medium | Large

  • 4vCPUs | 8vCPUs | 16vCPUs
  • 16GB | 32GB | 48GB
  • 14GB Storage

Remote Collectors Standard | Large

  • 2vCPUs | 4vCPUs
  • 4GB | 16GB

Reference:https://pubs.vmware.com/vrealizeoperationsmanager-62/index.jsp#com.vmware.vcom.core.doc/GUID-071E3259-625A-437B-AB34-E6A58B87C65B.html

vSphere Data protection

Nutanix also has built in backup/recovery/snapshotting capabilities which include application consistency via VSS. As with vROM we need to include vSphere Data Protection in any comparison to the Nutanix CVM.

vSphere Data Protection can be deployed in 0.5 to 8TB as shown below:

VDPrequirements

Reference: http://pubs.vmware.com/vsphere-60/topic/com.vmware.ICbase/PDF/vmware-data-protection-administration-guide-61.pdf

The minimum size is 4vCPUs and 4GB RAM but that only supports 0.5TB, for even an average size node which supports say 4TB, 4 vCPUs and 8GB is required.

So best case scenario we need to deploy one VDP appliance per 8TB, which is smaller than some Nutanix (or VSAN Ready) nodes (e.g.: NX6035 / NX8035 / NX8150) so that would potentially mean one VDP appliance per node when running VSAN since the backup capabilities are not built in like they are with Nutanix.

Now what about if I want to replicate my VMs or use Site Recovery Manager (SRM)?

vSphere Replication

As with vROM and vSphere Data protection, vSphere Replication provides VSAN functionality which Nutanix also has built into the CVM. So we also need to include vSphere Replication resources in any comparison to the CVM.

While vSphere Replication has fairly light on resource requirements, if all my replication needs to go via the appliance, it means one VSAN node will be a hotspot for storage and network traffic, potentially saturating the network/node and being a noisy neighbour to any Virtual machines on the node.

Resource Requirements:

  • 2vCPUs
  • 4GB RAM
  • 14GB Storage

Limitations:

  • 1 vSphere replication appliance per vCenter
  • Limited to 2000 VMs

Reference: http://pubs.vmware.com/vsphere-replication-61/index.jsp?topic=%2Fcom.vmware.vsphere.replication-admin.doc%2FGUID-E114BAB8-F423-45D4-B029-91A5D551AC47.html

So scaling beyond 2000 VMs requires another vCenter, which means another VUM, another Heartbeat VM (if it was still available), potentially more databases on SQL or Oracle.

Nutanix doesn’t have this limitation, but again we’ll give VMware the benefit of the doubt for this comparison.

Supporting Databases

The size of even a small SQL server is typically at least 2vCPUs and 8GB+ RAM and if you want to compare apples/apples with Nutanix AHV/CVM you need to make the supporting database server/s highly available.

So even in a small environment we would be talking 2 VMs @ 2 vCPUs and 8GB+ RAM ea just to support the back end database requirements for vCenter, VUM, SRM etc.

As the environment grows so does the vCPU/vRAM and Storage (Capacity/IOPS) requirements, so keep this in mind.

So what are the approx. VSAN overheads for a small 4 node cluster?

The table below shows the minimum vCPU/vRAM requirements for the various components I have discussed previously to get VSAN comparable (not equivalent) functionality to what the Nutanix CVM provides.

VSANoverheads3

As the above only covers the minimum requirements for a small say 4 node environment, things like vSphere Data Protection will require multiple instances, SQL should be made highly available using an Always on Availability group (AAG) which requires a 2nd SQL server and as the environment grows, so do the vCPU/vRAM requirements for vCenter, vRealize Operations Manager and SQL.

A Nutanix AHV environment on the other hand looks like this:

NutanixOverheads2

So just 32 vCPUs and 64GB RAM for a 4 node cluster which is 8vCPU and 54GB RAM LESS than the comparable vSphere/VSAN 4 node solution.

If we add Nutanix Scale out File Server functionality into the mix (which is optionally enabled) this increases to 48vCPUs and 100GB RAM. Just 8vCPUs more and still 18GB RAM LESS than vSphere/VSAN while Nutanix provides MORE functionality (e.g.: Scale out File Services) and comes out of the box with a fully distributed, highly available, self healing FULLY INTEGRATED management stack.

The Nutanix vCPU count assumes all vCPUs are in use which is VERY rarely the case. So this comparison is well and truely in favour of VSAN while still showing vSphere/VSAN having higher overheads for a typical/comparable solution with Nutanix providing additional built in features such as Scale out File Server (another distributed and highly available solution) for only a small amount more resources than vSphere/VSAN which does not provide comparable native file serving functionality.

What about if you don’t use all those vSphere/VSAN features and therefore don’t deploy all those management VMs. VSAN overheads are lower, right?

It is a fair argument to say not all vSphere/VSAN features need to be deployed, so this will reduce the vSphere/VSAN requirements (or overheads).

The same however is true for the Nutanix Controller VM.

Its not uncommon where customers don’t run all features and/or have lower I/O requirements for the CVM to be downsized to 6vCPUs. I personally did this earlier this week for a customer running SQL/Exchange this week and the CVM is still only running at ~75% or approx ~4 vCPUs and that’s running vBCA with in-line compression.

So the overheads depend on the workloads, and the default sizes can be changed for both vSphere/VSAN components and the Nutanix CVM.

Now back to the whole In-Kernel nonsense.

VMware also like to spread FUD that their own hypervisor has such high overheads, its crazy to run any storage through it. I’ve always found this funny since VMware have been telling the market for years the hypervisor has a low overhead (which it does), but they change their tune like the weather to suit their latest slideware.

One such example of this FUD comes from VMware’s Chief Technologist, Duncan Epping who tweeted:

DuncanInKernelRubbish

The tweet is trying to imply that going through the hypervisor to another Virtual Machine (in this case a Nutanix CVM) is inefficient, which is interesting for a few reasons:

  1. If going from one VM to another via the kernel has such high overheads, why do VMware themselves recommend virtualizing business critical high I/O applications which have applications access data between VMs (and ESXi hosts) all the time? e.g.: When a Web Server VM accesses an Application Server VM which accesses data from a Database. All this is in one VM, through the kernel and into another VM.
  2. Because for VSAN has to do exactly this to leverage many of the features it advertises such as:
  • Replication (via vSphere Replication)
  • vRealize Operations Manager (vROM)
  • vSphere Data Protection (vDP)
  • vCenter and supporting components

Another example of FUD from VMware, in this case Principal Engineer, Jad El-Zein is implying VSAN has low(er) overheads compared to Nutanix (Blocks = Nutanix “Blocks”):

screenshot_2015-03-10-10-05-44-1.png

I guess he forgot about the large number of VMs (and resources) required to provide VSAN functionality and basic vSphere management. Any advantage of being In-Kernel (assuming you still believe it is in fact any advantage) are well and truely eliminated by the constant traffic across the hypervisor to and from the management VMs all of which are not In-Kernel as shown below.

vSphereManagement

I’d say its #AHVisTheOnlyWay and #GoNutanix since the overheads of AHV are lower than vSphere/VSAN!

Summary:

  1. The Nutanix CVM provides a fully integrated, preconfigured and highly available, self healing management stack. vSphere/VSAN requires numerous appliances and/or software to be installed.
  2. The Nutanix AHV Management stack (provided by the CVM) using just 8vCPUs and typically 16GB RAM provides functionality which in many cases exceeds the capabilities of vSphere/VSAN which requires vastly more resources and VMs/Appliances to provide comparable (but in many cases not equivalent) functionality.
  3. The Nutanix CVM provides these capabilities built in (with the exception of PRISM Central which is a seperate Virtual Appliance) rather than being dependant on multiple virtual appliances, VMs and/or 3rd party database products for various functionality.
  4. The Nutanix management stack is also more resilient/highly available that competing products such as all VMware management components and comes this way out of the box. As the cluster scales, the Acropolis management stack continues to automatically scale management capabilities to ensure linear scalability and consistent performance.
  5. Next time VMware/EMC try to spread FUD about the Nutanix Controller VM (CVM) being a resource hog or similar, ask them what resources are required for all functionality they are referring to. They probably haven’t even considered all the points we have discussed in this post so get them to review the above as a learning experience.
  6. Nutanix/AHV management is fully distributed and highly available. Ask VMware how to make all the vSphere/VSAN management components highly available and what the professional services costs will be to design/install/validate/maintain that solution.
  7. The next conversation to have would be “How much does VSAN cost compared to Nutanix”? Now that we understand all the resources overheads and complexity in design/implementation/validation of the VSAN/vSphere environment, not to mention most management components will not be highly available beyond vSphere HA. But cost is a topic for another post as the ELA / Licensing costs are the least of your worries.

To our friends at VMware/EMC, the Nutanix CVM says,

“Go ahead, underestimate me”.

 

Nutanix Data Protection Capabilities

There is a lot of misinformation being spread in the HCI space about Nutanix data protection capabilities. One such example (below) was published recently on InfoStore.

Evaluating Data Protection for Hyperconverged Infrastructure

When I see articles like this, It really makes me wonder about the accuracy of content on these type of website as it seems articles are published without so much as a brief fact check from InfoStore.

None the less, I am writing this post to confirm what Data Protection Capabilities Nutanix provides.

  • Native In-Built Data protection

Prior to my joining Nutanix in mid-2013, Nutanix already provided a Hypervisor agnostic Integrated backup and disaster recovery solution with centralised consumer- grade management through our PRISM GUI which is HTML 5 based.

The built in capabilties are flexible and VM-centric policies to protect virtualized applications with different RPOs and RTOs with or without application consistency.

The solution also supports Local, remote, and cloud-based backups, and synchronous and asynchronous replication-based disaster recovery solutions.

Currently supported cloud targets include AWS and Azure as shown below.

CloudBackup

The below video which shows in real time how to create Application consistent snapshots from the Nutanix PRISM GUI.

Nutanix can also perform One to One, One to Many and Many to One replication of application consistent snapshots to onsite or offsite Nutanix clusters as well as Cloud providers (AWS/Azure), ensuring choice and flexibility for customers.

Nutanix native data protection can also replicate between and recover VMs to clusters of different hypervisors.

  • CommVault Intellisnap Integration

Nutanix also provides integration with Commvault Intellisnap which allows existing Commvault customers to continue leveraging their investment in the market leading data protection product and to take advantage of other features where required.

The below shows how agentless backups of Virtual Machines is supported with Acropolis Hypervisor (AHV). Note: Commvault is also fully supported with Hyper-V and ESXi.

By Commvault directly calling the Nutanix Distributed Storage Fabric (NDSF) it ensures snapshots are taken quickly and efficiently without the dependancy on a hypervisor.

  • Hypervisor specific support such as VMware API Data Protection (VADP)

Nutanix also supports solutions which leverage VADP, allowing customers with existing investment in products such as Veeam & Netbackup to continue with their existing strategy until such time as they want to migrate to Nutanix native data protection or solutions such as Commvault.

  • In-Guest Agents

Nutanix supports the use of In-Guest agents which are typically very inefficient with centralised SAN/NAS storage but due to data locality and NDSF being a truly distributed platform, In-Guest Incremental forever backups perform extremely well on Nutanix as the traditional choke points such as Network, Storage Controllers & RAID packs have been eliminated.

Summary:

As one size does not fit all in the world of I.T, Nutanix provides customers choice to meet a wide range of market segments and requirements with strong native data protection capabilities as well as 3rd party integration.