Nutanix Data Protection Capabilities

There is a lot of misinformation being spread in the HCI space about Nutanix data protection capabilities. One such example (below) was published recently on InfoStore.

Evaluating Data Protection for Hyperconverged Infrastructure

When I see articles like this, It really makes me wonder about the accuracy of content on these type of website as it seems articles are published without so much as a brief fact check from InfoStore.

None the less, I am writing this post to confirm what Data Protection Capabilities Nutanix provides.

  • Native In-Built Data protection

Prior to my joining Nutanix in mid-2013, Nutanix already provided a Hypervisor agnostic Integrated backup and disaster recovery solution with centralised consumer- grade management through our PRISM GUI which is HTML 5 based.

The built in capabilties are flexible and VM-centric policies to protect virtualized applications with different RPOs and RTOs with or without application consistency.

The solution also supports Local, remote, and cloud-based backups, and synchronous and asynchronous replication-based disaster recovery solutions.

Currently supported cloud targets include AWS and Azure as shown below.

CloudBackup

The below video which shows in real time how to create Application consistent snapshots from the Nutanix PRISM GUI.

Nutanix can also perform One to One, One to Many and Many to One replication of application consistent snapshots to onsite or offsite Nutanix clusters as well as Cloud providers (AWS/Azure), ensuring choice and flexibility for customers.

Nutanix native data protection can also replicate between and recover VMs to clusters of different hypervisors.

  • CommVault Intellisnap Integration

Nutanix also provides integration with Commvault Intellisnap which allows existing Commvault customers to continue leveraging their investment in the market leading data protection product and to take advantage of other features where required.

The below shows how agentless backups of Virtual Machines is supported with Acropolis Hypervisor (AHV). Note: Commvault is also fully supported with Hyper-V and ESXi.

By Commvault directly calling the Nutanix Distributed Storage Fabric (NDSF) it ensures snapshots are taken quickly and efficiently without the dependancy on a hypervisor.

  • Hypervisor specific support such as VMware API Data Protection (VADP)

Nutanix also supports solutions which leverage VADP, allowing customers with existing investment in products such as Veeam & Netbackup to continue with their existing strategy until such time as they want to migrate to Nutanix native data protection or solutions such as Commvault.

  • In-Guest Agents

Nutanix supports the use of In-Guest agents which are typically very inefficient with centralised SAN/NAS storage but due to data locality and NDSF being a truly distributed platform, In-Guest Incremental forever backups perform extremely well on Nutanix as the traditional choke points such as Network, Storage Controllers & RAID packs have been eliminated.

Summary:

As one size does not fit all in the world of I.T, Nutanix provides customers choice to meet a wide range of market segments and requirements with strong native data protection capabilities as well as 3rd party integration.

VADP or Agent Based Backups

In light of ongoing bugs with VMware’s API for Data Protection (VADP), I figured it worth re-visiting the topic of VADP or Agent Based backups.

VADP gives backup products the ability to kick off snapshots and use Changed Block Tracking (CBT) to allow incremental style backups which improve the efficiency of backup solutions by reducing the impact (performance, think storage, network and compute overheads) and duration (backup window).

But the problem is, there has now been several instances of VADP bugs in recent years which has meant incremental backups have lacked integrity due to the changed blocks not being correctly reported.

Here is a list of some of the VADP related issues/bugs:

  1. Backups with Changed Block Tracking can return incorrect changed sectors in ESXi 6.0 (2136854)
  2. Backing up a virtual machine with Changed Block Tracking (CBT) enabled fails after upgrading to or installing VMware ESXi 6.0 (2114076)
  3. Changed Block Tracking (CBT) on virtual machines (1020128)
  4. Enabling or disabling Changed Block Tracking (CBT) on virtual machines(1031873)
  5. Changed Block Tracking is reset after a storage vMotion operation in vSphere 5.x (2048201)
  6. When Changed Block Tracking is enabled in VMware vSphere 5.x, vMotion migration fails with error: The source detected that the destination failed to resume (2086670)
  7. QueryChangedDiskAreas API returns incorrect sectors after extending virtual machine VMDK file with Changed Block Tracking (CBT) enabled (2090639)

From the above (albeit a limited list of VADP related issues) we can see that there are issues related to integrity of VADP CBT as well as operational considerations (limitations) when using CBT, such as not being able to Storage vMotion and having vMotion operations fail.

So while VADP in theory has its advantages, should it be used in production environments?

At this stage I am highlighting the risks associated with using VADP with customers and where required/possible mitigating the issue.

But what about good ol’ agent based backups?

Agent based backups have a bad rap in my opinion mainly because of 3-Tier solutions and the fact backup windows take a long time due to the contention in the storage network, controllers and back end disk.

Now people ask me all the time, how can we do backups on Nutanix? The answer is, you have numerous (very good) options without using VADP (or for non vSphere customers).

Using a product like Commvault, In-Guest Agent’s can be deployed and managed centrally, removing much of the administrative overhead (downside) of agent based backups.

Then by configuring incremental forever backups, Commvault manages the change block tracking (regardless of hypervisor) and can even do source side deduplication and compression before sending the delta’s over the network to the Commvault Media Agent (ie.: The backup server).

Now since all new write I/O is written to Nutanix SSD tier, it is very likely that all changes will still be in the SSD tier when a daily incremental backup is started meaning the delta’s will be quickly read and send over the network. Why is this solving the problems of 3-Tier i discussed earlier, well its thanks to data locality and the fact Nutanix XCP is a highly distributed platform.

Because each Nutanix node has a local storage controller with local SSD, AND critically, Data Locality writes new data to the node where the VM is running, most data (under normal situations) will be read locally (without traversing a NIC/HBA or the storage network). This means there is no impact on other nodes from the backup of VMs on each node.

Due to these factors, the only traffic traversing the IP network to the backup server (Commvault Media Agent in this example), are the delta changes in a compressed and deduplicated format.

So a Commvault Agent Based backup solution on Nutanix XCP, on any hypervisor, avoids the dependancy on hypervisor APIs (which have proven in several cases not to be reliable) and ensures backup windows and the impact of backup jobs is minimal due to intelligent incremental forever style backups running on an intelligent distributed storage fabric.

In-Guest agent based backups may just be making a comeback!

Note: In y experience, Agent based backups typically provide more granularity/flexibility compared to VADP backups, for specifics speak with your preferred backup vendor.

Oh BTW, did I mention Nutanix XCP supports Commvault Intellisnap for storage level snapshots on the Distributed Storage Fabric… again just another option for Nutanix customers wanting to avoid further pain with VADP.