What’s .NEXT 2016 – Metro Availability Witness

In 2014, Nutanix introduced Metro Availability which allows Virtual Machines to have mobility between sites as well as to provide failover in the event of a site failure.

The goal of the Metro Availability (MA) Witness is to automate failovers in case of inter-site network failures or site failures. By the virtue of running the Witness in a different location than the two Metro Sites, it provides the ‘outside’ view that can determine whether a site is actually down or whether the network connection between the two sites is down, avoiding a split-brain scenario that can occur without an external Witness.

The main functions of a Witness include:

  • Making failover decision in the event of site or inter-site network failure
  • Avoiding split brain where the same container is active on both sites
  • Design to handle situations where a single storage or network domain fails

For example, in the case of a Metro Availability (MA) relationship between clusters, a Witness residing in a separate failure domain (e.g.: 3rd site) decides which site should be activated when a split brain occurs due to a WAN failure, or in the situation where a site goes down. For that reason, it is a requirement that there are independent network connections for inter-site connectivity *and* for connections to the witness.

How Metro works without a Witness:

In the event of the primary site failure (the site where the Metro container is currently active) or the links between the sites going offline, the Nutanix administrator is required to manually Disable Metro Availability and Promote the Container to Active on the site where VMs are desired to be ran. This is a quick and simple process, but it is not automated which may impact the Recovery Time Objective (RTO).

In case of a communication failure with the secondary site (either due to the site going down or the network link between the sites going down), the Nutanix administrator can configure the system in two ways:

  • Automatic: the system will automatically disable Metro Availability on the container on the primary site after a short pause if the secondary site connection doesn’t recover within that time
  • Manual: wait for the administrator to manually take action

MetroNoWitness

How Metro Availability (MA) works with the witness:

With the new Witness capability, the process of disabling Metro Availability and Promoting the Container in case of a site outage or a network failure is fully automated which ensures the fastest possible RTO.  The Witness functionality is only used in case of a failure, meaning a Witness failure itself will not affect VMs running on either site.

MetroWithWitness

Failure Scenarios Addressed by MA Witness.

There are a number of scenarios which can occur and Metro Availability responds differently depending on if MA is configured in “Witness mode”, “Automatic Resume mode” or in “Manual mode”.

The following table details the scenarios and the behaviour based on the configuration of Metro Availability.

MetroFailureScenarios3

In all cases except a failure at both Site 1 and Site 2, the MA Witness automatically handles the situation and ensures the fastest possible RTO.

The following videos show how each of the above scenarios function.

Deployment of the Metro Availability (MA) Witness:

The Witness capability is deployed as a stand-alone Virtual Machine, that can be imported on any hypervisor in a separate failure domain, typically a 3rd site.  This VM can run on non-Nutanix hardware.  This site is expected to have dedicated network connections to Site 1 and Site 2 to avoid a single point of failure.

As a result, MA Witness is quick and easy to deploy, resulting in lower complexity and risk compared to other solutions on the market.

Summary:

The Nutanix Metro Witness completes the Nutanix Metro Availability functionality by providing completely automatic failover in case of site or networking failures.

Related .NEXT 2016 Posts

What’s .NEXT 2016 – Acropolis Block Services (ABS)

Acropolis Block Services or ABS (not to be confused with Anti-lock Braking Systems), is an extension of the In-Guest iSCSI Nutanix announced at .NEXT 2015.

The original goal of the In-Guest iSCSI was to enable support for applications like MS Exchange which are not supported on NFS and applications such as SQL clustering for quorum drives, and this has been very successful. However customers have been telling us for a number of years they want to make Nutanix the standard platform for their datacenters, however they have not been able to realise this vision due to a number of reasons including:

  • The desire/requirement to re-use existing servers
  • Applications which are not virtual (for many reasons, mostly political)
  • Performance / Scalability of externally connected servers
  • Complexity including operational considerations of external iSCSI

Let’s discuss each of these topics and how ABS solves these challenges.

Re-using existing servers

As it’s uncommon for customers to be at the exact right time in the refresh cycle for servers and storage to replace all infrastructure at once, ABS allows customers to either get started with Nutanix by deploying some nodes/blocks, or to scale the existing environment/s while being able to use the Acropolis Distributed Storage Fabric (ADSF) to provide storage to existing HCI workloads and non HCI workloads.

A couple of key advantages of ABS compared to the existing In-Guest iSCSI support and traditional SAN/NAS is:

  • ABS load balances and optimizes paths so MPIO and ALUA are not needed
  • New storage is automatically added without requiring client-side changes

The downside to using ABS as a stop gap until the refresh cycle for the compute hardware is that is does add complexity which I discuss in this article from July 2015.

Scaling Hyper-converged solutions – Compute only

However, if the goal is to maximise the return on investment (ROI) of existing infrastructure, ABS is in my opinion a better option than having another silo of storage to install/configure and manage as it:

  • ABS load balances and optimizes paths so MPIO and ALUA are not needed
  • New storage is automatically added without requiring client-side changes
  • Removes the requirement for another silo.
  • Increases performance/capacity/resiliency of an existing cluster
  • Allows customers to standardize their infrastructure
  • Gives customers flexibility to quickly add/remove nodes from a cluster/s to meet requirements.

Scalability:

ABS ensures linear and automated scalability by creating virtual targets to ensure performance is not limited by iSCSI limitation of one session per initiator and target. This means a single LUN (or Volume Group in Nutanix speak) can be serviced by the multiple virtual targets which are spread across all Nutanix CVMs. This ensures multiple network threads are used which also mitigates against network threads being a bottleneck.

By default 32 virtual targets are used to ensure optimal performance for even the largest and most I/O intensive workloads.

This process is also transparent to the administrator and application to avoid any complexity in implementation and ongoing support.

The following diagram shows how the data services IP sits in front of the virtual targets (which are on each CVM) and the vDisks are spread across all controllers for maximum performance.

ABSvirtualtargets

At .NEXT 2015 Nutanix announced support to scale storage seperate to compute using “Storage Only” nodes and this capability is fully compatible with ABS. This ensures capacity and performance can be scaled separately to compute for maximum flexibility.

ABSnoiSCSIMPIO

Resiliency:

If a vDisks active CVM goes offline due to failure or planned maintenance, any active sessions against that CVM are disconnected, which triggers a re-logon from the iSCSI client. The re-logon occurs through the external data services IP, which redirects the session to a healthy CVM.

This means things like One-Click rolling AOS upgrades can still be performed as they are with native Nutanix environments.

ABSCVMfailure

Functionality:

ABS supports SCSI-3 persistent reservations for shared storage-based Windows clusters, which are commonly used with Microsoft SQL Server and clustered file servers.

As of Acropolis OS (AOS) 4.7, ABS will be supported with physical servers or virtual machines. Support for connecting ESXi via iSCSI is expected to follow in a future release.

ABS supports several use cases, including:

  • iSCSI for Microsoft Exchange Server.
  • Shared storage for Linux-based clusters
  • Windows Server Failover Clustering (WSFC).
  • SCSI-3 persistent reservations for shared storage-based Windows clusters
  • Shared storage for Oracle RAC environments.
  • Bare-metal environments.

ABSoverview

ABS enables server hardware separate from the Nutanix environment to consume the Acropolis DSF resources, so you can leverage existing server hardware investments against Nutanix storage resources. Workloads not targeted for virtualization can also use the DSF.

Supported Client OS & Qualified Applications

  • RHEL 6+
  • Windows 2008 R2 & Windows 2012 R2
  • Oracle RAC
  • Microsoft SQL Server
  • Microsoft Exchange Server

Summary:

Whether you have applications that require shared storage access or environments with separate storage and compute needs, Acropolis Block Services (ABS) simplifies deployment and highlights the dynamic scale out, extreme performance, and high availability of the Nutanix platform. ABS automatically load balances iSCSI clients to take advantage of all resources in the cluster, and failure events are managed seamlessly. The same upgrade, snapshot, and asynchronous replication workflows that customers leverage today work consistently whether you are using VMs or VGs. By enabling VM, file, and block services, Nutanix offers a single platform to consolidate workloads and ease administration, thus reducing risk and enabling organizations to simplify their infrastructure.

Related .NEXT 2016 Posts

Metro Availability Witness Failure Scenario 9 – Network Partition + Site Failure

Related Posts