Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 4 – Security

Security is a major pillar of the XCP design. The use of innovative automation results in perhaps the most hardened, simple and comprehensive virtualization infrastructure in the industry.

AHV is not designed to work with a comprehensive HCL of hardware vendors, nor does it have countless bolt-on style products which need to be catered for. Instead Acropolis hypervisor has been optimized to work with the Nutanix Distributed Storage Fabric and approved appliances from Nutanix and OEM partners to provide all services/functionality in a truly Web scale manner.

This allows for much tighter and targeted quality assurance and dramatically reduces the attack surface compared to hypervisors.

The Security Development Lifecycle (SecDL) is leveraged across the entire Acropolis platform ensuring every line of code is production ready. This design follows a defense-in-depth model that removes all unnecessary services for libvirt/QEMU (SPICE, unused drivers), leverages libvirt non-root group sockets for principle of least privilege, SELinux confined guests for vmescape protection, and an embedded intrusion detection system.

seclifecycle

Acropolis hypervisor has a documented and supported security baseline (XCCDF STIG), and introduces the self-remediating hypervisor. On a customer defined interval, the hypervisor is scanned for any changes to the supported security baseline, and resets the baseline back to the secure state if any anomaly is detected in the background with no user intervention.

The Acropolis platform also boats a comprehensive list of security certifications/validations:

SecCerts2

Summary

Acropolis provides numerous security advantages including:

  1. In-Built and self auditing Security Technical Implementation Guides (STIGs)
  2. Hardened hypervisor out of the box without the requirement for administrators to apply hardening recommendations
  3. Reduced attack surface compared to other supported hypervisors

For more information on Nutanix security see:

Back to the Index

Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 3 – Scalability

Scalability is not just about the number of nodes that can form a cluster or the maximum storage capacity. The more important aspects of scalability is how an environment expands from many perspectives including Management, Performance, Capacity, Resiliency and how scaling effects Operational aspects.

Let’s start with scalability of the components required to Manage/Administrator AHV:

Management Scalability

AHV automatically sizes all Management components during deployment of the initial cluster, or when adding node/s to the cluster. This means there is no need to do initial sizing or manual scaling of XCP management components regardless of the initial and final size of the cluster/s.

Where Resiliency Factor of 3 (N+2) is configured, the Acropolis management components will be automatically scaled to meet the N+2 requirement. Let’s face it, there is no point having N+2 at one layer and not at another because availability, like a Chain, is only as good as its weakest link.

Storage Capacity Scaling

The Nutanix Distributed Storage Fabric (DSF) has no maximum Storage Capacity, additionally, storage capacity can even be scaled separately to compute with “Storage-only” nodes such as the NX-6035C. Nutanix storage only nodes help eliminate the problems when scaling capacity compared to traditional storage.

Scaling Storage-only nodes run AHV (which are interoperable with other supported hypervisors) allowing customers to scale capacity regardless of Hypervisor. Storage-only nodes do not require hypervisor licensing or separate management. Storage only nodes also fully support all one-click upgrades for the Acropolis Base Software and AHV just like compute+storage nodes. As a result, storage only nodes are invisible, well apart from the increased capacity and performance which the nodes deliver.

Nutanix Storage only nodes help eliminate the problems when scaling capacity compared to traditional storage, for more information see: Scaling problems with traditional shared storage.

Some of the scaling problems with traditional storage is adding shelves of drives and not scaling data services/management. This leads to problems such as lower IOPS/GB and higher impact to workloads in the event of component failures such as storage controllers.

Scaling storage only nodes is remarkably simple. For example a customer added 8 x NX6035C nodes to his vSphere cluster via his laptop on the showroom floor of vForum Australia in October of this year.

https://twitter.com/josh_odgers/status/656999546673741824

As each storage-only node is added to the cluster, a light-weight Nutanix CVM joins the cluster to provide data services to ensure linear scale out management and performance capabilities, thus avoiding the scaling problems which plague traditional storage.

For more information on Storage only nodes, see: http://t.co/LCrheT1YB1

Compute Scalability

Enabling HA within a cluster requires reserving one or more nodes for HA. This can create unnecessary inefficiencies when the hypervisor limits the maximum cluster size. AHV not only has no limit to the number of nodes within a cluster. As a result, AHV can help avoid unnecessary silos that can lead to inefficient use of infrastructure due to requiring one or more nodes per cluster to be reserved for HA. AHV nodes are also automatically configured with all required settings when joining an existing cluster. All the administrator needs to provide is basic IP address information, Press Expand cluster and Acropolis takes care of the rest.

See the below demo showing how to expand a Nutanix cluster:

Analytics Scalability

AHV includes built-in Analytics and as with the other Acropolis Management components, Analysis components are sized automatically during initial deployment and scales automatically as nodes are added.

This means there is never tipping point where there is a requirement for an administrator to scale or deploy new Analysis instances or components. The analysis functionality and its performance remains linear regardless of scale.

This means AHV eliminates the requirement for seperate software instances and database/s to provide analytics.

Resiliency Scalability

As Acropolis uses the Nutanix Distributed Storage Fabric, in the event drive/s or node/s fail, all nodes within the cluster participate in restoring the configured resiliency factor (RF) for the impacted data. This occurs regardless of Hypervisor, however, AHV includes fully distributed Management components; the larger the cluster, the more resilient the management layer also becomes.

For example, the loss of a single node in a 4-node cluster would have potentially a 25% impact on the performance of the management components. In a 32-node cluster, a single node failure would have a much lower potential impact of only 3.125%. As an AHV environment scales, the impact of a failure decreases and the ability to self-heal increases in both speed to recover and number of subsequent failures which can be supported.

For information about increasing resiliency of large clusters, see: Improving Resiliency of Large Clusters with EC-X

Performance Scalability

Regardless of hypervisor, as XCP clusters grow, the performance improves. The moment new node(s) are added to the cluster, the additional CVM/s start participating in Management and Data Resiliency tasks even when no VMs are running on the nodes. Adding new nodes allows the Storage Fabric to distribute RF traffic among more Controllers which enhances Write I/O & resiliency while helping decrease latency.

The advantage that AHV has over the other supported hypervisors is that the performance of the Management components (many of which have been previously discussed) dynamically scale with the cluster. Similar to Analytics, AHV management components scale out. There is never a tipping point requiring manual scale out of management or new deploying instances of management components or their dependencies.

Importantly, for all components, the XCP distributes data and management functions across all nodes within the cluster. Acropolis does not use “mirrored” components/hardware or objects which ensures no two nodes or components/hardware become a bottleneck or point of failure.

Back to the Index

Why Nutanix Acropolis hypervisor (AHV) is the next generation hypervisor – Part 2 – Simplicity

Let me start by saying I believe complexity is one of the biggest and potentially the most overlooked, issue in modern datacenters.

Virtualization has enabled increased flexibility and solved countless problems within the datacenter. But over time I have observed an increase in complexity especially around the management components which for many customers is a major pain point.

Complexity leads to things like increased cost (both CAPEX & OPEX) and risk, which commonly leads to reduced availability/performance.

In Part 10, I will cover Cost in more depth so let’s park it for the time being.

When architecting solutions for customers, my number one goal is to meet/exceed all my customers’ requirements with the simplest solution possible.

anyfool

Acropolis is where web-scale technology delivers enterprise grade functionality with consumer-grade simplicity, and with AHV the story gets even better.

Removing Dependencies

A great example of the simplicity of the Nutanix Xtreme Computing Platform (XCP) is its lack of external dependencies. There is no requirement for any external databases when running Acropolis Hypervisor (AHV) which removes the complexity of designing, implementing and maintaining enterprise grade database solutions such as Microsoft SQL or Oracle.

This is even more of an advantage when you take into account the complexity of deploying these platforms in highly available configurations such as AlwaysOn Availability Groups (SQL) or Real Application Clusters (Oracle RAC) where SMEs need to be engaged for design, implementation and maintenance. As a result of not being dependent on 3rd party database products, AHV reduces/removes complexity around product interoperability or the need to call multiple vendors if something goes wrong. This also means no more investigating Hardware Compatibility Lists (HCLs) and Interoperability Matrix’s when performing upgrades.

Management VMs

Only a single management virtual machine (Prism Central) needs to be deployed – even for multi-cluster globally distributed AHV environments. Prism Central is an easy to deploy appliance and since it’s state-less, it does not require backing up. In the event the appliance is lost, an administrator simply deploys a new Prism Central appliance and connects it to the clusters which can be done in a matter of seconds per cluster. No historical data is lost as the data is maintained on the clusters being managed.

Because Acropolis requires no additional components, it all but eliminates the design/implementation and operational complexity for management compared to other virtualization / HCI offerings.

Other supported hypervisors commonly require multiple management VMs and backend databases even for relatively small scale/simple deployments just to provide basic administration, patching and operations management capabilities.

Acropolis has zero dependencies during the installation phase, customers can implement a fully featured AHV environment without any existing hardware/software in the datacenter. Not only does this make initial deployment easy, but it also removes the complexity around interoperability when patching or upgrading in the future.

Ease of Management

Nutanix XCP clusters running any hypervisor can be managed individually using Prism Element or centrally via Prism Central.

Prism Element requires no installation; it is available and performs optimally out-of-the-box. Administrators can access Prism Element via the XCP Cluster IP address or via any Controller VM IP address.

Administrators of Legacy virtualization products often need to use hypervisor-specific tools to complete various tasks requiring design/deployment and management of these components and their dependencies. With AHV, all hypervisor level functionality is completed via Prism providing a true single pane of glass interface for everything from Storage, Compute, Backup, Data Replication, Hardware monitoring and more.

The image below shows the PRISM Central Home Screen that provides a high-level summary of all clusters in the environment. From this screen, you can drill down to individual clusters to get more granular information where required.

PRISMcentraloverview

Administrators perform all upgrades from PRISM without the requirement for external update management applications/appliances/VMs or supporting back end databases.

PRISM performs one-click fully automated rolling upgrades to all components including Hypervisor, Acropolis Base Platform (formally known as NOS), Firmware and Nutanix Cluster Check (NCC).

For a demo of Prism Central see the following YouTube video:

Further Reduced Storage Complexity

Storage has long been, and continues for many customers to be, a major hurdle to successful virtual environments. Nutanix has essentially made storage invisible over the past few years by removing the requirement for dedicated Storage Area Networks, Zoning, Masking, RAID and LUNs. When combined with AHV, XCP has taken this innovation yet another big step forward by removing the concepts of datastores/mounts and virtual SCSI controllers.

For each Virtual Machine disk, AHV presents the vDisk directly to the VM, and the VM simply sees the vDisk as if it were a physically attached drive. There is no in-guest configuration. It just works.

This means there is no complexity around how many virtual SCSI controllers to use, or where to place a VM or vDisk and as such, Acropolis has eliminated the requirement for advanced features to manage virtual machine placement and capacity management such as vSphere’s Storage DRS.

Don’t get me wrong, Storage DRS is a great feature which helps solve serious problems with traditional storage.  With XCP these problems just don’t exist.

For more details see:  Storage DRS and Nutanix – To use, or not to use, that is the question?

The following screen shot shows just how simple vDisks appear under the VM configuration menu in Prism Element. There is no need to assign vDisks to Virtual SCSI controllers which ensures vDisks can be added quickly and perform optimally.

VMdisks

Node Configuration

Configuring an AHV environment via Prism automatically applies all changes to each node within the cluster. Critically, Acropolis Host Profiles functionality does not need to be enabled or configured, nor do Administrators have to check for compliance or create/apply profiles to nodes.

In AHV all networking is fully distributed similar to the vSphere Distributed Switch (VDS) from VMware. AHV network configuration is automatically applied to all nodes within the cluster without requiring the administrator to attach nodes/hosts to the virtual networking. This helps ensure a consistent configuration throughout the cluster.

The reason the above points are so important is each dramatically simplifies the environment by removing (not just abstracting) many complicated design/configuration items such as:

  • Multipathing
  • Deciding How many datastores are required & what size each should be
  • Considering how many VMs should reside per datastore/LUN.
  • Configuration maximums for Datastores / Paths
  • Managing consistent configuration across nodes/hosts
  • Managing Network Configuration

Administrators can optionally join Acropolis built-in authentication to an Active Directory domain, removing the requirement for additional Single Sign-On components. All Acropolis components include High Availability out-of-the-box, removing the requirement to design (and license) HA solutions for individual management components.

Data Protection / Replication

The Nutanix CVM includes built-in data protection and replication components, removing the requirement to design/deploy/manage one or more Virtual Appliances. This also avoids the need to design, implement and scale these components as the environment grows.

All of the data protection and replication features are also available via Prism and, importantly, are configured on a per VM basis making configuration easier and reducing overheads.

Summary

In summary the simplicity of the AHV eliminates:

  1. Single points of failures for all management components out of the box
  2. The requirement for dedicated management clusters for Acropolis components
  3. Dependency on 3rd Party Operating Systems & Database platforms
  4. The requirement for design, implementation and ongoing maintenance for Virtualization management components
  5. The need to design, install, configure & maintain a Web or Desktop type
  6. Complexity such as
    1. The requirement to install software or appliances to allow patching / upgrading
    2. The requirement for an SME to design a solution to make management components highly available
    3. The requirement to follow complex Hardening Guides to achieve security compliance.
    4. The requirement for additional Appliances/interfaces and external dependencies (i.e.: Database Platforms)
  7. The requirement to license features to allow Centralised configuration management of nodes.

Back to the Index