Peak Performance vs Real World Performance

In this post I will be discussing Real World Performance of Storage solutions compared to peak performance. To make my point I will be using some car analogies which will hopefully assist in getting my point across.

Starting with the Bugatti Veyron Super Sport (below). This car has a W16 engine with 4 turbochargers and produces 1183BHP (~880kW) and has a top speed (peak performance) of 267MPH (431KPH).

bugatti-veyron-super-sport-

The Veyron achieved the world record 267MPH at Volkswagen’s Ehra-Lessien test track in Germany. The test track has a 5.6 mile long straight. This is one of the very few places on earth where the Veyron can actually achieve its peak performance.

Now for the Veyron to achieve the 267MPH, not only do you need a 5.6 mile long straight, but the Veyron’s rear spoiler must NOT be deployed. Now rear spoilers provide down-force to keep stability so having the spoiler down means the car has a reduced ability to for example take corners.

bugatti-veyron-super-sport_100315491_l

In addition to requiring a 5.6 mile long straight, the rear spoiler being down, the Veyron can also only maintain its top speed (Peak performance) for 12 minutes before the Veyron’s 26.4-gallon fuel tank will be emptied, which is lucky because the Veyron’s specially designed tyres only last 15mins at >250MHP.

veyron-tires-2-thumb-550x336

So in reality, while the Bugatti Veyron is one of (if not the fastest) production car in the world, even when you have all your ducks in a row, you can still only achieve its peak performance for a very short period of time (in this example <12 mins) and with several constraints such as reduced ability to corner (due to reduced aerodynamics from the spoiler being down).

Now what about Fuel Economy? The Veyron is rated as follows:

City Driving: 29 L/100 km; 9.6 mpg

Highway Driving: 17 L/100 km; 17 mpg

Top Speed: 78 L/100 km; 3.6 mpg

As you can see, vastly different figures depending on how the Veyron is being used.

There are numerous other factors which can limit the Veyron’s performance, such as weather. For example if the test track is wet, or has strong head winds, the Veyron would not be able to perform at its peak.

bugatti-veyron-wallpaper-7

So while the Veyron can achieve the 267MPH, In the real world, its average (or Real World) performance will be much lower and will vary significantly from owner to owner.

At this stage you’re probably asking “What has this got to do with Storage”?

A Storage solution, be it a SAN/NAS or Hyper-Converged, all can be configured and benchmarked to achieve really impressive Peak Performance (IOPS) much like the Veyron.

But these “Peak Performance” numbers can rarely (if at all) be achieved with “Real World” workloads, especially over an extended duration.

To quote two great guys in the Storage industry (Vaughn Stewart & Chad Sakac):

Absolute performance more often than not, is NOT the only design consideration.

I couldn’t agree with this more. The storage vendors are to blame by advertising unrealistic IOPS numbers based on 4K 100% read and now customers expect the same number of IOPS from SQL or Oracle.

The MPG of the Veyron is like the number of IOPS a Storage array can achieve. It Depends on how the Car or Storage Array is used! The car will get higher MPG if used only on the highway just like a Storage Array will get higher IOPS if only used for one I/O profile.

As the IO size and profile of workloads like SQL & Oracle are vastly different than the peak performance benchmarks using 4K 100% Read IOPS, expecting the same IOPS number for the benchmark and SQL/Oracle is as unrealistic as expecting the Veyron to do 267MPH in heavy traffic.

heavy-traffic-beirut-saidaonline

But like I said, Its the storage vendors fault for failing to educate customers on real world performance so many customers have the impression that peak IOPS is a good measurement, and as a result customers regularly waste time comparing Peak Performance of Vendor A and Vendor B, instead of focusing on their requirements and Real World performance.

In the real world, (at least in the vast majority of cases) customers don’t have dedicated storage solutions for one application where peak performance can be achieved, let alone sustained for any meaningful length of time.

Customers generally run numerous mixed workloads on their storage solutions, everything from Active Directory, DNS , DHCP etc which has low capacity/IOPS requirements , Database, Email and Application servers which may have higher capacity/IOPS requirements to achieve and backup with are low IOPS but high capacity.

Each of these workloads have different IO profiles and depending on storage architecture may share storage controllers / SSDs / HDDs / storage networking all of which can result in congestion / contention which leads to reduced performance.

Before you start considering what vendors storage solution is best, you need to first understand (and document) your requirements along with a success criteria which you can validate storage solutions against.

If your requirements are for example:

  • Host 10TB of Exchange Mailboxes for 2000 users (~400 random Read/Write 32-64k IOPS)
  • Host 20TB Windows DFS solution
  • Host 50TB of Backups
  • Support 1TB active working set SQL Database
  • Host 10TB of misc low IO random workload
  • Have Per VM snapshot / backup / replication capabilities

Then there is no point having (or testing) a solution for 100k Random Read 4k IOPS, as your requirement may be less than 10K IOPS of varying sizes and profile.

Consider this:

If the storage solution/s your considering can achieve the 10K IOPS with the I/O profile of your workloads and can be easily scaled, then a solution able to achieve 20K IOPS day 1, is of little/no advantage to a solution which can achieve 12K IOPS since 10K IOPS is all that you need.

Now if your Constraints are:

  • 12RU rack space
  • 4kw Power
  • $200k

Anything that’s larger than 12RU, uses more than 4Kw of Power or costs more than $200k is not something you should spend your time looking at / benchmarking etc since its not something you can purchase.

So to quote Vaughn and Chad again, “Don’t perform Absurd Testing”. absurdtesting

In my opinion, customers should value their own time enough not to waste time doing a proof of concepts (PoCs) on multiple different products when in reality only 2 meet your requirements.

An example of Absurd testing would be taking a Toyota Corolla on a test drive to a drag strip and testing its 1/4 mile performance when you plan to use the car to pick-up the shopping and drop the kids off at school.

school crossingcarshopping

Its equally as Absurd to test 100% Random Read 4k IOPS or consider/test/compare a storage solutions <insert your favourite feature here> when its not required or applicable to your use case.

Summary:

  1. Peak performance is rarely a significant factor for a storage solution.
  2. Understand and document you’re storage requirements / constraints before considering products.
  3. Create a viability/success criteria when considering storage which validates the solution meets you’re requirements within the constraints.
  4. Do not waste time performing absurd testing of “Peak performance” or “features” which are not required/applicable.
  5. Only conduct Proof of Concepts on solutions:
    1. Where no evidence exists on the solutions capability for your use case/s.
    2. Which fall within your constraints (Cost, Size , Power , Cooling etc).
    3. Which on paper meet/exceed your requirements!
    4. Where you have a documented PoC plan with a detailed success criteria!
  6. As long as the solution your considering can quickly, easily and non-disruptively scale, there is no need to oversize day 1.
    1. If the solution your considering CANT quickly, easily and non-disruptively scale, then its probably not worth considering.
  7. The performance of a storage solution can be impacted by many factors such as compute, network  and applications.
  8. When Benchmarking, do so with tests which simulate the workload/s you plan to run, not “hero” style 100% read 4k (to achieve peak IOPS numbers) or 100% read 256k (to achieve high throughput numbers).

The new standard in Enterprise Architecture certifications

I am very proud to have been selected to be part of a team of absolute superstars who in the last few months have developed what I believe will be the new standard in Enterprise Architecture certifications, the Nutanix Platform Expert (NPX).

The NPX was developed under the guidance of Lisa O’Leary, a PhD psychometrician and recognized authority in the development of expert-level panel-based assessments for the IT industry. This was a real eye opener for me into how to create a scoring rubric and how to ensure different examiners score as evenly as possible to ensure consistent results.

The NPX certification (along with Nutanix nu.School Education) is designed to produce and certify the best of the best enterprise architects with the main goal of ensuring customers get the best architects to design and deliver solutions which solve real world business problems while maximizing value and reducing ongoing costs.

During the development of NPX, myself and other members of the group basically decided that none of us should be able achieve NPX without each of us putting in significant time and effort to improve our skills, especially as it is required to demonstrate expertise both architecturally and hands on in multiple hypervisors and vendor software stacks. Considering the talent in the group, this was a big call!

I personally am enjoying the challenge of preparing my submission for the NPX based on a large scale project I am working on at the moment, and look forward to submitting my application and hopefully being invited to the Nutanix Design Review (NDR) to defend. I can already tell you this is more comprehensive than any single design I have done to date, and it will be a blast to defend.

So what will being an NPX mean?

Certified graduates of the NPX Program will have a very unique set of skills, including the demonstrated ability to deliver enterprise-class Web-scale solutions using multiple hypervisors and vendor software stacks on the Nutanix platform (VMware® vSphere®, Microsoft® Hyper-V®, and KVM).

This hypervisor agnostic certification for Enterprise Architects is a first in the industry; our groundbreaking approach allows an NPX the freedom to design cutting-edge Web-scale solutions for customers based solely on their business needs.

The depth and breadth of the solution design and delivery skills validated through our peer-vetted program make NPX the new standard for excellence. In accordance with program goals every NPX will be a superb technologist, a visionary evangelist for Web-scale, and a true Enterprise Architect – capable of designing and delivering a wide range of cutting-edge solutions; custom built to support the business goals of the Global 2000 and government agencies in every region of the world.

So what’s required to achieve NPX?

The first prerequisite is the Nutanix Platform Professional (NPP) certification. The NPP is really the entry level certification showing core Nutanix knowledge.

As per the NPX Application, the NPX certification is a two-stage process;

Stage 1 being a review of a candidate’s NPX Program Application.

If a candidate’s application is accepted they will be invited to participate in the NPX Design Review (NDR).

Now at this stage you’re probably saying, this doesn’t seem that hard, right?

Well, here is an idea of the required documentation:

  • A current state and operational readiness assessment
  • A Web-scale migration and transition plan
  • Documentation of specific business requirements driving the solution design
  • Documentation of assumptions that impacted the solution design
  • Documentation of design constraints that impacted the design and delivery of the solution
  • Documentation describing risks identified in the design and delivery of the solution and how those risks
  • A solution architecture including a conceptual/logical and physical design with appropriate diagrams and descriptions all functional components of the solution
  • Documentation of operational procedures and verification

The documentation set goes well beyond any certification I am aware of, but more importantly demonstrates a candidates ability to produce documentation which ensures the solution can be implemented , validated and operated in the event the lead architect is unavailable. This is a very high standard of documentation which I’ve rarely seen in my career.

In addition, 3 Professional references will also be required to validate the candidates experience.

Stage 2 being the NDR is modeled after an academic viva voce defense (live, oral exam) and requires candidates to present their solution to, and answer questions posed to them by NPX-Certified Examiners (NCE). The NDR also includes a series of hands-on exercises, which must be completed by the candidate. Successful completion of both stages is required to earn the NPX credential.

The NPX has a strict policy regarding fictitious solution designs.

NPX candidates may not submit wholly fictitious designs.

I pushed for this during the development of the certification as in my opinion, an enterprise architect should have a portfolio of work to choose from which negates the requirements to create a fictitious design.

In saying that, Partially fictitious designs are permitted when an existing design requires additions or enhancements in order to demonstrate competence in required knowledge areas (e.g., a backup or DR solution may be added if this component was outside the scope of the original design).

Adapting an existing 3-tier solution design to the Nutanix platform is also permitted. In either case the submitted design should contain a majority of solution components architected to support applications with service level agreements specified by actual business stakeholders.

The NDR itself requires the completion of an exercise involving a live Nutanix environment and completion of a design scenario. Both exercises will require demonstration of NPX-level solution design and delivery skills with a second solution stack/hypervisor.

An NPX candidate is permitted to choose the hypervisor you will be tested on during your NDR (it must be different from the hypervisor utilized in the submitted solution design). The hypervisor selected will be used for the Hands-on and Design scenarios during the NDR.

The Hypervisor choices are:

  • VMware® vSphere®
  • Microsoft® Hyper-V®
  • KVM

What next?

I would encourage all enterprise architects to stay tuned for the release of more NPX details via the Nutanix nu.School website and take on the challenge of NPX and become a better architect in the process.

The Nutanix Platform Expert Official Certification Guide is currently being written and should be released at Nutanix .NEXT this coming June.

Summary:

I really enjoyed working with such a talented group of people in developing NPX, and I look forward to being a part of the program firstly as a candidate and as a certified examiner in the future to ensure the quality of Enterprise Architects in the industry only gets better!

Here is a group shot of on the final day of NPX development in San Jose.

Names (Left to right): Derek Seaman , Steven Poitras, Jon Kohler, Ray Hassan, Bas Raayman, Raymon Epping, Josh Odgers, Michael Webster, Artur Krzywdzinski, Samir Roshan, Lane Laverett, Mark Brunstad and Richard Arsenian.

Absent for Photo: Magnus Andersson , Lisa O’Leary, PhD Psychometrician.

NPXDevTeam