I have noticed an increasing amount of search engine terms which results in people accessing my blog similar to
* High CPU Ready Low CPU usage
* CPU ready and Low utilization
* CPU ready relationship to utilization
So I wanted to try and clear this issue up.
First lets define CPU Ready & CPU Utilization.
CPU ready (percentage) is the percentage of time a virtual machine is waiting to be scheduled onto a physical (or HT) core by the CPU scheduler.
CPU utilization measures the amount of Mhz or Ghz that is being used.
Next to find out how much CPU ready is ok, check out my post How Much CPU ready is OK?
CPU Ready and CPU utilization have very little to do with each other, high CPU utilization does not mean you will have high CPU ready, and vice versa.
So it is entirely possible to have either of the below scenarios
Scenario 1 : An ESXi host has 20% CPU utilization and VMs to suffer high CPU ready (>10%).
Scenario 2: An ESXi host has 95% CPU utilization and VMs to have little or no CPU ready (<2.5%)
How are the above two scenarios possible?
Scenario 1 may occur when
* One or more VMs are oversized (ie: not utilizing the resources they are assigned)
* The host (or cluster) is highly overcommited (either with or without right sized VMs)
* Where power management settings are set to Balanced / Low Power or custom
Scenario 2 may occur when
* VMs are correctly sized
* The ESXi hosts are well sized for the virtual machine workloads
* The VM to host ratio has been well architected
So the question on everyone lips, How can high CPU ready with Low CPU utilization be addressed/avoided?
If you have a situation where you are experiencing high CPU ready and low ESXi host utilization the following steps should be taken
* Right size your VMs
This is by far the most important thing to do. I Recommend using a tool such as vCenter Operations to assist with determining the correct size for VMs.
* Ensure your hosts/clusters are not excessively overcommited
I generally find 4:1 vCPU overcommitment is achievable with right sized VMs where the avg VM size is <4 vCPUs. The higher the vCPU per VM average, the lower CPU overcommitment you will achieve.)
If you have an average VM size of 8 vCPUs then you may only see <1.5:1 overcommitment before suffering contention (CPU ready).
* Use DRS affinity rules to keep complimentary workloads together
VMs with high CPU utilization and VMs with very low CPU utilization can work well together. You also may have an environment where some servers are busy overnight and others are only busy during business hours, these are examples of workload to keep together.
* Use DRS anti-affinity rules to keep non-complimentary workloads apart
VMs with very high CPU utilization (assuming the high utilization is at the same time) can be spread over a number of hosts to avoid stress on the CPU scheduler.
* Ensure your ESXi hosts are chosen with the virtual machine workloads in mind
If your VMs are >=8vCPUs choose a CPU with >=8 cores per socket and more sockets per host, like 4 socket hosts as opposed to 2 socket hosts. If the bulk of your VMs are 1 or 2 vCPUs, then even older 2 socket 4 core processors should generally work well.
* Use Hyperthreading
Assuming you have a mix of workloads and not all VMs require large amounts of cores and Ghz, using hyper threading increases the efficiency of the CPU schedulure by effectively doubling the scheduling opportunities. Note: A HT core will generally give much less than half the performance of a pCore.
* Use “High Performance” for your Power Management Policy
The above seven (7) steps should resolve the vast majority of issues with CPU ready.
For an example of the benefits of right sizing your VMs, check out my earlier post – VM Right Sizing , An example of the benefits.
Also please note, using CPU reservations does not solve CPU ready, I have also written an article on this topic – Common Mistake – Using CPU reservations to solve CPU ready
I hope this helps clear up this issue.
Great post. I’d love to see more resource consumers better right size their virtual machines. I sometimes wonder how many shops out there are suffering from high CPU ready simply because their standard template was designed to handle “just about any” workload rather than deploying at lower vCPU assignment and having consumers request increases based on demonstrated need.
Josh, nice article. One other thing worth mentioning in regards to unexplained High CPU ready times would be power management settings. Some manufacturers default to dynamic power settings basically parking cores and reducing cpu clock speeds which ESXI is unaware of. I usually hand control over to the hypervisor.
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1018206
Hi Carlos, Thanks for the feedback, Power Management settings has been added as a cause and changing the setting to High Performance as a possible resolution to ready issues.
Cheers
Great post, also to add would be NUMA node considerations sometimes for workloads such as Oracle/SQL to better utilize CPU and or cores within the same CPU
Thanks for the comment, I agree NUMA is a importaint consideration but its not directly related to CPU ready / contention. However, since your interested I will do a separate post on sizing VMs which will include NUMA node sizing considerations.
Cheers
Awesome article as always. You’re keeping in touch with the common man. You’re still Joshie from the block
Quite often i see low cpu / high CPU ready in our environment.
Trialling vCOPs, and it says we can get up to 15:1 vCPU ratio. I think it’s a bit high, but hopefully we can win the political war and right size VM’s
Cheers Dauncy!
vC Ops is awesome, but yeah I wouldnt push to hard, and gradually increase overcommitment.
Best of luck getting a win on the board with the political war as you put it, its generally the biggest hurdle to right sizing.
Catch you at Melbourne VMUG?
Another culprit I have seen in the past is high I/O latency causing high CPU ready times without high CPU utilization.
Hi Josh. I have a VM with 2 vCPUs averaging close to 9000 milliseconds of CPU ready, 9% CPU usage and zero core count contention. The cluster the VM running on has 4% total CPU usage and zero CPU core count contention. SHouldn’t CPU core count contention be higher than zero if the VM is experiencing 9000 milliseconds of CPU ready ?
Hi Peter, assuming you have no CPU overcommitment, in theory you should have next to zero CPU ready. I have seen this problem before due to Power Management settings, so I would suggest you try setting it to High performance.
Also, if your VM/s have only 9% CPU usage I would drop it back to 1vCPU regardless of if you have overcommitment or not.
Let me know how it goes for you.
HI, what if we have vm’s that are slow 95% of the time but then require heavy CPU usage the other 5%? So, most of the time 2/4 vCPU’s would suffice but then when they peak they’d need 8/16 vCPU’s to keep up. Is there a solution to this issue?
Thanks!
In most cases, if a VM is idle, the number of vCPUs it is assigned is generally not an issue even in production environment. Ideally, for end of month activities which do zero work during the rest of the month, you could automate the shutdown/startup if for whatever reason there is an impact to your ESXi host/s performance as a result of CPU ready.