Not clear on what happens when you assign CPUs to a virtual machine? The good news is that you are not alone! Even better news is that you about to get an explanation!
First thing you must know is that physical processors are never assigned to specific virtual machines. Assigning two virtual CPUs to a virtual machine does not mean that Hyper-V reserves two cores out of the physical pool and permanently assigns them to that virtual machine. That’s not what any hypervisor does. At least Hyper-V doesn’t.
Take a look at a screen shot of my Task Manager screen.
When computers never came with multiple CPUs or CPUs with multiple cores, we knew that computers couldn’t really multitask. There was one CPU with one core, so there was only one possible thread of execution.
Task Manager then looked pretty much like Task Manager now. You had a long list of running processes, all of them with a metric indicating what percentage of the CPUs time they were using. Each line item you see is a process (in the recent Task Manager versions, a process group).
A process might consist of one or many threads. A thread is nothing more than a sequence of CPU instructions.
What happens is that the operating system would stop a running thread, preserve its state, and then start another thread. After a bit of time, it would stop that thread, preserve its state, and then start another thread, and so on. All this is pre-emptive, which means that it’s the operating system that decides when a new thread will run. The thread can beg for more CPU time, and priorities can be set that affect where a process goes in line, but the OS is in charge of thread scheduling.
The only difference today is that we have multiple CPUs with multiple cores in practically every system (and don’t forget about hyper-threading in Intel processors), so operating systems can actually multi-task now.
Because of its role as a thread manager, Windows can be called a “supervisor” (very old terminology):
“A supervisory program or supervisor is a computer program, usually part of an operating system, that controls the execution of other routines and regulates work scheduling, input/output operations, error actions, and similar functions and regulates the flow of work in a data processing system.”
Hyper-V is a hypervisor:
“A system that manages supervisors that manage processes that are made up of threads.”
Task Manager doesn’t work the same way for Hyper-V, but the same thing is going on. There is a list of partitions, and inside those partitions are processes and threads. The thread scheduler works pretty much the same way.
The image presented above is necessarily an oversimplification, as it’s not simple first-in-first-out (NUMA plays a role). There are always going to be a lot more than just nine threads going at any given time. They’ll be queued up in the thread scheduler.
We know that we can affinitize threads in Windows so that they always run on a particular core or set of cores. As far as I know there’s no way to do that in Hyper-V with vCPUs. Dedicating a thread to a core is not the same thing as dedicating a core to a thread. You can’t prevent a core from running other threads in the Windows world, which is what many people really want to try to do.
Hyper-V makes the decision at the hypervisor level, but it doesn’t really let the guests have any input. Guest operating systems decide which of their threads they wish to operate.
Really understanding this topic requires a fairly deep dive into some complex ideas, and that level of depth is not really necessary for most administrators.
Affinity aside, you never know where any given thread is actually going to execute. A thread that was paused to yield CPU time to another thread may, when it’s resumed, be assigned to another core.
When you see an application consuming exactly 50% of a dual core system and each core looks like it’s running at 50% usage, that indicates a single-threaded application. Each time it is scheduled, it consumes 100% of the core that it’s on. The next time it’s scheduled, it goes to the other core and consumes 100% there. When the performance is aggregated for Task Manager, that’s an even 50% utilization for the app. Since the cores are handing the thread off at each scheduling event and are mostly idle while the other core is running that app, they amount to 50% utilization for the measured time period. If you could reduce the period of measurement to capture individual time slices, you’d actually see the cores spiking to 100% and dropping to 0% (or whatever the other threads are using) in an alternating pattern.
What we’re really concerned with is the number of vCPUs assigned to a system and priority.
You can’t assign more vCPUs to a virtual machine than you have physical cores in your host. So,a virtual machine’s CPU count means the maximum number of threads that it is allowed to operate on physical cores at any given time. Here, I can’t set that virtual machine to have more than four vCPUs because the host only has four CPUs. Therefore, there is place for a fifth thread to be scheduled. If I had more cores on my system and left this VM at 4 vCPUs, then it would only ever send a maximum of four threads up to the hypervisor for scheduling. Other threads are kept in the guest’s thread scheduler (the supervisor), waiting their turn.
This is no different than the fact that we’ve got 50+ processes running on a quad core laptop. We can’t actually run more than four threads at a time, but we’ll always going to have far more than four threads scheduled. Windows is so good at this (most of the time) that most people don’t even pause to consider just what’s going on. Your VMs (supervisors) will bubble up threads to run and Hyper-V (hypervisor) will schedule them (mostly) the same way that Windows has been scheduling them ever since it outgrew cooperative scheduling in Windows 3.x.
In the generic sense, this has no answer. Some people say 1:1 – and can do it but it’s wasteful. I could run my current desktop configuration on 64-core server but I wouldn’t see much performance difference because almost all my threads sit idle almost all the time. If something needs 0% CPU time, giving it its own core does nothing. Later, the answer was upgraded to 12 vCPUs per 1 physical core. And then the recommendations went away.
It was probably a good rule, but think about it. You know that mostly, operating threads will be evenly distributed across whatever hardware is available. So the amount of physical CPUs needed doesn’t depend on how many virtual CPUs there are. It’s entirely dependent on what the operating threads need. And, even if you’ve got a bunch of heavy threads going, that doesn’t mean their systems will die as they get pre-empted by other heavy threads. It really is going to depend on how many other heavy threads they wait for.
The CPU is a binary device; it’s either processing or it isn’t. Here’s the kicker: Every single time a thread runs, no matter what it is, it drives the CPU at 100% (power-throttling changes the clock speed, not workload saturation). The 100% or 50% or whatever number you see is completely dependent on time measurement. 100% means that the CPU was completely active across the measured span of time. 50% means it was running a process half of that time and was idle the other half of that time.
This means is that a single thread can’t actually consume 100% of the CPU the way people think it can, because Windows/Hyper-V will pre-empt it when it’s another thread’s turn. You actually can have multiple “100%” CPU threads running on the same system.
The problem is that a normally responsive system expects some idle time, mean