Paravirtualization (see TechOverview) has a number of advantages over full virtualization. Performance is the most obvious current advantage, but there are many more.
Contents
Performance
Performance is the most well known advantage that paravirtualization has, however with paravirtualized device drivers in a fully virtualized OS this advantage is actually getting smaller over time.
However, compared to traditional full virtualization, where the virtualization software emulates a complete computer and a completely unmodified guest operating system is run, paravirtualization has very significant performance advantages.
CPU time usage measurement
Any virtualized system tends to have multiple virtual machines sharing the system resources. Each of these virtual machines will have its own CPU scheduler, in addition to the hypervisor's scheduler. Each virtual machine can get preempted by the hypervisor in order to have another virtual machine run.
This results in the curious effect where two virtual machines can be sharing a CPU, and both having a process that looks like it's using 100% of the CPU - even though they are sharing the CPU, so they can really only get 50% each!
Usually this is not much of an issue, but in environments where users get billed for the CPU time used, the system really ought to get it right.
Another major concern is performance measurement of applications. On a system with 3 virtual CPUs per physical CPU, you can easily end up with a "fudge factor" of 3, making performance measurement totally useless. On the other hand, measuring performance of the application in a non-virtualized environment also makes little sense, if it will only be used in a virtualized environment once it is in production.
Steal time
Paravirtualization gets this time accounting right. The time spent waiting for a physical CPU is never billed against a process, allowing for accurate performance measurement even when there is CPU time contention between multiple virtual machines.
The amount of time the virtual machine slowed down due to such CPU time contention is split out as so-called "steal time" in /proc/stat and properly displayed in tools like vmstat(1), top(1) and sar(1).
Profiling
Xen has an oprofile port called "xenoprof", which allows administrators to accurately measure the CPU use of everything on the system. With full virtualization, you would be limited to profiling code that runs inside the guest virtual machine, with all the timing issues of full virtualization.
Timekeeping
A fully virtualized system, like an OS running on bare hardware, relies on the timer interrupt for its time keeping. This means a number of things:
- An idle virtual machine still has to process hundreds of interrupts a second.
- Missed interrupts result in unstable time.
The unstable timekeeping has been observed for years by users in the field. Paravirtualization solves this by keeping a stable time in the hypervisor, and having the guest OS query the hypervisor time.
This also allows paravirtualized guests to not get timer interrupts at all when it is idle. It simply tells the hypervisor "wake me up in half a second" (or whenever the next scheduled event is), and goes to sleep. This means that idle guests use a lot less CPU time, allowing more guests to run on the same physical hardware. Mostly idle guests are very common in eg. ISP hosting environments, and dynamic ticks provide a large benefit here.
Having idle virtual machines really idle also allows the hypervisor to put the CPU in power saving mode when nothing is running. A few hundred interrupts per second to a few dozen virtual machines would prevent power saving from working right.
SMP scalability
A fully virtualized SMP guest expects to always immediately be able to communicate between the various virtual CPUs. A paravirtualized guest knows that this is not always possible, because the hypervisor also schedules other guests. The Xen paravirtualized kernel has several optimizations that help SMP scalability of virtual machines.
Memory resizing
In order to fit more virtual machines on a physical system, it is possible to simply reduce the amount of memory each virtual machine gets. However, sometimes a workload simply needs more memory for a period of time.
Paravirtualized Xen kernels have the ability to dynamically resize the memory of virtual machines on the fly. This is not possible with fully virtualized kernels.
CPU hotplug
The number of virtual CPUs assigned to Xen guest kernels can be varied on the fly. Currently Linux only allows the number of CPUs to be reduced below the number it was booted with, and back up, but not beyond the number of CPUs the virtual machine had at bootup time.
What about mainframes?
Don't mainframes use full virtualization?
Well, not exactly. Mainframes have CPU steal time accounting, no clock tick on idle and extensive device hotplugging support for virtual machines. In short, mainframes use all the same technologies that are commonly grouped under the "paravirtualization" category.