[FoRK] Purpose-built unikernels and further specializations of VMs
meltsner at alum.mit.edu
Wed Mar 12 14:46:13 PDT 2014
It'd be cool if this worked really well with the Docker tools mentioned by
sdw. OCaml + a componentized kernel and build system, I guess.
The Compiler Solution
This problem has received a lot of thought at the University of Cambridge,
both at the Computer Laboratory (where the Xen hypervisor originated in
2003) and within the Xen Project (custodian of the hypervisor that now
powers the public cloud via companies such as Amazon and Rackspace). The
solution--dubbed *MirageOS*--has its ideas rooted in research concepts that
have been around for decades but are only now viable to deploy at scale
since the availability of cloud-computing resources has become more widely
The goal of MirageOS is to restructure entire VMs--including all kernel and
user-space code-- into more modular components that are flexible, secure,
and reusable in the style of a library operating system. What would the
benefits be if *all* the software layers in an appliance could be compiled
within the same high-level language framework instead of dynamically
assembling them on every boot? First, some background information about
appliances, library operating systems, and type-safe programming languages.
The Shift to Single-Purpose Appliances
A typical VM running on the cloud today contains a full operating-system
image: a kernel such as Linux or Windows hosting a primary application
running in user space (e.g., MySQL or Apache), along with secondary
services (e.g., syslog or NTP) running concurrently. The generic software
within each VM is initialized every time the VM is booted by reading
configuration files from storage.
Despite containing many flexible layers of software, most deployed VMs
ultimately perform a single function such as acting as a database or Web
server. The shift toward single-purpose VMs is a reflection of just how
easy it has become to deploy a new virtual computer on demand. Even a
decade ago, it would have taken more time and money to deploy a single
(physical) machine instance, so the single machine would need to run
multiple end-user applications and therefore be carefully configured to
isolate the constituent services and users from each other.
The software layers that form a VM haven't yet caught up to this trend, and
this represents a real opportunity for optimization--not only in terms of
performance by adapting the appliance to its task, but also for improving
security by eliminating redundant functionality and reducing the attack
surface of services running on the public cloud. Doing so statically is a
challenge, however, because of the structure of existing operating systems.
Limitations of Current Operating Systems
The modern hypervisor provides a resource abstraction that can be scaled
dynamically-- both vertically by adding memory and cores, and horizontally
by spawning more VMs. Many applications and operating systems can't fully
utilize this capability since they were designed before modern hypervisors
came about (and the physical analogs such as memory hotplug were never
ubiquitous in commodity hardware). Often, external application-level load
balancers are added to traditional applications running in VMs in order to
make the service respond *elastically* by spawning new VMs when load
increases. Traditional systems, however, are not optimized for size or boot
time (Windows might apply a number of patches at boot time, for example),
so the load balancer must compensate by keeping idle VMs around to deal
with load spikes, wasting resources and money.
Why couldn't these problems with operating systems simply be fixed? Modern
operating systems are intended to remain resolutely general-purpose to
solve problems for a wide audience. For example, Linux runs on an
incredibly diverse set of platforms, from low-power mobile devices to
high-end servers powering vast data centers. Compromising this flexibility
simply to help one class of users improve application performance would not
On the other hand, a specialized server appliance no longer requires an OS
to act as a resource multiplexer since the hypervisor can do this at a
lower level. One obvious problem with this approach is that most existing
code presumes the existence of large but rather calcified interfaces such
as POSIX or the Win32 API. Another potential problem is that conventional
operating systems provide services such as a TCP/IP stack for communication
and a file-system interface for storing persistent data: in our brave new
world, where would these come from?
The MirageOS architecture--dubbed *unikernels*--is outlined in figure 1.
Unikernels are specialized OS kernels that are written in a high-level
language and act as individual software components. A full application (or
*appliance*) consists of a set of running unikernels working together as a
distributed system. MirageOS is based on the OCaml (http://ocaml.org)
language and emits unikernels that run on the Xen hypervisor. To explain
how it works, let's look at a radical operating-system architecture from
the 1990s that was clearly ahead of its time. ....
After 30+ years of email, I have used up my supply of clever ,sig material.
More information about the FoRK