Latest Articles related to all categories. Microsoft, Twitter, Xbox, Autos and much more

Full width home advertisement

Post Page Advertisement [Top]

This chapter will take a quick look at the inside of a UML. I will concentrate
on the relationship between the UML and the host. For many
people, encountering a virtual machine for the first time can be confusing
because it may not be clear where the host ends and the virtual
machine starts.
For example, the virtual machine obviously is part of the host
since it can’t exist without the host. However, it is totally separate from
the host in other ways. You can be root inside the UML and have no
privileges1 whatsoever on the host. When UML is run, it is provided
some host resources to use as its own. The root user within UML has
absolute control over those, but no control, not even access, to anything
else on the host. It’s this extremely sharp distinction between what the
UML has access to and what it doesn’t that makes UML useful for a
large number of applications.
1. In order to run a process, you obviously need some level of privilege on the
system. However, a UML host can be set up such that the user that owns
the UML processes on the host can do nothing but run the UML process.

A second common source of confusion is the duality of UML. It is
both a Linux kernel and a Linux process. It is useful, and instructive,
to look at UML from both perspectives. However, to many people, a kernel
and a process are two completely different things, and there can be
no overlap between them. So, we will look at a UML from both inside
and outside, on the host, in order to compare the two views to each
other. We will see different views of the same things. They will look different
but will both be correct in their own ways. Hopefully, by the end
of the chapter, it will be clear how something can be both a Linux kernel
and a Linux process.
Figure 2.1 shows the relationship among a UML instance, the host
kernel, and UML processes. To the host kernel, the UML instance is a
normal process. To the UML processes, the UML instance is a kernel.
Processes interact with the kernel by making system calls, which are
like procedure calls except that they request the kernel do something
on their behalf.


Like all other processes on the host, UML makes system calls to
the host kernel in order to do its work. Unlike the other host processes,
UML has its own system call interface for its processes to use. This is
the source of the duality of UML. It makes system calls to the host,
which makes it a process, and it implements system calls for its own
processes, making it a kernel.
Let’s take a look at the UML binary, which is normally called
linux:
host% ls -l linux
-rwxrw-rw- 2 jdike jdike 23346513 Jan 27 12:16 linux
Figure 2.1 UML as both a process and a kernel
Host kernel
System calls
System calls
Hardware
ls ps UML
ls ps

This is a normal Linux ELF binary, as you can see by running file on
it:
host% file linux
linux: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), \
for GNU/Linux 2.2.5, statically linked, not stripped
It is also a Linux kernel, so it may be instructive to compare it to the
kernel running on this machine:
host% ls -l /boot/vmlinuz-2.4.26
-rw-r--r-- 1 root root 945800 Sep 18 17:12 /boot/vmlinuz-2.4.26
The UML binary is quite a bit larger than the kernel on the host,
but it has a full symbol table, as you can see from the output of file
above. So, let’s strip it and see what that does:
host% ls -l linux
-rwxrw-rw- 2 jdike jdike 2236936 Jan 27 15:01 linux
It’s a bit more than twice as large as the host kernel, possibly
because the configurations are different. I tend to build options into
UML, which on the host are modules. Checking this by adding up the
sizes of the modules loaded on the host yields this:
host% lsmod
Module        Size Used by
usblp           17473 0
parport_pc   31749 1
lp                16713 0
parport         39561 2 parport_pc,lp
autofs4         23493 2
sunrpc         145541 1
...
host% lsmod | awk '{n += $2} END {print n}'
1147092
Adding that to the file size of vmlinuz-2.4.26 gives us something
close to the size of the UML binary after the symbol table has
been stripped off.
What is the point of this comparison? It is to introduce the fact
that UML is both a Linux kernel and a Linux process. As a Linux process,
it can be run just like any other executable on the system, such as
bash or ls.

BOOTING UML FOR THE FIRST TIME
Let’s boot UML now:
host% ./linux
Checking for /proc/mm...not found
Checking for the skas3 patch in the host...not found
Checking PROT_EXEC mmap in /tmp...OK
Linux version 2.6.11-rc1-mm1 (jdike@tp.user-mode-linux.org) (gcc version 3.3.2
20031022 (Red Hat Linux 3.3.2-1)) #83 Thu Jan 27 12:16:00 EST 2005
Built 1 zonelists
Kernel command line: root=98:0
PID hash table entries: 256 (order: 8, 4096 bytes)
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 29368k available
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
Checking for host processor cmov support...Yes
Checking for host processor xmm support...No
Checking that ptrace can change system call numbers...OK
Checking syscall emulation patch for ptrace...missing
Checking that host ptys support output SIGIO...Yes
Checking that host ptys support SIGIO on close...No, enabling workaround
Checking for /dev/anon on the host...Not available (open failed with errno 2)
NET: Registered protocol family 16
mconsole (version 2) initialized on /home/jdike/.uml/3m3vDd/mconsole
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
NET: Registered protocol family 2
IP: routing cache hash table of 512 buckets, 4Kbytes
TCP established hash table entries: 2048 (order: 2, 16384 bytes)
TCP bind hash table entries: 2048 (order: 1, 8192 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
NET: Registered protocol family 1
NET: Registered protocol family 17
Initialized stdio console driver
Console initialized on /dev/tty0
Initializing software serial port version 1
VFS: Waiting 19sec for root device...
VFS: Waiting 18sec for root device...
VFS: Waiting 17sec for root device...
VFS: Waiting 16sec for root device...
Figure 2.2 Output from the first boot of UML

VFS: Waiting 15sec for root device...
VFS: Waiting 14sec for root device...
VFS: Waiting 13sec for root device...
VFS: Waiting 12sec for root device...
VFS: Waiting 11sec for root device...
VFS: Waiting 10sec for root device...
VFS: Waiting 9sec for root device...
VFS: Waiting 8sec for root device...
VFS: Waiting 7sec for root device...
VFS: Waiting 6sec for root device...
VFS: Waiting 5sec for root device...
VFS: Waiting 4sec for root device...
VFS: Waiting 3sec for root device...
VFS: Waiting 2sec for root device...
VFS: Waiting 1sec for root device...
VFS: Cannot open root device “98:0” or unknown-block(98,0)
Please append a correct “root=” boot option
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(98,0)
EIP: 0023:[<a015a751>] CPU: 0 Not tainted ESP: 002b:40001fa0 EFLAGS: 00000206
Not tainted
EAX: 00000000 EBX: 00002146 ECX: 00000013 EDX: 00002146
ESI: 00002145 EDI: 00000000 EBP: 40001fbc DS: 002b ES: 002b
Call Trace:
a0863af0: [<a0030446>] printk+0x12/0x14
a0863b00: [<a003ff32>] notifier_call_chain+0x22/0x40
a0863b30: [<a002f9f2>] panic+0x56/0x108
a0863b40: [<a003c0f6>] msleep+0x42/0x4c
a0863b50: [<a0002d96>] mount_block_root+0xd6/0x188
a0863bb0: [<a0002e9c>] mount_root+0x54/0x5c
a0863bc0: [<a0002f07>] prepare_namespace+0x63/0xa8
a0863bd0: [<a0002ebb>] prepare_namespace+0x17/0xa8
a0863bd4: [<a000e190>] init+0x0/0x108
a0863be4: [<a000e190>] init+0x0/0x108
a0863bf0: [<a000e291>] init+0x101/0x108
a0863c00: [<a0027131>] run_kernel_thread+0x39/0x40
a0863c18: [<a000e190>] init+0x0/0x108
a0863c28: [<a0027117>] run_kernel_thread+0x1f/0x40
a0863c50: [<a0013211>] unblock_signals+0xd/0x10
a0863c70: [<a002c51c>] finish_task_switch+0x24/0xa4
a0863c84: [<a000e190>] init+0x0/0x108
a0863c90: [<a002c5ad>] schedule_tail+0x11/0x124
a0863cc4: [<a000e190>] init+0x0/0x108
a0863cd0: [<a001ad58>] new_thread_handler+0xb0/0x104
a0863cd4: [<a000e190>] init+0x0/0x108
a0863d20: [<a015a508>] __restore+0x0/0x8
a0863d60: [<a015a751>] kill+0x11/0x20
Figure 2.2 Output from the first boot of UML (continued)

Notice two obvious things about the results, shown in Figure 2.2.
1. The output resembles the boot output of a normal Linux machine.
2. The boot was not very successful, as you can see from the panic
and stack dump at the end.
It’s worth comparing this to the boot output of a Linux system,
which is normally available by running dmesg. You’ll see a lot of similarities—
many of the messages, such as the ones from the filesystem
and network subsystems, are identical. Much of the rest are totally different,
although they should seem similar in purpose. This is largely
due to hardware drivers initializing. UML doesn’t have the same hardware
or drivers as the host, so their bootup messages will be different.
If you have access to Linux on several different architectures, such as
x86 and x86_64 or ppc, you’ll see the same sorts of differences between
their boot output. In fact, this is a very apt comparison because UML is
a different architecture from the Linux kernel running on the host.
Let’s look at the output in more detail.
Checking for /proc/mm...not found
Checking for the skas3 patch in the host...not found
Checking PROT_EXEC mmap in /tmp...OK
These are checking the environment on the host to see if it can run
at all (the executable /tmp check) and whether the host kernel has
capabilities that allow UML to run more efficiently. You’ll see more of
this below, but these particular checks need to be done very early.
Checking for host processor cmov support...Yes
Checking for host processor xmm support...No
Checking that ptrace can change system call numbers...OK
These are checking some more capabilities of the host. The first
two are checking processor capabilities, and the last is checking
whether the host has a feature that’s absolutely needed for UML to run
(which all modern hosts do).
mconsole (version 2) initialized on /home/jdike/.uml/3m3vDd/mconsole
...
Initialized stdio console driver
...
Initializing software serial port version 1
Here, UML is initializing its drivers. A UML boot has much less
output of this sort compared with a boot of a physical Linux system.

This is because UML uses resources on the host to support its virtual
hardware, and there are many fewer types of these resources than
there are different types of devices on a physical system. For example,
every possible sort of block device within UML can be accessed as a
host file, so block devices require a single UML driver. In contrast, the
host has multitudes of block drivers, for IDE disks, SCSI disks, SATA
disks, and so on. Because of the uniform interface provided by the host,
UML requires many fewer drivers in order to access these devices and
the data on them.
The first driver is the mconsole2 driver, which allows a UML to be
controlled and managed from the host. This has no hardware equivalent
on most Linux systems. The last two are the console and serial line
drivers, which obviously do have hardware equivalents, except that the
UML drivers will communicate using virtual devices such as pseudoterminals
rather than physical devices such as a graphics card or serial
line.
VFS: Waiting 1sec for root device...
VFS: Cannot open root device "98:0" or unknown-block(98,0)
Please append a correct "root=" boot option
Kernel panic - not syncing: VFS: Unable to mount root fs on \
unknown-block(98,0)
Here is the panic that killed off this attempted run of UML. The
problem is that we didn’t provide UML with a root device, so it couldn’t
mount its root filesystem. This is fatal and causes the panic and the
stack trace. You can make a physical Linux machine do exactly the
same thing by putting a bogus root= option on the kernel command
line using LILO or GRUB.3
Finally, an important point is that we just panicked a UML kernel,
and the only result was that we were dropped back to the shell prompt.
The host system itself, and everything else on the system, was totally
unaffected by the crash. This demonstrates the basis of many of the
advantages of UML over a physical system—it can be used in ways
that may cause system crashes or other software malfunctions, but the
2. MConsole stands for “Management Console” and is a mechanism for controlling
and monitoring a UML instance from the host.
3. UML needs no bootloader like the host needs LILO or GRUB. As it is run from
the command line, you can think of the host as being the UML bootloader.

damage is limited to the virtual machine. As we will see later, even this
damage can be undone quite easily.
That may have been interesting, but not very useful. Now, we will
boot UML successfully and see how it looks inside.
BOOTING UML SUCCESSFULLY
The problem was that we didn’t tell UML what its root device was. This
is an important special case of a more general property of UML—its
hardware is configured on the fly. In contrast to a physical system,
whose hardware is fixed, a virtual system can be different every time it
is booted. So, it expects to be told, either on the command line or later
via the mconsole interface, what hardware it possesses.
Here, we will configure UML on the command line. The first order
of business is to give it a proper root device so that it has something it
can boot. As I mentioned earlier, UML devices are virtual and constructed
from host resources. Specifically, UML’s disks are generally
(but not always, as we will see later) files in the host’s filesystem.
For example, here is the filesystem we will boot:
host% ls -l ~/roots/debian_22
-rw-rw-r-- 1 jdike jdike 1074790400 Jan 27 18:31 \
/home/jdike/roots/debian_22
One obvious thing here is that the filesystem image is very large.
file will tell us a bit more about it:
host% file ~/roots/debian_22
/home/jdike/roots/debian_22: Linux rev 1.0 ext2 filesystem data
This tells us that the data in this file is an ext2 filesystem image.
In other words, we can loopback-mount it and see that it contains a full
filesystem:
host# mount ~/roots/debian_22 ~/mnt -o loop
host% ls ~/mnt
bfs boot dev floppy initrd lib mnt root tmp var
bin cdrom etc home kernel lost+found proc sbin usr
In fact, when mounting this as its root filesystem, UML will do
something very similar to a loopback mount. The UML block driver
operates by calling read and write on this file on the host, analogous to
a block driver on the host doing reads and writes on a physical disk.

The loopback driver on the host is doing exactly the same thing, except
from within the host kernel, rather than from a process, where the
UML block driver is.
So, in order to provide this file to UML as its root device, we need
to tell the UML block driver (the ubd or UML Block Device driver) to
attach itself to it. This is done with this option:
ubda=~/roots/debian_22
This is the easiest way to initialize a UML block device, and it simply
says that the first UML block device is to be attached to the file ~/roots/
debian_22. Internally, UML tells the kernel initialization code to use
the ubda device as its default root device (this can be overridden by
specifying a different device with the root= switch, as the panic message
suggested).
I’m going to add one more option to the command line to make the
virtual machine’s configuration more explicit:
mem=128M
This makes UML believe it has 128MB of physical memory but
does not actually allocate 128MB on the host. Rather, this creates a
128MB sparse file on the host. Being sparse, this file will occupy very
little space until data starts being written to it. As the UML instance
uses its memory, it will start putting data in the memory backed by
this file. As that happens, the host will start allocating memory to hold
that data. Since the file is fixed in size, the UML instance is limited to
that amount of memory. Its memory consumption will approach this
limit asymptotically as it reads file data from its own disks and caches
it in its memory.
Since the host will be allocating memory for the UML instance
dynamically, as needed, the actual consumption will be less than the
maximum for a time. This conserves memory, making it possible to run
a greater number of not-too-active UML instances than would be possible
otherwise.
The host memory consumption will, in this case, be at most
128MB. Even if the UML instance is fully using its memory, the host
memory consumption may be less, as it may have swapped out some of
the UML memory. The UML instance, like any other process that has
been swapped out, will be unaware of this and will use its memory as
though it is present in the host’s memory. The host kernel is responsible

The UML instance will also swap if its workload exceeds its physical
memory. This is entirely independent from the host swapping the
UML instance’s memory. Each system will swap when it needs more
memory, so if the host is short of memory and the UML instance has
plenty, the host will swap and the UML instance won’t. Conversely, if
the UML instance is short of memory and the host isn’t, the UML
instance will swap and the host won’t. The case where both are swapping
at the same time is interesting and can lead to pathological performance
problems.4
So, the UML command ends up looking like this:
~/linux mem=128M ubda=/home/jdike/roots/debian_22
Figure 2.3 shows the results.
This is much more interesting than the last attempt. We get to see
the filesystem booting. Note that it’s almost exactly the same as it
would be if the same filesystem were booted on the host. The underlying
virtual machine shows through in only a couple of places. One is
when the root filesystem is checked5:
/dev/ubd0: clean, 9591/131328 files, 64611/262144 blocks
where we see the UML device name, /dev/ubd0, rather than hda1 or
sda1 as on a physical machine.
4. Consider the case where both the host and the UML instance are swapping
at the same time. They may both choose the same page to swap out. If the
host swaps it out first, then when the UML instance swaps it, the host will
need to read it back from disk so that the UML instance can write it to its
own swap device. This will cause the page to be read and written a total of
three times, when only once was desirable. This will increase the I/O load
on the host at a time when it is already under stress. Solutions for this sort
of situation are under investigation and will be described in Chapter 10.
5. The fsck message refers to /dev/ubd0 rather than /dev/ubda. Devices
can be specified with either numbers or letters. Using letters is generally favored
since it is similar to current practice with other drivers, such as naming
IDE disks hda, hdb, and so on. It also makes the use of multiple ubd
devices within UML less confusing. There’s less expectation that ubdb on
the command line corresponds to minor number 1 inside the UML instance,
as the use of ubd1 does. In fact, ubdb has minor number 16 (to allow for partitions
on ubda). The one case where numbers are needed is when you are
plugging a large number of disks into a UML instance. There is no letter
equivalent of ubd512, so you’d have to use a number to describe this device.

~/linux/2.6/2.6.10 22849: ./linux mem=128M ubda=/home/jdike/roots/debian_22
Checking for /proc/mm...not found
Checking for the skas3 patch in the host...not found
Checking PROT_EXEC mmap in /tmp...OK
Linux version 2.6.11-rc1-mm1 (jdike@tp.user-mode-linux.org) (gcc version 3.3.2
20031022 (Red Hat Linux 3.3.2-1)) #83 Thu Jan 27 12:16:00 EST 2005
Built 1 zonelists
Kernel command line: mem=128M ubda=/home/jdike/roots/debian_22 root=98:0
PID hash table entries: 1024 (order: 10, 16384 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 126720k available
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
Checking for host processor cmov support...Yes
Checking for host processor xmm support...No
Checking that ptrace can change system call numbers...OK
Checking syscall emulation patch for ptrace...missing
Checking that host ptys support output SIGIO...Yes
Checking that host ptys support SIGIO on close...No, enabling workaround
Checking for /dev/anon on the host...Not available (open failed with errno 2)
NET: Registered protocol family 16
mconsole (version 2) initialized on /home/jdike/.uml/igpn9r/mconsole
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
NET: Registered protocol family 2
IP: routing cache hash table of 512 buckets, 4Kbytes
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
NET: Registered protocol family 1
NET: Registered protocol family 17
Initialized stdio console driver
Console initialized on /dev/tty0
Initializing software serial port version 1
ubda: unknown partition table
VFS: Mounted root (ext2 filesystem) readonly.
line_ioctl: tty0: ioctl KDSIGACCEPT called
INIT: version 2.78 booting
Activating swap...
Checking root file system...
Parallelizing fsck version 1.18 (11-Nov-1999)
/dev/ubd0: clean, 9591/131328 files, 64611/262144 blocks
Calculating module dependencies... depmod: get_kernel_syms: Function not
implemented
done.
Loading modules: cat: /etc/modules: No such file or directory
(continues)
Figure 2.3 Output from the first successful boot of UML

modprobe:
Can’t open dependencies file /lib/modules/2.6.11-rc1-mm1/modules.dep
(No such file or directory)
Checking all file systems...
Parallelizing fsck version 1.18 (11-Nov-1999)
Setting kernel variables.
Mounting local filesystems...
mount: devpts already mounted on /dev/pts
none on /tmp type tmpfs (rw)
Setting up IP spoofing protection: rp_filter.
Configuring network interfaces: done.
Setting the System Clock using the Hardware Clock as reference...
line_ioctl: tty1: unknown ioctl: 0x4b50
hwclock is unable to get I/O port access: the iopl(3) call failed.
System Clock set. Local time: Thu Jan 27 18:51:28 EST 2005
Cleaning: /tmp /var/lock /var/run.
Initializing random number generator... done.
Recovering nvi editor sessions... done.
INIT: Entering runlevel: 2
Starting system log daemon: syslogd syslogd: /dev/xconsole: No such file or
directory
klogd.
Starting portmap daemon: portmap.
Starting NFS common utilities: statd lockdlockdsvc: Function not implemented
.
Starting internet superserver: inetd.
Starting MySQL database server: mysqld.
Not starting NFS kernel daemon: No exports.
Starting OpenBSD Secure Shell server: sshd.
Starting web server: apache.
/usr/sbin/apachectl start: httpd started
Debian GNU/Linux 2.2 usermode tty0
usermode login:
Figure 2.3 Output from the first successful boot of UML (continued)

The other is when the boot scripts try to synchronize the internal
kernel clock with the system’s hardware clock:
Setting the System Clock using the Hardware Clock as reference...
line_ioctl: tty1: unknown ioctl: 0x4b50
hwclock is unable to get I/O port access: the iopl(3) call \
failed.
The UML serial line driver is complaining about an ioctl it
doesn’t implement, and the hwclock program inside UML is complainmodprobe

ing that it tried to execute the iopl instruction and failed. These are
both symptoms of hwclock trying different methods of accessing the
hardware system clock and failing because the device doesn’t exist in
UML. The UML kernel does have access to a clock, but it is not one
that hwclock will recognize. Rather, it is simply a call to the host’s
gettimeofday.
After that, you’ll notice that a relatively small number of services
are started, but they do include such things as NFS, MySQL, and
Apache. All of these run just as they would on a physical machine. This
boot process took about 5 seconds on my laptop, demonstrating one of
the conveniences of UML—the ability to quickly create and destroy virtual
machines.
LOOKING AT A UML FROM THE INSIDE AND OUTSIDE
Finally, we’ll see a login prompt. Actually, I see three on my screen.
One is in the xterm window in which I ran UML. The other two are in
xterm windows run by UML in order to hold the second console and the
first serial line, which are configured to have gettys running on them.
We’ll log in as root (using the highly secure default root password of
root that most of my UML filesystems have) and get a shell:
usermode login: root
Password:
Last login: Thu Jan 27 18:51:35 2005 on tty0
Linux usermode 2.6.11-rc1-mm1 #83 Thu Jan 27 12:16:00 EST 2005 \
i686 unknown
usermode:~#
Again, this is identical to what you’d see if you logged in to a physical
machine booted on this filesystem.
Now it’s time to start poking around inside this UML and see
what it looks like. First, we’ll look at what processes are running, as
shown in Figure 2.4.
There’s not much to comment on except the total normality of this
output. What’s interesting here is to look at the host. Figure 2.5 shows
the corresponding processes on the host.
Each of the nameless host processes corresponds to an address
space inside this UML instance. Except for application and kernel
threads, there’s a one-to-one correspondence between UML processes
and these host processes.

usermode:~# ps uax
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.3 1100 464 ? S 19:17 0:00 init [2]
root 2 0.0 0.0 0 0 ? RWN 19:17 0:00 [ksoftirqd/0]
root 3 0.0 0.0 0 0 ? SW< 19:17 0:00 [events/0]
root 4 0.0 0.0 0 0 ? SW< 19:17 0:00 [khelper]
root 5 0.0 0.0 0 0 ? SW< 19:17 0:00 [kthread]
root 6 0.0 0.0 0 0 ? SW< 19:17 0:00 [kblockd/0]
root 7 0.0 0.0 0 0 ? SW 19:17 0:00 [pdflush]
root 8 0.0 0.0 0 0 ? SW 19:17 0:00 [pdflush]
root 10 0.0 0.0 0 0 ? SW< 19:17 0:00 [aio/0]
root 9 0.0 0.0 0 0 ? SW 19:17 0:00 [kswapd0]
root 96 0.0 0.4 1420 624 ? S 19:17 0:00 /sbin/syslogd
root 98 0.0 0.3 1084 408 ? S 19:17 0:00 /sbin/klogd
daemon 102 0.0 0.3 1200 420 ? S 19:17 0:00 /sbin/portmap
root 105 0.0 0.4 1128 548 ? S 19:17 0:00 /sbin/rpc.statd
root 111 0.0 0.4 1376 540 ? S 19:17 0:00 /usr/sbin/inetd
root 120 0.0 0.6 1820 828 ? S 19:17 0:00 /bin/sh /usr/bin/
mysql 133 0.1 1.2 19244 1540 ? S 19:17 0:00 /usr/sbin/mysqld
mysql 135 0.0 1.2 19244 1540 ? S 19:17 0:00 /usr/sbin/mysqld
mysql 136 0.0 1.2 19244 1540 ? S 19:17 0:00 /usr/sbin/mysqld
root 144 0.9 0.9 2616 1224 ? S 19:17 0:00 /usr/sbin/sshd
root 149 0.0 1.0 2588 1288 ? S 19:17 0:00 /usr/sbin/apache
root 152 0.0 0.9 2084 1220 tty0 S 19:17 0:00 -bash
root 153 0.0 0.3 1084 444 tty1 S 19:17 0:00 /sbin/getty 38400
root 154 0.0 0.3 1084 444 tty2 S 19:17 0:00 /sbin/getty 38400
root 155 0.0 0.3 1084 444 ttyS0 S 19:17 0:00 /sbin/getty 38400
www-data 156 0.0 1.0 2600 1284 ? S 19:17 0:00 /usr/sbin/apache
www-data 157 0.0 1.0 2600 1284 ? S 19:17 0:00 /usr/sbin/apache
www-data 158 0.0 1.0 2600 1284 ? S 19:17 0:00 /usr/sbin/apache
www-data 159 0.0 1.0 2600 1284 ? S 19:17 0:00 /usr/sbin/apache
www-data 160 0.0 1.0 2600 1284 ? S 19:17 0:00 /usr/sbin/apache
root 162 2.0 0.5 2384 736 tty0 R 19:17 0:00 ps uax
usermode:~#
Figure 2.4 Output from ps uax inside UML

Notice that the properties of the UML processes and the corresponding
host processes don’t have much in common. All of the host
processes are owned by me, whereas the UML processes have various
owners, including root. The process IDs are totally different, as are the
virtual and resident memory sizes.
This is because the host processes are simply containers for UML
address spaces. All of the properties visible inside UML are maintained
by UML totally separate from the host. For example, the owner of the
host processes will be whoever ran UML. However, many UML processes
will be owned by root. These processes have root privileges

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
jdike 9938 0.1 3.1 131112 16264 pts/3 R 19:17 0:03 ./linux [ps]
jdike 9942 0.0 3.1 131112 16264 pts/3 S 19:17 0:00 ./linux [ps]
jdike 9943 0.0 3.1 131112 16264 pts/3 S 19:17 0:00 ./linux [ps]
jdike 9944 0.0 0.0 472 132 pts/3 T 19:17 0:00
jdike 10036 0.0 0.5 8640 2960 pts/3 S 19:17 0:00 xterm -T Virtual
jdike 10038 0.0 0.0 1368 232 ? S 19:17 0:00 /usr/lib/uml/port
jdike 10039 0.0 1.5 131092 8076 pts/6 S 19:17 0:00 ./linux [hwclock]
jdike 10095 0.0 0.1 632 604 pts/3 T 19:17 0:00
jdike 10099 0.0 0.0 416 352 pts/3 T 19:17 0:00
jdike 10107 0.0 0.0 428 332 pts/3 T 19:17 0:00
jdike 10113 0.0 0.1 556 516 pts/3 T 19:17 0:00
jdike 10126 0.0 0.0 548 508 pts/3 T 19:17 0:00
jdike 10143 0.0 0.0 840 160 pts/3 T 19:17 0:00
jdike 10173 0.0 0.2 1548 1140 pts/3 T 19:17 0:00
jdike 10188 0.0 0.1 1232 780 pts/3 T 19:17 0:00
jdike 10197 0.0 0.1 1296 712 pts/3 T 19:17 0:00
jdike 10205 0.0 0.0 452 452 pts/3 T 19:17 0:00
jdike 10207 0.0 0.0 452 452 pts/3 T 19:17 0:00
jdike 10209 0.0 0.0 452 452 pts/3 T 19:17 0:00
jdike 10210 0.0 0.5 8640 2960 pts/3 S 19:17 0:00 xterm -T Virtual
jdike 10212 0.0 0.0 1368 232 ? S 19:17 0:00 /usr/lib/uml/port
jdike 10213 0.0 2.9 131092 15092 pts/7 S 19:17 0:00 ./linux [/sbin/ge
jdike 10214 0.0 0.1 1292 688 pts/3 T 19:17 0:00
jdike 10215 0.0 0.1 1292 676 pts/3 T 19:17 0:00
jdike 10216 0.0 0.1 1292 676 pts/3 T 19:17 0:00
jdike 10217 0.0 0.1 1292 676 pts/3 T 19:17 0:00
jdike 10218 0.0 0.1 1292 676 pts/3 T 19:17 0:00
jdike 10220 0.0 0.1 1228 552 pts/3 T 19:17 0:00
Figure 2.5 Partial output from ps uax on the host

inside UML, but they have no special privileges on the host. This
important fact means that root can do anything inside UML without
being able to do anything on the host. A user logged in to a UML as
root has no special abilities on the host and, in fact, may not have any
abilities at all on the host.
Now, let’s look at the memory usage information in /proc/meminfo,
shown in Figure 2.6.
The total amount of memory shown, 126796K, is close to the
128MB we specified on the command line. It’s not exactly 128MB
because some memory allocated during early boot isn’t counted in the
total. Going back to the host ps output in Figure 2.5, notice that the
linux processes have a virtual size (the VSZ column) of almost exactly
128MB. The difference of 50K is due to a small amount of memory in
the UML binary, which isn’t counted as part of its physical memory.

usermode:~# cat /proc/meminfo
MemTotal: 126796 kB
MemFree: 112952 kB
Buffers: 512 kB
Cached: 7388 kB
SwapCached: 0 kB
Active: 6596 kB
Inactive: 3844 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 126796 kB
LowFree: 112952 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
Mapped: 5424 kB
Slab: 2660 kB
CommitLimit: 63396 kB
Committed_AS: 23100 kB
PageTables: 248 kB
VmallocTotal: 383984 kB
VmallocUsed: 24 kB
VmallocChunk: 383960 kB
Figure 2.6 The UML /proc/meminfo

Now, let’s go back to the host ps output and pick one of the UML
processes:
jdike 9938 0.1 3.1 131112 16264 pts/3 R 19:17 0:03 \
./linux [ps]
We can look at its open files by looking at the /proc/9938/fd directory,
which shows an entry like this:
lrwx------ 1 jdike jdike 64 Jan 28 12:48 3 -> \
/tmp/vm_file-AwBs1z (deleted)
This is the host file that holds, and is the same size (128MB in our
case) as, the UML “physical” memory. It is created in /tmp and then
deleted. The deletion prevents something else on the host from opening
it and corrupting it. However, this has the somewhat undesirable side
effect that /tmp can become filled with invisible files, which can confuse
people who don’t know about this aspect of UML’s behavior.
To make matters worse, it is recommended for performance reasons
to use tmpfs on /tmp. UML performs noticeably better when its memory

file is on tmpfs rather than on a disk-based filesystem such as ext3.
However, a tmpfs mount is smaller than the disk-based filesystem /tmp
would normally be on and thus more likely to run out of space when
running multiple UML instances. This can be handled by making the
tmpfs mount large enough to hold the maximum physical memories of
all the UML instances on the host or by creating a tmpfs mount for
each UML instance that is large enough to hold its physical memory.
Take a look at the root directory:
UML# ls /
bfs boot dev floppy initrd lib mnt root tmp var
bin cdrom etc home kernel lost+found proc sbin usr
This looks strikingly similar to the listing of the loopback mount
earlier and somewhat different from the host. Here UML has done the
equivalent of a loopback mount of the ~/roots/debian_22 file on the
host.
Note that making the loopback mount on the host required root
privileges, while I ran UML as my normal, nonroot self and accomplished
the same thing. You might think this demonstrates that either
the requirement of root privileges on the host is unnecessary or that
UML is some sort of security hole for not requiring root privileges to do
the same thing. Actually, neither is true because the two operations,
the loopback mount on the host and UML mounting its root filesystem,
aren’t quite the same thing. The loopback mount added a mount point
to the host’s filesystem, while the mount of / within UML doesn’t. The
UML mount is completely separate from the host’s filesystem, so the
ability to do this has no security implications.
However, from a different point of view, some security implications do
arise. There is no access from the UML filesystem to the host filesystem.
The root user inside the UML can do anything on the UML filesystem,
and thus, to the host file that contains it, but can’t do anything outside
it. So, inside UML, even root is jailed and can’t break out.6
6. We will talk about this in greater detail in Chapter 10, but UML is secure
against a breakout by the superuser only if it is configured properly. Most
important, module support and the ability to write to physical memory
must be disabled within the UML instance. The UML instance is owned by
some user on the host, and the UML kernel has the same privileges as that
user. So, the ability for root to modify kernel memory and inject code into it
would allow doing anything on the host that the host user can do. Disallowing
this ensures that even the superuser inside UML stays jailed.

This is a general property of UML—a UML is a full-blown Linux
machine with its own resources. With respect to those resources, the
root user within UML can do anything. But it can do nothing at all to
anything on the host that’s not explicitly provided to the UML. We’ve
just seen this with disk space and files, and it’s also true for networking,
memory, and every other type of host resource that can be made
accessible within UML.
Next, we can see some of UML’s hardware support by looking at
the mount table:
UML# mount
/dev/ubd0 on / type ext2 (rw)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
none on /tmp type tmpfs (rw)
Here we see the ubd device we configured on the command line
now mounted as the root filesystem. The other mounts are normal virtual
filesystems, procfs and devpts, and a tmpfs mount on /tmp.
df will show us how much space is available on the virtual disk:
UML# df
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/ubd0 1032056 242108 737468 25% /
none 63396 0 63396 0% /tmp
Compare the total size of /dev/ubd0 (1032056K) to that of the
host file:
-rw-rw-r-- 1 jdike jdike 1074790400 Jan 27 18:31 \
/home/jdike/roots/debian_22
They are nearly the same,7 with the difference probably being the
ext2 filesystem overhead. The entire UML filesystem exists in and is
confined to that host file. This is another way in which users inside the
UML are confined or jailed. A UML user has no way to consume more
disk space than is in that host file.
However, on the host, it is possible to extend the filesystem file,
and the extra space becomes available to UML. In Chapter 6 we will
see exactly how this is done, but for now, it’s just important to note that
this is a good example of how much more flexible virtual hardware is in
7. The difference between the 1074790400 byte host file and 1032056K
(1056825344 bytes) is 1.7%.

comparison to physical hardware. Try adding extra space to a physical
disk or a physical disk partition. You can repartition the disk in order
to extend a partition, but that’s a nontrivial, angst-ridden operation
that potentially puts all of the data on the disk at risk if you make a
mistake. You can also add a new volume to the volume group you wish
to increase, but this requires that the volume group be set up beforehand
and that you have a spare partition to add to it. In comparison,
extending a file using dd is a trivial operation that can be done as a
normal user, doesn’t put any data at risk except that in the file, and
doesn’t require any prior setup.
We can poke around /proc some more to compare and contrast
this virtual machine with the physical host it’s running on. For some
similarities, let’s look at /proc/filesystems:
UML# more /proc/filesystems
nodev sysfs
nodev rootfs
nodev bdev
nodev proc
nodev sockfs
nodev pipefs
nodev futexfs
nodev tmpfs
nodev eventpollfs
nodev devpts
reiserfs
ext3
ext2
nodev ramfs
nodev mqueue
There’s no sign of any UML oddities here at all. The reason is that
the filesystems are not hardware dependent. Anything that doesn’t
depend on hardware will be exactly the same in UML as on the host.
This includes things such as virtual devices (e.g., pseudo-terminals,
loop devices, and TUN/TAP8 network interfaces) and network protocols,
as well as the filesystems.
So, in order to see something different from the host, we have to
look at hardware-specific stuff. For example, /proc/interrupts contains
information about all interrupt sources on the system. On the
8. The TUN/TAP driver is a virtual network interface that allows packets to
be handled by a process, in order to create a tunnel (the origin of “TUN”) or
a virtual Ethernet device (“TAP”).

host, it contains information about devices such as the timer, keyboard,
and disks. In UML, it looks like this:
UML# more /proc/interrupts
CPU0
0: 211586 SIGVTALRM timer
2: 87 SIGIO console, console, console
3: 0 SIGIO console-write, console-write, \
console-write
4: 2061 SIGIO ubd
6: 0 SIGIO ssl
7: 0 SIGIO ssl-write
9: 0 SIGIO mconsole
10: 0 SIGIO winch, winch, winch
11: 56 SIGIO write sigio
The timer, keyboard, and disks are here (entries 0, 2 and 6, and 4,
respectively), as are a bunch of mysterious-looking entries. The -write
entries stem from a weakness in the host Linux SIGIO support. SIGIO
is a signal generated when input is available, or output is possible, on a
file descriptor. A process wishing to do interrupt-driven I/O would set
up SIGIO support on the file descriptors it’s using. An interrupt when
input is available on a file descriptor is obviously useful. However, an
interrupt when output is possible is also sometimes needed.
If a process is writing to a descriptor, such as one belonging to a
pipe or a network socket, faster than the process on the other side is
reading it, then the kernel will buffer the extra data. However, only a
limited amount of buffering is available. When that limit is reached,
further writes will fail, returning EAGAIN. It is necessary to know when
some of the data has been read by the other side and writes may be
attempted again. Here, a SIGIO signal would be very handy. The trouble
is that support of SIGIO when output is possible is not universal.
Some IPC mechanisms support SIGIO when input is available, but not
when output is possible.
In these cases, UML emulates this support with a separate thread
that calls poll to wait for output to become possible on these descriptors,
interrupting the UML kernel when this happens. The interrupt
this generates is represented by one of the -write interrupts.
The other mysterious entry is the winch interrupt. This appears
because UML wants to detect when one of its consoles changes size, as
when you resize the xterm in which you ran UML. Obviously this is not
a concern for the host, but it is for a virtual machine. Because of the
interface for registering for SIGWINCH on a host device, a separate
thread is created to receive SIGWINCH, and it interrupts UML itself

whenever one comes in. Thus, SIGWINCH looks like a separate device
from the point of view of /proc/interrupts.
/proc/cpuinfo is interesting:
UML# more /proc/cpuinfo
processor : 0
vendor_id : User Mode Linux
model name : UML
mode : skas
host : Linux tp.user-mode-linux.org 2.4.27 #6 \
Thu Jan 13 17:06:15 EST 2005 i686
bogomips : 1592.52
Much of the information in the host’s /proc/cpuinfo makes no
sense in UML. It contains information about the physical CPU, which
UML doesn’t have. So, I just put in some information about the host,
plus some about the UML itself.
CONCLUSION
At this point, we’ve seen a UML from both the inside and the outside.
We’ve seen how a UML can use host resources for its hardware and
how it’s confined to whatever has been provided to it.
A UML is both very similar to and very different from a physical
machine. It is similar as long as you don’t look at its hardware. When
you do, it becomes clear that you are looking at a virtual machine with
virtual hardware. However, as long as you stay away from the hardware,
it is very hard to tell that you are inside a virtual machine.
Both the similarities and the differences have advantages. Obviously,
having a UML run applications in exactly the same way as on
the host is critical for it to be useful. In this chapter we glimpsed some
of the advantages of virtual hardware. Soon we will see that virtualized
hardware can be plugged, unplugged, extended, and managed in ways
that physical hardware can’t. The next chapter begins to show you
what this means.

No comments:

Post a Comment

Bottom Ad [Post Page]