torsdag 24. mars 2011

Designing an infrastructure for booting linux from iSCSI

The previous post outlined the components needed to boot linux with iSCSI root:

  • A kernel
  • An initial root-fs with necessary drivers and scripts for mounting an iSCSI-volume
  • The root-fs on iSCSI
  • Configuration

Keep in mind that all of this has to fit together. The init-root and iSCSI-root need a /lib/modules directory which has the kernel-modules compiled for the specific kernel loaded. I practice, the the init-root is built for one specific kernel, the actual root might support several (i.e., it will typically contain modules for each kernel ever installed), but userland tools might work only with the latest. The configuration needs to take this into consideration

Now, for actual requirements:

  • The iSCSI-root should be as general as possible, meaning I don't want any specific configuration inside this filesystem (there are some things actually needed, more on that below). This is because I would like to be able to create a new instance by cloning a template and booting this without having to change any config file inside the filesystem itself.
  • The same goes for the init root-fs and the kernel, these should be re-used across several instances.
  • There must be a naming scheme which makes it easy to understand which components go together
  • There should be no redundancy in the configuration, the same things should not be configured several places (as an inconsistent configuration could lead to hard-to-track bugs)
  • The configuration should be as concise as possible, just list the things that actually vary in a brief format

There are also a few other parameters that will (or should) vary between instances:

  • Ethernet MAC-address. This will be my preferred way of locating which configuration to use, that is: given a MAC-address, the configuration should determin all other parameters
  • IPv4-address, determined by MAC-address using DHCP
  • IPv6-address, I will probably not use DHCP, but instead use the router-solicitation mechanism in IPv6, as this is simpler and more elegant
  • Filesystem UUID. I don't think it matters if separate machines have the same UUID-s, but if I some day might try to mount these filesystems on the same machine, it will probably get confused.
  • Hostname. This could be determined by DNS, but it would probably be an advantage if the machine knows its name even if network hasn't come up yet. It could also be the other way around, that DNS is updated by DHCP when the IP is assigned. (This is more or less out-of-the-box functionality with many DHCP-servers. Unfortunately, there seems to be no standard solution for IPv6 yet)
  • SSH-keys, re-using these across machines would be a security issue.
  • iSCSI-initiator id, a unique name identifying the client.

The design for handling these parameters could be as follows:

  • The MAC-address is either determined by the ethernet-card (in a physical machine) or by the hypervisor-tools (in a virtual one)
  • IP-addresses are assigned automatically based on MAC
  • Boot-parameters (location of kernel and init root, along with parameters to mount the iSCSI-volume) could be set by the hypervisor or the DHCP-server
  • Filesystem UUID should be set when creating it (so if we are creating a new instance by cloning a template, it should be immediately followed by a change of UUID). Keep in mind that /etc/fstab should refer to something else than UUID (I am thinking volume label, but I will get back to this)
  • SSH-keys will be generated on first boot (the ssh startup script should check if keys exist, and generate them if not)
  • Hostname could be set manually on first boot.
  • iSCSI initiator id must be set outside of the bootet system, as this needs to be available before it has access to the actual root-fs, but it must also be known to the machine itself, because it might want to attach other iSCSI-interfaces after it has booted. One possibility is to create an address-structure based on MAC (even one more id attached to the MAC...)

onsdag 23. mars 2011

How to boot linux from iSCSI

I would like to boot virtual (or physical) linux hosts from the network, with an iSCSI-device as root.
There are bits and pieces of information concerning this available, but I have not been able to find a complete guide to how this is done.

This is what I would like to do:

  • Load the kernel, either using the Xen bootloader or with PXE (network boot)
  • Attach to an iSCSI-disk available somewhere on my local network (in iSCSI-terms: let the newly booted linux be an iSCSI-initiator and log it in to an iSCSI-target)
  • Mount the iSCSI as root
  • Continue booting from the new root-filesystem
This is more or less the same as booting from NFS, but iSCSI is far more efficient than NFS

To get this to work, we need the following pieces:

  • A kernel that can loaded before the filesystem is mounted, which means that it needs to be copied over to another location outside of the host we want to boot. For xen, this can be anywhere on the Dom0 filesystem, but to be really general and independent of any local hardware, it should be put on a tftp-server
  • The initial root filesystem (initrd.img) needs to be in the same place
  • Some configuration telling the bootloader where to find the kernel and the root filesystem. With XEN, this could be the xen config file, but for PXE, it should be placed in the dhcp-configuration.
  • After booting the kernel with the initial root-fs, it will need some tools to mount the actual root-fs from iSCSI:
    • Kernel modules for iSCSI. I'm not exactly sure what the minimum set is, but we probably need atleast iscsi_tcp.
    • Tools to connect to iSCSI, iscsistart
    • Configuration of where to find the iSCSI-volume
    • Scripts pulling everything together
  • Scripts run by init on the initial root file-system will mount the actual root from iSCSI and continue booting from this.
Se initrd(4) for a detailed and general description of the boot-process. (However, this seems a bit out of date, because it refers to LILO and LOADLIN, and seems unaware of GRUB. Documentation/initrd.txt seems just as outdated.)

I have investigated the kernels bundled with the latest ubuntu and debian variants and found that:

  • Both have the neccessary kernel modules included in initrd.img
  • Debian have scripts for setting up iSCSI included in initrd.img
  • None have the iscsistart-tool
But wait a minute, what really happens here? Actually, the initrd.img is not copied from an installation-archive, it is generated when installing the kernel. This is infact documented, if you know what to look for:
  • initramfs-tools is the tool actually generating the image. The contents is from /usr/share/initramfs-tools. Some other packages put contents here.
  • open-iscsi provides the iscsi-script mentioned above
  • It also provides the iscsistart command, but apparently not in a location picked up by the initramfs-tools.
Further digging and searching (use the source, Luke...), and voila: if you create /etc/iscsi/iscsi.initramfs with default values for the iscsi-configuration, the neccessary files will be included when generating the initramfs. This is actually described here.

This was the general info on how this fits together, some recommendations to actual setup will come in a later post.

torsdag 10. mars 2011

Powershell for Unix-users

Powershell is an object-oriented scriptinglanguage bundled with windows-7 and windows server 2008 which is heavily influenced by unix scripting, python, perl, lisp and more. This guide lists some common unix-commands and their powershell equivalents.  Please note that they are not completely equal, as unix-commands works on streams of bytes, typically split into lines, while powershell works on streams of objects.

grepWhere-Object, but see below for details
cdSet-Location, but cd is an alias
catGet-Content, but cat and type are aliases

grep "pattern"
If input is a list of strings, the Select-String command is equal to grep:
Select-String -Pattern "pattern"
However, input will typically be a stream of object, and what you want to do is to filter this. Thus Perl grep or Common Lisp remove-if-not which both accepts a general selection function as parameter are more appropriate. The PowerShell command is inspired by SQL SELECT WHERE ...
Where-Object { current object is $_ }

søndag 6. februar 2011

troll i eske februar

kommende filmer fra filmweb.no, minus de som vises på gimle filmfest og minus de som ikke er "troll-i-eske-genre":
  • Fjellet
  • Miral
  • 80 dager
  • 14 kilometer
  • onkel boonmee
  • the adjustent bureau
  • exteriors

torsdag 3. februar 2011

netfilter

Linux has a very comprehensive set of modules for filtering and changeing network packets as they flow by the network stack. However, this framework has been through several major and quite a few minor re-designs (the last major one seems to be the introduction of the nf_conntrack subsystem i 2.6.15).

There are vast amounts of documentation available on the web, but very little is up to date with the last changes. I will try to summarize my experiences with kernel 2.6.32.

Some pointers for good places to start:

  • This is a good introduction, which also covers the last changes (pdf)
  • The authorative source is netfilter.org, but most documentation there is outdated by almost a decade.
  • This tutorial from Oskar Andreasson is not too out of date
  • You can also take a look at the wikipedia page for iptables which has a nice flowchart

A quick introduction for the impatient:

  • Each network packet is sent through a set of tables
  • For each table, the packet is sent through a set of chains, which chains depend on the final destination of the package; inbound, outbound or routing through (see the drawing on wikipedia for a full picture
  • Each chain has a default rule, and can have a set of additional rules
  • Each rule is a filter and a command. The filter can be anything that the netfilter framework can find in the package: interface, source, destination, port, protocol etc. The command is either a pre-defined one (DROP, ACCEPT, REJECT etc) or a user-defined table. The filter kan be regarded as an "if"-statement, and the command as a functin call. But pay attention that not all commands make sense in all tables or all chains. Read the documentation for details.

tirsdag 1. februar 2011

udhcpc

One of the utilities bundled in BusyBox is the udhcpc DHCP client.  This is a tool according to the unix principle of: do one thing and do it well.  The udhcpc command handles the DHCP protocol as described in RFC 2131, but it doesn't actually configure the network based on the replies.

However, the documentation available from busybox is not very comprehensive.  There is a man-page available which can be found by googling udhcpc - very small DHCP client.

The operation of udhcpc is simple and unix-y.  When it receives a reply from a DHCP-server, it calls a script with one parameter, which is one of:
  • deconfig: remove configuration (when lease is lost or udhcpc starts)
  • bound: moving from unbound to bound state (receives configuration)
  • renew: lease is renewed
  • nak: nak received from server
  • leasefail: (not documented in the man-page): run if there is no reply after configured timeouts and retries
lots of other configuration parameters available as environment variables.  The example scripts included with busybox implements these by configuring udhcpd to call a script which calls: simple.$1 (and of course there are 4 scripts: simple.deconfig, simple.bound, simple.renew, simple.nak).  Simple and easy!

I have basically used the sample scripts that came with BusyBox (examples/udhcpc/*), but converted them to using the ip(8)-command instead of ifconfig(8). I also added logging to syslog with the following function:

log() {
  logger -p daemon.info -t dhcp $*
}

tirsdag 25. januar 2011

dnsmasq

dnsmasq is a wonderful little daemon providing dhcp (IPv4) and dns, and integrates these, so a client provided with an IP-address using DHCP will also be available through DNS.  And it is really easy to set up, it gets all relevant information from /etc/hosts and /etc/ethers, but you can also add extra parameters through command line or config file.


Configuration

The build system for dnsmasq doesn't come with configure, but there are really few options and these can be manipulated with the make command line, or, as I did, by changing config.h. I removed:

  • tftp-support (cmdline: COPTS=-DNO_TFTP), if I want to boot machines through the network, I will use my regular fileserver for tftp, not the firewall
  • script-support (cmdline: COPTS=-DNO_SCRIPT), I don't think I will need scripts, and including them seems like a potential security problem

Build with make and install in /usr/sbin.

installing ssh

Naturally, I could just copy ssh from my regular ubuntu-installation, but that wouldn't as rewarding, and it is nice to be able to exclude features I don't need.

OpenSSL

To build ssh, we first need openssl. I set PREFIX to a separate directory, to be able to run make install with full control over which files is included in the distribution.

mkdir /home/build/firewall/dist
PREFIX=/home/build/firewall/dist
cd openssl-1.0.0c/
./config --prefix=$PREFIX
make -j 3
make install

OpenSSH is statically linked with OpenSSL, so no openssl-files are actually needed on the firewall. The openssl client utilities are meant for manipulating certificates or testing ssl connections. This functionality will not be needed on the firewall.

OpenSSH

Openssh built in a similar fashion:
cd ../openssh-5.6p1/
./configure --help
./configure --with-ssl-dir=$PREFIX --prefix=$PREFIX --with-privsep-user=sshd --with-4in6
./configure --help
make -j 3
make install
From the dist-directory I made a selective installation of just some utilities:
  • ssh
  • ssh-add
  • ssh-agent
  • ssh-keygen
  • ssh-keyscan
all installed in /usr/bin, and sshd installed in /usr/sbin. Configurationfiles are taylored and put in /etc/ssh. The daemon is started from inittab using:
::respawn:/usr/sbin/sshd -f /etc/ssh/sshd_config

Basic linux setup

First things first: basic linux installation. From scratch.

Linux kernel 2.6.37

  • there is no need for every driver available as a module, what I need is xen-support, basic drivers and some networking. As a baseline, a minimum set of driver compiled in should be sufficient. However, quite a few network-modules would be "nice to have" (tunneling, ipsec, vlan, bonding etc) but not needed for booting, and some of these needs to be modules to be able to provide load-time parameters.
  • there is also no need for an initial ram-disk, this will be a small, simple setup with one ext2-partition, and the few drivers needed for boot will be linked into the kernel
I have set up a 4GB logical volume in dom0 for the installation. (Normally I would use iSCSI, but the firewall must be able to boot without any other networking present. Infact, the iSCSI-server expects to get an IP-address from the firewall with DHCP)

BusyBox 1.18.1

I will use BusyBox for basic unix utilities. This has probably been compiled with far more functionality than what is currently needed, but it would be a bother to re-compile just to get that one extra utility. Currently I have included:
  • init and related utilities
  • basic file-utils and text-utils
  • every network-util
  • some filesystem-tools
The boot-sequence is very simple:
  • init starts all daemons through inittab
  • init also starts /etc/init.d/rcS which mounts filesystems and set up networking
Putting it all together

Basic directory structure:

  • /etc
  • /etc/init.d
  • /bin
  • /sbin
  • /usr
  • /usr/bin
  • /usr/sbin
  • /usr/lib
  • /lib
  • /var
  • /lib64
  • /proc
  • /dev
  • /tmp
I added the following files to /etc:
  • fstab
  • group
  • init.d/rcS
  • inittab
  • passwd
  • shadow
  • nsswitch.conf
  • resolv.conf
contents of inittab:
::sysinit:/etc/init.d/rcS
::respawn:/sbin/getty -L hvc0 9600 linux
::restart:/sbin/init
::ctrlaltdel:/sbin/reboot
::shutdown:umount -a -r

contents of rcS:

#!/bin/sh

fsck /dev/root
mount -t proc proc /proc
mount -o remount,rw /
#mount -a

hostname firewall2
ip address add dev eth0 local 192.168.32.10/24
ip link set dev eth0 up
ip route add to default via 192.168.32.1
(since everything is mounted in the rc-script, fstab is really not needed)

I put busybox in /bin and ran:

/bin/busybox --install -s
this created symlinks to all busybox commands in /bin, /sbin, /usr/bin and /usr/sbin.

Finally, I copied these libraries from an Ubuntu-installation:

/lib/libm.so.6
/lib/libc.so.6
/lib/libcrypt.so.1
/lib/libdl.so.2
/lib/libnsl.so.1
/lib/libresolv.so.2
/lib/libutil.so.1
/lib/libz.so.1
/lib/libnss_files-2.11.1.so
/lib/libnss_dns-2.11.1.so
/lib64/ld-linux-x86-64.so.2

The kernel itself is not on the guest filesystem, it is started by xen in dom0.

mandag 24. januar 2011

IPv6 at home

Ok, so it's time to get on the IPv6-bandwagon. That means:
  • setting up a firewall/router that can handle both IPv4 and IPv6
  • creating an address-policy for both protocols
  • setting up an IPv6-tunnel while waiting for my ISP to provide IPv6
For the firewall/router, I would like the following functionality:
  • DHCP for IPv4
  • Router advertisements (aka stateless autoconfiguration for IPv6, as defined in RFC 4862)
  • DNS for v4 and v6
  • 6in4 tunnel through Hurricane Electric
  • ntp-daemon for a local time-source
  • ssh for administration
  • running a linux-kernel in a xen virtual machine
  • basic linux-utilities provided by busybox
Some justifications for this setup:
  • running linux on regular hw gives more flexibility than on custom-hw (like a linksys)
  • however, I don't need much computing power for this, so a timeshare of my regular server is OK
  • Hurricane Electric seems to be the most used and easy to set up tunnel service
  • I would like to have a cliean IPv6-network internally and translate in the router, however, I have some clients that might not do IPv6 at all (like a blueray-player and the PS3), and even on regular plattforms, some features are missing (like: getting DNS-setup from the IPv6 autoconfig or even do DNS-lookups over IPv6)
  • IPv6 stateless autoconfiguration is far more elegant than DHCP. I would really like to use stateless autoconfigure for everything but servers, and set up other network parameters using DNS SRV-records or zeroconf/bonjour. But the support for this is scarce, so I'll stick with autoconfiguration for addresses and get the rest through DHCP (v4) for now. (Servers need statefull configuration anyway to have a stable IP-address independent of the network card)
So, the plan is:
  • set up a custom linux-system, running on xen
  • use busybox for providing basic utilities
  • ssh for remote login
  • use linux built in netfilter functionality to provide routing, NAT, filtering etc, configured with iptables
  • isc dhcp or dnsmasq for dhcp-functionality (dnsmasq probably has enough functionalit for IPv4, but has no DHCPv6-support. I will start with using dnsmasq for DHCP and use only stateless configuration of IPv6)
  • radvd sending router advertisement messages providing IPv6 stateless autoconfiguration.
  • dnsmasq for DNS
  • ntpd