torsdag 24. mars 2011

Designing an infrastructure for booting linux from iSCSI

The previous post outlined the components needed to boot linux with iSCSI root:

  • A kernel
  • An initial root-fs with necessary drivers and scripts for mounting an iSCSI-volume
  • The root-fs on iSCSI
  • Configuration

Keep in mind that all of this has to fit together. The init-root and iSCSI-root need a /lib/modules directory which has the kernel-modules compiled for the specific kernel loaded. I practice, the the init-root is built for one specific kernel, the actual root might support several (i.e., it will typically contain modules for each kernel ever installed), but userland tools might work only with the latest. The configuration needs to take this into consideration

Now, for actual requirements:

  • The iSCSI-root should be as general as possible, meaning I don't want any specific configuration inside this filesystem (there are some things actually needed, more on that below). This is because I would like to be able to create a new instance by cloning a template and booting this without having to change any config file inside the filesystem itself.
  • The same goes for the init root-fs and the kernel, these should be re-used across several instances.
  • There must be a naming scheme which makes it easy to understand which components go together
  • There should be no redundancy in the configuration, the same things should not be configured several places (as an inconsistent configuration could lead to hard-to-track bugs)
  • The configuration should be as concise as possible, just list the things that actually vary in a brief format

There are also a few other parameters that will (or should) vary between instances:

  • Ethernet MAC-address. This will be my preferred way of locating which configuration to use, that is: given a MAC-address, the configuration should determin all other parameters
  • IPv4-address, determined by MAC-address using DHCP
  • IPv6-address, I will probably not use DHCP, but instead use the router-solicitation mechanism in IPv6, as this is simpler and more elegant
  • Filesystem UUID. I don't think it matters if separate machines have the same UUID-s, but if I some day might try to mount these filesystems on the same machine, it will probably get confused.
  • Hostname. This could be determined by DNS, but it would probably be an advantage if the machine knows its name even if network hasn't come up yet. It could also be the other way around, that DNS is updated by DHCP when the IP is assigned. (This is more or less out-of-the-box functionality with many DHCP-servers. Unfortunately, there seems to be no standard solution for IPv6 yet)
  • SSH-keys, re-using these across machines would be a security issue.
  • iSCSI-initiator id, a unique name identifying the client.

The design for handling these parameters could be as follows:

  • The MAC-address is either determined by the ethernet-card (in a physical machine) or by the hypervisor-tools (in a virtual one)
  • IP-addresses are assigned automatically based on MAC
  • Boot-parameters (location of kernel and init root, along with parameters to mount the iSCSI-volume) could be set by the hypervisor or the DHCP-server
  • Filesystem UUID should be set when creating it (so if we are creating a new instance by cloning a template, it should be immediately followed by a change of UUID). Keep in mind that /etc/fstab should refer to something else than UUID (I am thinking volume label, but I will get back to this)
  • SSH-keys will be generated on first boot (the ssh startup script should check if keys exist, and generate them if not)
  • Hostname could be set manually on first boot.
  • iSCSI initiator id must be set outside of the bootet system, as this needs to be available before it has access to the actual root-fs, but it must also be known to the machine itself, because it might want to attach other iSCSI-interfaces after it has booted. One possibility is to create an address-structure based on MAC (even one more id attached to the MAC...)

onsdag 23. mars 2011

How to boot linux from iSCSI

I would like to boot virtual (or physical) linux hosts from the network, with an iSCSI-device as root.
There are bits and pieces of information concerning this available, but I have not been able to find a complete guide to how this is done.

This is what I would like to do:

  • Load the kernel, either using the Xen bootloader or with PXE (network boot)
  • Attach to an iSCSI-disk available somewhere on my local network (in iSCSI-terms: let the newly booted linux be an iSCSI-initiator and log it in to an iSCSI-target)
  • Mount the iSCSI as root
  • Continue booting from the new root-filesystem
This is more or less the same as booting from NFS, but iSCSI is far more efficient than NFS

To get this to work, we need the following pieces:

  • A kernel that can loaded before the filesystem is mounted, which means that it needs to be copied over to another location outside of the host we want to boot. For xen, this can be anywhere on the Dom0 filesystem, but to be really general and independent of any local hardware, it should be put on a tftp-server
  • The initial root filesystem (initrd.img) needs to be in the same place
  • Some configuration telling the bootloader where to find the kernel and the root filesystem. With XEN, this could be the xen config file, but for PXE, it should be placed in the dhcp-configuration.
  • After booting the kernel with the initial root-fs, it will need some tools to mount the actual root-fs from iSCSI:
    • Kernel modules for iSCSI. I'm not exactly sure what the minimum set is, but we probably need atleast iscsi_tcp.
    • Tools to connect to iSCSI, iscsistart
    • Configuration of where to find the iSCSI-volume
    • Scripts pulling everything together
  • Scripts run by init on the initial root file-system will mount the actual root from iSCSI and continue booting from this.
Se initrd(4) for a detailed and general description of the boot-process. (However, this seems a bit out of date, because it refers to LILO and LOADLIN, and seems unaware of GRUB. Documentation/initrd.txt seems just as outdated.)

I have investigated the kernels bundled with the latest ubuntu and debian variants and found that:

  • Both have the neccessary kernel modules included in initrd.img
  • Debian have scripts for setting up iSCSI included in initrd.img
  • None have the iscsistart-tool
But wait a minute, what really happens here? Actually, the initrd.img is not copied from an installation-archive, it is generated when installing the kernel. This is infact documented, if you know what to look for:
  • initramfs-tools is the tool actually generating the image. The contents is from /usr/share/initramfs-tools. Some other packages put contents here.
  • open-iscsi provides the iscsi-script mentioned above
  • It also provides the iscsistart command, but apparently not in a location picked up by the initramfs-tools.
Further digging and searching (use the source, Luke...), and voila: if you create /etc/iscsi/iscsi.initramfs with default values for the iscsi-configuration, the neccessary files will be included when generating the initramfs. This is actually described here.

This was the general info on how this fits together, some recommendations to actual setup will come in a later post.

torsdag 10. mars 2011

Powershell for Unix-users

Powershell is an object-oriented scriptinglanguage bundled with windows-7 and windows server 2008 which is heavily influenced by unix scripting, python, perl, lisp and more. This guide lists some common unix-commands and their powershell equivalents.  Please note that they are not completely equal, as unix-commands works on streams of bytes, typically split into lines, while powershell works on streams of objects.

grepWhere-Object, but see below for details
cdSet-Location, but cd is an alias
catGet-Content, but cat and type are aliases

grep "pattern"
If input is a list of strings, the Select-String command is equal to grep:
Select-String -Pattern "pattern"
However, input will typically be a stream of object, and what you want to do is to filter this. Thus Perl grep or Common Lisp remove-if-not which both accepts a general selection function as parameter are more appropriate. The PowerShell command is inspired by SQL SELECT WHERE ...
Where-Object { current object is $_ }