søndag 21. februar 2016

Aftenposten om blockchain

Så en journalist fra aftenposten har hørt et foredrag om bitcoin, og kommer ut med stjerner i øynene

"Tapscott forteller helt kort at nå kommer en teknologi som kommer til å snu opp ned på hvem vi kan stole på i samfunnet."

Jeg husker man sa noe slikt om Segway også, da det ble annonsert første gang:  "...Cities will be built around this device." (iflg wikipedia). Og han presiserer videre:

"Det handler om å bygge et totalt pålitelig system, som er så sikkert at vi ikke lenger vil trenge store selskaper, eller stater, til å garantere for de mest sensitive og verdifulle ting, som penger."

Men grunnleggende kildekritikk ser ut til å være helt fraværende her, for eksempel en vurdering av kildens motivasjon:

"Men detaljene, forteller Tapscott, får vi vente med til boken hans om blockchain kommer i mai."

Tenker nok det ja...

Videre følger en "lekmann"-forklaring av hva "blockchain" er for noe, en forklaring som ikke er veldig gal, men som sannsynligvis ikke sier så mye med mindre man kjenner teknologien fra før. Noen sitater:

  • "Blockchain er dypest sett et verktøy for å ta vare på sannheten i et samfunn der ingen trenger å stole på hverandre."
  • "Det spesielle er at regnearket ikke lagres et spesielt sted, men over hele verden i helt identiske kopier."
  • "Ved hjelp av såkalt "hashing" er det mulig å se øyeblikkelig om det er innført en eneste feil i protokollen. Dette er ekstremt effektivt."
For å gjøre det litt mer folkelig vil jeg prøve meg på en litt enklere forklaring med følgende metafor: blockchain er som en lang, sammenhengende kassalapp som det er umulig å endre etter at den er skrevet. Med dette i bakhodet, la oss se videre på påstandene som kommer frem i artikkelen:

  • "Hvis en blockchain for eksempel inneholdt alle regnskapene til alle bedrifter i Norge, kunne hvem som helst med et øyekast se om noen en eller gang i løpet av historien har forsøkt å føre inn et eneste feilaktig tall."
  • "Det gir en etterrettelighet som tradisjonelle systemer ikke er i nærheten av."
Men hør nå her: selv om en fysisk papirrull fra et kassaapparat ikke er umulig å forfalske på samme måte som blockchain er, så er det likevel ikke i denne kassalappen at forfalskningene skjer. Vi kan for alle praktiske formål anta at også en slik fysisk, papirbasert løsning er umulig å endre etter at den er skrevet, men folk klarer likevel å forfalske regnskap! Prøv å tenk litt på hvordan dette henger sammen. De som forfalsker regnskap gjør det ikke ved å endre på kassalappen, de gjør det ved å flytte penger som aldri skrives til kassalappen i det hele tatt! For bitcoin finnes ingen parallell til dette, fordi det er umulig å flytte bitcoins uten å skrive det inn i blockchain, så dermed virker dette som et vanntett system, men dette er ikke hele systemet. Det er bare en liten del av systemet, og det er den delen som er enklest å sikre.

Selv om en hash-kode (for alle praktiske formål, i denne konteksten) er umulig å forfalske betyr ikke det at systemet som helhet er sikkert. For bitcoin sin del inneholder blockchain bare listen av overføringer mellom bitcoin-lommebøker, men for å kunne bruke penger som ligger i en slik bitcoin-lommebok trenger du en privat nøkkel og lommeboken inneholder både denne og andre sensitive data du må passe godt på. Dette er vanskelig. Aftenposten påpeker jo dette til og med selv ved å linke til en tidligere artikkel som sier at "Bitcoins for milliarder tatt ut av sirkulasjon". Så selv om blockchain er umulig å forfalske, hindret ikke dette MtGox i å miste sine kunders bitcoins.

Og siden alle unntatt de mest risikohungrige eventyrere skjønner dette (eller vet at de ikke skjønner det og derfor klokelig holder seg unna) ser det ikke ut til at bitcoin kommer til å erobre verden riktig ennå.

"En viktig garantist for sannheten i dag, er banker. Hvor stor del av bankvirksomheten består ikke av å holde rede på hvem som eier hvilke penger? Det er jo nettopp dette en blokkjede gjør. Ikke rart at direktør Christoffer Hernæs i Sparebank 1 og hans folk har gått i gang med å se på hva de kan bruke blockchain til."

Som sagt skal det veldig mye mer til enn en hash-kode for å skape tillit. Utfra det korte avsnittet over er det vanskelig å lese hva Christoffer Hernæs faktisk mener om tillit, men om direktøren i en bank faktisk tror at en hash-kode er nok vet jeg i hvert fall hvilken bank jeg ikke skal sette pengene mine i.

Men tilbake til påstanden om at en hash-kode gjør alt så ekstremt effektivt. Jeg skal si litt mer om dette etter hvert, men nedover i artikkelen blir dette punktet ekstrapolert videre i følgende to punkter:
  • "Mye av det banker i dag foretar seg kan automatiseres."
  • "Blockchain 2.0 gjør det mulig å for dataprogrammer til å drive store organisasjoner, i prinsippet uten at mennesker trengs i administrasjonen."
  • "De mange sjåførene hos Uber vil ikke lenger trenge en sentral organisasjon med verdensherredømme, som stikker av med et gigantisk overskudd. I stedet vil alle oppgavene kunne utføres desentralisert i en blokkjede."
Altså, jeg tror ikke jeg overdriver om jeg påstår at 80% av verdens programmerere jobber utelukkende med å gjøre organisasjoner mer effektive så man kan bruke mindre ressurser på administrasjon. Og dette har vært situasjonen de siste 50 årene. Dersom hash-koder var nok til å fikse dette hadde vi vel gjort det for lenge siden.

Men altså, effektivitet. Siden bitcoin skriver transaksjoner inn i blockchain på en måte som gjør at de ikke kan endres senere, får man altså en kvittering som er umulig å forfalske, og dette uten noen form for menneskelig involvering, og uten at noen sentral myndighet eller organisasjon garanterer for transaksjonen. Men dette har faktisk en pris. For bitcoin er denne prisen følgende begrensninger:
  • det tar 15 minutter å få bekreftet en transaksjon. Dette tilsvarer altså at du som kunde skulle stå i butikken og trykke "kode og klar", og så må du vente 15 minutter før du får kvitteringen
  • det er en teoretisk øvre grense på 7 transaksjoner per sekund (den praktiske grensen er sannsynligvis mye lavere). Sammenlign dette med paypal som utfører 115 transaksjoner per sekund eller Visa som utfører 2000.
  • For å produsere denne kvitteringen trengs 14 kWh energi, som er hva en gjennomsnittlig norsk husholdning bruker på et kvart døgn. (Iflg beregninger jeg fant på reddit)
Resten av artikkelen glir ut i et grenseland mellom new-age og sci-fi.

lørdag 27. september 2014

Trying to understand keybase.io

I have recently been invited to keybase.io, and to increase my understanding of how this works, I will try to accomplish the same as keybase provides, but manually using the gpg commandline.;/p>

The goal of keybase.io is to create links between public keys and online identities. For example, I have a public pgp key which can be downloaded from here: https://pgp.mit.edu/pks/lookup?op=vindex&search=0xDC82662DC1136424

To fetch my key into your gpg keyring, use:

gpg --recv-keys 0xdc82662dc1136424

This key is combined with a certificate where I claim that my email-address i rolf.ness(at)pvv.org. Using gpg you can check that this claim was made by someone controlling the private part of the key referred above. However, you cannot check if this claim is actually true. I'll get back to this later, but some other examples first.

Let's take a look at this blogspot account and make some claims about this:

My name is Rolf Rander Næss
I control the blogspot-account: rolfn.blogspot.no (a.k.a. rolfn.blogspot.com)
I have a pgp-key with id: 0xdc82662dc1136424
The key fingerprint is: 5D18 257C 9F45 7108 DFA6  AD51 DC82 662D C113 6424
This message is signed with this key

Now I can sign this message, proving possession of the corresponding private key, and by posting it here, I also prove that I control this blogspot account.

The signed message is below, it is formatted in binary (and the converted to ascii with base64) to avoid errors due to formatting, charset or copy-paste-errors.

-----BEGIN PGP MESSAGE-----

owGbwMvMwMF4pylN96BwigrjWka1JOGknPz04oL8Et3knMTM3GK9koqSELXqSt9K
hbzE3FSFzGKFoPycNIWgxLyU1CIFv8PLiot5uTwVkvPzSorycxRKMlIV4EYkJifn
l+aVWCkAZdLy9GDienn5ChqJetl6iXroMsn5uZog4zISy1IVEhUK0gt0s1MrFcoz
SzIUMlOsFAwqUpItjMzMjFKSDQ2NzUyMTHi5QoBWghSlZealpxYVFGXmlQCdaaVg
6mJooWBkau6sYOlmYqpgbmhgoeDi5mimoODoYmqo4OJsYaQANMlFwRlolALMLKAH
c1OLixPTwX4tzkzPS02B2F8CkgNaxMvVySjDwsDIwcDGygQKGwYuTgFYIPacZP/v
c2u1XP3Nswc5AyZaG/LpZ+Yc8xD5MO191gTXzY4zmd51SU8pcVy3z+XE8m7vI64f
LnQuLHlXv3plwM74Q/JWi5ZvMtYP9uHtbWivjLppy5ju9Obw39AQz5WxPFrSSne4
mW91sE7SiJZYZWZ49sn+XcVC8VN+yy7rfPbQNeJkxnROef6TqbLmXW/awurnJT6N
NDtdr1c3J0zf2I7H5OFaS3lNQ5u/npddpmTuWuS6y9PGwvOMbm+XemJX3DktZaFl
ESuvqKkVCvw+oHmcWSb2Z7OmZdUGFbZmjhNOYf5+/yY9ct4xW2nqo4Tgiz8KX6ZM
EL1ZPrX3YNTWtjax/19OZ+gfu88qKr7v19ovegA=
=DCLD
-----END PGP MESSAGE-----

So, by posting this here, I have created a two-way link between my blogspot-account and my private key. (However, in this particular case it is not worth much, because blogspot doesn't support https, so a man-in-the-middle could change this message before it reached your browser).

If you have my key imported, you can paste this message (including "BEGIN" and "END") into gpg, and gpg will tell you that the signature is good (i.e., that it was made by someone controlling the key). You should check that the fingerprint in the message and the fingerprint reported by gpg match.

Lets try the same for twitter, which is slightly more useful. Here is a statement with a similar set of claims:

My name is Rolf Rander Næss
I control the twitter-account: @rolfrander
I have a pgp-key with id: 0xdc82662dc1136424
The key fingerprint is: 5D18 257C 9F45 7108 DFA6  AD51 DC82 662D C113 6424
This message is signed with this key

The binary-encoded, signed, ascii-armoured version is:

-----BEGIN PGP MESSAGE-----

owGbwMvMwMF4pylN96BwigrjWkamJKGS8sySktQi3eScxMzcYr2SipIQtZo+30qF
vMTcVIXMYoWg/Jw0haDEvJTUIgW/w8uKi3m5PBWS8/NKivJzFEoyUhVgJiQmJ+eX
5pVYKTgAZdKKwDpAajMSy1IVEhUK0gt0s1MrFYCqMxQyU6wUDCpSki2MzMyMUpIN
DY3NTIxMeLlCgOaBFKVl5qWnFhUUZeaVAN1gpWDqYmihYGRq7qxg6WZiqmBuaGCh
4OLmaKag4Ohiaqjg4mxhpAA0yUXBGWiUAswsoOtzU4uLE9PBHinOTM9LTYHYXwKS
A1rEy9XJKMPCwMjBwMbKBPI4AxenACyANixj/ytd1Xtd8ZtSfORNsTlL+Z+r+7+I
+iWw+n+TvLKiRvl1dnOW8JSHhY16IV7vfocJep8+udOGK9Ot8NmJJPsuA+XFjRu/
Bjy5emDeDzW5l643dpStSGqe1fX106JVCyvqn5qJ6x7+o3dzmUvYm3aGH+Gzsia4
7zk9+XjC5L2T5ae7fX9y75fHtNjT3V9ykkSO7enZZaJ3QoZJcp7BFRWn/Uv/Fr+z
jNvw1aznhGYgn/Ax1o+q/x+F+G3nzf1z0ni595Tfth8uJSw9MsXRbgI/X7lw+Myu
fvllK+/VTreZdMBYX/LCUh8Jq43+za5GiUw+AcqW/n+aXm4PCT4lblXKsHHLyl6l
CrUfu3asltS8/R4A
=pG8g
-----END PGP MESSAGE-----

As before, this proves that the person making these claims possess the private part of the key with this fingerprint. Now, if I could post this to twitter, I would also prove that I control the twitter account and thus provide a two-way link. However, twitter only allows 140 chars, so the message above is to large. To get around this, I only post a hash of this message to twitter. The hash must be constructed from the binary encoded message, again to avoid formatting or cut-and-paste issues. Starting with the pgp-message, the hash can be obtained by piping the message into this command:

gpg 2>/dev/null | openssl sha256 -binary | openssl base64

which returns NMTiZ2clKwsuQnRFQjFxuL1oTL6NE+R2doBG3ohPThA=

now chech twitter: https://twitter.com/rolfrander/status/515794922851827712

since twitter use https, you can trust (within reason) that no man-in-the-middle has changed this message before it reached your browser.

Now, there are some pieces of software and quite a few organizations you need to trust to be able to trust this key, but I'll get back to that in a later post...

So, given the steps (and caveats) above, you have now established, within reasonable doubt, that:

  • the person controlling the private part of key 0xdc82662dc1136424
  • also controls the address rolfn.blogspot.no
  • and controls the twitter-account @rolfrander

However, what you have not done, is to prove anything about who I really am. To do that, you have to meet me in person (and possible check some government issued ID, depending on your usecase). If we meet, I can provide some data which enables you to connect my real-life identity to my key. Previously this would mean me giving you a hash of my key, which you could check. However, now that we have established a connection between my twitter-handle and my key, I only need to give you my twitter-handle, which is much easier for you to remember.

This is basically what keybase.io does, but it is wrapped in a nice user interface and with a tool which makes it easier to handle. I addition, they have the added functionality of "tracking", I will get back to that later.

Edit: I just realized that the command for computing the hash above really just hash the message, not the signature, which keybase hash message+signature. I don't think this has any security implications, but there could be some corner case I havent seen yet.

Anyway, the message signed by keybase includes other security measures as well, such as the current time, so I really recommend using keybase as opposed to doing this yourself.

And I am on keybase as well: https://keybase.io/rolfn

Edit 2: oh, and one more thing: I promised to get back to the email example. My pgp-key (as posted on keybase and on the network of public keyservers) contain one or more email-addresses. How can you be sure that these are accurate? First of all, we need a precise formulation of what this is:

  • The key contains a claim about my email-address
  • This claim is signed with my private key (verifiable with the public key), thus it contains proof that I posess the private key

Now, what we want to check is that I also control the email-address. This can be done using challenge-response authentication. For example, if you email me some unique data (such as a random number), encrypted with my public key, I need access to my private key to decrypt. Then I can sign this number with my private key and return to you. If I was able to decrypt correctly and sign verifiably, this proves that I have the correct private key. Since this was done through my email address, it also proves that I control the email-address.

But do note that you still don't know for sure who I am, you just know that the email-address and the private key are controled by the same entity. To prove you are talking to me, we need to meet in person.

mandag 2. juni 2014

Re: Client Feedback On the Creation of the Earth

Regarding:
http://www.mcsweeneys.net/articles/client-feedback-on-the-creation-of-the-earth

Dear Mike, thank you for this thorough feedback. I will run this by my engineering team, but I can give some preliminary feedback on the issues you point out.

1. Please note that "day" and "night" are just handles we use internally. Wording and translation to different languages is the customer responsibility, and we have provided a configuration system for this purpose. Please see clause 5b in the contract.

2. Well, you didn't provide any strict requirements regarding color, but you had quite a few requirements regarding the composition of the atmosphere and the fact that carbon-based life were to be able to live on "earth" without external support systems. Finding the right balance here turned out to be quite a challenge, but I believe we found a good tradeoff between cost and functionality. The color is really just a consequence of the composition of the atmosphere and there is nothing we can do about this so late in the process.

3. Unfortunately it is not possible to make life out of carbon alone, you also need fluid to transport stuff around (sorry about this simple explanation, my engineers can get back to you with more details if necessary). Water turned out to be the most stable fluid available. I realize that the amount of water might seem excessive, but it is really necessary to get everything working.

4. I'm not sure where you are getting at. You specifically wanted carbon-based life. Now, your definition of "life" might differ from the generally accepted meaning in the industry, but some sort of "reproduction" is usually regarded as a key ingredient. The "seeds" and "fruits" (these are your terms, our internal, technical terms are more refined) are needed for reproduction.  Thus no "seeds"; no life.

5. This is really a cost issue. It turned out that creating a source of light was far, far more expensive than expected. Thus we only created one. The second source you see at night is really just a reflection of the primary source. We considered adding more reflections like this, but it soon became unstable and crashed.

6. As I explained above, we need sea to get life. The fact that the life also spreads to the sea is a side-effect of life. (Actually it was the other way around, it turned out that the easiest way of bootstrapping the life-process was by starting it at sea, but this doesn't matter. Even if we had started the life on land it would have spread to the sea eventually)

7. The birds was just an add-on, really. One of our engineers thought it would be a fun idea and added it. I believe it adds more drama and movement and overall makes the system more rewarding to use.

8. You can regulate the amount of animals by adjusting the amount of plants available. Please see the users manual.

9. They aren't really made "in my image", that is just something the keep telling themselves to boost their own ego and sense of superiority. But they are really at the top of the food-chain, and that means the can pretty much do what they like. I understand that this can be a PITA, but life is usually structured like this, with one species on top, and everything else is really just a consequence of this.

10. Please see note about life above (pt 4). Mankind, being at the top of the food-chain, cannot be expected to figure out the most rational way to behave by themselves. Everyone else on "earth" is hunted by someone else and will adapt the behavior necessary to avoid extinction. Previous trials show that this doesn't work with the species on top, and they need to be told what to do. If not given explicit instructions to "be fruitful and multiply", they will just vanish after a few generations. Please note that we tried to put this as delicately as possibly, using the euphemism "fruitful" instead of more explicit terms.

Working on sunday is a no-go, but I will gladly bring my chief architect to go over these issues with you and your stakeholders on monday.

regards
God

torsdag 26. september 2013

Tolkning av skarptrommestemmen i Ravels Bolero

Et studium av fremføringstradisjoner representert hos Spotify
Skarptrommestemmen på Bolero er som kjent en monoton rytme av to takter som repeteres gjennom hele verket, nemlig:

Men selv om dette tilsynelatende er enkelt og monotont vil jeg påstå at det likevel er rom for tolkning og frasering.
Spørsmålet er: er det riktig å tolke inn noen frasering, og er det vanlig å gjøre det?
Til min rådighet har jeg i underkant av 200 ulike innspillinger som finnes på spotify, her følger en oversikt over de mest bemerkelsesverdige.
Først vil jeg si litt om tempo. Wikipedia har en lengre utgreiing. Ravel har selv indikert et tempo på 66, som betyr varighet på rett under 16 minutter. Men praksis varierer fra 17:40 (Pedro de Freitas Branco – Le boléro) ned til 12:06 (Leopold Stokowski – Bolero). Barenboim/Chicago er sannsynligvis ganske nær Ravel sin intensjon.
Men det var frasering av skarptrommestemmen vi skulle snakke om. For å musisere dette og ikke bare høres ut som en metronom, mener jeg det vil være naturlig å legge en liten betoning på åttendelene på tredje slag i første takt, samt å la triolene i siste halvdel av andre takt lede tydelig frem mot eneren i takten etter. Stokowski nevnt over (raskeste innspilling) er et godt eksempel på hva jeg mener. Andre eksempler er:
Noen andre, litt mer fantasifulle varianter:
  • Lamoreux Concert Association (?) høres det ut som prøver å flytte triolene litt «til høyre» (uten å få det til hver gang)
  • Ja, det er vanskelig å spille svakt, men det må da gå an å spille litt svakere enn «trommemarsj»? New York/Kurt Masur
  • London/Abbado (Hæ? Det høres ut so mom trommisen var på fylla dagen før og ikke har kommet seg helt. Og dette er London med Abbado?!)
Og så har du de som virkelig ikke får det til. Noen er bare litt ujevne, slik at det blir en litt pussig frasering uten noen egentlig retning, mens de mest ekstreme ikke klarer å få samme klang med venstre og høyre hånd. Det blir nesten litt skolekorps-følelse i slutten av andre takt. Noen eksempler på dette:
Selv foretrekker jeg at det svinger litt av skarptrommespillet. Det er meningen at dette skal være en dans, og det blir en veldig kjedelig dans med en monoton metronom i bakgrunnen. Samtidig må fraseringene være forsiktige og subtile så det ikke tipper over i parodien. Boston/Ozawa synes jeg får det til bra.

torsdag 24. mars 2011

Designing an infrastructure for booting linux from iSCSI

The previous post outlined the components needed to boot linux with iSCSI root:

  • A kernel
  • An initial root-fs with necessary drivers and scripts for mounting an iSCSI-volume
  • The root-fs on iSCSI
  • Configuration

Keep in mind that all of this has to fit together. The init-root and iSCSI-root need a /lib/modules directory which has the kernel-modules compiled for the specific kernel loaded. I practice, the the init-root is built for one specific kernel, the actual root might support several (i.e., it will typically contain modules for each kernel ever installed), but userland tools might work only with the latest. The configuration needs to take this into consideration

Now, for actual requirements:

  • The iSCSI-root should be as general as possible, meaning I don't want any specific configuration inside this filesystem (there are some things actually needed, more on that below). This is because I would like to be able to create a new instance by cloning a template and booting this without having to change any config file inside the filesystem itself.
  • The same goes for the init root-fs and the kernel, these should be re-used across several instances.
  • There must be a naming scheme which makes it easy to understand which components go together
  • There should be no redundancy in the configuration, the same things should not be configured several places (as an inconsistent configuration could lead to hard-to-track bugs)
  • The configuration should be as concise as possible, just list the things that actually vary in a brief format

There are also a few other parameters that will (or should) vary between instances:

  • Ethernet MAC-address. This will be my preferred way of locating which configuration to use, that is: given a MAC-address, the configuration should determin all other parameters
  • IPv4-address, determined by MAC-address using DHCP
  • IPv6-address, I will probably not use DHCP, but instead use the router-solicitation mechanism in IPv6, as this is simpler and more elegant
  • Filesystem UUID. I don't think it matters if separate machines have the same UUID-s, but if I some day might try to mount these filesystems on the same machine, it will probably get confused.
  • Hostname. This could be determined by DNS, but it would probably be an advantage if the machine knows its name even if network hasn't come up yet. It could also be the other way around, that DNS is updated by DHCP when the IP is assigned. (This is more or less out-of-the-box functionality with many DHCP-servers. Unfortunately, there seems to be no standard solution for IPv6 yet)
  • SSH-keys, re-using these across machines would be a security issue.
  • iSCSI-initiator id, a unique name identifying the client.

The design for handling these parameters could be as follows:

  • The MAC-address is either determined by the ethernet-card (in a physical machine) or by the hypervisor-tools (in a virtual one)
  • IP-addresses are assigned automatically based on MAC
  • Boot-parameters (location of kernel and init root, along with parameters to mount the iSCSI-volume) could be set by the hypervisor or the DHCP-server
  • Filesystem UUID should be set when creating it (so if we are creating a new instance by cloning a template, it should be immediately followed by a change of UUID). Keep in mind that /etc/fstab should refer to something else than UUID (I am thinking volume label, but I will get back to this)
  • SSH-keys will be generated on first boot (the ssh startup script should check if keys exist, and generate them if not)
  • Hostname could be set manually on first boot.
  • iSCSI initiator id must be set outside of the bootet system, as this needs to be available before it has access to the actual root-fs, but it must also be known to the machine itself, because it might want to attach other iSCSI-interfaces after it has booted. One possibility is to create an address-structure based on MAC (even one more id attached to the MAC...)

onsdag 23. mars 2011

How to boot linux from iSCSI

I would like to boot virtual (or physical) linux hosts from the network, with an iSCSI-device as root.
There are bits and pieces of information concerning this available, but I have not been able to find a complete guide to how this is done.

This is what I would like to do:

  • Load the kernel, either using the Xen bootloader or with PXE (network boot)
  • Attach to an iSCSI-disk available somewhere on my local network (in iSCSI-terms: let the newly booted linux be an iSCSI-initiator and log it in to an iSCSI-target)
  • Mount the iSCSI as root
  • Continue booting from the new root-filesystem
This is more or less the same as booting from NFS, but iSCSI is far more efficient than NFS

To get this to work, we need the following pieces:

  • A kernel that can loaded before the filesystem is mounted, which means that it needs to be copied over to another location outside of the host we want to boot. For xen, this can be anywhere on the Dom0 filesystem, but to be really general and independent of any local hardware, it should be put on a tftp-server
  • The initial root filesystem (initrd.img) needs to be in the same place
  • Some configuration telling the bootloader where to find the kernel and the root filesystem. With XEN, this could be the xen config file, but for PXE, it should be placed in the dhcp-configuration.
  • After booting the kernel with the initial root-fs, it will need some tools to mount the actual root-fs from iSCSI:
    • Kernel modules for iSCSI. I'm not exactly sure what the minimum set is, but we probably need atleast iscsi_tcp.
    • Tools to connect to iSCSI, iscsistart
    • Configuration of where to find the iSCSI-volume
    • Scripts pulling everything together
  • Scripts run by init on the initial root file-system will mount the actual root from iSCSI and continue booting from this.
Se initrd(4) for a detailed and general description of the boot-process. (However, this seems a bit out of date, because it refers to LILO and LOADLIN, and seems unaware of GRUB. Documentation/initrd.txt seems just as outdated.)

I have investigated the kernels bundled with the latest ubuntu and debian variants and found that:

  • Both have the neccessary kernel modules included in initrd.img
  • Debian have scripts for setting up iSCSI included in initrd.img
  • None have the iscsistart-tool
But wait a minute, what really happens here? Actually, the initrd.img is not copied from an installation-archive, it is generated when installing the kernel. This is infact documented, if you know what to look for:
  • initramfs-tools is the tool actually generating the image. The contents is from /usr/share/initramfs-tools. Some other packages put contents here.
  • open-iscsi provides the iscsi-script mentioned above
  • It also provides the iscsistart command, but apparently not in a location picked up by the initramfs-tools.
Further digging and searching (use the source, Luke...), and voila: if you create /etc/iscsi/iscsi.initramfs with default values for the iscsi-configuration, the neccessary files will be included when generating the initramfs. This is actually described here.

This was the general info on how this fits together, some recommendations to actual setup will come in a later post.

torsdag 10. mars 2011

Powershell for Unix-users

Powershell is an object-oriented scriptinglanguage bundled with windows-7 and windows server 2008 which is heavily influenced by unix scripting, python, perl, lisp and more. This guide lists some common unix-commands and their powershell equivalents.  Please note that they are not completely equal, as unix-commands works on streams of bytes, typically split into lines, while powershell works on streams of objects.

grepWhere-Object, but see below for details
cdSet-Location, but cd is an alias
catGet-Content, but cat and type are aliases

grep "pattern"
If input is a list of strings, the Select-String command is equal to grep:
Select-String -Pattern "pattern"
However, input will typically be a stream of object, and what you want to do is to filter this. Thus Perl grep or Common Lisp remove-if-not which both accepts a general selection function as parameter are more appropriate. The PowerShell command is inspired by SQL SELECT WHERE ...
Where-Object { current object is $_ }