Debian testing, using http for gpxe
here is the rundown on how I have set up my gpxe client:
Compile a kernel (debian 2.6.26)
- herbie is my model machine; it has a hard drive, but usually boots off the network. I prepared the kernel and the image from it. Once the image is created by oneSIS on the server (mimi), updates will be done directly on mimi. There is a copy of the image in /tftpboot, and everytime I do an update, and have verified everything is working as advertised, I will rsync this directory so there is always a backup copy. There is enough room on /tftpboot, may just keep previous version until next update, so if there is a problem that doesn't show up for a while, I can still see how things were previously. When it is time to update the kernel, will have to boot herbie off the hard drive, and update to current status first.
- downloaded latest linux-source and prepare directories (apt-cache search linux-source will tell you what goes in the ... part):
apt-get install linux-source...
cd /usr/src
unpack kernel
tar -jvxf linux-source...
ln -s linux-source... linux
- install latest linux-image (to get configs from, probably you are already at latest image, in which case just copy, otherwise make sure you reboot after installing to make sure using that kernel and it works fine)
- copy config from latest linux-image, example:
cp /boot/config-2.6.8-2-k7 /usr/src/linux/.config
- other software will need:
- libncurses5-dev
- kernel-package
- move to /usr/src/linux directory, and get ready for actual compile
make menuconfig
- compile options, make sure asterix, not just M, use exit to return to previous menu, exit from top level (linux kernel configuration) will really exit menuconfig so be careful. Save on exit when asked.
Top Level (Linux Kernel Configuration) -> Networking -> Networking
Options -> IP: kernel level autoconfiguration
(once autoconfiguation is asterisked, these show up to put an asterisk in:
IP: DHCP support
IP: BOOTP support (maybe IP: RARP support?)
Top Level -> Device Drivers -> Network Device Support -> appropriate
network card Ethernet currently using intel pro pci card
Intel® 82576 on custom built servers
Top Level -> Device Drivers -> Input device support -> Mice -> PS/2 mouse
Top Level -> File systems -> Miscellaneous filesystems -> make sure
Compressessed ROM file system support has *
Top Level -> File systems -> Network File Systems -> NFS client support ->
-> Root file system on NFS
make-kpkg clean
note: use new revision number if using same kernel-image version
look at deb packages in /usr/src to see what number/letter to use.
make-kpkg --initrd --revision=1a kernel_image
notes: the revision number will just be used to identify the image in the deb package
kernel-image is just as written - not a specific number or image
need to have capacity for initrd, at least the way I have done this, and certainly if you want to boot the server using this kernel
Congratulations you should now have a new compiled kernel!
kernel image can now be found here:
/usr/src/linux-source-2.xx.xx/arch/i386/boot/ (new amd64 computer images found in x86_64 directory)
image is called bzImage
Prepare Kernel
- Download and install oneSIS deb package:
http://sourceforge.net/projects/onesis/files/, put in /root/src and install using dpkg
- build modules for new kernel
cd /usr/src/linux
make modules
while doing make modules got this message:
WARNING: modpost: Found 2 section mismatch(es).
To see full details build your kernel with:
'make CONFIG_DEBUG_SECTION_MISMATCH=y'
I tried doing this, but didn't find anything strange. Maybe I didn't do it correctly, I just added the CONFIG...=y bit to the .config file in /usr/src/linux before building the kernel, but I didn't see any output referring to actual mismatches. I suspect this isn't a big deal anyway, as I think this should affect this machine booting with this kernel, not just a computer booting off the network, and I was able to boot the server computer with the new compiled kernel with no problem. Furthermore, according to searches I did, this will either royally screw your system or have no effect.
make modules_install
- install new, compiled kernel on server, and make sure this machine will boot from it, has the right modules, etc. as test and because I want the machine that I do updates on to be running same kernel as clients. from /usr/src/:
dpkg -i linux-image-2.6.8_2d_i386.deb
(or whatever your compiled image is called)
put kernel package on hold, so doesn't update kernel, can check package with:
dpkg -l |grep linux-image
You should recognize the release version you gave it, also should be the only one that is a kernel binary
echo "linux-image-2.6.26 hold" | dpkg --set-selections
reboot server, make sure that is a go
At the end of this tutorial, there will be two initrd images, one in your /boot directory, created by dpkg when installing the kernel to the server, and one that you will create below for the clients. You will need to change the option BOOT=local to BOOT=nfs in /etc/initramfs-tools/initramfs.conf to make the client initrd, and then change it back after you run this command so the local initrd is created correctly (by installing the kernel using dpkg). This is the config for all initrds made, so even if you aren't going to install the kernel locally on the server, you need to change it back when you are done so that you don't mess up future kernel installations (this will show up as not being able to find root directory when booting).
Note: I am using the directory /tftpboot for all diskless booting stuff for historical reasons, but since I am actually serving the kernel now with a web server, and root directories are served using nfs, doesn't actually need to be called /tftpboot.
Once you have changed initramfs.conf to BOOT=nfs, this command will use the running kernel to create a new initrd.img in the oneSIS directory (use current numbers):
mkinitramfs -o /oneSIS/initrd.img-2.6.26
If you are not running your compiled kernel or if you are using a different kernel on the server, have to specify the kernel:
mkinitramfs -o /oneSIS/initrd.img-2.6.26 2.6.26
use wraplinux, which is on mimi, to put the kernel and initrd together.
- copy the new kernel image to mimi (something like
/usr/src/linux-source-2.xx.xx/arch/i386/boot/bzImage as above, x86_64 directory for amd64), and the new intrd you created, /oneSIS/initrd.img-2.6.26 or whatever. I copy them both to /oneSIS and use wraplinux there:
wraplinux -i /oneSIS/initrd.img-2.6.26 -p "root=/dev/nfs ip=dhcp nfsroot=10.208.108.17:/var/lib/oneSIS/image" --output=/oneSIS/kernel bzImage
If the nfsroot in the wraplinux command is incorrect, when you try to boot the client, it will go into endless loop about trying to unmount, which is pretty misleading, but look at the server logs, and should see what it has attempted to mount.
Note: wraplinux is not a debian package, need to download from http://freshmeat.net/projects/wraplinux, put in /root/src/
Note: bzImage will need to be moved to http server. I generally call it bzImage there, but the file you are actually moving from mimi is kernel, since that was the output file when wrapping the bzImage and initrd together.
If this is a kernel update, rather than first time install, go to OneSIS Configuration at this point.
Create a gpxe file for floppy:
Webserver:
stuck the bzImage I created onto my webserver and created a file boot.gpxe with the following contents:
#!gpxe
kernel http://www.shadlen.org/gpxe/bzImage root=/dev/nfs ip=dhcp
boot
DHCP server:
- added stuff to my dhcpd.conf:
# computational cluster
group {
filename "http://www.shadlen.org/gpxe/boot.gpxe";
server-identifier 10.208.108.18;
option subnet-mask 255.255.255.128;
use-host-decl-names on;
default-lease-time 86400; # 24 hours
max-lease-time 86400; # 24 hours
host nina {
hardware ethernet 00:04:23:e0:5f:9b;
fixed-address 10.208.108.18;
}
host oscar {
hardware ethernet 00:04:23:e0:5e:68;
fixed-address 10.208.108.28;
}
}
OneSIS
Currently some issues with oneSIS and Java. cd /root/src/oneSIS-svn/trunk and run prefix=/var/lib/oneSIS/image make install
get image from your client with a hard drive (same one you got the kernel from, I used herbie). On mimi run
copy-rootfs -r herbie /var/lib/oneSIS/image It will ask you several times for the root password for herbie.
There are several directories that you will want to empty out, so that you can create links to ram for them, and the OS can write to them as it boots. Unfortunately, just emptying doesn't really always empty them (?!?), so remove the directories, and then replace
- rm -r /var/lib/oneSIS/image/var/run
- rm -r /var/lib/oneSIS/image/var/log
- rm -r /var/lib/oneSIS/image/var/lock
- rm -r /var/lib/oneSIS/image/tmp
- rm -r /var/lib/oneSIS/image/var/lib/gdm
- rm -r /var/lib/oneSIS/image/var/lib/munin
- rm -r /var/lib/oneSIS/image/usr/tmp
- mkdir /var/lib/oneSIS/image/var/run
- mkdir /var/lib/oneSIS/image/var/log
- mkdir /var/lib/oneSIS/image/var/log/gdm
- mkdir /var/lib/oneSIS/image/var/log/munin
- mkdir /var/lib/oneSIS/image/var/lock
- mkdir /var/lib/oneSIS/image/tmp
- mkdir /var/lib/oneSIS/image/var/lib/gdm
- mkdir /var/lib/oneSIS/image/var/lib/munin
- mkdir /var/lib/oneSIS/image/var/lib/ntp
- mkdir /var/lib/oneSIS/image/usr/tmp
The /root directory on herbie has stuff that we don't want in the /root directory for the clients. This is because life is easier if the OS can write to /root, so we put it in ram, but don't want it loading a bunch of crap into ram. So, moved the src directory and netboot directory that live in /root on herbie to /backup on mimi
mv /var/lib/oneSIS/image/root* /backup/herbie_root/
Also make sure you have the distro set in the image config file, /var/lib/oneSIS/image/etc/sysimage.conf
Flexnet is not on herbie, copy it from an old oneSIS image on mimi, (remember on the image, the actual file will be the .default file, but we move it to be the normal file, since oneSIS will move it to the .backup file for us
cp /backup/old_image/etc/init.d/.flexnet.default /var/lib/oneSIS/image/etc/init.d/flexnet
or copy from the matlab directory /usr/local/matlabx/etc/flexnet.boot.linux. Remember to put in a username if copying from the matlab directory.
- chroot /var/lib/oneSIS/image
- update-rc.d flexnet defaults
- exit
mk-sysimage will not edit the /etc/fstab file now that debian is using UUIDs instead of /dev, so need to edit fstab by hand to comment out anywhere it would use a local drive
get the distro from the patch directory, since this is what it is for.
/usr/share/oneSIS/distro-patches currently called debian-6.patch
make sure config file in the image has line:
# DISTRO: <name> <version> #
DISTRO: debian 6.32
I had to create my own debian patch, because the one in the oneSIS package was too old, patch for Debian Squeeze can be found here: oneSIS
mk-sysimage will read the configuration file from the image, but it will look for the patch in the host config file (here it is mimi:/usr/share/oneSIS/distro-patches
mk-sysimage --dryrun /var/lib/oneSIS/image
if everything works, can add stuff to /var/lib/oneSIS/image/etc/sysimage.conf
my config file can be found at oneSIS
Re-do the dry run, if still okay, run mk-sysimage for real
mk-sysimage /var/lib/oneSIS/image
- Last time I had problems with the /tmp directory having the wrong permissions (gdm wasn't working properly, would log you out as soon as you tried to log in). Change the permissions of /var/lib/oneSIS/image/tmp.default to 777 to fix it.
Finish up
- restarted various services
- exportfs on machine with root directories
- dhcp on dhcp server
- boot up client with floppy!
Making Changes, minor updates
If you make changes make a note, since this will make the image different from the image on herbie's hard drive, and when it is time to update the kernel, will want to reconcile. Sometimes it is necessary to chroot to make changes, but this is not the preferable method.
using chroot:
chroot /var/lib/oneSIS/image
now can run aptitude update, etc.
better for updating:
make a copy of the image. copy should already be at the latest, but rsync again, just in case you forgot at the end of the last update
rsync -avu /var/lib/oneSIS/image /backup/
things are set up so that according to ella everyone can mount the image rw, but mimi only allows herbie to mount rw, others are all ro. Default should still be that herbie is ro, so change mount.
NOTE: I was having a hard time getting herbie to change from ro to rw, so started changing on mimi instead. Need to edit /etc/exports and run exportfs -a
Once the backup of the image has been created, and mounted rw, can update as root from herbie.
- reboot herbie and test or just remount image ro (remount doesn't always work)
- mount -o remount -r /
- if everything is ok, rsync again so the most current image is in the backup folder
rsync -avu /var/lib/oneSIS/image /backup/
It would probably be better to update the backup image, rather then the live image, but this would involve running wraplinux again and booting with a different kernel, since this is where the image location is set. So far, no problems with updating live image, and can always rsync to the previous image if things go awry.
Updating the kernel
If updating the kernel, have to boot herbie into hard drive. I think the only difference between herbie local and image is the license for matlab9 (uses different ethernet card, both are present on both as backup). Remember, kernel will need to be recompiled, so only do this if major kernel upgrade, or if it has been a long time since you reconciled. Remember to always boot into compiled kernel to make sure using correct kernel modules, etc.
- booted herbie on the hard drive, and updated everything (see above for dealing with kernel, will have to take kernel off hold to install new one)
On mimi:
- backup up the current image.
- @rsync -avu /var/lib/oneSIS/image /backup/old_image/@]
Tried changing boot.gpxe on ella so that I was using the image in /backup instead of /var/lib/oneSIS/ but this didn't work. Must be set somewhere else. So, instead after making a copy of the old image, get rid of the image in /var/lib/oneSIS and replace it with the new image from herbie.
- copy-rootfs -r herbie /var/lib/oneSIS/image
- have to pay attention and put in password when it wants it for various directories
- copy certain files from old image
- cp /backup/old_image/etc/sysimage.conf /var/lib/oneSIS/image/etc/
- cp /backup/old_image/usr/share/oneSIS/distro-patches/debian.patch /var/lib/oneSIS/image/usr/share/oneSIS/distro-patches/
- copy the munin-node start script, because have included a hack to get image to mount everything. someday need to fix this hack...
- cp /backup/old_image/etc/init.d/munin-node /var/lib/oneSIS/image/etc/init.d/
- remove and replace directories listed in OneSIS Configuration
- patch the image
mk-sysimage --dryrun /var/lib/oneSIS/image
if all is okay, run mk-sysimage for real
mk-sysimage var/lib/oneSIS/image
patch no longer able to patch fstab, since debian is using UUID, so make sure to comment out the local hard drives in fstab before testing.
now, test to make sure herbie will boot off this image. Once confident this image is working well, rsync so there is a backup
rsync -avu /var/lib/oneSIS/image/ /backup/image
If you want to do a test run on just one computer, put the new image in a new directory, for example, /backup/new_image.
Rewrap the kernel so it looks for the new image in the correct place:
wraplinux -i /oneSIS/initrd.img-2.6.32 -p "root=/dev/nfs ip=dhcp nfsroot=10.208.108.17:/backup/new_image" --output=/oneSIS/kernel_herbie bzImage
Place this on the web server:
scp /oneSIS/kernel_herbie ella:/var/www/gpxe/new_bzImage
and edit boot.gpxe (make sure there is a backup copy first) to reflect the new image:
#!gpxe
kernel http://www.shadlen.org/gpxe/new_bzImage ramdisk_size=14332 root=/dev/nfs ip=dhcp nfsroot=10.208.108.17:/backup/new_image rw
boot
Make sure mimi is set to export this directory to herbie (/etc/exports -> exportfs -a)
OneSIS hints
myclass tells you the class
Last time I had to patch the file /usr/share/per/5.10.1/oneSIS.pm (exists on both mimi and image, but most important to patch on image)
This was so that I wasn't getting this error:
/ram/var/spool/torque/server_name: Too many levels of symbolic links
Line numbers were not correct in patch from Josh. here is as given by josh:
--- oneSIS.pm/lib/oneSIS.pm (revision 501)
+++ oneSIS.pm/lib/oneSIS.pm (working copy)
@@ -3368,9 +3368,11 @@
my $filename = $1;
if (defined (copy_file($final_target, "$dest_dir/$filename"))) {
- chdir($dest_dir);
- unless (symlink($filename, $src_file)) {
- warn "oneSIS: Error! Could not create symlink for $src: $!\n";
+ unless ($filename eq $src_file) {
+ chdir($dest_dir);
+ unless (symlink($filename, $src_file)) {
+ warn "oneSIS: Error! Could not create symlink for $src: $!\n";
+ }
}
}
else {
OneSIS Patch
Had to make a new patch (oneSIS), because the patch for debian was outdated. Final patch lives in /usr/share/oneSIS/distro-patches, and use ~/patch as working directory for making patches. To make patch: copy files to be patched to ~/patch/ make 2 copies in this directory of files to be patched (filename and filename.orig). Leave the .orig file alone, and open the copy with the actual file name (filename). Make changes desired to this file. Now make patch. For each file in this directory that needs patched:
diff -uNr bootmisc.sh.orig bootmisc.sh >> patchfile
The double arrows mean append to, so new patches don't overwrite the fie patches already in the patchfile. Do this for every file you are changing. Now you need to adjust the path to the files to be patched. Add etc/init.d/ to bootmisc.sh, for example. Don't worry about the timestamps.
After you make a new patch, be sure to edit sysimage.conf and run mk-sysimage again.