PXE boot notes (part 5, kickstart config)

By this point in the process we’ve had a lot of magic:

  • automagically gaining an IP address via BOOTP (and subsequently DHCP)
  • fetching a bootloader via TFTP
  • having the bootloader fetch a kernel and initial ramdisk via TFTP
  • preparing the HTTP distro environment and populating it

But that magic is nothing on the next bit.

If you’re not familiar with Kickstart – why not? Most likely you’re not using a Red Hat derived distribution such as CentOS, Scientific Linux or Fedora. Or you are, but you’ve never had a need to simultaneously provision lots of bare metal installs – I mean, who uses bare metal these days? It’s so 2010!

Oh, people like me.

Anyway – in Part 3 I detailed the config file for the PXE bootloader, which referred to a boot option:

inst.ks=http://10.30.110.40/ks.cfg

That should be pretty clear: go fetch a file called ks.cfg from that webserver.

I already detailed a Kickstart configuration file in Part 1, and this one is more or less the same with a couple of notable differences. Firstly, the method is different – rather than

install
cdrom

we have

install
url --url http://10.30.100.40/centos/
text

The ‘url’ option tells the installer (which was provided over TFTP) to fetch its useful bits from a local webserver.

The next section is as follows:

reboot

lang en_GB.UTF-8
keyboard --vckeymap=gb --xlayouts='gb'
timezone --utc Europe/London

ignoredisk --only-use=sda

network --onboot yes --device enp6s0 --bootproto dhcp

authconfig --enableshadow --passalgo=sha512
rootpw --iscrypted [SHA512_HASH]

selinux --disabled

bootloader --location=mbr --driveorder=sda --append="crashkernel=auto rhgb quiet"

clearpart --all --drives=sda

part /boot --fstype=xfs --size=500
part pv.08002 --grow --size=1

volgroup vg_centos --pesize=4096 pv.08002
logvol / --fstype=ext4 --name=lv_root --vgname=vg_centos --grow --size=1024 --maxsize=204800
logvol swap --name=lv_swap --vgname=vg_centos --grow --size=4096 --maxsize=4096

%packages --nobase
openssh-server
%end

So nothing that special so far – the ‘reboot’ option means the install auto-completes without waiting for human intervention, which makes this a completely unattended installation.

And here is where, as they say, the magic happens: the post-installation section. This starts with ‘%post’ and is usually running in the chroot of the newly installed operating system (it is possible to override that if required but can make things considerably more complicated in terms of referencing the right paths). Firstly:

%post --log=/root/postinstall.log

echo "proxy=http://[PROXY_SERVER]:80" >> /etc/yum.conf

yum -y install strace sysstat at ntp kexec-tools numactl rng-tools wget lsof psmisc postfix iptables dd nfs-utils libaio libaio-devel gcc-c++ net-tools zip unzip tcsh
yum -y update

mkdir /root/postinst
cd /root/postinst
for x in iometer-dynamo.tgz startup_dynamo.sh iperf3-3.0.11-1.fc22.x86_64.rpm jre-8u66-linux-x64.rpm vdbench50403.zip kmod-enic-2.1.1.99-rhel.el7.centos.x86_64.rpm tput.sh sysctl.conf.10GbE sunrpc.conf ntp.conf; do wget http://10.30.110.40/local/$x; done

That section sets up a logfile, sets the proxy server for yum to use, installs a bunch of useful bits and pieces from the distro for this test platform, then fetches another bunch of local requirements which are then processed thus:

# local tools
for x in tput.sh startup_dynamo.sh
do
 cp $x /root && chmod 0700 /root/$x
done

# dynamo
(cd /tmp && tar xvpzf /root/postinst/iometer-dynamo.tgz)
echo "* * * * * root /root/startup_dynamo.sh" >> /etc/cron.d/startup_dynamo

# vdbench
unzip -d /usr/local/vdbench /root/postinst/vdbench50403.zip

mv ntp.conf /etc/ntp.conf
systemctl enable ntpd

yum -y localinstall iperf3-3.0.11-1.fc22.x86_64.rpm jre-8u66-linux-x64.rpm

Now – this was being put together to do some performance testing of a big storage array, as I mentioned right back in Part 1. There were a few things to do which were specific to the testing methodology in that we had 22 separate volumes created for each test machine spread across a number of different devices, using two different NICs (each on a different VLAN and therefore subnet) for the NFS traffic and one for ‘management’. That meant we had to get a bit creative on the addressing and mount front, which required a bit of shell scripting – and yes, there may be far more efficient ways to have done this, but hey ho – it worked. Firstly, setting the NICs up – this depends on the address the management port received over DHCP, and the hostname that was set via the same route (which ended in either an ‘a’ or ‘b’):

# configure secondary NIC for NFS gubbins
ADDR=$( ip ad ls enp6s0 2>&1 | egrep "inet " )
IP=$( echo $ADDR | awk "{ print \$2 }" | sed "s/\/.\+//" )
IP1=$( echo $IP | sed 's/\(^.\+\)\.\([0-9]\+\)/\2/' )
IP2=$( echo $IP | sed 's/\(^.\+\)\.\([0-9]\+\)/\1/' )
echo $HOSTNAME | egrep -q a$
if [ $? -eq 0 ]
then
 MULT=12
 CLASS="prod1"
else
 MULT=106
 CLASS="prod2"
fi
IP3=$(( $IP1+$MULT ))
SECADDR1=$( echo $IP2 | sed 's/\.110/.114/' )
SECADDR2=$( echo $IP2 | sed 's/\.110/.111/' )

MOUNTIP1=$SECADDR1.$IP3
SECINTIP1=$SECADDR1.$IP1
MOUNTIP2=$SECADDR2.$IP3
SECINTIP2=$SECADDR2.$IP1

cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-enp12s0
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=STATIC
DEFROUTE=no
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=no
IPV6_DEFROUTE=no
IPV6_FAILURE_FATAL=no
NAME=enp12s0
DEVICE=enp12s0
IPADDR=$SECINTIP1
PREFIX=24
MTU=9000

EOF

cat <<EOF > /etc/sysconfig/network-scripts/ifcfg-enp7s0
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=STATIC
DEFROUTE=no
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=no
IPV6_DEFROUTE=no
IPV6_FAILURE_FATAL=no
NAME=enp7s0
DEVICE=enp7s0
IPADDR=$SECINTIP2
PREFIX=24
MTU=9000

EOF

After that wrangling, some more shell scripting to create the mount points and setup /etc/fstab to mount the appropriate NFS exports from the right device in the right place. Again, this is partially defined by the hostname using $CLASS which we set above:

# sort out NFS mounts
NFSOPTS="rw,bg,vers=3,hard,proto=tcp,timeo=600,rsize=65536,wsize=65536,retry=2 0 0"

for x in {0..9}
do
mkdir /tmp/${CLASS}_${HOSTNAME}_${x}
 cat <<EOF >> /etc/fstab
${MOUNTIP1}:/${CLASS}_${HOSTNAME}_${x} /tmp/${CLASS}_${HOSTNAME}_${x} nfs $NFSOPTS
EOF
done

for x in {10..19}
do
mkdir /tmp/${CLASS}_${HOSTNAME}_${x}
 cat <<EOF >> /etc/fstab
${MOUNTIP2}:/${CLASS}_${HOSTNAME}_${x} /tmp/${CLASS}_${HOSTNAME}_${x} nfs $NFSOPTS
EOF
done

cat <<EOF >> /etc/fstab
${MOUNTIP1}:/${CLASS}_${HOSTNAME} /tmp/${CLASS}_${HOSTNAME} nfs $NFSOPTS
${MOUNTIP2}:/${CLASS}_${HOSTNAME}_clone /tmp/${CLASS}_${HOSTNAME}_clone nfs $NFSOPTS
EOF

mkdir -p /tmp/${CLASS}_${HOSTNAME}
mkdir -p /tmp/${CLASS}_${HOSTNAME}_clone

The next section creates some SSH keys and adds known keys to root’s ‘authorized_keys’ file, which is left as an exercise for the reader. Finally, we add some options to /etc/rc.local which are very specific to this environment (CPU affinity for certain tasks, for example) which are left out as they’re a bit too specific for a general document.

Finally, we create a consistent /etc/hosts and apply it so all the test machines know each other’s names and don’t need to use DNS:

cat <<EOF > /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.30.110.21 centos1a
10.30.110.22 centos2a
10.30.110.23 centos3a
10.30.110.24 centos4a
10.30.110.25 centos5a
10.30.110.26 centos6a
10.30.110.27 centos1b
10.30.110.28 centos2b
10.30.110.29 centos3b
10.30.110.30 centos4b
10.30.110.31 centos5b
10.30.110.32 centos6b

10.30.110.40 perf-mgmt
EOF

%end

And with that ‘%end’ option, that’s that.

Setting the machines to boot from the network (using the blade profile management tools) and rebooting them takes us from bare metal to a ready-made test platform in a little under 15 minutes.

Magic!

PS: One last step is to change the pxelinux.cfg file to boot from local disk after they’ve started installing otherwise they’ll just go round and round and round in circles, reinstalling themselves each time. That’s not a desired outcome! I’ve got a shell script to do that, which simply copies in the right file to the right place once the installs have reached the postinstall phase, but again I’ll leave that to the reader to work out. It isn’t that hard!

PXE boot notes (part 4, serving files)

Now we have a working DHCP server with the ability to allow clients to boot, a TFTP server which is providing the PXE bootloader, and it then references some other content that we haven’t yet detailed.

Remember the Centos 7 Minimal ISO I referred to earlier? We need that on the boot server – so get it copied over. Once it’s copied over, mount it in a convenient place:

mkdir /tmp/minimal-iso
mount -o loop $PATH_TO_ISO/CentOS-7-x86_64-Minimal /tmp/minimal-iso

We could simply refer to the paths under the mountpoint in our various configs from here but disk space wasn’t exactly at a premium so I lazily copied the content to two locations – the TFTP root, and under the webserver root. There’s about a million ways to do this reliably (cp, rsync, tar etc) but I’m a bit old and like tar, so:

for x in /var/www/html/centos /var/lib/tftpboot/centos
do
  mkdir -p $x
  ( cd /tmp/minimal-iso && tar cpf - . ) | ( cd $x && tar xvpf - )
done

Yes, it’s not very efficient, but humour me – with hundreds of TB of storage to play with in this system I could afford to be profligate!

That puts in place the files the pxelinux.cfg/default referred to in the previous post, so now the client machines can not only grab a bootloader but a kernel and initial ramdisk too. Good.

Webserver configuration is uncomplicated, as we left it with the default configuration and placed everything in the preconfigured DocumentRoot of /var/www/html. The only thing left to do now with the webserver itself is to enable and start it:

systemctl enable httpd
systemctl start httpd

And there was one more thing referred to in the PXE config file: ks.cfg, the actual Kickstart configuration for the test boxes. That’ll be described next post.

PXE boot notes (part 3, TFTP)

So in part 2 we setup the DHCP server. It referenced a TFTP server in its configuration, so now we need to configure the TFTP server ready to do some neat stuff.

In Centos 7, the service starts by default from xinetd, so we need to edit /etc/xinetd.d/tftp and set ‘disable’ to ‘no’. Kick the xinetd service:

systemctl enable xinetd
systemctl xinetd start

And now we need some extra help, starting with another package – syslinux.

yum -y install syslinux

…and we need some files from that copying to the TFTP server directory, in our case /var/lib/tftpboot:

cd /usr/share/syslinux
cp -p pxelinux.0 menu.c32 memdisk mboot.c32 chain.c32 /var/lib/tftpboot/

Now that bit’s done (and see there, that pxelinux.0 file is the one referenced in the last post by the DHCP server) we also need a config file. There are several places under the TFTP root that PXE boot systems will look for which I won’t go into here, as we’re going to use the default. So we:

mkdir -p /var/lib/tftpboot/pxelinux.cfg/

In there, we need a file called ‘default’. That contains the following (for an automated installation, anyway):

default menu.c32
prompt 0
timeout 50
ONTIMEOUT kickstart

MENU TITLE PXE Menu

LABEL kickstart
 MENU LABEL Kickstart
 kernel /centos/images/pxeboot/vmlinuz
 append initrd=/centos/images/pxeboot/initrd.img inst.ks=http://10.30.110.40/ks.cfg ip=dhcp

LABEL run
 MENU LABEL run (local boot)
 LOCALBOOT 0

But… where do all those other files come from? You know, the kernel, initrd, and that pesky ks.cfg? Read on!

PXE boot notes (part 2, DHCP)

So, having deployed a VM in part 1, I needed to make it do useful stuff…

Running through a PXE boot process in order, what happens first? Well, the device utilising PXE needs to get on the network – so firstly it needs an ethernet card (obviously). We’ll leave that bit as an exercise for the reader; in my case it was a blade system so we made sure the appropriate profile was applied to the hardware we were using and gave it an ethernet card (more than one, actually).

And when it boots, the PXE process starts and looks for… a BOOTP server. So here’s where the DHCP server comes in.

The basics, in /etc/dhcp/dhcpd.conf (locations may vary according to distro in use):

option domain-name "[YOUR DOMAIN HERE]";
option domain-name-servers [COMMA SEPARATED NAMESERVER LIST];
default-lease-time 86400;
max-lease-time 604800;
ddns-update-style none;

allow booting;                            # <- REALLY IMPORTANT
allow bootp;                              # <- REALLY IMPORTANT
option option-128 code 128 = string;
option option-129 code 129 = text;
next-server 10.30.110.40;                 # <- TFTP server
filename "/pxelinux.0";                   # <- PXE loader

authoritative;

subnet 10.30.110.0 netmask 255.255.255.0 {
 option routers 10.30.110.1;
 range 10.30.110.41 10.30.110.49;
}

host centos1a { hardware ethernet 00:AA:BB:CC:DD:00; fixed-address 10.30.110.21; option host-name "centos1a"; }
host centos2a { hardware ethernet 00:AA:BB:CC:DD:01; fixed-address 10.30.110.22; option host-name "centos2a"; }
host centos3a { hardware ethernet 00:AA:BB:CC:DD:02; fixed-address 10.30.110.23; option host-name "centos3a"; }
host centos4a { hardware ethernet 00:AA:BB:CC:DD:03; fixed-address 10.30.110.24; option host-name "centos4a"; }
host centos5a { hardware ethernet 00:AA:BB:CC:DD:04; fixed-address 10.30.110.25; option host-name "centos5a"; }
host centos6a { hardware ethernet 00:AA:BB:CC:DD:05; fixed-address 10.30.110.26; option host-name "centos6a"; }
host centos1b { hardware ethernet 00:AA:BB:C1:DD:00; fixed-address 10.30.110.27; option host-name "centos1b"; }
host centos2b { hardware ethernet 00:AA:BB:C1:DD:01; fixed-address 10.30.110.28; option host-name "centos2b"; }
host centos3b { hardware ethernet 00:AA:BB:C1:DD:02; fixed-address 10.30.110.29; option host-name "centos3b"; }
host centos4b { hardware ethernet 00:AA:BB:C1:DD:03; fixed-address 10.30.110.30; option host-name "centos4b"; }
host centos5b { hardware ethernet 00:AA:BB:C1:DD:04; fixed-address 10.30.110.31; option host-name "centos5b"; }
host centos6b { hardware ethernet 00:AA:BB:C1:DD:05; fixed-address 10.30.110.32; option host-name "centos6b"; }

The platform was using 6 blades in each location, each in a different chassis, so their base MAC addresses differed slightly.

A quick

systemctl enable dhcpd
systemctl start dhcpd

and we’re up and running. Sort of. Obviously we haven’t done the other magic bits yet, but we’ll get to them shortly – in Part 3.

PXE boot notes (storage test platform)

At work I’ve recently had the task of setting up a performance test platform for a storage project. Without going into vast amounts of detail of the hardware, the physical environment was a $VENDOR1 clustered & replicated storage platform split across two locations with identical systems at either end; in conjunction with a blade system from $VENDOR2.

Conveniently, with help from colleagues in our network team, the whole system was put into private (RFC1918) network space with a number of VLANs for different tasks. The VLANs spanned both locations.

One blade was provisioned as a VMware ESXi host with a couple of CentOS7 virtual machines on it. One of these VMs was setup as a DHCP, TFTP, web and management server to run performance testing applications across the subsequent test platform.

This post is my notes – a HOWTO, if you like – on the setup of the platform.


Firstly, a very ‘skinny’ CentOS 7 install was put on the VM using a mounted ISO image from my local machine (CentOS-7-x86_64-Minimal). I booted it to the bootloader, interrupted the boot process and told it to use a Kickstart configuration on a local webserver:

#version=DEVEL
# System authorization information
auth --enableshadow --passalgo=sha512
# Use CDROM installation media
cdrom
# Use graphical install
graphical
# Don't run the Setup Agent on first boot
firstboot --disable
ignoredisk --only-use=sda
# Keyboard layouts
keyboard --vckeymap=gb --xlayouts='gb'
# System language
lang en_GB.UTF-8

# Network information
network --bootproto=static --device=eth0 --gateway=10.30.110.1 --ip=10.30.110.40 --nameserver=[LIST_REMOVED] --netmask=255.255.255.0 --noipv6 --activate
network --hostname=perf-mgmt

# Root password
rootpw --iscrypted [SHA_REMOVED]
# System timezone
timezone Europe/London --isUtc
# System bootloader configuration
bootloader --append=" crashkernel=auto" --location=mbr --boot-drive=sda
# Partition clearing information
clearpart --none --initlabel
# Disk partitioning information
part pv.547 --fstype="lvmpv" --ondisk=sda --size=101899
part /boot --fstype="xfs" --ondisk=sda --size=500
volgroup centos --pesize=4096 pv.547
logvol swap --fstype="swap" --size=32768 --name=swap --vgname=centos
logvol / --fstype="xfs" --size=69127 --name=root --vgname=centos

%packages
@^minimal
@core
kexec-tools

%end

%addon com_redhat_kdump --enable --reserve-mb='auto'

%end

All reasonably default so far. That file was put on a local webserver, accessible to the new VM, which was told to load it from the bootloader by adding the following options to the appropriate line thus:

inst.ks=http://webserver/ks.cfg ip=10.30.110.40::10.30.110.1:255.255.255.0:perf-mgmt:eth0:none

Once it had installed and booted, useful packages were then installed using yum:

dhcp
httpd
xinetd
tftp-server
ntp
ntpdate
mlocate
tcpdump
wget
curl
rsync
[group] Development Tools

…and a few other bits & pieces.

Now we get on to the really meaty bit: turning this into a DHCP, TFTP and webserver that can build all the other blades without user interaction. See the next post!

Ooh. A Google Drive CLI tool that works properly!

In a past post I mentioned gsync, which I started using to do offsite data sync from my Linux boxes and NAS at home to Google Drive.

It worked, of sorts, but I found it wasn’t quite perfect and I struggled to get the changes I needed sorted out – I’m heavily invested in a couple of other FLOSS projects and found the time I needed to give to gsync just wasn’t there. So I stopped using it and went back to holding my breath…

Until earlier in the week I ran across rclone. Written in Go (not a language I’m familiar with at all), this looked like a useful bit of kit but I wondered how well maintained it was, whether it was already suffering API bitrot (like a number of other sync clients which have foundered in the past).

I needn’t have worried. For the first time, I was able to download, install, configure and run without patching; didn’t have to worry about MIME types, and it’s now merrily uploading almost 1 terabyte of data (which I reckon will take about two weeks due to Virgin Media’s traffic management policy).

The most interesting thing about it is the extensible support for multiple cloud storage platforms. Right now it has support for:

  • Google Drive
  • Amazon S3
  • Openstack Swift / Rackspace cloud files / Memset Memstore
  • Dropbox
  • Google Cloud Storage
  • The local filesystem

This looks like it’s going to form an exceedingly useful weapon in my work and home tech armoury. I might look into extending it towards Microsoft’s OneDrive API, if I’ve got time, because that will provide a useful “missing link” at work (where we have both Office365 and Google platforms in use alongside on-premise storage).

More news as I have it.

Hi, nice people from the Nagios forums…

The stats on my posts about Exim+Logstash have gone a bit daft in the last week or so.

It appears that a nice person on the Nagios Forums answered a question about using NLS with Exim by linking to here (thanks!).

If you’re visiting from those forums, please note that I’m more than happy to answer questions (and really rather flattered that you’re asking) but… *please* make sure you read all the posts, in order, and grab the associated examples, config files and/or patches from Github. There are links right there in the articles.

If you do that then you should be good to go. If not, ask away.