Running a Kubernetes native x86_64 application on Raspberry Pis, and why you shouldn’t!
While this guide does work, don’t do this. Emulation is slow and Raspberry Pi CPUs (even overclocked to 2GHz) are slower!
Also, you have to use x86_64 emulation (rather than KVM) because Pis use an ARM instruction set rather than x86.
CPU benchmark for slowness
This benchmark is terrible, but gives an indication of single threaded performance
#On my work machine
time dd if=/dev/urandom of=/dev/null bs=2000000 count=100
real 0m4.264s
user 0m0.000s
sys 0m4.264s
# On a VM on k3s-005
$ time dd if=/dev/urandom of=/dev/null bs=2000000 count=100
real 0m 19.08s
user 0m 0.03s
sys 0m 18.94s
# Natively on that Pi
pi@node-005:~ $ time dd if=/dev/urandom of=/dev/null bs=2000000 count=100
real 0m2.784s
user 0m0.004s
sys 0m2.769s
Architecture
Equipment in use
- 4 * Pi 4 8GB
- SSD enclosure per Pi supporting UASP
- SSD per Pi
- Ethernet switch
- A router
- Power
- misc cables
OSs in use
- Raspbian 64 bit on the Pis
- Alpine in the x86_64 VMs
Setting up the Pis
We installed 64 bit Raspbian on the SSDs and booted from them.
Installing the dependencies
sudo apt update && sudo apt upgrade -y && sudo apt install -y qemu nmon virtinst qemu-utils qemu-system-x86 tmux vim dnsmasq-utils dnsmasq-base iptables libvirt-daemon-system
Overclocking the Pis
The Raspberry Pi doesn’t have the fastest CPU, so we overclocked it to 2GHz and over_voltage 6 to keep the warranty. Full guide https://www.seeedstudio.com/blog/2020/02/12/how-to-safely-overclock-your-raspberry-pi-4-to-2-147ghz/
Setting up the QEMU groups
groupadd libvirt-qemu
groupadd libvirt
groupadd libvirtd
useradd -g libvirt-qemu libvirt-qemu
Setting up swap (in case it’s need since 8GB RAM isn’t much)
# Create and mount swap
sudo fallocate -l 64G /swapfile && sudo chmod 600 /swapfile && sudo mkswap /swapfile && sudo swapon /swapfile
Next add an entry to fstab so that the swap is mounted on boot.
sudo su -c "echo '/swapfile swap swap defaults 0 0' >> /etc/fstab"
Next check that trim works on the PI, if it’s working you should get something similar to the following
sudo fstrim -av
/boot: 0 B (0 bytes) trimmed
/: 19.9 GiB (21294051328 bytes) trimmed
If this fails, you’ll need to follow https://www.jeffgeerling.com/blog/2020/enabling-trim-on-external-ssd-on-raspberry-pi
Due to the high volume of writes expected since the SSD will be used as RAM, configure TRIM to run frequently to prevent rapid deterioration of the SSD.
# trimming every 2 min:
sudo vi /lib/systemd/system/fstrim.timer
# change:
[Timer]
OnCalender=*:0/2
AccuracySec=0
sudo systemctl daemon-reload
Setting up the Bridge networking so the VMs can connect directly (helpful for k8s)
Largely following https://www.raspberrypi.com/documentation/computers/configuration.html#bridging
sudo su -c "echo '[NetDev]
Name=br0
Kind=bridge
' >> /etc/systemd/network/bridge-br0.netdev"
sudo su -c "echo '[Match]
Name=eth0
[Network]
Bridge=br0
' >> /etc/systemd/network/bridge-br0.netdev"
sudo systemctl enable systemd-networkd
Next ensure that eth0 is on the bridge network, and it’s the bridge that does dhcp rather than other interfaces
sudo vim /etc/dhcpcd.conf
# At top add the following, without the #
# denyinterfaces wlan0 eth0
# at the very bottom add, without the #
# interface br0
Next is to let QEMU use this bridge, largely following https://wiki.archlinux.org/title/QEMU#Bridged_networking_using_qemu-bridge-helper
sudo mkdir /etc/qemu
# Add all that into a script (run as root):
#!bin/bash
cat <<EOT > /etc/systemd/network/bridge-br0.netdev
[NetDev]
Name=br0
Kind=bridge
EOT
cat <<EOT > /etc/systemd/network/br0-member-eth0.network
[Match]
Name=eth0
[Network]
Bridge=br0
EOT
sed -i '1s/^/denyinterfaces wlan0 eth0 \n/' /etc/dhcpcd.conf
echo "interface br0" >> /etc/dhcpcd.conf
if [[ ! -d "/etc/qemu" ]]; then
mkdir /etc/qemu
fi
echo "allow br0" > /etc/qemu/bridge.conf
#then need to execute later
systemctl enable system-networkd
Running the x86 VM
For the VM we used Alpine as it’s a light OS and compatible with k3s. You can choose your download from https://alpinelinux.org/downloads/
wget https://dl-cdn.alpinelinux.org/alpine/v3.15/releases/x86_64/alpine-virt-3.15.0-x86_64.iso
Next you’ll have to create a disk image for the Alpine VM, we created a 128GB sparse image given the size of the SSDs we used.
qemu-img create -f qcow2 alpine.qcow2 128G
Now to run the VM, this command must be run from a display because it attempts to open a window. It is possible to do this in a headless fashion, but we couldn’t get that working. Each VM needs a unique mac address for the networking to work correctly
sudo qemu-system-x86_64 --name alpine-node -drive file=/home/pi/alpine.qcow2 -smp cpus=4 -m 60G,slots=4,maxmem=61G -accel tcg,thread=multi -nic bridge,br=br0,model=virtio-net-pci,mac=52:54:00:12:34:50
Exciting options, -accel tcg,thread=multi
enables the VM to use multiple host cores virtio-net-pci
allows a virtual nic that works with the bridge configured.
You should ensure each of these VMs have a unique hostname, and a static IP in your router. This will make k3s configuration much easier
Configuring k3s
For our setup, one Pi ran the k3s control plane natively (and didn’t have a VM running) and the other Pis each had a VM running. Those VMs were joined to the k3s cluster as worker nodes.
Before installing on the control node, follow https://rancher.com/docs/k3s/latest/en/advanced/#enabling-legacy-iptables-on-raspbian-buster to set up iptables correctly and https://rancher.com/docs/k3s/latest/en/advanced/#enabling-cgroups-for-raspbian-buster to set up cgroups correctly.
Before installing on the (virtual) worker nodes, follow https://rancher.com/docs/k3s/latest/en/advanced/#additional-preparation-for-alpine-linux-setup
Installing the control-plane
Unfortunately, the easiest way to install k3s with systemd is running random scripts from the internet…
sudo -i
apt install -y curl
curl -sfL https://get.k3s.io | sh -
Prevent pods running on control plane
It’s generally a bad idea to run workloads on your control plane, especially in this case as our workloads with be x86 but the control plane is ARM64.
kubectl taint nodes node-0001 node-role.kubernetes.io/master=true:NoSchedule
Adding worker nodes
Pretty simple, full guide here.
apk add curl && curl -sfL https://get.k3s.io | K3S_URL=https://node-001:6443 K3S_TOKEN=SomeTokenFromControlPlane sh -
With this, the x86_64 workers nodes should be added the cluster. Each node should have a unique hostname and be connectable on that hostname from all nodes.
Success?
If you’re successful, you should have a working k3s cluster with workloads running on x86 (slowly!)
pi@node-001:~ $ kubectl get nodes
NAME STATUS ROLES AGE VERSION
node-001 Ready control-plane,master 105m v1.21.5+k3s2
k3s-002 Ready control-plane,master 100m v1.21.5+k3s2
k3s-003 Ready control-plane,master 90m v1.21.5+k3s2
k3s-004 Ready control-plane,master 80m v1.21.5+k3s2
k3s-005 Ready control-plane,master 70m v1.21.5+k3s2
Why you shouldn’t do this
Over 1.5 cores of the Pi is used by k3s, never mind the workloads we want to run!
Here’s a simple pod which didn’t manage to start, even after 5 mins due to the CPU limitations.
pi@node-001:~ $ kubectl get pod -w
NAME READY STATUS RESTARTS AGE
kubegres-controller-manager-75b6765589-kvr97 1/2 ContainerCreating 0 4m54s
Conclusions
Don’t run x86 on Pis, especially not on kubernetes! They just aren’t fast enough (yet)