Category: Tech

Raspberry Pi 4: boot from USB with Ubuntu, ZFS

Helmut Neukirchen, 18. November 2020

First steps

When I was new to Raspberry Pi, I followed these German instructions.

Running Raspberry Pi on SD card and syslog wear

The default Raspberry Pi syslog logs to the normal files system, i.e. SD card (if not using USB).
In order to log to RAM and write it to file system only when needed, you can use Log2RAM either by adding it manually or to apt. Alternativly (did not try myself), use: Zram-config.

But you anyway might want to use USB storage instead of SD card.

Booting Raspberry Pi from USB

Raspberry Pi now supports booting from USB (having installed the latest firmware does not harm: I did this by booting Raspberry OS from SD card. Note that in contrast to Raspberry Pi <4, Raspberry Pi 4 stores the firmware actually in an EEPROM, not just as a file loaded at every boot from the FAT boot partition by the GPU firmware of Raspberry Pi <4). I then dd'ed the image from SD to USB mass storage.

Booting Ubuntu from USB

If you want to go for Ubuntu (not just beta as 64 bit Raspberry Pi OS, but available as stable 64 bit and support for ZFS), the following is relevant:

Ubuntu for Raspberry Pi uses U-Boot as bootloader, however U-Boot does not support booting from USB on Raspberry Pi, only from SD card, i.e. while Ubuntu 20.04 LTS works out of the box when booting from SD card, when I dd'ed the SD card onto a USB drive, booting failed because U-Boot could not load the kernel via USB.

Luckily, the bootloader that is part of the Raspberry Pi firmware can boot Ubuntu without U-Boot. As always, a FAT format boot partition is needed that contains a couple of files in order to boot.

While U-Boot can load compressed (vmlinuz) kernel images and can load the kernel from an ext4 root filesystem, the Raspberry Pi bootloader firmware can only load uncompressed (vmlinux) kernel images and only from the FAT-based boot filesystem.

While the Ubuntu 20.04 LTS ARM64 image has a kernel on the FAT-based boot partition, it is unfortunately compressed (because the assumed U-Boot would be able to deal with it). Hence, you need to uncompress the kernel manually to allow the Raspberry Pi firmware bootloader to load and start the kernel.

In addition, it seems that the .dat and .elf files that are part of the bootstraping need to be the most recent ones.

Hence, I downloaded the whole Raspberry Pi firmware from GitHub (via the green Code button) and extracted the .dat and .elf files from the boot directory.

Finally, you need to change the config.txt by adding kernel=vmlinux and
initramfs initrd.img followkernel
(in [all] and comment-out [pi4].

That should be enough to boot Ubuntu from USB. My above steps are essentially based on https://eugenegrechko.com/blog/USB-Boot-Ubuntu-Server-20.04-on-Raspberry-Pi-4 where you find step-by-step instructions.

Note that when you do later a kernel update inside the booted Ubuntu, it might only update the kernel image in the ext4 root partition -- not in the FAT boot partition. In this case, you need to copy the kernel over. Also you need to decompress it again.
(It should be possible to automate this, to get an idea, see: https://krdesigns.com/articles/Boot-raspbian-ubuntu-20.04-official-from-SSD-without-microsd of https://medium.com/@zsmahi/make-ubuntu-server-20-04-boot-from-an-ssd-on-raspberry-pi-4-33f15c66acd4.)

If you want to use the system headless (in fact, connecting a keyboard did not produce any input in my case), you can configure the network settings via the FAT-based boot partition: https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi#3-wifi-or-ethernet

ZFS

Work in progress...

The ultimate goal is to have two drives as ZFS mirrors (RAID1) connected via USB.

(Be aware: 1. if the USB adapter claims data to have been written, that it has in fact not yet written, ZFS may fail -- just like probably any journaling-based file system; 2. USB is not as stable as SATA, so an ODROID-HV4 or a Raspberry Pi 4 compute module with PCI-based SATA might be better, or a Helios64 which might in future even have ECC RAM. But at least, Raspberry Pi has the better ecosystem and ZFS has some memory debug flag that does checksums for its RAM buffers.)

While https://www.nasbeery.de/ has some very easy script to use ZFS, it still assumes an SD card for the boot and root filesystem. It would of course be better to have everything on the USB drive (and even using RAID).

As the Raspberry Pi bootloader can only access a FAT-based boot partition, we still need a FAT-based boot partition on the USB drive. According to documentation, if the first probed USB drive does not have a boot partition, the next drive will be probed. So, it should be possible to have some sort redundancy here (but we need manually take care that both FAT-based boot partitions are synced after each kernel update to have some sort of RAID1).

As Ubuntu should be able to have the root partition on ZFS (once the Raspberry Pi firmware bootloader loaded the kernel from the FAT-based boot partition), it should be possible to use ZFS as root partition (what size? 50GB?). The remainder could then be a ZFS data pool.

Note that if one of the RAID1 drives fails and needs to be replaced, the new drive might have slightly less sectors, so it is wise to use not all available space for the ZFS data pool. If we use anyway a swap partition in addition, we could use it to utilize the remaining space (and have then on the replacement drive a slightly smaller swap partition if the replacement drive is smaller). The swap partition should not be on ZFS but a raw swap partition: Linux can either use multiple swap partitions, i.e. from all of the RAID drives -- or only use one and keep the others unused.

This means, we still partition the mass storage instead of letting ZFS use it exclusively. The Raspberry Pi bootloader understand only MBR partition format -- this might limit drive size to 2 TB.

The following web pages cover ZFS as root:

Compiling ZFS for Raspberry Pi OS (first part), switching to ZFS root (second part), including initramfs and kernel cmdline.txt

https://github.com/jrcichra/rpi-zfs-root

https://www.reddit.com/r/zfs/comments/ekl4e1/ubuntu_with_zfs_on_raspberry_pi_4/

USB adapters

I used USB to SATA adapters with the ASMedia ASM1153E as these work with Raspberry Pi and UASP.

TODO:
When checking dmesg after connecting a USB drive, it did spit out some warning that could be ignored.

TODO: performance tuning

https://www.jeffgeerling.com/blog/2020/raspberry-pi-usb-boot-uasp-trim-and-performance

https://www.jeffgeerling.com/blog/2020/raspberry-pi-usb-boot-uasp-trim-and-performance

https://www.jeffgeerling.com/blog/2020/enabling-trim-on-external-ssd-on-raspberry-pi

However that made in fact everything slower (TODO: speed testing via hdparm, dd).

Debian 10 Buster Linux on Thinkpad T14 AMD

Helmut Neukirchen, 12. November 2020

Update: the text below refers to Debian 10 "buster". Now that Debian 11 "Bullseye" has been released, which has Linux kernel 5.10, things should work out of the box. I just did a dist-upgrade from buster to bullseye which was the smoothest dist-upgrade that I ever had, i.e. no problems at all. Except that I had to do a apt-get install linux-image-amd64 to get the standard bullseye kernel (my kernel installed manually from buster did confuse VirtualBox which complained about mimssing matching kernel header source files) .

The Thinkpad T14 AMD is a very nice machine and everything works with Linux (I did not test the fingerprint reader and infrared camera, though). I opted for the T14 over the slimmer T14s, because the T14s has no full sized Ethernet port and it seems that (due to being slimmer) the cooling is not as good as with the T14.

Kernel 5.9 (or later) is a must to support all the hardware of Thinkpad T14 AMD, but the 4.x kernel used by the installer of Debian Buster is sufficient to do the installation, except that Wifi does not work, so you need an Ethernet cable connection during installation.

To get the 5.9 kernel in Debian Buster, it is at time of writing available from Sid (there are various ways to use packages from Sid in Stable -- in the simplest case, download the debs manually), and the following packages are needed:

  • linux-image-5.9*-amd64*
  • firmware-linux-free*
  • firmware-linux-nonfree*
  • firmware-misc-nonfree*
  • firmware-amd-graphics*
  • amd64-microcode* (Checking that the UEFI/BIOS is most recent before using a Microcode upgrade is recommended. But the update might be anyway blocked via /etc/modprobe.d/amd64-microcode-blacklist.conf)

In principle, the kernel header files are also nice to have, but it may involve updating to a completely new GCC from Sid:

  • linux-headers-5.9*-amd64*
  • linux-headers-5.9*-common*

Note that you will not get automatically security updates if you update the kernel manually. You may want to give APT pinning a try for having only the kernel from Sid.

Update: a buster backport of a more recent kernel is available now, install via apt-get install -t buster-backports linux-image-amd64 (assuming, you have backports configured). This support also all the above packages, including linux headers.

Note that for the Intel AX200 Wifi to work, you also need the latest firmware-iwlwifi (the one from buster-backports is enough-- the one from Sid should not be necessary):
apt-get install --target-release=buster-backports firmware-iwlwifi (assuming that backports have been configured as APT source). Initially, I had to download some files directly from Intel, but it seems that the buster-backport package now contains the missing files.

The graphics was straigthforward from Debian Stable:

  • xserver-xorg-video-amdgpu
  • firmware-amd-graphics*

I cannot remember whether I installed them explicitly or whether they are just installed because they are dependencies of the above AMDGPU package: I have a couple of mesa and DRM packages installed (see also Debian Howto on AMDGPU and the Debian Page on Video acceleration), e.g.:

  • libgl1-mesa-dri
  • libglx-mesa0
  • mesa-vulkan-drivers
  • mesa-va-drivers (for video decoding)
  • vdpau-driver-all (for video decoding)
  • libdrm-amdgpu

(The whole Linux graphics stack consists of more components, you maybe want to check..)

Check whether MESA is activated.

Also some, BIOS/UEFI tweeking was needed, e.g. Change sleep state from Windows to Linux mode (and if you use an unsigned kernel: disable secure boot).

There is one thing concerning suspend and resume, though: my Jabra USB headset stops to work after resume -- manually unplugging and plugging the USB plug solves this problem and remarkably other USB devices (keyboard and mouse) work after resume. In the end, I wrote a script that removed and install the USB module after resume and placed as file /etc/pm/sleep.d/10-usbquirk.sh (do a chmod a+x). The content is as follows:

# !/bin/bash
# USB headset not working after suspend, so try relode USB module (other USB devices work though)
case "${1}" in
hibernate)
;;

suspend)
;;

thaw)
rmmod xhci_hcd
sleep 0.5
modprobe xhci_hcd
;;

resume)
rmmod xhci_hcd
sleep 0.5
modprobe xhci_hcd
;;

*)
;;
esac

(Somehow the indentation got removed by WordPress -- but in fact indentation does not matter for the Bash script.)

I also have the UltraDock docking station: it uses the two USB-C ports of the T14 plus a third proprietary connector: AFAIK it is for the Ethernet, even though it seems that it does not simply mechanically extend the Ethernet port, but has an extra Ethernet chip built-into the docking station. And I guess that one of the USB C ports works rather in a mode where not USB C is used but the video signals are directly transmitted, so that is different from a pure USB C dock. And indeed, the dock seems to have its own MAC address (I did not figure out how to find out what the MAC address is -- probably need to connect it to my smart switch). While the BIOS has a MAC passthrough setting (that is enabled), under Linux, it gets not passed through.

Otherwise, the dock works without any extra configuration with Debian, including dock and undock (I use a 4k screen via one of the HDMI connectors of the UltraDock and a couple of USB A ports of the UltraDock).

I still dislike not having anymore the bottom dock connector used by Lenovo in the past: you cannot anymore simply "throw" the Laptop onto the dock, but need significant force now to insert the three connectors from the side and I am not sure, how many docking cycles the USB-C connectors last from a mechanical point of view.

I also tried a 4K screen (Lenovo P32p) that has USB-C for power delivery and video and even serves as USB hub as well (i.e. with one USB-C cable, it powers the laptop, transmits the video, and mouse and keyboard are attached to the screen; it even has Ethernet that will then go via the same, single USB-C cable). This works nice and replaces in fact a dock. -- However, at the beginning I needed always to reboot the system in order to make the video work via USB-C. For some reasons this problem magically disappeared.

TODO: Check Thinkpad specific packages (I have them currently not installed, and do not see any hardware support missing).

Update: There is now also Debian Wiki page on the Thinkpad T14.

Tahoma and Tahoma bold font in Wine/CrossOver

Helmut Neukirchen, 27. October 2016

Even if the free Microsoft Core fonts are installed, Tahoma is missing. A Microsoft knowledge base support entry is available to download as Tahoma32.exe, however this is a broken link. Hence, download the therein contained files (tahoma.ttf and tahomabd.ttf) from elsewhere (seems to be legal as Microsoft offered them anyway to the public), e.g. https://github.com/caarlos0/msfonts/tree/master/fonts

Copy font file to ~/.fonts directory and run fc-cache -fv

Some notes on using a Spark cluster

Helmut Neukirchen, 18. August 2016

The following notes are mainly for my personal use referring to the Spark 1.6/YARN cluster that I access, but maybe they are helpful for you as well...

Upload to HDFS

By default (=used implicitly by all HDFS operations), a HDFS paths are relative to your HDFS home directory: it needs to be created first by the administrator!

While piping through SSH should work ( cat test.txt | ssh
username@masternode "hdfs dfs -put - hadoopFoldername/" ) , it is reported to be slow -- I never checked this, but as I anyway used rather small data, I did instead an scp to the local file system of the master node and used afterwards a hdfs put:
scp localFile username@masternode
hdfs dfs -put twitterSmall.csv Twitter

Concatenate HDFS files (all inside an HDFS directory) and store in
local file system (without sorting)

hdfs dfs -getmerge HdfFolderContainingSplitResultFiles LocalFileToBeCreated

Note that Spark does not overwrite output files in HDFS by default. Either take care when you re-run jobs that the output files have been (re-)moved or you have to allow it in the Spark conf of your program:  conf.set("spark.hadoop.validateOutputSpecs","false")

Debugging

  1. See http://spark.apache.org/docs/latest/running-on-yarn.html
  2. Use spark-submit --verbose
  3. If executor processes are killed, this is mainly due to insufficient RAM (garbage collection takes too long, thus timeouts occur or simple out of memory/OOM exceptions). While you see in this case in the log of the driver on the spark-submit console  only "<span class="hljs-keyword">exit</span> code <span class="hljs-number">143</span>", the details need to be found in the logs of nodes/executors. This may not be possible via Web UI due to executor nodes being firewalled -- in this case use:
    yarn logs -applicationId application_1470137500465_0147
    (App Id tp be taken from ID columns in Cluster Web UI. Works only for completed runs, not the current run.) In these logs, you can find then / search for java.lang.OutOfMemoryError: GC overhead limit exceeded or java.lang.OutOfMemoryError: Java heap space

Performance tuning

  1. Note that due HDFS blocks size of 128 MB, by default, partitions of this size are created when reading data. To enforce a higher number of partitions/higher parallelism, use already at the file read stage the optional numberOfPartitions parameter (that also many other RDD creating operations support).
  2. Some introduction https://www.mapr.com/blog/resource-allocation-configuration-spark-yarn
    http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
    (in particular: more than 5 cores per executor is said to lead to bad HDFS throughput. Note that “executor” is not identical to “node”, thus instead of running one executor with 24 cores on one node, rather run 4 executors with 5 cores on each node or 8 executors with 3 cores! Note that then, however, the overall memory of a node needs to be divided by the numbers of executors per node, e.g. 5 BG per executor with 8 executors per node on a 40G RAM node.)
  3. Config for RAM-intensive jobs (=1 core per executor only & 1 core per node only, using 40GB heap space and 2GB overhead for Spark/Yarn itself => on each of the 38 nodes only one core is used that thus can make use of all available RAM), in addition increase timeouts and message size:
    spark-submit --conf "spark.network.timeout=300s" --conf "spark.akka.frameSize=2000" --driver-memory 30g --num-executors 38 --executor-cores 1 --conf "yarn.nodemanager.resource.cpu-vcores=1"--executor-memory 40g --conf "spark.yarn.executor.memoryOverhead=2000"--conf "spark.driver.cores=4" --conf "spark.driver.maxResultSize=0"
    (Note: not sure about the driver memory and cores: this seems to have no influence -- is it too late to set it here?)

CORBA remote object IORs in a NAT environment

Helmut Neukirchen, 23. October 2015

When running CORBA remote objects in a NAT environment (assuming Internet protocols are used), the IIOP IOR remote object references that will be created (and registered at some nameservice) will contain the private IP address (to convince yourself: dump the IOR as string and paste that string in http://www2.parc.com/istl/projects/ILU/parseIOR/). As a result, when a client outside the NAT environment looks up the IOR, it will get one containing the private IP and access to the remote object does of course not work. For the Oracle OpenJDK CORBA implementation, the following command line parameter needs to be provided to both the ORB and the JVM running at the remote object side:
-ORBServerHost PublicIPofServer

Concerning the ports:
By default, the Oracle OpenJDK is using TCP port 1049 for the activation service. You can change this port via the ORB command line parameter -port.

The port used for the CORBA Naming Service (which is automatically provided by the OpenJDK Java ORB) depends on whether orbd is started as root or as an ordinary user: when started as root, TCP port 900 is used, otherwise TCP port 1049 (because ports lower than 1024 can only be created by root). Unfortunately, TCP port 1049 is also used by the activation service as described above. Hence, a port collision (=exceptions) will occur (what a stupid design)!
In this case, let the ORB start the Naming Service e.g. on TCP port 1050:
orbd -ORBInitialPort 1050

When changing the Naming Service port from the default 900, client and server JVMs that use that Naming Service also need to know about the changed Naming Service port number: Start the JVMs with additional parameter:
java -ORBInitialPort 1050

When running client and server on different hosts, take care that they use the same Naming Service. Assuming that the Naming Service running on the server's host is used: the server will anyway use this local Naming Service, but the client needs to know the hostname of the server's Naming Service: start the client JVM with additional parameter:
java -ORBInitialHost nameserverhost

Note that in addition to these standard services (Activation and Naming), CORBA uses by default dynamically assigned TCP ports (=expect difficulties with firewalls) for all further objects such as your own remote objects that are contained in the IORs. However, you can enforce a port to be used by a servant created within a JVM using the additional parameter:
java -ORBServerPort port

Debian Linux on Thinkpad X250

Helmut Neukirchen, 4. March 2015

What I did to install Debian Linux (Jessie) on Thinkpad X250:

Booting from USB device (to install Debian) was some challenge: in particular USB 3 needed to be disabled in BIOS (maybe some more BIOS tweaks that I cannot remember anymore).

To make the Trackpoint keys work:

In BIOS, disable Touchpad (anyway a good idea to prevent accidental touches there).

Added file /etc/modprobe.d/x250.conf with content
options psmouse proto=imps

Added file /usr/share/X11/xorg.conf.d/20-thinkpad.conf with content (works only if Touchpad is disabled in BIOS)

Section "InputClass"
Identifier "Trackpoint Wheel Emulation"
MatchProduct "PPS/2 IBM TrackPoint|DualPoint Stick|Synaptics Inc. Composite TouchPad / TrackPoint|ThinkPad USB Keyboard with TrackPoint|USB Trackpoint pointing device|Composite TouchPad / TrackPoint|PS/2 Synaptics TouchPad"
MatchDevicePath "/dev/input/event*"
Option "EmulateWheel" "true"
Option "EmulateWheelButton" "2"
Option "Emulate3Buttons" "false"
Option "XAxisMapping" "6 7"
Option "YAxisMapping" "4 5"
EndSection

Also to make side button of my Logitech USB mouse act as middle button:
Added file 20-logitech-mouse-side-button.conf with content

Section "InputClass"
Identifier "Logitech mouse side button remap"
MatchProduct "Logitech USB Receiver"
MatchDevicePath "/dev/input/event*"
Option "ButtonMapping" "1 0 3 4 5 6 7 2 9 10"
EndSection

(Still sometimes Logitech mouse stops completely to work, then unplugging USB receiver from docking station works -- still need to investigate that. Update it seems that plugging in the USB receiver into another USB port (=other USB type) helps.)

I also experience sometimes that my external Dell monitor connected via DP cable and my dock sometimes blanks for half a second: a firmware update of the dock is needed, but is only available as MS Windows executable. Any hints welcome how to do this via Linux! (A BIOS update via Linux is possible and worked.)
I do not have that problem when using the DVI-D port and cable of the dock -- however for 4k resolution, DP is better than DVI!

I also had an old 1440x900 display that did not report its native resolution when connected via VGA (which btw. reports as DP2). While I might probably add some modeline to some xconfig file as I last did probably 10 years ago, I did the following:

cvt 1440 900
Then pasted the modeline generated by cvt:
xrandr --output DP2 --newmode "1440x900_60.00" 106.50 1440 1528 1672 1904 900 903 909 934 -hsync +vsync
xrandr --addmode DP2 "1440x900"
xrandr --output DP2 --mode 1440x900

Also my other display sometimes gets no recognised:

cvt 1920 1080
Then pasted the modeline generated by cvt:
xrandr --output DP2 --newmode "1920x1080_60.00" 173.00 1920 2048 2248 2576 1080 1083 1088 1120 -hsync +vsync
xrandr --addmode DP2 "1920x1080"
xrandr --output DP2 --mode 1920x1080

For getting cloned display output with KDE "Display and Monitor" configuration system setting pane, the two screens have to dragged onto each other. However, I like
the old "Size & Orientation" pane more which can be obtained by installing the kde-workspace-randr package.

Just as reminder for me: to use Gutenprint for the photoprinter: create first in CUPS (e.g. via web interface) an entry for the photoprinter so that the printer gets an own queue. Then, in Gimp, this queue can be used when setting up the photoprinter there. In case the Print with Gutenprint menu entry does not show up in Gimp, an extra package needs to be installed: IIRC for Debian it is package: gimp-gutenprint