Notes After Migrating to BTRFS

In 2022 I wrote that I wanted to migrate to BTRFS on my old, trusty laptop. Since then I have migrated/installed BTRFS on many personal computers for me, my family and on servers. Here are some of my notes which I made along the process and which maybe will make someone’s life easier. I always refer to them when I setup BTRFS on a new system.

Data Migration

I have migrated from Ext4 on top of LVM on top of LUKS to BTRFS on top of LUKS.
Just in case I have made ordinary backup to devices which were not involved in the migration process.
To migrate I used external HDD which I connected via USB3. Procedure is dead simple:
- boot into live CD
  - I recommend using Debian Live KDE
  - For Debian Live systems the username is user and password live. UID is 1000.
- mount file systems: home partition under /mnt/home, root under /mnt/r and external HDD under /mnt/usb
- remove unnecessary trash from /mnt/home to speed up data migration
- clone the whole file system to the external drive:
```
mkdir /mnt/usb/{root,home}
rsync -axHAWXS --numeric-ids --info=progress2 /mnt/r/ /mnt/usb/root
```
  - Note slashes at the end of rsync’s paths which change the semantics of the command. Repeat for /mnt/home. This will end up with separate directories for rootfs and home, which isn’t strictly necessary, but I like this separation.
  - For ~120 GB of data cloning took ~30 minutes (and then additional 30 minutes for cloning back)
  - Copying data from Ext4 to BTRFS often freezed and slowed down. It didn’t matter which command was used (rsync, cp).
    - After migration I made some tests and copying of ~30 GB of data from Ext4 to BTRFS froze whole system. It eventually unfroze.
    - I don’t know the reason of freezes.
    - It’s best to not plan on doing anything in parallel when copying is in progress.
- re-partition disks:
  - KDE has wonderful partition GUI program, much better than gparted, which can create encrypted partitions
  - umount /mnt/{r,home}, close luks container
  - remove root and home partitions from the internal drive. Leave /boot and /boot/efi partitions intact.
  - create a new BTRFS partition, optionally encrypted. Leave space for swap behind it if you wish
    - BTRFS has problems with swap files in multi-disk setups, so if you have many drives in your PC and plan to use swap, it’s better to create a swap partition.
  - command line version of creating luks encrypted container: cryptsetup luksFormat /dev/nvme0n1p3
    - I think it’s possible to reuse the old LUKS partition by mounting it and running wipefs on /dev/mapper/. I didn’t see a reason to do that and simply created new ones.
- create BTRFS subvolumes:
  - I prefer flat layout of subvolumes, which I find clearer and more explicit than subvolumes nested under a root subvolume. I create subvolumes for /home, /opt, /srv, /tmp, /usr/local and /var, because they all have files (either user data or important runtime data) which shouldn’t be snapshoted. This prevents data loss.
```
cryptsetup luksOpen /dev/nvme0n1p3 crypt_nvme0n1p3
mount /dev/mapper/crypt_nvme0n1p3 /mnt/r
for subvol in @rootfs @home @opt @srv @tmp @usr_local @var; do
    btrfs subvolume create /mnt/r/${subvol}
done

btrfs subvolume create /mnt/r/@snapshots
```
  - we’re manually opening a luks container under a desired name. We’ll use this name in /etc/crypttab later on
  - This follows openSUSE recommendations for subvolume layout.
  - 2024-04-24: I usually don’t create a subvolume for /root, because I don’t use root account directly and don’t store any files there
  - Default subvolume is named @rootfs because I have read somewhere that this is how Debian calls its own default subvolume. I figured that this will make it more compatible with features which Debian cooks for BTRFS.
  - I’m creating additional subvolume for snapper snapshots
- umount root btrfs and mount subvolumes to their target hierarchy:
```
mount /dev/mapper/crypt_nvme0n1p3 /mnt/r -o subvol=@rootfs,compress=zstd,noatime
for subvol in home opt srv tmp usr_local var; do
    d=$(echo ${subvol} | sed 's|_|/|')
    mkdir -p /mnt/r/${d}
    mount /dev/mapper/crypt_nvme0n1p3 /mnt/r/${d} -o subvol=@${subvol},compress=zstd,noatime
done
```
- clone data back:
```
rsync -axHAWXS --numeric-ids --info=progress2 /mnt/usb/root/ /mnt/r
rsync -axHAWXS --numeric-ids --info=progress2 /mnt/usb/home/ /mnt/r
```
- in case rsync had copied some devices or other files created by the system, we may (should?) remove them
```
rm -rf /mnt/r/{dev,media,mnt,proc,run,sys,tmp}/*
```
- at some point I disable copy-on-write for /var. This can (should?) be done on a running system: chattr +C /var

Adapt crypttab and fstab

New partitions mean new UUIDs which must be changed in /etc/crypttab and /etc/fstab. Alternatively, btrfstune program, which is a part of btrfs-tools, may be used to change UUIDs of offline filesystems.
- lsblk -o name,uuid
- remember to put UUIDs of decrypted partitions to /etc/fstab (the ones in /dev/mapper, not the encrypted ones)!
- similarily, put UUIDs of encrypted partitions in crypttab!
- Remember to put a correct name of luks container in crypttab. You should manually use cryptsetup luksOpen to make sure it’ll be the same as in your target system! See “create BTRFS subvolumes” step above.
- My /etc/fstab for reference:
```
UUID=...  /            btrfs  subvol=@rootfs,compress=zstd,noatime 0 1
UUID=...  /boot        ext2   defaults 0 2
UUID=...  /boot/efi    vfat   umask=0077 0 1
UUID=...  /home        btrfs  subvol=@home,compress=zstd,noatime 0 1
UUID=...  /opt         btrfs  subvol=@opt,compress=zstd,noatime 0 1
UUID=...  /srv         btrfs  subvol=@srv,compress=zstd,noatime 0 1
UUID=...  /tmp         btrfs  subvol=@tmp,compress=zstd,noatime 0 1
UUID=...  /usr/local   btrfs  subvol=@usr_local,compress=zstd,noatime 0 1
UUID=...  /var         btrfs  subvol=@var,compress=zstd,noatime 0 1
UUID=...  /.snapshots  btrfs  subvol=@snapshots,compress=zstd,noatime 0 1
UUID=...  none         swap   defaults 0 0
```
  - Fedora defaults to 0 0 for btrfs mounts, but this disables fdisk and I don’t like this.
  - discard=async is default since kernel 6.2. Apparently it helps to reduce read latencies. Add it to mount options if you use older kernel.
- when we mount BTRFS manually, it’s easy to forget about mounting it with compression enabled and compression works only for new files. To re-compress the whole system in this case, just defragment it. Example for zstd: btrfs filesystem defragment -r -v -czstd /
At some point I migrated to a bigger drive. (sidenote: With clonezilla disk-to-disk) I used KDE partition manager to resize BTRFS partition on a new drive to use it fully. KDE partition manager has good support for BTRFS and runs btrfs commands underneath to resize file system. Even though, it “ate” ~200 GB of space: partition had 930 GB and btrfs filesystem usage showed only 700 GB available. I had to resize it manually: btrfs filesystem resize max /.
In /etc/crypttab I resigned from the trick of unlocking many encrypted devices by passing a keyfile which resides on the first unlocked device. This trick is unsuitable for some RAIDs and multi-disk BTRFS setups which require all disks to properly mount a file system (sidenote: Maybe it works with noearly or some other options which I didn’t investigate.) Instead I chose to use a keyscript approach. Debian (and its derivatives I guess) ships with decrypt_keyctl script which caches a password for 60 seconds and passes it to all encrypted volumes in a configured group (which I called pw1), so if they share the same password, they’ll be automatically unlocked.
```
# rootfs
crypt_nvme0n1p3 UUID=... pw1 luks,discard,keyscript=decrypt_keyctl
# storage
crypt_sda1 UUID=... pw1 luks,discard,keyscript=decrypt_keyctl
# swap
crypt_nvme0n1p4 UUID=... pw1,luks,discard,keyscript=decrypt_keyctl
```
- You must run update-initramfs after adding a keyscript to crypttab.

Chroot

We need to update initramfs and grub to boot our system. To do it we need to mount /boot, /dev and friends and enter chroot. We already have the rootfs in place, so it’s the matter of:
```
cryptsetup luksOpen /dev/nvme0n1p4 crypt_nvme0n1p4
mount /dev/nvme0n1p2 /mnt/r/boot
mount /dev/nvme0n1p1 /mnt/r/boot/efi
mkdir -p /mnt/r/{dev,media,mnt,proc,run,sys,tmp}
for d in dev media mnt proc run sys tmp; do 
    mount --rbind /$d /mnt/r/$d
done
chroot /mnt/r

# in chroot:
update-initramfs -u -k all
update-grub
```
- In the above it was important to mount both /boot and /boot/efi
- I think that opening a container for swap partition is important. Anyway, I like to have the filesystem in state in which it’ll be normally.
- If you mess up for example names of luks containers, systemd might complain during boot and delay the whole boot for 90 seconds, when it tries to do impossible mounts.

Conversion from ext4

The following procedure worked for converting a simple ext4 filesystem on my other server in-place:
```
fsck.ext4 -f /dev/sda1
btrfs-convert /dev/sda1
```
To create a subvolume structure mentioned above I used a trick which doesn’t use any additional disk space (note that this removes /root - this is because on the server I have created a subvolume for /root. Don’t remove it otherwise!)
```
mount /dev/sda1 /mnt/r && cd /mnt/r
btrfs subvolume snapshot . @rootfs
cd @rootfs
rm -rf /home /opt /root /srv /tmp /usr/local /var
```
Now for each removed directory we can create a snapshot and remove+move all unwanted contents. For example (beware of Bashism in first line!):
```
shopt -s extglob
btrfs subvolume snapshot .. ./home
cd home
rm -rf -- !(home)
mv home/* . ; mv home/.* . ; rmdir home
```
If /boot is on btrfs after convert, remember to reinstall it:
```
grub-install /dev/sda
update-grub
```
I screwed this up initially, but btrfs-convert -r /dev/sda1 worked flawlessly.

Daily maintenance

btrfs-assistant is a well-done GUI for btrfs management. I like using it instead of raw command line. To run it:
```
QT_QPA_PLATFORM=wayland btrfs-assistant-launcher
```
btrfsmaintenance is a “setup once and forget” package for periodic maintenance of BTRFS
Its configuration is in /etc/default/btrfsmaintenance. One should edit it and then activate with systemctl restart btrfsmaintenance-refresh.service, which setups several systemd timers for tasks like scrubbing and balancing of filesystem.
I did setup monthly scrubbing and balancing on my personal laptop. I disabled defragmentation and trimming.

Snapper

I configured snapper to make auto snapshots of @rootfs (which excludes nested subvolumes). It makes them on several occasions: on boot, before and after apt upgrade and once every hour. It performs auto-cleanup of old snapshots once a day.
- snapper -c root create-config /
- snapper is a wonderful program which you set up once and then forget about them as they do their work.
- snapper lists snapshots very slowly when BTRFS quotas are enabled.
- snapper creates snapshots in a /.snapshots directory. It’s a good idea to create a root-level snapshots subvolume and mount it to /.snapshots via fstab, because it should ease recoveries from a grub (I didn’t have opportunity to test this mechanism yet). If you haven’t created it before:
```
mkdir /mnt/foo
mount -o subvolid=5 /dev/... /mnt/foo
cd /mnt/foo
btrfs subvolume create @snapshots

# and in /etc/fstab:
UUID=... /.snapshots btrfs subvol=@snapshots,compress=zstd 0 1
```
- if you had mounted @snapshots before creating snapper configuration, snapper will complain that it already exists. You must first unmount it and remove /.snapshots directory, then run snapper’s create-config command, then unmount and remove the volume it created and then re-mount our @snapshots (easy with mount -a).
- grub-btrfs adds snapshots to grub, allowing to boot into snapshots
I disabled quota feature which apparently is responsible for a lot of slow downs: btrfs quota disable <path>. I did it for all of my subvolumes.
I lowered “timeline” and “number” algorithms of snapshot cleanups to sth. like 2-4 for different times and 20 for “number”.

Fedora

Fedora (as of version 40) supports BTRFS on installer level, but the interface for disk partitioning could be improved; it wasn’t crystal clear for me what I was doing. (sidenote: Terminology issues, maybe translation - I used Polish installer.)
I found it the easiest to choose the “semi-automatic” approach. (sidenote: Sorry, I don’t remember the exact name - not automatic, not gparted-like partitioning - the middle one.) I let Fedora use the whole disk using “BTRFS scheme” (sidenote: I had to manually remove all existing partitions before I started the installer. IIRC this wasn’t the case in Fedora 39.) and then I add the new mountpoints. Fedora automatically creates subvolumes for them (at the root level) and assigns their names.
Using one of immutable variants of Fedora may remove my need for snapshots. I’ll evaluate them when I decide to move from Debian on my main machine (probably when I’ll have to change hardware).
- Fedora atomic projects are immutable, not reproducible (like NixOS), which is more important for me. But Fedora atomic don’t invent a language with horrible syntax.
- rpm-ostree - the one component to bind them all.