Notes After Migrating to BTRFS
—
Last year I wrote that I wanted to migrate to BTRFS on my old, trusty laptop. I did it this weekend. Although I have bootstrap scripts, I didn’t want to reinstall OS so I simply copied my entire file system to the external hard drive and back. Here are some of my notes which I made along the process and which maybe will make someone’s life easier.
Data Migration
- I have migrated from Ext4 on top of LVM on top of LUKS to BTRFS on top of LUKS.
- Just in case I have made ordinary backup to devices which were not involved in the migration process.
- To migrate I used external HDD which I connected via USB3. It was formatted
to Ext4. Procedure was dead simple:
- clone the whole file system to the external drive
- re-partition my disks from the Live Debian Image (I used Debian
Bookworm KDE Live image on USB stick).
- KDE has wonderful partition GUI program, much better than gparted, which can create encrypted partitions
- clone data back
- For Debian Live systems the username is user and password live.
- I used rsync to clone data.
- I did it from Live image.
- I first mounted my old FS to /mnt/r and USB drive to /mnt/usb
- The extact command was:
rsync -axHAWXS --numeric-ids --info=progress2 /mnt/r/ /mnt/usb/
. Note slashes at the end of rsync’s paths which change the semantics of the command. I used the same command, but revesed paths, to clone data back.
- Copying data from Ext4 to BTRFS often freezed and slowed down. It didn’t
matter which command was used (rsync, cp).
- After migration I made some tests and copying of ~30 GB of data from Ext4 to BTRFS froze whole system. It eventually unfroze.
- I don’t know the reason of freezes.
- It’s best to not plan on doing anything in parallel when copying is in progress.
Partitioning
- I didn’t touch partition table for my main SSD because it contained separate /boot and /boot/efi partitions which I didn’t want to touch.
- I created a separate partition for each of my SSDs (I have 2). One is a
root file system, the other is storage used for interned downloads, photos
etc.
- I didn’t want to join these disks in RAID. They’re both 256 GB which is a little low these days. I know I should buy a new bigger disk and I have it on my list.
- I didn’t want to join these disks in a
single
or RAID0 array, because when one disk fails in such array, then the whole array is dead. - Instead of RAID I’m taking regular backups to my NAS and to the cloud.
- I created swap partition on my secondary SSD. Previously I used swap file, but I’ve read that BTRFS has problems with these in multi-disks setups. I’ve made it a little bigger than previous swap file, because I plan to enable hibernation one day.
- I reformated both drives to BTRFS on LUKS. KDE partition manager works fine
for this purpose. Alternatively
cryptsetup luksFormat /dev/nvme0n1p3
worked for my NVME SSD.- I think it’s possible to reuse the old LUKS partition by mounting it
and running
wipefs
on /dev/mapper/. I didn’t see a reason to do that and simply created new ones.
- I think it’s possible to reuse the old LUKS partition by mounting it
and running
- New partitions mean new UUIDs which must be changed in /etc/crypttab and
/etc/fstab. Alternatively,
btrfstune
program, which is a part ofbtrfs-tools
may be used to change UUIDs of offline filesystems. - At some point I migrated to a bigger drive.
(sidenote: With
clonezilla disk-to-disk)
I used KDE partition manager to resize BTRFS
partition on a new drive to use it fully. KDE partition manager has good
support for BTRFS and runs
btrfs
commands underneath to resize file system. Even though, it “ate” ~200 GB of space: partition had 930 GB andbtrfs filesystem usage
showed only 700 GB available. I had to resize it manually:btrfs filesystem resize max /
.
Chroot
-
I did the initial setup of BTRFS subvolumes in a chroot environment. I ran the following commands to get my FS into it:
# cryptsetup luksOpen /dev/nvme0n1p3 crypt_nvme0n1p3 # mkdir /mnt/p # mount /dev/mapper/crypt_nvme0n1p3 /mnt/r # mount /dev/nvme0n1p2 /mnt/r/boot # mount /dev/nvme0n1p1 /mnt/r/boot/efi # mkdir -p /mnt/r/{dev,media,mnt,proc,run,sys,tmp} # for d in dev media mnt proc run sys tmp; do mount --rbind /$d /mnt/r/$d; done # chroot /mnt/r
-
In the above it was important to mount both /boot and /boot/efi
- I modified /etc/crypttab and /etc/fstab to reflect changes in UUIDs of partitions.
- At the end of chroot session I updated initramfs and regenerated grub’s
configuration:
# update-initramfs -u -k all # update-grub
Mounting File System
-
In /etc/crypttab I resigned from the trick of unlocking many encrypted devices by passing a keyfile which resides on the first unlocked device. This trick is unsuitable for some RAIDs and multi-disk BTRFS setups which require all disks to properly mount a file system (sidenote: Maybe it works with
noearly
or some other options which I didn’t investigate.) Instead I chose to use a keyscript approach. Debian (and its derivatives I guess) ships withdecrypt_keyctl
script which caches a password for 60 seconds and passes it to all encrypted volumes in a configured group (which I called pw1), so if they share the same password, they’ll be automatically unlocked.# rootfs crypt_nvme0n1p3 UUID=... pw1 luks,discard,keyscript=decrypt_keyctl # storage crypt_sda1 UUID=... pw1 luks,discard,keyscript=decrypt_keyctl # swap crypt_sda2 UUID=... pw1,luks,discard,keyscript=decrypt_keyctl
- You must run
update-initramfs
after adding a keyscript to crypttab.
- You must run
-
I used the following options in /etc/fstab for BTRFS:
UUID=... / btrfs subvol=@rootfs,noatime,compress=zstd,discard=async 0 1
discard=async
is default since kernel 6.2. Apparently it helps to reduce read latencies.-
in chroot phase it’s easy to forget about mounting BTRFS file system with compression enabled and compression works only for new files. To re-compress the whole system in this case, just defragment it. Example for zstd:
btrfs filesystem defragment -r -v -czstd /
BTRFS Subvolumes
- I followed openSUSE recommendations regarding subvolume
layout: @rootfs for the root and nested subvolumes for /home, /opt,
/root, /srv, /tmp, /usr/local and /var (which also has copy-on-write
disabled:
chattr +C /var
).- BTRFS doesn’t create snapshots for nested subvolumes. It’s a feature, not a bug - above layout prevents data loss.
- 2024-04-24: I usually don’t create a subvolume for /root, because I don’t use root account directly and don’t store any files there
- Default subvolume is named @rootfs because I have read somewhere that this is how Debian calls its own default subvolume. I figured that this will make it more compatible with features which Debian cooks for BTRFS.
-
I configured snapper to make auto snapshots of @rootfs (which excludes nested subvolumes). It makes them on several occasions: on boot, before and after
apt upgrade
and once every hour. It performs auto-cleanup of old snapshots once a day.- snapper is a wonderful program which you set up once and then forget about them as they do their work.
- snapper lists snapshots very slowly when BTRFS quota is enabled.
-
snapper creates snapshots in a /.snapshots directory. It’s a good idea to create a root-level snapshots subvolume and mount it to /.snapshots via fstab, because it should ease recoveries from a grub (I didn’t have opportunity to test this mechanism yet):
mkdir /mnt/foo mount -o subvolid=5 /dev/... /mnt/foo cd /mnt/foo btrfs subvolume create @snapshots # and in /etc/fstab: UUID=... /.snapshots btrfs subvol=@snapshots,compress=zstd 0 0
-
grub-btrfs adds snapshots to grub, allowing to boot into snapshots
- btrfs-assistant is a well-done GUI for btrfs management
-
I disabled quota feature which apparently is responsible for a lot of slow downs:
btrfs quota disable <path>
. I did it for all of my subvolumes.
Conversion from ext4
-
The following procedure worked for converting a simple ext4 filesystem on my other server:
fsck.ext4 -f /dev/sda1 btrfs-convert /dev/sda1
-
To create a subvolume structure mentioned above I used a trick which doesn’t use any additional disk space:
mount /dev/sda1 /mnt/r && cd /mnt/r btrfs subvolume snapshot . @rootfs cd @rootfs rm -rf /home /hopt /root /srv /tmp /usr/local /var
Now for each removed directory we can create a snapshot and remove+move all unwanted contents. For example (beware of Bashism in first line!):
shopt -s extglob btrfs subvolume snapshot .. ./home cd home rm -rf -- !(home) mv home/* . ; mv home/.* . ; rmdir home
-
If /boot is on btrfs after convert, remember to reinstall it:
grub-install /dev/sda update-grub
-
I screwed this up initially, but
btrfs-convert -r /dev/sda1
worked flawlessly.
Daily maintenance
- btrfsmaintenance is a “setup once and forget” package for periodic maintenance of BTRFS
- Its configuration is in /etc/default/btrfsmaintenance. One should edit it
and then activate with
systemctl restart btrfsmaintenance-refresh.service
, which setups several systemd timers for tasks like scrubbing and balancing of filesystem. - I did setup monthly scrubbing and balancing on my personal laptop. I disabled defragmentation and trimming.
Fedora
- Fedora (as of version 40) supports BTRFS on installer level, but the interface for disk partitioning could be improved; it wasn’t crystal clear for me what I was doing. (sidenote: Terminology issues, maybe translation - I used Polish installer.)
- I found it the easiest to choose the “semi-automatic” approach. (sidenote: Sorry, I don’t remember the exact name - not automatic, not gparted-like partitioning - the middle one.) I let Fedora use the whole disk using “BTRFS scheme” (sidenote: I had to manually remove all existing partitions before I started the installer. IIRC this wasn’t the case in Fedora 39.) and then I add the new mountpoints. Fedora automatically creates subvolumes for them (at the root level) and assigns their names.
- Using one of immutable variants of Fedora
may remove my need for snapshots. I’ll evaluate them when I decide to move
from Debian on my main machine (probably when I’ll have to change
hardware).
- Fedora atomic projects are immutable, not reproducible (like NixOS), which is more important for me. But Fedora atomic don’t invent a language with horrible syntax.
- rpm-ostree - the one component to bind them all.