nixos/zfs: Improve the ZFS boot process

It turns out that the upstream systemd services that import ZFS pools contain
serious bugs. The first major problem is that importing pools fails if there
are no pools to import. The second major problem is that if a pool ends up in
/etc/zfs/zpool.cache but it disappears from the system (e.g. if you
reboot but during the reboot you unplug your ZFS-formatted USB pen drive),
then the import service will always fail and it will be impossible to get rid
of the pool from the cache (unless you manually delete the cache).

Also, the upstream service would always import all available ZFS pools every
boot, which may not be what is desired in some cases.

This commit will solve these problems in the following ways:

1. Ignore /etc/zfs/zpool.cache. This seems to be a major source of
issues, and also does not play well with NixOS's philosophy of
reproducible configurations. Instead, on every boot NixOS will try to import
the set of pools that are specified in its configuration.  This is also the
direction that upstream is moving towards.

2. Instead of trying to import all ZFS pools, only import those that are
actually necessary. NixOS will automatically determine these from the
config.fileSystems.* option. Also, the user can import any additional
pools every boot by adding them to the config.boot.zfs.extraPools
option, but this is only necessary if their filesystems are not
specified in config.fileSystems.*.

3. Added options to configure if ZFS should force-import ZFS pools. This may
currently be necessary, especially if your pools have not been correctly
imported with a proper host id configuration (which is probably true for 99% of
current NixOS ZFS users). Once host id configuration becomes mandatory when
using ZFS in NixOS and we are sure that most users have updated their
configurations and rebooted at least once, we should disable force-import by
default. Probably, this shouldn't be done before the next stable release.

WARNING: This commit may change the order in which your non-ZFS vs ZFS
filesystems are mounted.  To avoid this problem (now or in the future)
it is recommended that you set the 'mountpoint' property of your ZFS
filesystems to 'legacy', and that you manage them using
config.fileSystems, just like any other non-ZFS filesystem is usually
managed in NixOS.
This commit is contained in:
Ricardo M. Correia 2014-10-22 19:17:21 +02:00
parent 1fea5866ae
commit 12e77fdc3f
3 changed files with 179 additions and 25 deletions

View file

@ -1,11 +1,10 @@
{ config, lib, pkgs, ... }:
{ config, lib, pkgs, utils, ... }:
#
# todo:
# - crontab for scrubs, etc
# - zfs tunables
# - /etc/zfs/zpool.cache handling
with utils;
with lib;
let
@ -31,6 +30,20 @@ let
zfsAutoSnap = "${autosnapPkg}/bin/zfs-auto-snapshot";
datasetToPool = x: elemAt (splitString "/" x) 0;
fsToPool = fs: datasetToPool fs.device;
zfsFilesystems = filter (x: x.fsType == "zfs") (attrValues config.fileSystems);
isRoot = fs: fs.neededForBoot || elem fs.mountPoint [ "/" "/nix" "/nix/store" "/var" "/var/log" "/var/lib" "/etc" ];
allPools = unique ((map fsToPool zfsFilesystems) ++ cfgZfs.extraPools);
rootPools = unique (map fsToPool (filter isRoot zfsFilesystems));
dataPools = unique (filter (pool: !(elem pool rootPools)) allPools);
in
{
@ -42,24 +55,82 @@ in
default = "";
example = "0xdeadbeef";
description = ''
ZFS uses a system's hostid to determine if a storage pool (zpool) is
native to this system, and should thus be imported automatically.
ZFS uses a system's hostid to determine if a storage pool (zpool) has been
imported on this system, and can thus be used again without reimporting.
Unfortunately, this hostid can change under linux from boot to boot (by
changing network adapters, for instance). Specify a unique 32 bit hostid in
hex here for zfs to prevent getting a random hostid between boots and having to
manually import pools.
manually and forcibly reimport pools.
'';
};
boot.zfs.useGit = mkOption {
type = types.bool;
default = false;
example = true;
description = ''
Use the git version of the SPL and ZFS packages.
Note that these are unreleased versions, with less testing, and therefore
may be more unstable.
'';
boot.zfs = {
useGit = mkOption {
type = types.bool;
default = false;
example = true;
description = ''
Use the git version of the SPL and ZFS packages.
Note that these are unreleased versions, with less testing, and therefore
may be more unstable.
'';
};
extraPools = mkOption {
type = types.listOf types.str;
default = [];
example = [ "tank" "data" ];
description = ''
Name or GUID of extra ZFS pools that you wish to import during boot.
Usually this is not necessary. Instead, you should set the mountpoint property
of ZFS filesystems to <literal>legacy</literal> and add the ZFS filesystems to
NixOS's <option>fileSystems</option> option, which makes NixOS automatically
import the associated pool.
However, in some cases (e.g. if you have many filesystems) it may be preferable
to exclusively use ZFS commands to manage filesystems. If so, since NixOS/systemd
will not be managing those filesystems, you will need to specify the ZFS pool here
so that NixOS automatically imports it on every boot.
'';
};
forceImportRoot = mkOption {
type = types.bool;
default = true;
example = false;
description = ''
Forcibly import the ZFS root pool(s) during early boot.
This is enabled by default for backwards compatibility purposes, but it is highly
recommended to disable this option, as it bypasses some of the safeguards ZFS uses
to protect your ZFS pools.
If you set this option to <literal>false</literal> and NixOS subsequently fails to
boot because it cannot import the root pool, you should boot with the
<literal>zfs_force=1</literal> option as a kernel parameter (e.g. by manually
editing the kernel params in grub during boot). You should only need to do this
once.
'';
};
forceImportAll = mkOption {
type = types.bool;
default = true;
example = false;
description = ''
Forcibly import all ZFS pool(s).
This is enabled by default for backwards compatibility purposes, but it is highly
recommended to disable this option, as it bypasses some of the safeguards ZFS uses
to protect your ZFS pools.
If you set this option to <literal>false</literal> and NixOS subsequently fails to
import your non-root ZFS pool(s), you should manually import each pool with
"zpool import -f &lt;pool-name&gt;", and then reboot. You should only need to do
this once.
'';
};
};
services.zfs.autoSnapshot = {
@ -124,6 +195,13 @@ in
config = mkMerge [
(mkIf enableZfs {
assertions = [
{
assertion = !cfgZfs.forceImportAll || cfgZfs.forceImportRoot;
message = "If you enable boot.zfs.forceImportAll, you must also enable boot.zfs.forceImportRoot";
}
];
boot = {
kernelModules = [ "spl" "zfs" ] ;
extraModulePackages = [ splPkg zfsPkg ];
@ -142,10 +220,20 @@ in
cp -pdv ${zfsPkg}/lib/lib*.so* $out/lib
cp -pdv ${pkgs.zlib}/lib/lib*.so* $out/lib
'';
postDeviceCommands =
''
zpool import -f -a
'';
postDeviceCommands = concatStringsSep "\n" ([''
ZFS_FORCE="${optionalString cfgZfs.forceImportRoot "-f"}"
for o in $(cat /proc/cmdline); do
case $o in
zfs_force|zfs_force=1)
ZFS_FORCE="-f"
;;
esac
done
''] ++ (map (pool: ''
echo "importing root ZFS pool \"${pool}\"..."
zpool import -N $ZFS_FORCE "${pool}"
'') rootPools));
};
boot.loader.grub = mkIf inInitrd {
@ -159,13 +247,57 @@ in
services.udev.packages = [ zfsPkg ]; # to hook zvol naming, etc.
systemd.packages = [ zfsPkg ];
systemd.services = let
getPoolFilesystems = pool:
filter (x: x.fsType == "zfs" && (fsToPool x) == pool) (attrValues config.fileSystems);
getPoolMounts = pool:
let
mountPoint = fs: escapeSystemdPath fs.mountPoint;
in
map (x: "${mountPoint x}.mount") (getPoolFilesystems pool);
createImportService = pool:
nameValuePair "zfs-import-${pool}" {
description = "Import ZFS pool \"${pool}\"";
requires = [ "systemd-udev-settle.service" ];
after = [ "systemd-udev-settle.service" "systemd-modules-load.service" ];
wantedBy = (getPoolMounts pool) ++ [ "local-fs.target" ];
before = (getPoolMounts pool) ++ [ "local-fs.target" ];
unitConfig = {
DefaultDependencies = "no";
};
serviceConfig = {
Type = "oneshot";
RemainAfterExit = true;
};
script = ''
zpool_cmd="${zfsPkg}/sbin/zpool"
("$zpool_cmd" list "${pool}" >/dev/null) || "$zpool_cmd" import -N ${optionalString cfgZfs.forceImportAll "-f"} "${pool}"
'';
};
in listToAttrs (map createImportService dataPools) // {
"zfs-mount" = { after = [ "systemd-modules-load.service" ]; };
"zfs-share" = { after = [ "systemd-modules-load.service" ]; };
"zed" = { after = [ "systemd-modules-load.service" ]; };
};
systemd.targets."zfs-import" =
let
services = map (pool: "zfs-import-${pool}.service") dataPools;
in
{
requires = services;
after = services;
};
systemd.targets."zfs".wantedBy = [ "multi-user.target" ];
})
(mkIf enableAutoSnapshots {
systemd.services."zfs-snapshot-frequent" = {
description = "ZFS auto-snapshotting every 15 mins";
after = [ "zfs-import-scan.service" "zfs-import-cache.service" ];
after = [ "zfs-import.target" ];
serviceConfig = {
Type = "oneshot";
ExecStart = "${zfsAutoSnap} frequent ${toString cfgSnapshots.frequent}";
@ -176,7 +308,7 @@ in
systemd.services."zfs-snapshot-hourly" = {
description = "ZFS auto-snapshotting every hour";
after = [ "zfs-import-scan.service" "zfs-import-cache.service" ];
after = [ "zfs-import.target" ];
serviceConfig = {
Type = "oneshot";
ExecStart = "${zfsAutoSnap} hourly ${toString cfgSnapshots.hourly}";
@ -187,7 +319,7 @@ in
systemd.services."zfs-snapshot-daily" = {
description = "ZFS auto-snapshotting every day";
after = [ "zfs-import-scan.service" "zfs-import-cache.service" ];
after = [ "zfs-import.target" ];
serviceConfig = {
Type = "oneshot";
ExecStart = "${zfsAutoSnap} daily ${toString cfgSnapshots.daily}";
@ -198,7 +330,7 @@ in
systemd.services."zfs-snapshot-weekly" = {
description = "ZFS auto-snapshotting every week";
after = [ "zfs-import-scan.service" "zfs-import-cache.service" ];
after = [ "zfs-import.target" ];
serviceConfig = {
Type = "oneshot";
ExecStart = "${zfsAutoSnap} weekly ${toString cfgSnapshots.weekly}";
@ -209,7 +341,7 @@ in
systemd.services."zfs-snapshot-monthly" = {
description = "ZFS auto-snapshotting every month";
after = [ "zfs-import-scan.service" "zfs-import-cache.service" ];
after = [ "zfs-import.target" ];
serviceConfig = {
Type = "oneshot";
ExecStart = "${zfsAutoSnap} monthly ${toString cfgSnapshots.monthly}";

View file

@ -50,12 +50,23 @@ stdenv.mkDerivation {
enableParallelBuilding = true;
# Remove provided services as they are buggy
postInstall = ''
rm $out/etc/systemd/system/zfs-import-*.service
sed -i '/zfs-import-scan.service/d' $out/etc/systemd/system/*
for i in $out/etc/systemd/system/*; do
substituteInPlace $i --replace "zfs-import-cache.service" "zfs-import.target"
done
'';
meta = {
description = "ZFS Filesystem Linux Kernel module";
longDescription = ''
ZFS is a filesystem that combines a logical volume manager with a
Copy-On-Write filesystem with data integrity detection and repair,
snapshotting, cloning, block devices, deduplication, and more.
snapshotting, cloning, block devices, deduplication, and more.
'';
homepage = http://zfsonlinux.org/;
license = stdenv.lib.licenses.cddl;

View file

@ -47,6 +47,17 @@ stdenv.mkDerivation {
enableParallelBuilding = true;
# Remove provided services as they are buggy
postInstall = ''
rm $out/etc/systemd/system/zfs-import-*.service
sed -i '/zfs-import-scan.service/d' $out/etc/systemd/system/*
for i in $out/etc/systemd/system/*; do
substituteInPlace $i --replace "zfs-import-cache.service" "zfs-import.target"
done
'';
meta = {
description = "ZFS Filesystem Linux Kernel module";
longDescription = ''