-
-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs: warnIf kernel is not LTS, add LTS flag to kernels #361573
base: master
Are you sure you want to change the base?
Conversation
cc52cd8
to
1f04cdc
Compare
Looks good to me. I probably wouldn’t have set up my system with zfs last week if this warning had been there. |
I am "just" supporting a non mainline kernel, but also as ZFS user i like the change, as it makes users explicitly aware, that ZFS is likely just working stable with LTS kernel variants (as otherwise people might have to expect downgrades due to EOL dates) and if their either dont care or can handle the situation on their own, they can opt-in. I assume non mainline variants will have to set the flags on their own according to their supported variants, correct (which is totally fine for me, just asking to that i dont get it wrong)?
why that? zfs works totally fine with LTS kernels and as long as you dont have bleeding edge hardware that really requires a super recent kernel there is no issue with that IMHO. I'd judge more about the use cases and not about the kernel version, if there is not really a use case for having zfs in your setup, it might be not the best ideas anyways in the first place. |
This is getting into off-topic, but yes. My usecase is that I want checksums, snapshoting and rollbacks for an impermanence setup. Also my thinkpad is so new that it has no wifi on 6.6. |
Yes indeed. I'll let the xanmod maintainers know to set the flag accordingly ;) I'll also take a look at all the other variants that might have LTS-based kernels.
Good that you mention it: It's implied but I wasn't sure whether we should call it out explicitly that using ZFS as your filesystem is simply not a valid option if you require a kernel newer than the latest LTS due to hardware support; that there is also the option of not using ZFS. |
05341fd
to
b5bc67e
Compare
@@ -133,3 +133,8 @@ This section was moved to the [Nixpkgs manual](https://nixos.org/nixpkgs/manual# | |||
It's a common issue that the latest stable version of ZFS doesn't support the latest | |||
available Linux kernel. It is recommended to use the latest available LTS that's compatible | |||
with ZFS. Usually this is the default kernel provided by nixpkgs (i.e. `pkgs.linuxPackages`). | |||
|
|||
:::{.note} | |||
Using non-LTS releases with ZFS is *not supported* because it is likely to break. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this wording is a bit strong. It makes it seem like ZFS will malfunction with a non-LTS kernel. But the only breakage is an error at eval time, so the user can pick another kernel or otherwise address the issue.
I think it's reasonable to say NixOS "supports" any kernel+ZFS combination that doesn't fail to eval, to the extent that "support" means the kernel module should work. If I understand correctly, the "support" that's being dropped is just for those cases that would cause eval failures anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"support" generally implies that it can be expected to keep working, and in this case that's not true. Somebody could have hardware that's too new for the LTS kernel, but the only newer kernels supported by ZFS might have gone EOL and been removed from Nixpkgs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that ZFS works with the latest non-LTS kernel for like half the year and doesn't for the other half.
At install time, it might just happen to be the time of the year where the newest non-LTS kernel is supported by ZFS, so the user installs with no errors and all is good but then a few months later it's EOL'd and ZFS won't support the next kernel for at least a few months. The user is then left with a system that doesn't eval anymore which I'd consider breakage.
The problem is not how we handle the breakage induced by kernel EOLs but rather that users are not initially made aware that this breakage can happen and that it's actually very likely to happen. The purpose is mostly to set the expectations right from the beginning; inform users to do the right thing before their systems break and not have a bad awakening a few months down the line.
As @maralorn mentioned for instance, he wouldn't have gotten himself into the predicament where he will soon have to choose between filesystem or WiFi working had this warning existed when he installed NixOS on his new laptop last week and still had the choice of not using ZFS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand all that, and I agree with the general premise of this PR. I am commenting only on this one line:
Using non-LTS releases with ZFS is not supported because it is likely to break.
As a user, this line makes me think that using ZFS with a non-LTS kernel could cause crashes or data corruption on NixOS. That's not the case, since any actual incompatibility should be caught at eval. I think this line could be rephrased so that it doesn't imply more than intended.
I've used ZFS with a vendor kernel for a couple years, so I understand the pitfalls. I think we can simultaneously say that using non-LTS kernels with ZFS is not a supported config while also saying that running non-LTS kernels with ZFS can be expected to work as long as the build successfully evals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question is: what does this statement add that isn’t already said in the previous paragraph plus the new eval warning (plus the NixOS wiki)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have there been a lot of issues getting opened? If so, I haven’t seen too many. More so that folks are confused, rather than demanding that it works. “It’s not supported by upstream” is a pretty easy way to end the issue, IMO. And I think that the eval warning will go a long way to reducing confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line makes me think that using ZFS with a non-LTS kernel could cause crashes or data corruption on NixOS
Right, now I get it. Yes, that's indeed not the intention but I also didn't want to re-iterate the entire eval warning here.
We should find a wording that captures the kind of breakage that is to be expected here.
My question is: what does this statement add that isn’t already said in the previous paragraph plus the new eval warning (plus the NixOS wiki)?
Nothing. I only added it to the manual for completeness' sake.
While the Wiki is sort of official now, I feel like the manual is the more appropriate place to say "this is how it is" whereas the Wiki is more of a "this is how you do x"/"this is how you work with x".
Have there been a lot of issues getting opened? If so, I haven’t seen too many. More so that folks are confused, rather than demanding that it works.
Preventing said confusion pre-emptively is the one and only purpose.
Ack mainline kernel changes. |
LGTM, though please rebase as to not include de version upgrades of kernels. (Just merged into the main branch) |
We might actually want to extend this to other kernel modules (e.g. Nvidia would be a good candidate) as those are likely to have the same ailment. Though to their credit, Nvidia usually at least gets an update out for the next kernel before the previous one is EOL. Still, you shouldn't for example use There are OOT kernel module which only depend on extremely stable kernel interfaces that can be used with practically any kernel but most OOT modules should probably only be used/supported on LTS kernels. I'll keep this to ZFS for this PR though as it's the most prominent case. |
A rebase is not necessary/wouldn't do anything. You only still see alyssa's commits due to a quirk of the GH UI because it doesn't re-calculate the diff if master is changed. If I pushed the exact same commit ID now, GH would re-calculate against the new master and not show alyssa's commits as new changes. The merge commit would be the exact same no matter what the GH UI shows. |
|
||
:::{.warning} | ||
This is explicitly *unsupported*. Do not create issues regarding ZFS | ||
kernel support if you have enabled this option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type = lib.types.package; | ||
description = "Configured ZFS kernel module package."; | ||
}; | ||
|
||
allowNonLTS = lib.mkEnableOption '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allowNonLTS = lib.mkEnableOption '' | |
allowNonLTSKernel = lib.mkEnableOption '' |
ZFS itself has the concept of “LTS”, so IMO this should be disambiguated.
I think this is over the top. A trace warning that the combination is fragile in certain circumstances is fine, making people set an option that they'll behave well borders on bad parenting practices. Instead of telling people what not to do it would help to link people to descriptions of what they can do from where they are. |
A push would be nice nonetheless to improve clarity of what is “really” in this PR (but yes, GitHub does Git pretty badly). |
I don't really see the benefit in this. It would be cool to have metadata about which kernels are LTS or not. Otherwise I consider this fixing the symptom not the problem, and users will continue to have issues. Throughout the years various people have tried multiple options to solve the kernel support being removed for the latest zfs compatible version, and yet we have only a history of awkward workarounds and footguns. I'll kindly ask again, can we mark kernels as deprecated/insecure for a short time prior to tree removal? I'll even offer to remove the kernel from the tree once we have a zfs package that can move past it. |
@@ -30,6 +30,7 @@ let | |||
, extraPatches ? [] | |||
, rev ? "zfs-${version}" | |||
, kernelCompatible ? null | |||
, allowNonLTS ? false # Whether to allow usage with unsupported kernels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is less allow, and more perhaps warnIfNonLTS ? true
.
Without the option, though, there’s no way to turn off the warning (and it is just a warning) (and that’s all the option really does: disable the warning). Unless I’m missing something. I do think there is a bit much duplication of the documentation of “no really, you probably don’t want to do this”, and the language a bit strong, where we only need to nudge. In general I’m in favor of nudges away from non-LTS Kernels with ZFS (shouldn’t be surprising to many). Others seem to not like this, though. I wonder if it’s simpler to just add a warning or trace (including perhaps reference to docs) when the existing I do agree that old Kernels getting marked insecure for a short time would be nice for folks that want that trade-off. |
Yeah, perhaps it should just say something more to the effect of "Don't expect upstream support, you're on you're own if you choose to ignore this warning.". The intention of this is twofold though; it's also to accentuate the fact that doing this is a really bad idea.
Note that the entire purpose of this is to prevent people from getting into that situation in the first place by informing them of the pitfall. This doesn't change anything about what happens after someone fell into the pitfall; it doesn't apply to that case at all. It merely documents the fact that maintainers won't be able to help them out of it. That'd be for another PR to improve. |
Yup, that's the purpose of this PR. We just simply cannot realistically fight the underlying problem, so fighting the symptom is all we can do.
Using an EOL kernel is not at all a great solution. It's also not like this is a rare occurrence; it happens practically every year. You'd be running an EOL kernel for large portions of the year, sometimes multiple months at a time. Even if we kept EOL kernels for a bit longer as an escape hatch (which I'm also in support of), I'd want this warning to exist. I just remembered that this issue actually also bit me at one point. I may have not had to deal with migrating my gaming machine/workstation to and away from ZFS and deal with not being able to update in between had this warning existed back then. |
You are using the ZFS kernel module in combination with a non-LTS kernel: | ||
|
||
${kernel.modDirVersion} | ||
|
||
Because the development cycles of OpenZFS and the Linux kernel do not line | ||
up, historical precedent has show it to be extremely likely that this kernel | ||
will go EOL (and will therefore be removed from Nixpkgs) before an OpenZFS | ||
release is available that supports the (at that time) latest non-LTS kernel. | ||
This is the case for any kernel that is not an LTS release. | ||
|
||
See https://www.kernel.org/ for more information on LTS kernel releases. | ||
|
||
When that happens, you will *no longer be able to update* and will receive | ||
*no support* from NixOS/Nixpkgs maintainers. | ||
|
||
Please use the latest LTS kernel (`pkgs.linuxPackages`) instead if you wish | ||
to continue to use ZFS in the future in a way that NixOS/Nixpkgs supports. | ||
|
||
The ZFS package can be overridden with `allowNonLTS = true` to hide this | ||
warning if you are willing to risk running into the described issue. The | ||
NixOS option `boot.zfs.allowNonLTS` also exists for convenience. **Do not | ||
create issues regarding to ZFS kernel support if you enable this flag; it is | ||
not supported and will not be supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are using the ZFS kernel module in combination with a non-LTS kernel: | |
${kernel.modDirVersion} | |
Because the development cycles of OpenZFS and the Linux kernel do not line | |
up, historical precedent has show it to be extremely likely that this kernel | |
will go EOL (and will therefore be removed from Nixpkgs) before an OpenZFS | |
release is available that supports the (at that time) latest non-LTS kernel. | |
This is the case for any kernel that is not an LTS release. | |
See https://www.kernel.org/ for more information on LTS kernel releases. | |
When that happens, you will *no longer be able to update* and will receive | |
*no support* from NixOS/Nixpkgs maintainers. | |
Please use the latest LTS kernel (`pkgs.linuxPackages`) instead if you wish | |
to continue to use ZFS in the future in a way that NixOS/Nixpkgs supports. | |
The ZFS package can be overridden with `allowNonLTS = true` to hide this | |
warning if you are willing to risk running into the described issue. The | |
NixOS option `boot.zfs.allowNonLTS` also exists for convenience. **Do not | |
create issues regarding to ZFS kernel support if you enable this flag; it is | |
not supported and will not be supported. | |
You have enabled the ZFS kernel module in combination with a non-LTS kernel, | |
which could prevent you from upgrading for days or weeks due to misalignment | |
in upstream release cycles. This configuration may received limited support | |
accordingly. | |
Please use the latest LTS kernel (`pkgs.linuxPackages`), remove zfs, set the | |
NixOS option `boot.zfs.allowNonLTS`, or see more options at | |
https://wiki.nixos.org/wiki/ZFS |
Here's a shorter version that I think is still much too long. Let's keep it short and to the point if we're going to warn users.
I'm still not convinced about this constant warning, and may prefer users opt in explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making the warning opt-in mostly defeats the purpose though, no? Since the main target (IMO) is folks who don’t know about the problem, not folks who do but have accidentally ended up with an unexpected Kernel version.
I would say “Please use the latest LTS kernel” → “Please either use an LTS kernel (e.g. pkgs.linuxPackages
),”. Leading either disam
I would adjust last paragraph to perhaps:
Please use either the latest LTS kernel (e.g.
pkgs.linuxPackages
) or remove ZFS. This warning may be suppressed by settingboot.zfs.allowNonLTS = true
. For more options and details, see https://wiki.nixos.org/wiki/ZFS.
So, I think this is the result of a longer discussion at the past NixOS Darmstadt meetup this Monday. |
I don't see how this helps. ZFS can take arbitrarily long to support a new kernel, and often takes a long time indeed. Keeping an old kernel around for some pre-agreed amount of time is still likely to result in having no kernel newer than LTS that works with ZFS. Also, insecure stuff isn't built on Hydra. |
Folks keep asking for it 🤷♂️. I think the biggest argument against it is that unless we align on keeping it around as insecure for at least as long as it’s the latest ZFS, there may still be misalignment and folks end up in the same position. So yea, it would probably be nice some of the time for some people (to avoid having to manually copy-paste stuff or w.e. I guess). But indeed I think it’s fair to say they’re already doing unsupported or insecure things and so can put up with manually managing it themselves. I know some disagree, but that’s the reality of what ZFS upstream provides. That’s why I generally like the nudge towards using LTS only with ZFS—not doing so ends up in a special sort of hell. |
After rereading this I also tend to think that the wording here is too alarmistic. Essentially you are trying to do three things here
I would like this PR to be reduced to 1). That would also make it much less wordy. Maybe we can put one sentence into the option description. "You have been warned of this problem. nixpkgs will not keep insecure kernels around when this happens." If this is the position that the kernel maintainers have consensus on and are still getting annoyed by these requests in-spite of the warning. Words like "unsupported" and "will break" seem inappropriate to me, especially at places where they stand without 1). Using stable zfs with a stable kernel, which is officially supported by that zfs version should be supported in nixos. And telling people that their filesystem will break is very scary and while technically true if you mean "breaking eval" it is the wrong wording imo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I prefer the toned down versions.
I second the suggestion to rename the option to warnIfNonLTSKernel
.
Besides that I have a suggestion for one paragraph change.
EDIT: I should make clear that I am very much from the bikeshedding department here. I have no experience, stakes or authority to as to the merit of this PR and would like to delegate that to the maintainers of the relevant packages. I just had opinions about the wording.
Forgot to change the name, will do. |
16f3cbb
to
bb3e2d7
Compare
}, | ||
"6.12": { | ||
"version": "6.12.1", | ||
"hash": "sha256:06f6y37fi7galj001wwrq5pz3vhdl9nryydf3f4yqwnkdpcb34q1" | ||
"hash": "sha256:06f6y37fi7galj001wwrq5pz3vhdl9nryydf3f4yqwnkdpcb34q1", | ||
"lts": false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'll update itself when updating to the next 6.12 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Or whenever kernel.org actually marks it as an LTS kernel, they haven't done that on the main page yet where the update script draws its info from.)
This serves to differentiate between a kernel that is an LTS release and one that is merely a regular "stable" release. This is required for the ZFS kernel module to show a warning when not used in combination with an LTS kernel. As this is a non-breaking change, this will be backported to all supported channels to allow for simpler backports of updates.
These are all LTS kernels
These are based on 6.6.y
The lack of swift support for new kernels in OpenZFS has been a pain point for many years and there is no sign of the situation changing any time soon. This has lead to the creation of the fundamentally flawed zfs.latestCompatibleLinuxPackages and created friction when removing EOL and massively insecure kernels from Nixpkgs because users are forced to rely on them for ZFS support (https://discourse.nixos.org/t/aggressive-kernel-removal-on-eol-in-nixos/23097/). To prevent users from unknowingly running into this issue, this commit adds an eval warning, informing the user that what they're doing is very likely to cause issues down the line and that they will receive no support from us when that happens. This effectively deprecates usage of the ZFS package with non-LTS kernels. A new override option is introduced to disable the warning should the user choose to ignore it as to not annoy them. This trace warning could be upgraded to a throw/assert if it does not show the desired effect on users. I'd have preferred to have this warning only appear after e.g. broken checks but I couldn't think of a good way of implementing that. The idea to add this warning is the result of a discussion at the 2024-12-02 Darmstadt NixOS Meetup between ma27 (Linux kernel package maintainer), hexa, maralorn, a few further ZFS users and yours truly but no ZFS maintainer has been involved until proposing these patches. I myself do not use ZFS anymore (in part due to the lack of upstream support for new kernels) nor maintain it. I have made extensive use of it before though and have seen enough ZFS users run into trap.
bb3e2d7
to
084db74
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK on the LTS stuff from me as well.
|
||
Please see the evaluation warning of the ZFS package for more | ||
information. If you have never seen such a warning, this option very | ||
likely doesn't apply to you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just quote it here again? Seems a little cumbersome to just reference to it when looking something up in the manual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get that, yeah, but the thing is that it's hard to do this without duplicating the docstring; requiring updating both places.
Since this only applies to those who attempt to use such a configuration and will have seen the detailed warning, I think its fine this way.
Hm, thinking some more about this, perhaps it'd be better to only handle this in the NixOS module. A few have expressed dislike for handing it as an eval warning for the package; would handling it as a NixOS warning be better? That'd also simplify solving the issue @Ma27 just pointed out. |
So, let me quickly share a bunch of thoughts on the whole kernel removal thing: I'm heavily torn between two aspects:
The obvious part is that we have a documentation problem: I did an abysmal job in making it clear from the manual which versions are provided (on which NixOS branches and how long). I mean, I regularly fail to find the section even though I wrote it! With that in place, do people consider this enough of an escape hatch? |
Please see the individual commit messages.
This is based on #361159 as to not cause conflicts with it.
If the kernel maintainers give their ACK for the kernel part, we can merge it before the next kernel update in a separate PR should the ZFS part take longer.
Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.