-
Notifications
You must be signed in to change notification settings - Fork 20
Wrong kernel output after system update #38
Comments
There may be multiple kernels installed (not the old ones but from different packages). You want to process them all. I wonder how did you managed to have files from old kernel in |
Yes, thanks, this feature is what I missed.
Surely not by copying them manually. One possibility is that I used this pacman hook at some point which does this kind of stuff and introduced the issue. IIRC, I uninstalled the hook because I suspected it to be responsible of the issue but it persisted nonetheless (because I didn't think of cleaning up the old kernel modules). AFAIC, I don't think the way that hook works is particularly illegitimate, intrusive or "dirty" and it looks quite heavily used. There was even a topic to move it from AUR to community repo.
I concur that it may not be that easy to guess which one is the legitimate kernel, but most importantly the current behavior of sbupdate when it happens is undetermined which surely is not expected: a crash with a clear error message like "you have multiple versions of the same kernel, please remove one" would be better than having potential silent failures that may break the boot. That's why I think it is sbupdate's responsibility to either inform the user that something looks wrong or let the user pick the kernel version they want to sign or pick one version arbitrarily (the most recent one sounds to me like a reasonable trade-off). I've been struggling with this issue for a while and as the issue is inconsistent (depends on which package you upgrade) and unrelated to signature, it was quite hard to figure out sbupdate was actually involved in it. On top of that, something did go wrong in sbupdate software as it was overwriting the same kernel in one run which looks very much like a bug. |
I use exact same hook and never had issues like yours so this may be unrelated. |
After trying out, this hook does put the two consecutive kernel versions in |
I would rather know if I booted kernel that doesn't have corresponding modules in |
Their cleanup service doesn't remove the modules of the currently loaded kernel (thanks to the |
I also build my own kernels of every single release for several years and I can assure you I know what kernel version I boot because I must test them. |
I don't know about your specific workflows. As said in the first post, if sbupdate is triggered from the hook following the installation of a kernel package, it'll properly detect the latest kernel and you would fail to reproduce the issue. You need to run sbupdate manually or trigger it from another hook (e.g. upgrade systemd) after having installed the kernel. From a purely programmatic standpoint, the issue is crystal clear, easy to reproduce and requires no assumption on the user's setup. When run outside of a kernel upgrade hook, sbupdate:
This is very easy to reproduce, very easy to understand by just scrolling through the code and I just cannot believe it is the intended or desired behavior. Regardless of the reasons that lead to having multiple versions of the same kernel in /usr/lib/modules. |
Which i did hundreds of times for all that years I use both tools.
The error you showed comes from initramfs. Initramfs is created by mkinitcpio (or alternative) which copies modules from latest installed kernel (unless you claim mkinitcpio is affected by the same issue and confuses kernel versions the same way sbupdate does). Therefore it's not possible that old kernel copied by sbupdate will work with initramfs created by mkinitcpio inside the bundled image. It will always fail as you posted in first comment.
This won't work because after kernel hook restores old kernel backup it will be the latest one (based on lstat) instead of the one coming from package.
It requires assumption about the correct action to do when multiple versions of the same kernel exists on system. Even if you think the issue is obvious the correct action isn't. It's not always correct to choose the kernel with higher version (even if calculated numerically instead of alphabetically) because user may did a downgrade. It's not always correct to choose the latest timestamp because the older version may be created later as mentioned above in hook backup case. It's possible there is no correct action that will work all the time especially in irregular circumstances. The question is if scenario with multiple instances of the same kernel can be supported. I'm skeptical if it can, at least not with solutions you provided. |
@gilbsgilbs Thanks for the detailed report. sbupdate looks specifically for kernel images, like There can be multiple kernels installed, like It is expected behavior for sbupdate to process all installed kernels under the above assumptions. In your case you had two On the sbupdate side, it makes sense to display an error if multiple |
@andreyv you perfectly summed it up. Just having sbupdate raise an error in such case would look like an intended behavior and would have saved me a lot of hassle. As progressing through the issue though, I noticed that there is at least one actual real-world use-case where one would have multiple versions of the same "pkgbase" in /usr/lib/modules: kernels-modules-hooks. For users of this hook, supdate behavior is already shaddy but will very rarely cause any apparent issue. If you decide to consistently raise an error, sbupdate may just stop working for them. You can decide this is fine or take an alternative approach such as:
Let me know if you need some help. I can also submit a PR. |
This will cause more harm than good as then every single kernel-module-hook user will get it broken at every update and even OP admitted:
Currently it works without issues for 99% of time. I wholly disagree this is fine approach.
This sounds promising - in case there is more than one kernel version per pkgbase always choose one coming from package. I wonder how hard would be to implement it.
Do you mean if there are many kernels for the same pkgbase owned by actual packages? How this can happen? This wouldn't help for your case.
As explained earlier this isn't correct.
Do you mean after every kernel upgrade you will go to |
That's what I meant. You're right in saying it wouldn't have helped in my specific case (filtering kernels belonging to a package would have been sufficient). The only case I can think of where this would happen is if the user installed two kernels with the same "pkgbase" that weren't properly marked as conflicting, but I agree it sounds already dubious enough that it may be safe to just ignore that case. Though it always feels safer when a program is self-aware of its own limitations and is able to detect inconsistent states (rather than breaking the system by doing something wrong).
There may be other ways: e.g. define some rules in That said, I agreed with all your points. I tried to come up with multiple approaches to not restrain the discussion, but I think the two first are superior (leaving apart details on error handling). From the two, the first one might be the easiest to implement yet slightly less flexible and a bit more prone to breaking existing setups. Both solutions would work seamlessly with kernel-module-hook, would have prevented my original issue, shouldn't break any real-world setup and I think they'd improve sbupdate's behavior overall. |
This should be fixed in 4e6d106. You can test the |
Thanks @andreyv . Just tested it and it worked well (both with and without the hook). Very smart and clean solution, I only see upsides from it. |
Alright, thanks. |
After thinking of it a bit more, I suspect this solution implies an important security drawback related to #36. An attacker with write access to the |
See the comment in the linked issue. Previously the same conditions already applied to Probably the related warning in README should be made more prominent. |
What's the issue:
Sometimes, sbupdate outputs the wrong kernel in
/boot
partition which results in being unable to boot with an error similar to this:See this thread on BBS for more context.
How to reproduce the issue:
(don't try it unless you know what you're doing, it's annoying)
/var/lib/modules/
along with your current kernelpacman -S systemd
(I guess it'd also reproduce when running justsbupdate
command, but note it does not reproduce withpacman -S linux
)Additional info:
As I understand the problem, I think the following events happen:
Some remarks:
linux
package because it catches the new kernel file from standard input, so there's no confusion possible. This trick doesn't work when the hook triggers from a different package such as systemd./boot
partition. Am I missing something?/var/lib/modules/
could be made configurable somehow.Thanks.
The text was updated successfully, but these errors were encountered: