Hello,
(posting again as plain text, excuse me if you already received the malformed HTML mail.)
-----"Greg KH" gregkh@linuxfoundation.org schrieb: -----
An: t.martitz@avm.de Von: "Greg KH" gregkh@linuxfoundation.org Datum: 17.08.2023 16:43 Kopie: stable@vger.kernel.org, "Al Viro" viro@zeniv.linux.org.uk Betreff: Re: proc_lseek backport request
On Thu, Aug 17, 2023 at 11:22:30AM +0200, t.martitz@avm.de wrote:
Dear stable team,
I'm asking that
commit 3f61631d47f1 ("take care to handle NULL ->proc_lseek()")
gets backported to the stable and LTS kernels down to 5.10.
Background: We are in the process of upgrading our kernels. One target kernel is based on 5.15 LTS.
Here we found that, if proc file drivers do not implement
proc_lseek,
user space crashes easily, because various library routines
internally
perform lseek(2). The crash happens in proc_reg_llseek, where it wants to jump to a NULL pointer.
We could, arguably, fix these drivers to use ".proc_lseek =
no_llseek".
But this doesn't seem like a worthwhile path forward, considering
that
latest Linux kernels (including 6.1 LTS) allow proc_lseek == NULL
again
and *remove* no_lseek. Essentially, on HEAD, it's best practice to
leave
proc_lseek == NULL. Therefore, I ask that the above procfs fix gets backported so that
our
drivers can work across all kernel versions, including latest 6.x.
For obvious technical, and legal reasons, we can not take kernel changes only for out-of-tree kernel modules, you know this :)
So sorry, no, we should not backport this change because as-is, all in-tree code works just fine, right?
The kernel is constantly being changed to support out-of-tree modules, be it kbuild changes or new EXPORT_SYMBOLs (all in-tree modules can use EXPORT_SYMBOL_GPLs right?).
Granted, such changes are typically not backported to stable (probably, haven't checked). I had hoped that you'd be less strict if we talk about a patch that's already in mainline.
But well, we'll cherry-pick this in our tree then and move on. I won't argue about this particular patch further.
Attempting to keep kernel code outside of the kernel tree is, on purpose, very expensive in time and resources. The very simple way to solve this is to get your drivers merged properly into the mainline kernel tree.
Have you submitted your drivers and had them rejected?
Most drivers affected by the above patch are delivered to us by chip vendors that we cannot post publicly without their consent. It's also not our job to get their crappy code (and it's a lot of that!) to a state that meets your quality standards. We can and do ask for mainline drivers but our influence is limited.
Also, would driver code for chips that aren't publicly available any useful for you?
There is also some in-house code affected but that "drivers" don't usually drive hardware but simply provide F!OS-specific proc interfaces (F!OS is the name of the firmware that runs on our devices). These are just software, often device or vendor specific, and not suitable for the wider kernel community. Also we don't have the resources to get our code top-notch for potential mainline inclusion (although it's usually better than the vendor code we receive).
On the positive side, we do realize that mainlining things can be a net win for us long term and we have started an internal process that allows us to selectively mainline portions of our in-house code, but it's limited by resources and therefore a slow process. See [1] for example.
[1] https://lore.kernel.org/all/20230619071444.14625-1-jnixdorf-oss@avm.de/
Have you taken advantage of the projects that are willing to take out-of-tree drivers and get them merged upstream properly for free?
I don't know about any such project. Interesting to hear they exist! Who are they?
Is there anything else preventing your code from being accepted into the upstream kernel tree that we can help with?
Thanks for the offer. I don't think you can help much. We need to get more resources internally for mainlining but we can barely keep up with maintaining our code base for our devices currently.
Best regards, Thomas Martitz
thanks,
greg k-h
I checked that this commit applies and works as expected on a board
that
runs Linux 5.15, and the observed crash goes away.
Furthermore, I investigated that the fix applies to older LTS
kernels, down
to 5.10. The lseek(2) path uses vfs_llseek() which checks for
FMODE_LSEEK. This
has been like that forever since the initial git import. However,
5.4 LTS and
older kernels do not have "struct proc_ops".
Thank you in advance.
Best regards, Thomas Martitz