Dear stable team,
I'm asking that
commit 3f61631d47f1 ("take care to handle NULL ->proc_lseek()")
gets backported to the stable and LTS kernels down to 5.10.
Background: We are in the process of upgrading our kernels. One target kernel is based on 5.15 LTS.
Here we found that, if proc file drivers do not implement proc_lseek, user space crashes easily, because various library routines internally perform lseek(2). The crash happens in proc_reg_llseek, where it wants to jump to a NULL pointer.
We could, arguably, fix these drivers to use ".proc_lseek = no_llseek". But this doesn't seem like a worthwhile path forward, considering that latest Linux kernels (including 6.1 LTS) allow proc_lseek == NULL again and *remove* no_lseek. Essentially, on HEAD, it's best practice to leave proc_lseek == NULL. Therefore, I ask that the above procfs fix gets backported so that our drivers can work across all kernel versions, including latest 6.x.
I checked that this commit applies and works as expected on a board that runs Linux 5.15, and the observed crash goes away.
Furthermore, I investigated that the fix applies to older LTS kernels, down to 5.10. The lseek(2) path uses vfs_llseek() which checks for FMODE_LSEEK. This has been like that forever since the initial git import. However, 5.4 LTS and older kernels do not have "struct proc_ops".
Thank you in advance.
Best regards, Thomas Martitz
On Thu, Aug 17, 2023 at 11:22:30AM +0200, t.martitz@avm.de wrote:
Dear stable team,
I'm asking that
commit 3f61631d47f1 ("take care to handle NULL ->proc_lseek()")
gets backported to the stable and LTS kernels down to 5.10.
Background: We are in the process of upgrading our kernels. One target kernel is based on 5.15 LTS.
Here we found that, if proc file drivers do not implement proc_lseek, user space crashes easily, because various library routines internally perform lseek(2). The crash happens in proc_reg_llseek, where it wants to jump to a NULL pointer.
We could, arguably, fix these drivers to use ".proc_lseek = no_llseek". But this doesn't seem like a worthwhile path forward, considering that latest Linux kernels (including 6.1 LTS) allow proc_lseek == NULL again and *remove* no_lseek. Essentially, on HEAD, it's best practice to leave proc_lseek == NULL. Therefore, I ask that the above procfs fix gets backported so that our drivers can work across all kernel versions, including latest 6.x.
For obvious technical, and legal reasons, we can not take kernel changes only for out-of-tree kernel modules, you know this :)
So sorry, no, we should not backport this change because as-is, all in-tree code works just fine, right?
Attempting to keep kernel code outside of the kernel tree is, on purpose, very expensive in time and resources. The very simple way to solve this is to get your drivers merged properly into the mainline kernel tree.
Have you submitted your drivers and had them rejected?
Have you taken advantage of the projects that are willing to take out-of-tree drivers and get them merged upstream properly for free?
Is there anything else preventing your code from being accepted into the upstream kernel tree that we can help with?
thanks,
greg k-h
I checked that this commit applies and works as expected on a board that runs Linux 5.15, and the observed crash goes away.
Furthermore, I investigated that the fix applies to older LTS kernels, down to 5.10. The lseek(2) path uses vfs_llseek() which checks for FMODE_LSEEK. This has been like that forever since the initial git import. However, 5.4 LTS and older kernels do not have "struct proc_ops".
Thank you in advance.
Best regards, Thomas Martitz
From: Greg KH
Sent: Thursday, August 17, 2023 3:43 PM
On Thu, Aug 17, 2023 at 11:22:30AM +0200, t.martitz@avm.de wrote:
Dear stable team,
I'm asking that
commit 3f61631d47f1 ("take care to handle NULL ->proc_lseek()")
gets backported to the stable and LTS kernels down to 5.10.
Background: We are in the process of upgrading our kernels. One target kernel is based on 5.15 LTS.
Here we found that, if proc file drivers do not implement proc_lseek, user space crashes easily, because various library routines internally perform lseek(2). The crash happens in proc_reg_llseek, where it wants to jump to a NULL pointer.
We could, arguably, fix these drivers to use ".proc_lseek = no_llseek". But this doesn't seem like a worthwhile path forward, considering that latest Linux kernels (including 6.1 LTS) allow proc_lseek == NULL again and *remove* no_lseek. Essentially, on HEAD, it's best practice to leave proc_lseek == NULL. Therefore, I ask that the above procfs fix gets backported so that our drivers can work across all kernel versions, including latest 6.x.
Wrong patch and wrong default behaviour. See d4455faccd.
All the NULL got converted to default_llseek().
Attempting to keep kernel code outside of the kernel tree is, on purpose, very expensive in time and resources. The very simple way to solve this is to get your drivers merged properly into the mainline kernel tree.
I've got some of those, you really wouldn't want them. They are audio/telephony drivers for some very specific hardware.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Thu, Aug 17, 2023 at 03:42:02PM +0000, David Laight wrote:
Attempting to keep kernel code outside of the kernel tree is, on purpose, very expensive in time and resources. The very simple way to solve this is to get your drivers merged properly into the mainline kernel tree.
I've got some of those, you really wouldn't want them.
I would argue we do, but if you want to spend the extra time and money keeping them out of the kernel, that's your choice to make.
They are audio/telephony drivers for some very specific hardware.
As are almost all drivers in the kernel. Your hardware is special and unique, just like all others :)
thanks,
greg k-h
Hello,
(posting again as plain text, excuse me if you already received the malformed HTML mail.)
-----"Greg KH" gregkh@linuxfoundation.org schrieb: -----
An: t.martitz@avm.de Von: "Greg KH" gregkh@linuxfoundation.org Datum: 17.08.2023 16:43 Kopie: stable@vger.kernel.org, "Al Viro" viro@zeniv.linux.org.uk Betreff: Re: proc_lseek backport request
On Thu, Aug 17, 2023 at 11:22:30AM +0200, t.martitz@avm.de wrote:
Dear stable team,
I'm asking that
commit 3f61631d47f1 ("take care to handle NULL ->proc_lseek()")
gets backported to the stable and LTS kernels down to 5.10.
Background: We are in the process of upgrading our kernels. One target kernel is based on 5.15 LTS.
Here we found that, if proc file drivers do not implement
proc_lseek,
user space crashes easily, because various library routines
internally
perform lseek(2). The crash happens in proc_reg_llseek, where it wants to jump to a NULL pointer.
We could, arguably, fix these drivers to use ".proc_lseek =
no_llseek".
But this doesn't seem like a worthwhile path forward, considering
that
latest Linux kernels (including 6.1 LTS) allow proc_lseek == NULL
again
and *remove* no_lseek. Essentially, on HEAD, it's best practice to
leave
proc_lseek == NULL. Therefore, I ask that the above procfs fix gets backported so that
our
drivers can work across all kernel versions, including latest 6.x.
For obvious technical, and legal reasons, we can not take kernel changes only for out-of-tree kernel modules, you know this :)
So sorry, no, we should not backport this change because as-is, all in-tree code works just fine, right?
The kernel is constantly being changed to support out-of-tree modules, be it kbuild changes or new EXPORT_SYMBOLs (all in-tree modules can use EXPORT_SYMBOL_GPLs right?).
Granted, such changes are typically not backported to stable (probably, haven't checked). I had hoped that you'd be less strict if we talk about a patch that's already in mainline.
But well, we'll cherry-pick this in our tree then and move on. I won't argue about this particular patch further.
Attempting to keep kernel code outside of the kernel tree is, on purpose, very expensive in time and resources. The very simple way to solve this is to get your drivers merged properly into the mainline kernel tree.
Have you submitted your drivers and had them rejected?
Most drivers affected by the above patch are delivered to us by chip vendors that we cannot post publicly without their consent. It's also not our job to get their crappy code (and it's a lot of that!) to a state that meets your quality standards. We can and do ask for mainline drivers but our influence is limited.
Also, would driver code for chips that aren't publicly available any useful for you?
There is also some in-house code affected but that "drivers" don't usually drive hardware but simply provide F!OS-specific proc interfaces (F!OS is the name of the firmware that runs on our devices). These are just software, often device or vendor specific, and not suitable for the wider kernel community. Also we don't have the resources to get our code top-notch for potential mainline inclusion (although it's usually better than the vendor code we receive).
On the positive side, we do realize that mainlining things can be a net win for us long term and we have started an internal process that allows us to selectively mainline portions of our in-house code, but it's limited by resources and therefore a slow process. See [1] for example.
[1] https://lore.kernel.org/all/20230619071444.14625-1-jnixdorf-oss@avm.de/
Have you taken advantage of the projects that are willing to take out-of-tree drivers and get them merged upstream properly for free?
I don't know about any such project. Interesting to hear they exist! Who are they?
Is there anything else preventing your code from being accepted into the upstream kernel tree that we can help with?
Thanks for the offer. I don't think you can help much. We need to get more resources internally for mainlining but we can barely keep up with maintaining our code base for our devices currently.
Best regards, Thomas Martitz
thanks,
greg k-h
I checked that this commit applies and works as expected on a board
that
runs Linux 5.15, and the observed crash goes away.
Furthermore, I investigated that the fix applies to older LTS
kernels, down
to 5.10. The lseek(2) path uses vfs_llseek() which checks for
FMODE_LSEEK. This
has been like that forever since the initial git import. However,
5.4 LTS and
older kernels do not have "struct proc_ops".
Thank you in advance.
Best regards, Thomas Martitz
On Mon, Aug 21, 2023 at 08:28:44AM +0200, t.martitz@avm.de wrote:
Attempting to keep kernel code outside of the kernel tree is, on purpose, very expensive in time and resources. The very simple way to solve this is to get your drivers merged properly into the mainline kernel tree.
Have you submitted your drivers and had them rejected?
Most drivers affected by the above patch are delivered to us by chip vendors that we cannot post publicly without their consent.
As it's all GPLv2 code, you don't need their "consent" to post the code publicly, in fact, you are obligated to do so :)
It's also not our job to get their crappy code (and it's a lot of that!) to a state that meets your quality standards. We can and do ask for mainline drivers but our influence is limited.
You can write this into your contract in order to pick their chips. That's how this was resolved decades ago for scsi and ethernet chips/drivers, you have more influence here than you might think.
Also, would driver code for chips that aren't publicly available any useful for you?
Of course it would, it's available for someone, right?
There is also some in-house code affected but that "drivers" don't usually drive hardware but simply provide F!OS-specific proc interfaces (F!OS is the name of the firmware that runs on our devices). These are just software, often device or vendor specific, and not suitable for the wider kernel community. Also we don't have the resources to get our code top-notch for potential mainline inclusion (although it's usually better than the vendor code we receive).
As stated many times before, by many companies, you will save time and money if you get your code merged upstream. If you have time and money to burn (like nvidia), then sure, keep the code out of the kernel tree, it's your choice.
On the positive side, we do realize that mainlining things can be a net win for us long term and we have started an internal process that allows us to selectively mainline portions of our in-house code, but it's limited by resources and therefore a slow process. See [1] for example.
[1] https://lore.kernel.org/all/20230619071444.14625-1-jnixdorf-oss@avm.de/
That's great!
Have you taken advantage of the projects that are willing to take out-of-tree drivers and get them merged upstream properly for free?
I don't know about any such project. Interesting to hear they exist! Who are they?
The old "driverdevel" mailing list would do this, but that got removed many years ago when companies stopped needing this. If you are interested, email me off-list and we can take it from there.
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org