Hi,
On 8/16/21 10:17 AM, Jiri Kosina wrote:
On Sun, 8 Aug 2021, Denis Efremov wrote:
The patch breaks userspace implementations (e.g. fdutils) and introduces regressions in behaviour. Previously, it was possible to O_NDELAY open a floppy device with no media inserted or with write protected media without an error. Some userspace tools use this particular behavior for probing.
It's not the first time when we revert this patch. Previous revert is in commit f2791e7eadf4 (Revert "floppy: refactor open() flags handling").
This reverts commit 8a0c014cd20516ade9654fc13b51345ec58e7be8.
By reverting it you bring back the bugs that were fixed by it
I agree with you, that O_NDELAY is broken for floppies (and always been). However, just by removing O_NDELAY we break many existing tools that use it for probing and ioctl-only opens. With the patch tools fail to open the device without a diskette and try to read a diskette if there is one (this is not as fast on a real hardware as in QEMU). I think that there should be a better fix that doesn't break existing tools. It appears that people still use software that depends on O_NDELAY in floppies. Same patch was already reverted in 2016 (presumably) by the same reason.
-- e.g. the possibility to livelock mmap() on the returned fd to keep waiting on the page unlock bit forever
As far as I understand this is a problem only for syzkaller. And this is not a security issue nowadays since most distributions (I don't know exceptions) require at least "disk" group to access floppies. Do you know a link for the syzkaller reproducer?
or the functionality bug reported at [1], and likely others.
The patch starts to return -ENXIO for O_NDELAY|O_RDONLY opens and devices without a diskette. I don't think this is an expected behavior during libblkid probing.
Probably there is a better fix for [1], maybe even an additional workaround for floppies in libblkid. They already have workarounds for cdroms https://github.com/karelzak/util-linux/commit/dc30fd4383e57a0440cdb0e16ba5c4...
I started to add simple tests https://lkml.org/lkml/2021/8/18/845 However, I failed to reproduce mount bug [1], probably because I don't know how to configure cloudinit properly. I tried to reproduce a mount fail bug with open("/dev/fd0", O_NDELAY|O_RDONLY) and mount("/dev/fd0", ...) but it works. Looks like there should be something else in between...
Regards, Denis