On Tue, 2024-01-23 at 08:19 -0500, Jeff Layton wrote:
On Tue, 2024-01-23 at 12:46 +0100, Sedat Dilek wrote:
On Tue, Jan 23, 2024 at 12:16 PM Jeff Layton jlayton@kernel.org wrote:
On Tue, 2024-01-23 at 07:39 +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
[a quick follow up with an important correction from the reporter for those I added to the list of recipients]
On 23.01.24 06:37, Linux regression tracking (Thorsten Leemhuis) wrote:
On 23.01.24 05:40, Paul Thompson wrote:
With my longstanding configuration, kernels upto 6.6.9 work fine. Kernels 6.6.1[0123] and 6.7.[01] all lock up in early (open-rc) init, before even the virtual filesystems are mounted.
The last thing visible on the console is the nfsclient service being started and:
Call to flock failed: Funtion not implemented. (twice)
Then the machine is unresponsive, numlock doesnt toggle the keyboard led, and the alt-sysrq chords appear to do nothing.
The problem is solved by changing my 6.6.9 config option:
# CONFIG_FILE_LOCKING is not set to CONFIG_FILE_LOCKING=y
(This option is under File Systems > Enable POSIX file locking API)
The reporter replied out-of-thread: https://lore.kernel.org/all/Za9TRtSjubbX0bVu@squish.home.loc/
""" Now I feel stupid or like Im losing it, but I went back and grepped for the CONFIG_FILE_LOCKING in my old Configs, and it was turned on in all but 6.6.9. So, somehow I turned that off *after I built 6.6.9? Argh. I just built 6.6.4 with it unset and that locked up too. Sorry if this is just noise, though one would have hoped the failure was less severe... """
Ok, so not necessarily a regression? It might be helpful to know the earliest kernel you can boot with CONFIG_FILE_LOCKING turned off.
I'll give a try reproducing this later though.
Quote from Paul: " Now I feel stupid or like Im losing it, but I went back and grepped for the CONFIG_FILE_LOCKING in my old Configs, and it was turned on in all but 6.6.9. So, somehow I turned that off *after I built 6.6.9? Argh. I just built 6.6.4 with it unset and that locked up too. Sorry if this is just noise, though one would have hoped the failure was less severe... "
-Sedat-
https://lore.kernel.org/all/Za9TRtSjubbX0bVu@squish.home.loc/#t
Ok, I can reproduce this in KVM, which should make this a bit simpler:
I tried turning off CONFIG_FILE_LOCKING on mainline kernels and it also hung for me at boot here (I think it was trying to enable the nvme disks attached to this host):
[ OK ] Reached target sysinit.target - System Initialization. [ OK ] Finished dracut-pre-mount.service - dracut pre-mount hook. [ OK ] Started plymouth-start.service - Show Plymouth Boot Screen. [ OK ] Started systemd-ask-password-plymo…quests to Plymouth Directory Watch. [ OK ] Reached target paths.target - Path Units. [ OK ] Reached target basic.target - Basic System. [ 4.647183] cryptd: max_cpu_qlen set to 1000 [ 4.650280] AVX2 version of gcm_enc/dec engaged. [ 4.651252] AES CTR mode by8 optimization enabled Starting systemd-vconsole-setup.service - Virtual Console Setup... [FAILED] Failed to start systemd-vconsole-s…up.service - Virtual Console Setup. See 'systemctl status systemd-vconsole-setup.service' for details. [ 5.777176] virtio_blk virtio3: 8/0/0 default/read/poll queues [ 5.784633] virtio_blk virtio3: [vda] 41943040 512-byte logical blocks (21.5 GB/20.0 GiB) [ 5.791351] vda: vda1 vda2 vda3 [ 5.792672] virtio_blk virtio6: 8/0/0 default/read/poll queues [ 5.801796] virtio_blk virtio6: [vdb] 209715200 512-byte logical blocks (107 GB/100 GiB) [ 5.807839] virtio_blk virtio7: 8/0/0 default/read/poll queues [ 5.813098] virtio_blk virtio7: [vdc] 209715200 512-byte logical blocks (107 GB/100 GiB) [ 5.818500] virtio_blk virtio8: 8/0/0 default/read/poll queues [ 5.823969] virtio_blk virtio8: [vdd] 209715200 512-byte logical blocks (107 GB/100 GiB) [ 5.829217] virtio_blk virtio9: 8/0/0 default/read/poll queues [ 5.834636] virtio_blk virtio9: [vde] 209715200 512-byte logical blocks (107 GB/100 GiB) [ **] Job dev-disk-by\x2duuid-5a8a135f\x2…art running (13min 46s / no limit)
The last part will just keep spinning forever.
I've gone back as far as v6.0, and I see the same behavior. I then tried changing the disks in the VM to be attached by virtio instead of NVMe, and that also didn't help.
That said, I'm using a fedora 39 cloud image here. I'm not sure it's reasonable to expect that to boot properly with file locking disabled. Paul, what distro are you running? When you say that it's hung, are you seeing similar behavior?
FWIW, I grabbed a dump of the VM's memory and took a quick look with crash. All of the tasks are either idle, or waiting in epoll. Perhaps there is some subtle dependency between epoll and CONFIG_FILE_LOCKING?
PID: 190 TASK: ffff8fa846eb3080 CPU: 7 COMMAND: "systemd-journal" #0 [ffffb5560063bd18] __schedule at ffffffffa10e8d39 #1 [ffffb5560063bd88] schedule at ffffffffa10e9491 #2 [ffffb5560063bda0] schedule_hrtimeout_range_clock at ffffffffa10eff99 #3 [ffffb5560063be10] do_epoll_wait at ffffffffa0a08106 #4 [ffffb5560063bee8] __x64_sys_epoll_wait at ffffffffa0a0872d #5 [ffffb5560063bf38] do_syscall_64 at ffffffffa10d3af4 #6 [ffffb5560063bf50] entry_SYSCALL_64_after_hwframe at ffffffffa12000e6 RIP: 00007f975753cac7 RSP: 00007ffe07ab17b8 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 000000000000001e RCX: 00007f975753cac7 RDX: 000000000000001e RSI: 000055d723ad8ca0 RDI: 0000000000000007 RBP: 00007ffe07ab18d0 R8: 000055d723ad79ac R9: 0000000000000007 R10: 00000000ffffffff R11: 0000000000000202 R12: 000055d723ad8ca0 R13: 0000000000000010 R14: 000055d723ad33b0 R15: ffffffffffffffff ORIG_RAX: 00000000000000e8 CS: 0033 SS: 002b
Whether this is a regression or not, a lot of userland software relies on file locking these days. Maybe this is a good time to consider getting rid of CONFIG_FILE_LOCKING and just hardcoding it on.
By disabling it, it looks like you save 4 bytes in struct inode. I'm not sure that's worth the hassle of having to deal with the extra test matrix dimension. In a really stripped down configuration where you don't need file locking, are you likely to have a lot of inodes in core anyway?
I guess you also save a little kernel text too, but I still have to wonder if it's worth it.