Hi Paul,
On Wed, 2022-09-21 at 13:42 +0200, Paul Menzel wrote:
Dear Linux folks,
Moving from Linux 5.10.113 to 5.15.69, starting Mozilla Thunderbird or Mozilla Firefox with the home on NFS, both programs get killed, and Linux 5.15.69 logs:
[ 3827.604396] BUG: unable to handle page fault for address: 000000001d473c07 [ 3827.611297] #PF: supervisor read access in kernel mode [ 3827.616452] #PF: error_code(0x0000) - not-present page [ 3827.621604] PGD 0 P4D 0 [ 3827.624152] Oops: 0000 [#1] SMP PTI [ 3827.627657] CPU: 0 PID: 2378 Comm: firefox Not tainted 5.15.69.mx64.435 #1 [ 3827.634551] Hardware name: Dell Inc. Precision Tower 3620/0MWYPT, BIOS 2.20.0 12/09/2021 [ 3827.642659] RIP: 0010:nfs_scan_commit_list+0x1e/0x100 [nfs] [ 3827.648256] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 41 57 41 56 41 55 41 54 55 53 48 83 ec 10 4c 8b 2f 48 89 3c 24 89 4c 24 0c <49> 8b 5d 00 4c 39 ef 0f 84 c3 00 00 00 48 89 f5 49 89 d6 4d 89 ef [ 3827.667057] RSP: 0018:ffffc90002097ce0 EFLAGS: 00010282 [ 3827.672294] RAX: 000000006329dcd6 RBX: ffffc90002097d60 RCX: 000000007fffffff [ 3827.679440] RDX: ffffc90002097d60 RSI: ffffc90002097d50 RDI: ffff8881d7618b38 [ 3827.686587] RBP: ffffc90002097d50 R08: 0000000000000001 R09: 0000000000000000 [ 3827.693734] R10: 0000000000000000 R11: 61c8864680b583eb R12: 0000000000000000 [ 3827.700880] R13: 000000001d473c07 R14: 0000000000000001 R15: 0000000000000000 [ 3827.708027] FS: 00007fa6141f2780(0000) GS:ffff88881dc00000(0000) knlGS:0000000000000000 [ 3827.716131] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3827.721886] CR2: 000000001d473c07 CR3: 000000012dae0006 CR4: 00000000003706f0 [ 3827.729034] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3827.736180] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3827.743328] Call Trace: [ 3827.745779] <TASK> [ 3827.747883] nfs_scan_commit+0x76/0xb0 [nfs] [ 3827.752167] __nfs_commit_inode+0x108/0x180 [nfs] [ 3827.756886] nfs_wb_all+0x59/0x110 [nfs] [ 3827.760822] nfs4_inode_return_delegation+0x58/0x90 [nfsv4] [ 3827.766413] nfs4_proc_remove+0x101/0x110 [nfsv4] [ 3827.771130] nfs_unlink+0xf5/0x2d0 [nfs] [ 3827.775065] vfs_unlink+0x10b/0x280 [ 3827.778563] do_unlinkat+0x19e/0x2c0 [ 3827.782158] __x64_sys_unlink+0x3e/0x60 [ 3827.786002] ? __x64_sys_readlink+0x1b/0x30 [ 3827.790192] do_syscall_64+0x40/0x90 [ 3827.793779] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 3827.798847] RIP: 0033:0x7fa6142e2aa7 [ 3827.802435] Code: f0 ff ff 73 01 c3 48 8b 0d be 03 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 57 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 03 0d 00 f7 d8 64 89 01 48 [ 3827.821264] RSP: 002b:00007fff37879a08 EFLAGS: 00000202 ORIG_RAX: 0000000000000057 [ 3827.828848] RAX: ffffffffffffffda RBX: 0000000080004005 RCX: 00007fa6142e2aa7 [ 3827.835997] RDX: 0000000077120e8d RSI: 00007fa614383520 RDI: 00007fa605425b88 [ 3827.843145] RBP: 00007fa605425b88 R08: 00007fff37879add R09: 0000000000000000 [ 3827.850291] R10: 00007fa614362ae0 R11: 0000000000000202 R12: 0000000077120e8d [ 3827.857439] R13: 00007fff37879add R14: 00007fa6141f26c8 R15: 0000000000000065 [ 3827.864586] </TASK> [ 3827.866776] Modules linked in: rpcsec_gss_krb5 nfsv4 nfs 8021q garp stp mrp llc amdgpu snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio i915 iommu_v2 gpu_sched drm_ttm_helper iosf_mbi ttm drm_kms_helper x86_pkg_temp_thermal kvm_intel drm kvm snd_hda_codec_hdmi intel_gtt i2c_algo_bit fb_sys_fops syscopyarea sysfillrect snd_hda_intel input_leds led_class snd_intel_dspcfg sysimgblt e1000e snd_hda_codec hid_logitech_hidpp snd_hda_core hid_logitech_dj snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi snd_pcm snd_timer uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd wmi_bmof soundcore wmi iTCO_wdt video irqbypass crc32c_intel iTCO_vendor_support nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc ip_tables x_tables unix ipv6 autofs4 [ 3827.935422] CR2: 000000001d473c07 [ 3827.938745] ---[ end trace d7dc2bc122fe8836 ]---
Does cherry-picking commit 6e176d47160c ("NFSv4: Fixes for nfs4_inode_return_delegation()") into 5.15.69 from the upstream kernel tree fix the problem?
8<--------------------------------------------------- From 6e176d47160cec8bcaa28d9aa06926d72d54237c Mon Sep 17 00:00:00 2001 From: Trond Myklebust trond.myklebust@hammerspace.com Date: Sun, 10 Oct 2021 10:58:12 +0200 Subject: [PATCH] NFSv4: Fixes for nfs4_inode_return_delegation()
We mustn't call nfs_wb_all() on anything other than a regular file. Furthermore, we can exit early when we don't hold a delegation.
Reported-by: David Wysochanski dwysocha@redhat.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com --- fs/nfs/delegation.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index 11118398f495..7c9eb679dbdb 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -755,11 +755,13 @@ int nfs4_inode_return_delegation(struct inode *inode) struct nfs_delegation *delegation;
delegation = nfs_start_delegation_return(nfsi); - /* Synchronous recall of any application leases */ - break_lease(inode, O_WRONLY | O_RDWR); - nfs_wb_all(inode); - if (delegation != NULL) + if (delegation != NULL) { + /* Synchronous recall of any application leases */ + break_lease(inode, O_WRONLY | O_RDWR); + if (S_ISREG(inode->i_mode)) + nfs_wb_all(inode); return nfs_end_delegation_return(inode, delegation, 1); + } return 0; }