Hi,
On Thu, Aug 24, 2023 at 4:51 PM Alexander Aring aahringo@redhat.com wrote:
This patch fixes an issues when concurrent fcntl() syscalls are executing on two different gfs2 filesystems. Each gfs2 filesystem creates an DLM lockspace, it seems that VFS only allows fcntl() syscalls at one time on a per filesystem basis. However if there are two filesystems and we executing fcntl() syscalls our lookup mechanism on the global plock op list does not work anymore.
It can be reproduced with two mounted gfs2 filesystems using DLM locking. Then call stress-ng --fcntl 32 on each mount point. The kernel log will show several:
WARNING: CPU: 4 PID: 943 at fs/dlm/plock.c:574 dev_write+0x15c/0x590
because we have a sanity check if it's was really the meant original plock op when dev_write() does a lookup. This patch adds just a additional check for fsid to find the right plock op which is an indicator that the recv_list should be on a per lockspace basis and not globally defined. After this patch the sanity check never warned again that the wrong plock op was being looked up.
Cc: stable@vger.kernel.org Reported-by: Barry Marson bmarson@redhat.com Fixes: 57e2c2f2d94c ("fs: dlm: fix mismatch of plock results from userspace") Signed-off-by: Alexander Aring aahringo@redhat.com
fs/dlm/plock.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 00e1d802a81c..e6b4c1a21446 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -556,7 +556,8 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, op = plock_lookup_waiter(&info); } else { list_for_each_entry(iter, &recv_list, list) {
if (!iter->info.wait) {
if (!iter->info.wait &&
iter->info.fsid == info.fsid) { op = iter; break; }
@@ -568,8 +569,7 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, if (info.wait) WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK); else
WARN_ON(op->info.fsid != info.fsid ||
op->info.number != info.number ||
WARN_ON(op->info.number != info.number ||
Please drop this patch as I was curious where the per lockspace/filesystem locking is for fcntl(). The answer is, it does not exist... I added here also checks to compare start and end fields. The lookup does not work, even with this patch applied. It's because several fcntl() races to put something into send_list and it gets out of order and we can't assume that recv_list contains the order of fcntl() calls. We need to compare all fields to match a correct one which needs to be granted.
The reason why I probably never saw it is because those fields in my tests are always the same and we simply don't compare all fields on the sanity check.
- Alex