This patch fixes an issues when concurrent fcntl() syscalls are executing on two different gfs2 filesystems. Each gfs2 filesystem creates an DLM lockspace, it seems that VFS only allows fcntl() syscalls at one time on a per filesystem basis. However if there are two filesystems and we executing fcntl() syscalls our lookup mechanism on the global plock op list does not work anymore.
It can be reproduced with two mounted gfs2 filesystems using DLM locking. Then call stress-ng --fcntl 32 on each mount point. The kernel log will show several:
WARNING: CPU: 4 PID: 943 at fs/dlm/plock.c:574 dev_write+0x15c/0x590
because we have a sanity check if it's was really the meant original plock op when dev_write() does a lookup. This patch adds just a additional check for fsid to find the right plock op which is an indicator that the recv_list should be on a per lockspace basis and not globally defined. After this patch the sanity check never warned again that the wrong plock op was being looked up.
Cc: stable@vger.kernel.org Reported-by: Barry Marson bmarson@redhat.com Fixes: 57e2c2f2d94c ("fs: dlm: fix mismatch of plock results from userspace") Signed-off-by: Alexander Aring aahringo@redhat.com --- fs/dlm/plock.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 00e1d802a81c..e6b4c1a21446 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -556,7 +556,8 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, op = plock_lookup_waiter(&info); } else { list_for_each_entry(iter, &recv_list, list) { - if (!iter->info.wait) { + if (!iter->info.wait && + iter->info.fsid == info.fsid) { op = iter; break; } @@ -568,8 +569,7 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, if (info.wait) WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK); else - WARN_ON(op->info.fsid != info.fsid || - op->info.number != info.number || + WARN_ON(op->info.number != info.number || op->info.owner != info.owner || op->info.optype != info.optype);
Hi,
On Thu, Aug 24, 2023 at 4:51 PM Alexander Aring aahringo@redhat.com wrote:
This patch fixes an issues when concurrent fcntl() syscalls are executing on two different gfs2 filesystems. Each gfs2 filesystem creates an DLM lockspace, it seems that VFS only allows fcntl() syscalls at one time on a per filesystem basis. However if there are two filesystems and we executing fcntl() syscalls our lookup mechanism on the global plock op list does not work anymore.
It can be reproduced with two mounted gfs2 filesystems using DLM locking. Then call stress-ng --fcntl 32 on each mount point. The kernel log will show several:
WARNING: CPU: 4 PID: 943 at fs/dlm/plock.c:574 dev_write+0x15c/0x590
because we have a sanity check if it's was really the meant original plock op when dev_write() does a lookup. This patch adds just a additional check for fsid to find the right plock op which is an indicator that the recv_list should be on a per lockspace basis and not globally defined. After this patch the sanity check never warned again that the wrong plock op was being looked up.
Cc: stable@vger.kernel.org Reported-by: Barry Marson bmarson@redhat.com Fixes: 57e2c2f2d94c ("fs: dlm: fix mismatch of plock results from userspace") Signed-off-by: Alexander Aring aahringo@redhat.com
fs/dlm/plock.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 00e1d802a81c..e6b4c1a21446 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -556,7 +556,8 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, op = plock_lookup_waiter(&info); } else { list_for_each_entry(iter, &recv_list, list) {
if (!iter->info.wait) {
if (!iter->info.wait &&
iter->info.fsid == info.fsid) { op = iter; break; }
@@ -568,8 +569,7 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, if (info.wait) WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK); else
WARN_ON(op->info.fsid != info.fsid ||
op->info.number != info.number ||
WARN_ON(op->info.number != info.number ||
Please drop this patch as I was curious where the per lockspace/filesystem locking is for fcntl(). The answer is, it does not exist... I added here also checks to compare start and end fields. The lookup does not work, even with this patch applied. It's because several fcntl() races to put something into send_list and it gets out of order and we can't assume that recv_list contains the order of fcntl() calls. We need to compare all fields to match a correct one which needs to be granted.
The reason why I probably never saw it is because those fields in my tests are always the same and we simply don't compare all fields on the sanity check.
- Alex
Hi,
On Thu, Aug 24, 2023 at 7:22 PM Alexander Aring aahringo@redhat.com wrote: ...
The reason why I probably never saw it is because those fields in my tests are always the same and we simply don't compare all fields on the sanity check.
I need to correct some things here... the patch works, but the commit message related to some locking issues is wrong. It works to make the lookup on a per lockspace basis because dlm_controld has a per lockspace corosync handle. Corosync keeps plock_op and results in order which is necessary for the lookup mechanism here. So this mechanism is on a per lockspace basis in order, if dlm_controld would have a global corosync handle it would work on a global basis.
The issue I saw with putting more sanity checks for start/end fields does not work because DLM_PLOCK_OP_GET will manipulate fields and they can't be compared between original request and result. I ran into this issue a couple times...
- Alex
linux-stable-mirror@lists.linaro.org