[PATCH v6.5-rc1 1/2] fs: dlm: introduce DLM_PLOCK_FL_NO_REPLY flag

List overview All Threads
Download

newer

older

[PATCH] exfat: release s_lock...

[PATCH] tracing/histograms: Return...

Alexander Aring

13 Jul 2023 13 Jul '23

2:40 p.m.

This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm plock operation should not send a reply back. Currently this is kind of being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more meanings that it will remove all waiters for a specific nodeid/owner values in by doing a unlock operation. In case of an error in dlm user space software e.g. dlm_controld we get an reply with an error back. This cannot be matched because there is no op to match in recv_list. We filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as reply. In newer dlm_controld version it will never send a result back when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle older dlm_controld versions.

Fixes: 901025d2f319 ("dlm: make plock operation killable") Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring aahringo@redhat.com --- fs/dlm/plock.c | 23 +++++++++++++++++++---- include/uapi/linux/dlm_plock.h | 1 + 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 70a4752ed913..7fe9f4b922d3 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -96,7 +96,7 @@ static void do_unlock_close(const struct dlm_plock_info *info) op->info.end = OFFSET_MAX; op->info.owner = info->owner;

- op->info.flags |= DLM_PLOCK_FL_CLOSE; + op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY); send_op(op); }

@@ -293,7 +293,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file, op->info.owner = (__u64)(long) fl->fl_owner;

if (fl->fl_flags & FL_CLOSE) { - op->info.flags |= DLM_PLOCK_FL_CLOSE; + op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY); send_op(op); rv = 0; goto out; @@ -392,7 +392,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, spin_lock(&ops_lock); if (!list_empty(&send_list)) { op = list_first_entry(&send_list, struct plock_op, list); - if (op->info.flags & DLM_PLOCK_FL_CLOSE) + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY) list_del(&op->list); else list_move_tail(&op->list, &recv_list); @@ -407,7 +407,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, that were generated by the vfs cleaning up for a close (the process did not make an unlock call). */

- if (op->info.flags & DLM_PLOCK_FL_CLOSE) + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY) dlm_release_plock_op(op);

if (copy_to_user(u, &info, sizeof(info))) @@ -433,6 +433,21 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, if (check_version(&info)) return -EINVAL;

+ /* Some old dlm user space software will send replies back, + * even if DLM_PLOCK_FL_NO_REPLY is set (because the flag is + * new) e.g. if a error occur. We can't match them in recv_list + * because they were never be part of it. We filter it here, + * new dlm user space software will filter it in user space. + * + * In future this handling can be removed. + */ + if (info.flags & DLM_PLOCK_FL_NO_REPLY) { + pr_info("Received unexpected reply from op %d, " + "please update DLM user space software!\n", + info.optype); + return count; + } + /* * The results for waiting ops (SETLKW) can be returned in any * order, so match all fields to find the op. The results for diff --git a/include/uapi/linux/dlm_plock.h b/include/uapi/linux/dlm_plock.h index 63b6c1fd9169..8dfa272c929a 100644 --- a/include/uapi/linux/dlm_plock.h +++ b/include/uapi/linux/dlm_plock.h @@ -25,6 +25,7 @@ enum { };

#define DLM_PLOCK_FL_CLOSE 1 +#define DLM_PLOCK_FL_NO_REPLY 2

struct dlm_plock_info { __u32 version[3];

-- 2.31.1

Show replies by date

Alexander Aring

13 Jul 13 Jul

2:40 p.m.

New subject: [PATCH v6.5-rc1 2/2] fs: dlm: allow to F_SETLKW getting interrupted

This patch implements dlm plock F_SETLKW interruption feature. If the pending plock operation is not sent to user space yet it can simple be dropped out of the send_list. In case it's already being sent we need to try to remove the waiters in dlm user space tool. If it was successful a reply with DLM_PLOCK_OP_CANCEL optype instead of DLM_PLOCK_OP_LOCK comes back (flag DLM_PLOCK_FL_NO_REPLY was then being cleared in user space) to signal the cancellation was successful. If a result with optype DLM_PLOCK_OP_LOCK came back then the cancellation was not successful.

Signed-off-by: Alexander Aring aahringo@redhat.com --- fs/dlm/plock.c | 91 ++++++++++++++++++++++++---------- include/uapi/linux/dlm_plock.h | 1 + 2 files changed, 66 insertions(+), 26 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 7fe9f4b922d3..5faa428fff1a 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -29,6 +29,9 @@ struct plock_async_data {

struct plock_op { struct list_head list; +#define DLM_PLOCK_OP_FLAG_SENT 0 +#define DLM_PLOCK_OP_FLAG_INTERRUPTED 1 + unsigned long flags; int done; struct dlm_plock_info info; /* if set indicates async handling */ @@ -74,30 +77,25 @@ static void send_op(struct plock_op *op) wake_up(&send_wq); }

-/* If a process was killed while waiting for the only plock on a file, - locks_remove_posix will not see any lock on the file so it won't - send an unlock-close to us to pass on to userspace to clean up the - abandoned waiter. So, we have to insert the unlock-close when the - lock call is interrupted. */ - -static void do_unlock_close(const struct dlm_plock_info *info) +static int do_lock_cancel(const struct plock_op *orig_op) { struct plock_op *op;

op = kzalloc(sizeof(*op), GFP_NOFS); if (!op) - return; + return -ENOMEM; + + op->info = orig_op->info; + op->info.optype = DLM_PLOCK_OP_CANCEL; + op->info.flags = DLM_PLOCK_FL_NO_REPLY;

- op->info.optype = DLM_PLOCK_OP_UNLOCK; - op->info.pid = info->pid; - op->info.fsid = info->fsid; - op->info.number = info->number; - op->info.start = 0; - op->info.end = OFFSET_MAX; - op->info.owner = info->owner; - - op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY); send_op(op); + wait_event(recv_wq, (orig_op->done != 0)); + + if (orig_op->info.optype == DLM_PLOCK_OP_CANCEL) + return 0; + + return 1; }

int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file, @@ -156,7 +154,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file, send_op(op);

if (op->info.wait) { - rv = wait_event_killable(recv_wq, (op->done != 0)); + rv = wait_event_interruptible(recv_wq, (op->done != 0)); if (rv == -ERESTARTSYS) { spin_lock(&ops_lock); /* recheck under ops_lock if we got a done != 0, @@ -166,17 +164,37 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file, spin_unlock(&ops_lock); goto do_lock_wait; } - list_del(&op->list); - spin_unlock(&ops_lock); + + if (!test_bit(DLM_PLOCK_OP_FLAG_SENT, &op->flags)) { + /* part of send_list, user never saw the op */ + list_del(&op->list); + spin_unlock(&ops_lock); + rv = -EINTR; + } else { + set_bit(DLM_PLOCK_OP_FLAG_INTERRUPTED, &op->flags); + spin_unlock(&ops_lock); + rv = do_lock_cancel(op); + switch (rv) { + case 0: + rv = -EINTR; + break; + case 1: + /* cancellation wasn't successful but op is done */ + goto do_lock_wait; + default: + /* internal error doing cancel we need to wait */ + goto wait; + } + }

log_debug(ls, "%s: wait interrupted %x %llx pid %d", __func__, ls->ls_global_id, (unsigned long long)number, op->info.pid); - do_unlock_close(&op->info); dlm_release_plock_op(op); goto out; } } else { +wait: wait_event(recv_wq, (op->done != 0)); }

@@ -392,10 +410,12 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, spin_lock(&ops_lock); if (!list_empty(&send_list)) { op = list_first_entry(&send_list, struct plock_op, list); - if (op->info.flags & DLM_PLOCK_FL_NO_REPLY) + if (op->info.flags & DLM_PLOCK_FL_NO_REPLY) { list_del(&op->list); - else + } else { list_move_tail(&op->list, &recv_list); + set_bit(DLM_PLOCK_OP_FLAG_SENT, &op->flags); + } memcpy(&info, &op->info, sizeof(info)); } spin_unlock(&ops_lock); @@ -457,6 +477,27 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, spin_lock(&ops_lock); if (info.wait) { list_for_each_entry(iter, &recv_list, list) { + /* A very specific posix lock case allows two + * lock request with the same meaning by using + * threads. It makes no sense from the application + * to do such request, however it is possible. + * We need to check the state for cancellation because + * we need to know the instance which is interrupted + * if two or more of the "same" lock requests are + * in waiting state and got interrupted. + * + * TODO we should move to a instance reference from + * request and reply and not go over lock states, but + * it seems going over lock states and get the instance + * does not make any problems (at least there were no + * issues found yet) but it's much cleaner to not think + * about all possible special cases and states to instance + * has no 1:1 mapping anymore. + */ + if (info.optype == DLM_PLOCK_OP_CANCEL && + !test_bit(DLM_PLOCK_OP_FLAG_INTERRUPTED, &iter->flags)) + continue; + if (iter->info.fsid == info.fsid && iter->info.number == info.number && iter->info.owner == info.owner && @@ -480,9 +521,7 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,

if (op) { /* Sanity check that op and info match. */ - if (info.wait) - WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK); - else + if (!info.wait) WARN_ON(op->info.fsid != info.fsid || op->info.number != info.number || op->info.owner != info.owner || diff --git a/include/uapi/linux/dlm_plock.h b/include/uapi/linux/dlm_plock.h index 8dfa272c929a..9c4c083c824a 100644 --- a/include/uapi/linux/dlm_plock.h +++ b/include/uapi/linux/dlm_plock.h @@ -22,6 +22,7 @@ enum { DLM_PLOCK_OP_LOCK = 1, DLM_PLOCK_OP_UNLOCK, DLM_PLOCK_OP_GET, + DLM_PLOCK_OP_CANCEL, };

#define DLM_PLOCK_FL_CLOSE 1

-- 2.31.1

Greg KH

2:49 p.m.

On Thu, Jul 13, 2023 at 10:40:28AM -0400, Alexander Aring wrote:

...

This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm plock operation should not send a reply back. Currently this is kind of being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more meanings that it will remove all waiters for a specific nodeid/owner values in by doing a unlock operation. In case of an error in dlm user space software e.g. dlm_controld we get an reply with an error back. This cannot be matched because there is no op to match in recv_list. We filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as reply. In newer dlm_controld version it will never send a result back when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle older dlm_controld versions.

Fixes: 901025d2f319 ("dlm: make plock operation killable") Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring aahringo@redhat.com

Why is adding a new uapi a stable patch?

...

fs/dlm/plock.c | 23 +++++++++++++++++++---- include/uapi/linux/dlm_plock.h | 1 + 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 70a4752ed913..7fe9f4b922d3 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -96,7 +96,7 @@ static void do_unlock_close(const struct dlm_plock_info *info) op->info.end = OFFSET_MAX; op->info.owner = info->owner;

op->info.flags |= DLM_PLOCK_FL_CLOSE;

op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY); send_op(op);

} @@ -293,7 +293,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file, op->info.owner = (__u64)(long) fl->fl_owner; if (fl->fl_flags & FL_CLOSE) {
op->info.flags |= DLM_PLOCK_FL_CLOSE;
op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
send_op(op); rv = 0; goto out;
@@ -392,7 +392,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, spin_lock(&ops_lock); if (!list_empty(&send_list)) { op = list_first_entry(&send_list, struct plock_op, list);
if (op->info.flags & DLM_PLOCK_FL_CLOSE)
if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
list_del(&op->list);
else list_move_tail(&op->list, &recv_list);
@@ -407,7 +407,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, that were generated by the vfs cleaning up for a close (the process did not make an unlock call). */

if (op->info.flags & DLM_PLOCK_FL_CLOSE)

if (op->info.flags & DLM_PLOCK_FL_NO_REPLY) dlm_release_plock_op(op);

if (copy_to_user(u, &info, sizeof(info))) @@ -433,6 +433,21 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, if (check_version(&info)) return -EINVAL;
/* Some old dlm user space software will send replies back,
* even if DLM_PLOCK_FL_NO_REPLY is set (because the flag is
* new) e.g. if a error occur. We can't match them in recv_list
* because they were never be part of it. We filter it here,
* new dlm user space software will filter it in user space.
*
* In future this handling can be removed.
*/
if (info.flags & DLM_PLOCK_FL_NO_REPLY) {
pr_info("Received unexpected reply from op %d, "
	"please update DLM user space software!\n",
	info.optype);

Never allow userspace to spam the kernel log. And this is not going to work, you need to handle the error and at most, report this to userspace once.

Also, don't wrap your strings, checkpatch should have told you this.

thanks,

greg k-h

Alexander Aring

2:57 p.m.

Hi,

On Thu, Jul 13, 2023 at 10:49 AM Greg KH gregkh@linuxfoundation.org wrote:

...

On Thu, Jul 13, 2023 at 10:40:28AM -0400, Alexander Aring wrote:

...
This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm plock operation should not send a reply back. Currently this is kind of being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more meanings that it will remove all waiters for a specific nodeid/owner values in by doing a unlock operation. In case of an error in dlm user space software e.g. dlm_controld we get an reply with an error back. This cannot be matched because there is no op to match in recv_list. We filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as reply. In newer dlm_controld version it will never send a result back when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle older dlm_controld versions.

Fixes: 901025d2f319 ("dlm: make plock operation killable") Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring aahringo@redhat.com

Why is adding a new uapi a stable patch?

because the user space is just to copy the flags back to the kernel. I thought it would work. :)

...

...
fs/dlm/plock.c | 23 +++++++++++++++++++---- include/uapi/linux/dlm_plock.h | 1 + 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 70a4752ed913..7fe9f4b922d3 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -96,7 +96,7 @@ static void do_unlock_close(const struct dlm_plock_info *info) op->info.end = OFFSET_MAX; op->info.owner = info->owner;
op->info.flags |= DLM_PLOCK_FL_CLOSE;
op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
send_op(op);
}

@@ -293,7 +293,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file, op->info.owner = (__u64)(long) fl->fl_owner;
  if (fl->fl_flags & FL_CLOSE) {
        op->info.flags |= DLM_PLOCK_FL_CLOSE;
        op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
        send_op(op);
        rv = 0;
        goto out;
@@ -392,7 +392,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, spin_lock(&ops_lock); if (!list_empty(&send_list)) { op = list_first_entry(&send_list, struct plock_op, list);
        if (op->info.flags & DLM_PLOCK_FL_CLOSE)
        if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
                list_del(&op->list);
        else
                list_move_tail(&op->list, &recv_list);
@@ -407,7 +407,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, that were generated by the vfs cleaning up for a close (the process did not make an unlock call). */
if (op->info.flags & DLM_PLOCK_FL_CLOSE)
if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
        dlm_release_plock_op(op);

if (copy_to_user(u, &info, sizeof(info)))
@@ -433,6 +433,21 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, if (check_version(&info)) return -EINVAL;
/* Some old dlm user space software will send replies back,
 * even if DLM_PLOCK_FL_NO_REPLY is set (because the flag is
 * new) e.g. if a error occur. We can't match them in recv_list
 * because they were never be part of it. We filter it here,
 * new dlm user space software will filter it in user space.
 *
 * In future this handling can be removed.
 */
if (info.flags & DLM_PLOCK_FL_NO_REPLY) {
        pr_info("Received unexpected reply from op %d, "
                "please update DLM user space software!\n",
                info.optype);
Never allow userspace to spam the kernel log. And this is not going to work, you need to handle the error and at most, report this to userspace once.

I will ignore handling this issue for older kernels because it would probably be fine that the user space never gets an invalid value handled.

...

Also, don't wrap your strings, checkpatch should have told you this.

That is correct and I was ignoring it as the implementation has another wrapped string somewhere else. It is a warning not an error.

Will send a v2 to not wrap the string around and drop Fixes and cc stable.

- Alex

Alexander Aring

3:08 p.m.

Hi,

On Thu, Jul 13, 2023 at 10:57 AM Alexander Aring aahringo@redhat.com wrote:

...

Hi,

On Thu, Jul 13, 2023 at 10:49 AM Greg KH gregkh@linuxfoundation.org wrote:

...
On Thu, Jul 13, 2023 at 10:40:28AM -0400, Alexander Aring wrote:

...
This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm plock operation should not send a reply back. Currently this is kind of being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more meanings that it will remove all waiters for a specific nodeid/owner values in by doing a unlock operation. In case of an error in dlm user space software e.g. dlm_controld we get an reply with an error back. This cannot be matched because there is no op to match in recv_list. We filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as reply. In newer dlm_controld version it will never send a result back when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle older dlm_controld versions.

Fixes: 901025d2f319 ("dlm: make plock operation killable") Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring aahringo@redhat.com

Why is adding a new uapi a stable patch?

because the user space is just to copy the flags back to the kernel. I thought it would work. :)

* Speaking of dlm_controld here, we don't know any other implementation which uses this UAPI. If there is another user space application using it and does a different behaviour then this issue is unfixable, as we don't know what behaviour we get there.

- Alex

Andreas Gruenbacher

14 Jul 14 Jul

1:54 p.m.

On Thu, Jul 13, 2023 at 4:40 PM Alexander Aring aahringo@redhat.com wrote:

...

This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm plock operation should not send a reply back. Currently this is kind of being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more meanings that it will remove all waiters for a specific nodeid/owner values in by doing a unlock operation. In case of an error in dlm user space software e.g. dlm_controld we get an reply with an error back. This cannot be matched because there is no op to match in recv_list. We filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as reply. In newer dlm_controld version it will never send a result back when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle older dlm_controld versions.

I don't think this makes sense. If dlm_controld understands a particular request, it already knows whether or not that request should receive a reply. On the other hand, if dlm_controld doesn't understand a particular request, it should communicate that fact back to the kernel so that the kernel will know. The kernel knows which kinds of requests should and shouldn't receive replies, so when it is sent a reply it doesn't expect, it knows that dlm_controld didn't understand the request and is either outdated or plain broken. The kernel doesn't need to pipe a flag through dlm_controld for figuring that out.

Thanks, Andreas

...

Fixes: 901025d2f319 ("dlm: make plock operation killable") Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring aahringo@redhat.com

fs/dlm/plock.c | 23 +++++++++++++++++++---- include/uapi/linux/dlm_plock.h | 1 + 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 70a4752ed913..7fe9f4b922d3 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -96,7 +96,7 @@ static void do_unlock_close(const struct dlm_plock_info *info) op->info.end = OFFSET_MAX; op->info.owner = info->owner;
  op->info.flags |= DLM_PLOCK_FL_CLOSE;
  op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
  send_op(op);
}

@@ -293,7 +293,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file, op->info.owner = (__u64)(long) fl->fl_owner;
    if (fl->fl_flags & FL_CLOSE) {
          op->info.flags |= DLM_PLOCK_FL_CLOSE;
          op->info.flags |= (DLM_PLOCK_FL_CLOSE | DLM_PLOCK_FL_NO_REPLY);
          send_op(op);
          rv = 0;
          goto out;
@@ -392,7 +392,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, spin_lock(&ops_lock); if (!list_empty(&send_list)) { op = list_first_entry(&send_list, struct plock_op, list);
          if (op->info.flags & DLM_PLOCK_FL_CLOSE)
          if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
                  list_del(&op->list);
          else
                  list_move_tail(&op->list, &recv_list);
@@ -407,7 +407,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, that were generated by the vfs cleaning up for a close (the process did not make an unlock call). */
  if (op->info.flags & DLM_PLOCK_FL_CLOSE)
  if (op->info.flags & DLM_PLOCK_FL_NO_REPLY)
          dlm_release_plock_op(op);

  if (copy_to_user(u, &info, sizeof(info)))
@@ -433,6 +433,21 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, if (check_version(&info)) return -EINVAL;
  /* Some old dlm user space software will send replies back,
   * even if DLM_PLOCK_FL_NO_REPLY is set (because the flag is
   * new) e.g. if a error occur. We can't match them in recv_list
   * because they were never be part of it. We filter it here,
   * new dlm user space software will filter it in user space.
   *
   * In future this handling can be removed.
   */
  if (info.flags & DLM_PLOCK_FL_NO_REPLY) {
          pr_info("Received unexpected reply from op %d, "
                  "please update DLM user space software!\n",
                  info.optype);
          return count;
  }
  /*
   * The results for waiting ops (SETLKW) can be returned in any
   * order, so match all fields to find the op.  The results for
diff --git a/include/uapi/linux/dlm_plock.h b/include/uapi/linux/dlm_plock.h index 63b6c1fd9169..8dfa272c929a 100644 --- a/include/uapi/linux/dlm_plock.h +++ b/include/uapi/linux/dlm_plock.h @@ -25,6 +25,7 @@ enum { };

#define DLM_PLOCK_FL_CLOSE 1 +#define DLM_PLOCK_FL_NO_REPLY 2

struct dlm_plock_info { __u32 version[3]; -- 2.31.1

Alexander Aring

11:21 p.m.

Hi,

On Fri, Jul 14, 2023 at 9:54 AM Andreas Gruenbacher agruenba@redhat.com wrote:

...

On Thu, Jul 13, 2023 at 4:40 PM Alexander Aring aahringo@redhat.com wrote:

...
This patch introduces a new flag DLM_PLOCK_FL_NO_REPLY in case an dlm plock operation should not send a reply back. Currently this is kind of being handled in DLM_PLOCK_FL_CLOSE, but DLM_PLOCK_FL_CLOSE has more meanings that it will remove all waiters for a specific nodeid/owner values in by doing a unlock operation. In case of an error in dlm user space software e.g. dlm_controld we get an reply with an error back. This cannot be matched because there is no op to match in recv_list. We filter now on DLM_PLOCK_FL_NO_REPLY in case we had an error back as reply. In newer dlm_controld version it will never send a result back when DLM_PLOCK_FL_NO_REPLY is set. This filter is a workaround to handle older dlm_controld versions.

I don't think this makes sense. If dlm_controld understands a particular request, it already knows whether or not that request should receive a reply. On the other hand, if dlm_controld doesn't understand a particular request, it should communicate that fact back to the kernel so that the kernel will know. The kernel knows which kinds of requests should and shouldn't receive replies, so when it is sent a reply it doesn't expect, it knows that dlm_controld didn't understand the request and is either outdated or plain broken. The kernel doesn't need to pipe a flag through dlm_controld for figuring that out.

It's already part of UAPI that a flag signals that a reply is not expected [0]. If this flag is set and current user space software did not understand the request (or any possible future user space error handling) it will send a reply back which cannot be matched and with the current match logic it will match the wrong one. I would say it is broken and probably we don't care because we never assume that an error will happen, we just need to be careful to not add any other possible reply for errors in future.

The bigger problem is to introduce new ops [1] (not flags) on the kernel side which does not send a reply back. Older user space software will not understand the optype and will send a reply, newer software will not send a reply (because it understands the optype).

Therefore I think we should never introduce a new optype which does not send a reply back. The new DLM_PLOCK_OP_CANCEL was trying to do that, that's why I tried to fix this behaviour which is another ugly workaround passing flags around.

- Alex

[0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/d... [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/incl...

735

days inactive

736

days old

linux-stable-mirror@lists.linaro.org

6 comments

participants

tags (0)

participants (3)

Alexander Aring
Andreas Gruenbacher
Greg KH