On Wed, Apr 24, 2019 at 12:44:50PM +0200, Miklos Szeredi wrote:
On Wed, Mar 27, 2019 at 11:15 AM Kirill Smelkov kirr@nexedi.com wrote:
FUSE filesystem server and kernel client negotiate during initialization phase, what should be the maximum write size the client will ever issue. Correspondingly the filesystem server then queues sys_read calls to read requests with buffer capacity large enough to carry request header
- that max_write bytes. A filesystem server is free to set its max_write
in anywhere in the range between [1·page, fc->max_pages·page]. In particular go-fuse[2] sets max_write by default as 64K, wheres default fc->max_pages corresponds to 128K. Libfuse also allows users to configure max_write, but by default presets it to possible maximum.
If max_write is < fc->max_pages·page, and in NOTIFY_RETRIEVE handler we allow to retrieve more than max_write bytes, corresponding prepared NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the filesystem server, in full correspondence with server/client contract, will be only queuing sys_read with ~max_write buffer capacity, and fuse_dev_do_read throws away requests that cannot fit into server request buffer. In turn the filesystem server could get stuck waiting indefinitely for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK which is understood by clients as that NOTIFY_REPLY was queued and will be sent back.
-> Cap requested size to negotiate max_write to avoid the problem. This aligns with the way NOTIFY_RETRIEVE handler works, which already unconditionally caps requested retrieve size to fuse_conn->max_pages. This way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than was originally requested.
Please see [1] for context where the problem of stuck filesystem was hit for real, how the situation was traced and for more involving patch that did not make it into the tree.
[1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2 [2] https://github.com/hanwen/go-fuse
Signed-off-by: Kirill Smelkov kirr@nexedi.com Cc: Han-Wen Nienhuys hanwen@google.com Cc: Jakob Unterwurzacher jakobunt@gmail.com Cc: stable@vger.kernel.org # v2.6.36+
fs/fuse/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 8a63e52785e9..38e94bc43053 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1749,7 +1749,7 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode, offset = outarg->offset & ~PAGE_MASK; file_size = i_size_read(inode);
num = outarg->size;
num = min(outarg->size, fc->max_write);
This is wrong: the max_size limited num is overwritten if constrained by file size.
I assume you are meaning this:
--- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1745,15 +1745,15 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode, unsigned int offset; size_t total_len = 0; unsigned int num_pages; offset = outarg->offset & ~PAGE_MASK; file_size = i_size_read(inode); - num = outarg->size; + num = min(outarg->size, fc->max_write); if (outarg->offset > file_size) num = 0; else if (outarg->offset + num > file_size) num = file_size - outarg->offset; <-- THIS num_pages = (num + offset + PAGE_SIZE - 1) >> PAGE_SHIFT; num_pages = min(num_pages, fc->max_pages);
and then in this case (offset + num > file_size) num overwrite
num = file_size - offset
can make num only smaller, right? And then the patch is not wrong because there is no other num overwriting in this function except when num is being further decremented in loop that prepares pages to retrieve.
Or am I missing something? Would it be more clear to cap num to max_write after all calculations? But then - if we are not sure that file_size check can only lower num - we have a problem: we are no longer sure that num is <= outarg->size.
Also the patch is whitespace damaged.
I've tried to do the following in my mutt on "RESEND4, PATCH 1/2" message:
|(cd ~/src/linux/linux && git am -)
and the patch applied successfully. So could you please clarify what "whitespace damaged" means?
Attaching the patch once again just in case.
Kirill
---- 8< ----
From 000a5a6c91992f2a361a846cd050444964920f85 Mon Sep 17 00:00:00 2001
From: Kirill Smelkov kirr@nexedi.com Date: Wed, 27 Mar 2019 13:00:56 +0300 Subject: [PATCH] fuse: retrieve: cap requested size to negotiated max_write MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit
FUSE filesystem server and kernel client negotiate during initialization phase, what should be the maximum write size the client will ever issue. Correspondingly the filesystem server then queues sys_read calls to read requests with buffer capacity large enough to carry request header + that max_write bytes. A filesystem server is free to set its max_write in anywhere in the range between [1·page, fc->max_pages·page]. In particular go-fuse[2] sets max_write by default as 64K, wheres default fc->max_pages corresponds to 128K. Libfuse also allows users to configure max_write, but by default presets it to possible maximum.
If max_write is < fc->max_pages·page, and in NOTIFY_RETRIEVE handler we allow to retrieve more than max_write bytes, corresponding prepared NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the filesystem server, in full correspondence with server/client contract, will be only queuing sys_read with ~max_write buffer capacity, and fuse_dev_do_read throws away requests that cannot fit into server request buffer. In turn the filesystem server could get stuck waiting indefinitely for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK which is understood by clients as that NOTIFY_REPLY was queued and will be sent back.
-> Cap requested size to negotiate max_write to avoid the problem. This aligns with the way NOTIFY_RETRIEVE handler works, which already unconditionally caps requested retrieve size to fuse_conn->max_pages. This way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than was originally requested.
Please see [1] for context where the problem of stuck filesystem was hit for real, how the situation was traced and for more involving patch that did not make it into the tree.
[1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2 [2] https://github.com/hanwen/go-fuse
Signed-off-by: Kirill Smelkov kirr@nexedi.com Cc: Han-Wen Nienhuys hanwen@google.com Cc: Jakob Unterwurzacher jakobunt@gmail.com Cc: stable@vger.kernel.org # v2.6.36+ --- fs/fuse/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 9971a35cf1ef..1e2efd873201 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1749,7 +1749,7 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode, offset = outarg->offset & ~PAGE_MASK; file_size = i_size_read(inode);
- num = outarg->size; + num = min(outarg->size, fc->max_write); if (outarg->offset > file_size) num = 0; else if (outarg->offset + num > file_size)