Re: [PATCH] btrfs: save i_size in compress_file_range

11 Oct 2019

On Fri, Oct 11, 2019 at 2:05 PM Josef Bacik josef@toxicpanda.com wrote:
...
We hit a regression while rolling out 5.2 internally where we were
hitting the following panic
kernel BUG at mm/page-writeback.c:2659!
RIP: 0010:clear_page_dirty_for_io+0xe6/0x1f0
Call Trace:
 __process_pages_contig+0x25a/0x350
 ? extent_clear_unlock_delalloc+0x43/0x70
 submit_compressed_extents+0x359/0x4d0
 normal_work_helper+0x15a/0x330
 process_one_work+0x1f5/0x3f0
 worker_thread+0x2d/0x3d0
 ? rescuer_thread+0x340/0x340
 kthread+0x111/0x130
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x1f/0x30
this is happening because the page is not locked when doing
clear_page_dirty_for_io.  Looking at the core dump it was because our
async_extent had a ram_size of 24576 but our async_chunk range only
spanned 20480, so we had a whole extra page in our ram_size for our
async_extent.
This happened because we try not to compress pages outside of our
i_size, however a cleanup patch changed us to do
actual_end = min_t(u64, i_size_read(inode), end + 1);
which is problematic because i_size_read() can evaluate to different
values in between checking and assigning.  So either a expanding
truncate or a fallocate could increase our i_size while we're doing
Well, not just those operations but a write starting at or beyond eof
could also do it,
since at writeback time we don't hold the inode's lock and the
buffered write path doesn't lock the range if it starts at or past
current i_size.
...
writeout and actual_end would end up being past the range we have
locked.
I confirmed this was what was happening by installing a debug kernel
that had
actual_end = min_t(u64, i_size_read(inode), end + 1);
if (actual_end > end + 1) {
        printk(KERN_ERR "WE GOT FUCKED\n");
I think we coud be more polite on changelogs and mailing lists, or we
could add a tag in the subjects warning about improper content like
video games and music records :)
...
    actual_end = end + 1;

}
and installing it onto 500 boxes of the tier that had been seeing the
problem regularly.  Last night I got my debug message and no panic,
confirming what I expected.
Fixes: 62b37622718c ("btrfs: Remove isize local variable in compress_file_range")
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik josef@toxicpanda.com
Anyway, looks good to me.
Reviewed-by: Filipe Manana fdmanana@suse.com
...

fs/btrfs/inode.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2eb1d7249f83..9a483d1f61f8 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -474,6 +474,7 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
        u64 start = async_chunk->start;
        u64 end = async_chunk->end;
        u64 actual_end;

  loff_t i_size = i_size_read(inode);
  int ret = 0;
  struct page **pages = NULL;
  unsigned long nr_pages;



@@ -488,7 +489,13 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
        inode_should_defrag(BTRFS_I(inode), start, end, end - start + 1,
                        SZ_16K);

  actual_end = min_t(u64, i_size_read(inode), end + 1);




  /*


   * We need to save i_size before now because it could change in between


   * us evaluating the size and assigning it.  This is because we lock and


   * unlock the page in truncate and fallocate, and then modify the i_size


   * later on.


   */


  actual_end = min_t(u64, i_size, end + 1);



again:
        will_compress = 0;
        nr_pages = (end >> PAGE_SHIFT) - (start >> PAGE_SHIFT) + 1;
--
2.21.0
-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] btrfs: save i_size in compress_file_range