Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Luis
Brian Foster (1): xfs: fix shared extent data corruption due to missing cow reservation
Carlos Maiolino (1): xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat
Christoph Hellwig (1): xfs: cancel COW blocks before swapext
Christophe JAILLET (1): xfs: Fix error code in 'xfs_ioc_getbmap()'
Darrick J. Wong (1): xfs: fix PAGE_MASK usage in xfs_free_file_space
Dave Chinner (3): xfs: fix overflow in xfs_attr3_leaf_verify xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers xfs: delalloc -> unwritten COW fork allocation can go wrong
Eric Sandeen (1): xfs: fix inverted return from xfs_btree_sblock_verify_crc
Ye Yin (1): fs/xfs: fix f_ffree value for statfs when project quota is set
fs/xfs/libxfs/xfs_attr_leaf.c | 11 +++++++++-- fs/xfs/libxfs/xfs_bmap.c | 5 ++++- fs/xfs/libxfs/xfs_btree.c | 2 +- fs/xfs/xfs_bmap_util.c | 10 ++++++++-- fs/xfs/xfs_buf_item.c | 28 +++++++++++++++++++++------- fs/xfs/xfs_ioctl.c | 2 +- fs/xfs/xfs_qm_bhv.c | 2 +- fs/xfs/xfs_reflink.c | 1 + fs/xfs/xfs_stats.c | 2 +- 9 files changed, 47 insertions(+), 16 deletions(-)
From: Carlos Maiolino cmaiolino@redhat.com
commit 41657e5507b13e963be906d5d874f4f02374fd5c upstream.
The addition of FIBT, RMAP and REFCOUNT changed the offsets into __xfssats structure.
This caused xqmstat_proc_show() to display garbage data via /proc/fs/xfs/xqmstat, once it relies on the offsets marked via macros.
Fix it.
Fixes: 00f4e4f9 xfs: add rmap btree stats infrastructure Fixes: aafc3c24 xfs: support the XFS_BTNUM_FINOBT free inode btree type Fixes: 46eeb521 xfs: introduce refcount btree definitions Signed-off-by: Carlos Maiolino cmaiolino@redhat.com Reviewed-by: Eric Sandeen sandeen@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org --- fs/xfs/xfs_stats.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_stats.c b/fs/xfs/xfs_stats.c index 4e4423153071..740ac9674848 100644 --- a/fs/xfs/xfs_stats.c +++ b/fs/xfs/xfs_stats.c @@ -119,7 +119,7 @@ static int xqmstat_proc_show(struct seq_file *m, void *v) int j;
seq_printf(m, "qm"); - for (j = XFSSTAT_END_IBT_V2; j < XFSSTAT_END_XQMSTAT; j++) + for (j = XFSSTAT_END_REFCOUNT; j < XFSSTAT_END_XQMSTAT; j++) seq_printf(m, " %u", counter_val(xfsstats.xs_stats, j)); seq_putc(m, '\n'); return 0;
From: Christoph Hellwig hch@lst.de
commit 96987eea537d6ccd98704a71958f9ba02da80843 upstream.
We need to make sure we have no outstanding COW blocks before we swap extents, as there is nothing preventing us from having preallocated COW delalloc on either inode that swapext is called on. That case can easily be reproduced by running generic/324 in always_cow mode:
[ 620.760572] XFS: Assertion failed: tip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap_util.c, line: 1669 [ 620.761608] ------------[ cut here ]------------ [ 620.762171] kernel BUG at fs/xfs/xfs_message.c:102! [ 620.762732] invalid opcode: 0000 [#1] SMP PTI [ 620.763272] CPU: 0 PID: 24153 Comm: xfs_fsr Tainted: G W 4.19.0-rc1+ #4182 [ 620.764203] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 [ 620.765202] RIP: 0010:assfail+0x20/0x28 [ 620.765646] Code: 31 ff e8 83 fc ff ff 0f 0b c3 48 89 f1 41 89 d0 48 c7 c6 48 ca 8d 82 48 89 fa 38 [ 620.767758] RSP: 0018:ffffc9000898bc10 EFLAGS: 00010202 [ 620.768359] RAX: 0000000000000000 RBX: ffff88012f14ba40 RCX: 0000000000000000 [ 620.769174] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff828560d9 [ 620.769982] RBP: ffff88012f14b300 R08: 0000000000000000 R09: 0000000000000000 [ 620.770788] R10: 000000000000000a R11: f000000000000000 R12: ffffc9000898bc98 [ 620.771638] R13: ffffc9000898bc9c R14: ffff880130b5e2b8 R15: ffff88012a1fa2a8 [ 620.772504] FS: 00007fdc36e0fbc0(0000) GS:ffff88013ba00000(0000) knlGS:0000000000000000 [ 620.773475] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 620.774168] CR2: 00007fdc3604d000 CR3: 0000000132afc000 CR4: 00000000000006f0 [ 620.774978] Call Trace: [ 620.775274] xfs_swap_extent_forks+0x2a0/0x2e0 [ 620.775792] xfs_swap_extents+0x38b/0xab0 [ 620.776256] xfs_ioc_swapext+0x121/0x140 [ 620.776709] xfs_file_ioctl+0x328/0xc90 [ 620.777154] ? rcu_read_lock_sched_held+0x50/0x60 [ 620.777694] ? xfs_iunlock+0x233/0x260 [ 620.778127] ? xfs_setattr_nonsize+0x3be/0x6a0 [ 620.778647] do_vfs_ioctl+0x9d/0x680 [ 620.779071] ? ksys_fchown+0x47/0x80 [ 620.779552] ksys_ioctl+0x35/0x70 [ 620.780040] __x64_sys_ioctl+0x11/0x20 [ 620.780530] do_syscall_64+0x4b/0x190 [ 620.780927] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 620.781467] RIP: 0033:0x7fdc364d0f07 [ 620.781900] Code: b3 66 90 48 8b 05 81 5f 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 28 [ 620.784044] RSP: 002b:00007ffe2a766038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 620.784896] RAX: ffffffffffffffda RBX: 0000000000000025 RCX: 00007fdc364d0f07 [ 620.785667] RDX: 0000560296ca2fc0 RSI: 00000000c0c0586d RDI: 0000000000000005 [ 620.786398] RBP: 0000000000000025 R08: 0000000000001200 R09: 0000000000000000 [ 620.787283] R10: 0000000000000432 R11: 0000000000000246 R12: 0000000000000005 [ 620.788051] R13: 0000000000000000 R14: 0000000000001000 R15: 0000000000000006 [ 620.788927] Modules linked in: [ 620.789340] ---[ end trace 9503b7417ffdbdb0 ]--- [ 620.790065] RIP: 0010:assfail+0x20/0x28 [ 620.790642] Code: 31 ff e8 83 fc ff ff 0f 0b c3 48 89 f1 41 89 d0 48 c7 c6 48 ca 8d 82 48 89 fa 38 [ 620.793038] RSP: 0018:ffffc9000898bc10 EFLAGS: 00010202 [ 620.793609] RAX: 0000000000000000 RBX: ffff88012f14ba40 RCX: 0000000000000000 [ 620.794317] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff828560d9 [ 620.795025] RBP: ffff88012f14b300 R08: 0000000000000000 R09: 0000000000000000 [ 620.795778] R10: 000000000000000a R11: f000000000000000 R12: ffffc9000898bc98 [ 620.796675] R13: ffffc9000898bc9c R14: ffff880130b5e2b8 R15: ffff88012a1fa2a8 [ 620.797782] FS: 00007fdc36e0fbc0(0000) GS:ffff88013ba00000(0000) knlGS:0000000000000000 [ 620.798908] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 620.799594] CR2: 00007fdc3604d000 CR3: 0000000132afc000 CR4: 00000000000006f0 [ 620.800424] Kernel panic - not syncing: Fatal exception [ 620.801191] Kernel Offset: disabled [ 620.801597] ---[ end Kernel panic - not syncing: Fatal exception ]---
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org --- fs/xfs/xfs_bmap_util.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 6de8d90041ff..9d1e5c3a661e 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -1824,6 +1824,12 @@ xfs_swap_extents( if (error) goto out_unlock;
+ if (xfs_inode_has_cow_data(tip)) { + error = xfs_reflink_cancel_cow_range(tip, 0, NULLFILEOFF, true); + if (error) + return error; + } + /* * Extent "swapping" with rmap requires a permanent reservation and * a block reservation because it's really just a remap operation
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit 132bf6723749f7219c399831eeb286dbbb985429 upstream.
In this function, once 'buf' has been allocated, we unconditionally return 0. However, 'error' is set to some error codes in several error handling paths. Before commit 232b51948b99 ("xfs: simplify the xfs_getbmap interface") this was not an issue because all error paths were returning directly, but now that some cleanup at the end may be needed, we must propagate the error code.
Fixes: 232b51948b99 ("xfs: simplify the xfs_getbmap interface") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org --- fs/xfs/xfs_ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 0ef5ece5634c..bad90479ade2 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1616,7 +1616,7 @@ xfs_ioc_getbmap( error = 0; out_free_buf: kmem_free(buf); - return 0; + return error; }
struct getfsmap_info {
From: Dave Chinner dchinner@redhat.com
commit 837514f7a4ca4aca06aec5caa5ff56d33ef06976 upstream.
generic/070 on 64k block size filesystems is failing with a verifier corruption on writeback or an attribute leaf block:
[ 94.973083] XFS (pmem0): Metadata corruption detected at xfs_attr3_leaf_verify+0x246/0x260, xfs_attr3_leaf block 0x811480 [ 94.975623] XFS (pmem0): Unmount and run xfs_repair [ 94.976720] XFS (pmem0): First 128 bytes of corrupted metadata buffer: [ 94.978270] 000000004b2e7b45: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00 ........;....... [ 94.980268] 000000006b1db90b: 00 00 00 00 00 81 14 80 00 00 00 00 00 00 00 00 ................ [ 94.982251] 00000000433f2407: 22 7b 5c 82 2d 5c 47 4c bb 31 1c 37 fa a9 ce d6 "{.-\GL.1.7.... [ 94.984157] 0000000010dc7dfb: 00 00 00 00 00 81 04 8a 00 0a 18 e8 dd 94 01 00 ................ [ 94.986215] 00000000d5a19229: 00 a0 dc f4 fe 98 01 68 f0 d8 07 e0 00 00 00 00 .......h........ [ 94.988171] 00000000521df36c: 0c 2d 32 e2 fe 20 01 00 0c 2d 58 65 fe 0c 01 00 .-2.. ...-Xe.... [ 94.990162] 000000008477ae06: 0c 2d 5b 66 fe 8c 01 00 0c 2d 71 35 fe 7c 01 00 .-[f.....-q5.|.. [ 94.992139] 00000000a4a6bca6: 0c 2d 72 37 fc d4 01 00 0c 2d d8 b8 f0 90 01 00 .-r7.....-...... [ 94.994789] XFS (pmem0): xfs_do_force_shutdown(0x8) called from line 1453 of file fs/xfs/xfs_buf.c. Return address = ffffffff815365f3
This is failing this check:
end = ichdr.freemap[i].base + ichdr.freemap[i].size; if (end < ichdr.freemap[i].base)
return __this_address;
if (end > mp->m_attr_geo->blksize) return __this_address;
And from the buffer output above, the freemap array is:
freemap[0].base = 0x00a0 freemap[0].size = 0xdcf4 end = 0xdd94 freemap[1].base = 0xfe98 freemap[1].size = 0x0168 end = 0x10000 freemap[2].base = 0xf0d8 freemap[2].size = 0x07e0 end = 0xf8b8
These all look valid - the block size is 0x10000 and so from the last check in the above verifier fragment we know that the end of freemap[1] is valid. The problem is that end is declared as:
uint16_t end;
And (uint16_t)0x10000 = 0. So we have a verifier bug here, not a corruption. Fix the verifier to use uint32_t types for the check and hence avoid the overflow.
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=201577 Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org --- fs/xfs/libxfs/xfs_attr_leaf.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c index 6fc5425b1474..2652d00842d6 100644 --- a/fs/xfs/libxfs/xfs_attr_leaf.c +++ b/fs/xfs/libxfs/xfs_attr_leaf.c @@ -243,7 +243,7 @@ xfs_attr3_leaf_verify( struct xfs_mount *mp = bp->b_target->bt_mount; struct xfs_attr_leafblock *leaf = bp->b_addr; struct xfs_attr_leaf_entry *entries; - uint16_t end; + uint32_t end; /* must be 32bit - see below */ int i;
xfs_attr3_leaf_hdr_from_disk(mp->m_attr_geo, &ichdr, leaf); @@ -293,6 +293,11 @@ xfs_attr3_leaf_verify( /* * Quickly check the freemap information. Attribute data has to be * aligned to 4-byte boundaries, and likewise for the free space. + * + * Note that for 64k block size filesystems, the freemap entries cannot + * overflow as they are only be16 fields. However, when checking end + * pointer of the freemap, we have to be careful to detect overflows and + * so use uint32_t for those checks. */ for (i = 0; i < XFS_ATTR_LEAF_MAPSIZE; i++) { if (ichdr.freemap[i].base > mp->m_attr_geo->blksize) @@ -303,7 +308,9 @@ xfs_attr3_leaf_verify( return __this_address; if (ichdr.freemap[i].size & 0x3) return __this_address; - end = ichdr.freemap[i].base + ichdr.freemap[i].size; + + /* be care of 16 bit overflows here */ + end = (uint32_t)ichdr.freemap[i].base + ichdr.freemap[i].size; if (end < ichdr.freemap[i].base) return __this_address; if (end > mp->m_attr_geo->blksize)
From: Brian Foster bfoster@redhat.com
commit 59e4293149106fb92530f8e56fa3992d8548c5e6 upstream.
Page writeback indirectly handles shared extents via the existence of overlapping COW fork blocks. If COW fork blocks exist, writeback always performs the associated copy-on-write regardless if the underlying blocks are actually shared. If the blocks are shared, then overlapping COW fork blocks must always exist.
fstests shared/010 reproduces a case where a buffered write occurs over a shared block without performing the requisite COW fork reservation. This ultimately causes writeback to the shared extent and data corruption that is detected across md5 checks of the filesystem across a mount cycle.
The problem occurs when a buffered write lands over a shared extent that crosses an extent size hint boundary and that also happens to have a partial COW reservation that doesn't cover the start and end blocks of the data fork extent.
For example, a buffered write occurs across the file offset (in FSB units) range of [29, 57]. A shared extent exists at blocks [29, 35] and COW reservation already exists at blocks [32, 34]. After accommodating a COW extent size hint of 32 blocks and the existing reservation at offset 32, xfs_reflink_reserve_cow() allocates 32 blocks of reservation at offset 0 and returns with COW reservation across the range of [0, 34]. The associated data fork extent is still [29, 35], however, which isn't fully covered by the COW reservation.
This leads to a buffered write at file offset 35 over a shared extent without associated COW reservation. Writeback eventually kicks in, performs an overwrite of the underlying shared block and causes the associated data corruption.
Update xfs_reflink_reserve_cow() to accommodate the fact that a delalloc allocation request may not fully cover the extent in the data fork. Trim the data fork extent appropriately, just as is done for shared extent boundaries and/or existing COW reservations that happen to overlap the start of the data fork extent. This prevents shared/010 failures due to data corruption on reflink enabled filesystems.
Signed-off-by: Brian Foster bfoster@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org --- fs/xfs/xfs_reflink.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 42ea7bab9144..7088f44c0c59 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -302,6 +302,7 @@ xfs_reflink_reserve_cow( if (error) return error;
+ xfs_trim_extent(imap, got.br_startoff, got.br_blockcount); trace_xfs_reflink_cow_alloc(ip, &got); return 0; }
From: Dave Chinner dchinner@redhat.com
commit d43aaf1685aa471f0593685c9f54d53e3af3cf3f upstream.
When retrying a failed inode or dquot buffer, xfs_buf_resubmit_failed_buffers() clears all the failed flags from the inde/dquot log items. In doing so, it also drops all the reference counts on the buffer that the failed log items hold. This means it can drop all the active references on the buffer and hence free the buffer before it queues it for write again.
Putting the buffer on the delwri queue takes a reference to the buffer (so that it hangs around until it has been written and completed), but this goes bang if the buffer has already been freed.
Hence we need to add the buffer to the delwri queue before we remove the failed flags from the log items attached to the buffer to ensure it always remains referenced during the resubmit process.
Reported-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org --- fs/xfs/xfs_buf_item.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 12d8455bfbb2..010db5f8fb00 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -1233,9 +1233,23 @@ xfs_buf_iodone( }
/* - * Requeue a failed buffer for writeback + * Requeue a failed buffer for writeback. * - * Return true if the buffer has been re-queued properly, false otherwise + * We clear the log item failed state here as well, but we have to be careful + * about reference counts because the only active reference counts on the buffer + * may be the failed log items. Hence if we clear the log item failed state + * before queuing the buffer for IO we can release all active references to + * the buffer and free it, leading to use after free problems in + * xfs_buf_delwri_queue. It makes no difference to the buffer or log items which + * order we process them in - the buffer is locked, and we own the buffer list + * so nothing on them is going to change while we are performing this action. + * + * Hence we can safely queue the buffer for IO before we clear the failed log + * item state, therefore always having an active reference to the buffer and + * avoiding the transient zero-reference state that leads to use-after-free. + * + * Return true if the buffer was added to the buffer list, false if it was + * already on the buffer list. */ bool xfs_buf_resubmit_failed_buffers( @@ -1243,16 +1257,16 @@ xfs_buf_resubmit_failed_buffers( struct list_head *buffer_list) { struct xfs_log_item *lip; + bool ret; + + ret = xfs_buf_delwri_queue(bp, buffer_list);
/* - * Clear XFS_LI_FAILED flag from all items before resubmit - * - * XFS_LI_FAILED set/clear is protected by ail_lock, caller this + * XFS_LI_FAILED set/clear is protected by ail_lock, caller of this * function already have it acquired */ list_for_each_entry(lip, &bp->b_li_list, li_bio_list) xfs_clear_li_failed(lip);
- /* Add this buffer back to the delayed write list */ - return xfs_buf_delwri_queue(bp, buffer_list); + return ret; }
From: Dave Chinner dchinner@redhat.com
commit 9230a0b65b47fe6856c4468ec0175c4987e5bede upstream.
Long saga. There have been days spent following this through dead end after dead end in multi-GB event traces. This morning, after writing a trace-cmd wrapper that enabled me to be more selective about XFS trace points, I discovered that I could get just enough essential tracepoints enabled that there was a 50:50 chance the fsx config would fail at ~115k ops. If it didn't fail at op 115547, I stopped fsx at op 115548 anyway.
That gave me two traces - one where the problem manifested, and one where it didn't. After refining the traces to have the necessary information, I found that in the failing case there was a real extent in the COW fork compared to an unwritten extent in the working case.
Walking back through the two traces to the point where the CWO fork extents actually diverged, I found that the bad case had an extra unwritten extent in it. This is likely because the bug it led me to had triggered multiple times in those 115k ops, leaving stray COW extents around. What I saw was a COW delalloc conversion to an unwritten extent (as they should always be through xfs_iomap_write_allocate()) resulted in a /written extent/:
xfs_writepage: dev 259:0 ino 0x83 pgoff 0x17000 size 0x79a00 offset 0 length 0 xfs_iext_remove: dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/2 offset 32 block 152 count 20 flag 1 caller xfs_bmap_add_extent_delay_real xfs_bmap_pre_update: dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/1 offset 1 block 4503599627239429 count 31 flag 0 caller xfs_bmap_add_extent_delay_real xfs_bmap_post_update: dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/1 offset 1 block 121 count 51 flag 0 caller xfs_bmap_add_ex
Basically, Cow fork before:
0 1 32 52 +H+DDDDDDDDDDDD+UUUUUUUUUUU+ PREV RIGHT
COW delalloc conversion allocates:
1 32 +uuuuuuuuuuuu+ NEW
And the result according to the xfs_bmap_post_update trace was:
0 1 32 52 +H+wwwwwwwwwwwwwwwwwwwwwwww+ PREV
Which is clearly wrong - it should be a merged unwritten extent, not an unwritten extent.
That lead me to look at the LEFT_FILLING|RIGHT_FILLING|RIGHT_CONTIG case in xfs_bmap_add_extent_delay_real(), and sure enough, there's the bug.
It takes the old delalloc extent (PREV) and adds the length of the RIGHT extent to it, takes the start block from NEW, removes the RIGHT extent and then updates PREV with the new extent.
What it fails to do is update PREV.br_state. For delalloc, this is always XFS_EXT_NORM, while in this case we are converting the delayed allocation to unwritten, so it needs to be updated to XFS_EXT_UNWRITTEN. This LF|RF|RC case does not do this, and so the resultant extent is always written.
And that's the bug I've been chasing for a week - a bmap btree bug, not a reflink/dedupe/copy_file_range bug, but a BMBT bug introduced with the recent in core extent tree scalability enhancements.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org --- fs/xfs/libxfs/xfs_bmap.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index a47670332326..3a496ffe6551 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -1683,10 +1683,13 @@ xfs_bmap_add_extent_delay_real( case BMAP_LEFT_FILLING | BMAP_RIGHT_FILLING | BMAP_RIGHT_CONTIG: /* * Filling in all of a previously delayed allocation extent. - * The right neighbor is contiguous, the left is not. + * The right neighbor is contiguous, the left is not. Take care + * with delay -> unwritten extent allocation here because the + * delalloc record we are overwriting is always written. */ PREV.br_startblock = new->br_startblock; PREV.br_blockcount += RIGHT.br_blockcount; + PREV.br_state = new->br_state;
xfs_iext_next(ifp, &bma->icur); xfs_iext_remove(bma->ip, &bma->icur, state);
From: Ye Yin dbyin@tencent.com
commit de7243057e7cefa923fa5f467c0f1ec24eef41d2 upsream.
When project is set, we should use inode limit minus the used count
Signed-off-by: Ye Yin dbyin@tencent.com Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org --- fs/xfs/xfs_qm_bhv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_qm_bhv.c b/fs/xfs/xfs_qm_bhv.c index 73a1d77ec187..3091e4bc04ef 100644 --- a/fs/xfs/xfs_qm_bhv.c +++ b/fs/xfs/xfs_qm_bhv.c @@ -40,7 +40,7 @@ xfs_fill_statvfs_from_dquot( statp->f_files = limit; statp->f_ffree = (statp->f_files > dqp->q_res_icount) ? - (statp->f_ffree - dqp->q_res_icount) : 0; + (statp->f_files - dqp->q_res_icount) : 0; } }
From: "Darrick J. Wong" darrick.wong@oracle.com
commit a579121f94aba4e8bad1a121a0fad050d6925296 upstream.
In commit e53c4b598, I *tried* to teach xfs to force writeback when we fzero/fpunch right up to EOF so that if EOF is in the middle of a page, the post-EOF part of the page gets zeroed before we return to userspace. Unfortunately, I missed the part where PAGE_MASK is ~(PAGE_SIZE - 1), which means that we totally fail to zero if we're fpunching and EOF is within the first page. Worse yet, the same PAGE_MASK thinko plagues the filemap_write_and_wait_range call, so we'd initiate writeback of the entire file, which (mostly) masked the thinko.
Drop the tricky PAGE_MASK and replace it with correct usage of PAGE_SIZE and the proper rounding macros.
Fixes: e53c4b598 ("xfs: ensure post-EOF zeroing happens after zeroing part of a file") Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Luis Chamberlain mcgrof@kernel.org --- fs/xfs/xfs_bmap_util.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 9d1e5c3a661e..211b06e4702e 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -1175,9 +1175,9 @@ xfs_free_file_space( * page could be mmap'd and iomap_zero_range doesn't do that for us. * Writeback of the eof page will do this, albeit clumsily. */ - if (offset + len >= XFS_ISIZE(ip) && ((offset + len) & PAGE_MASK)) { + if (offset + len >= XFS_ISIZE(ip) && offset_in_page(offset + len) > 0) { error = filemap_write_and_wait_range(VFS_I(ip)->i_mapping, - (offset + len) & ~PAGE_MASK, LLONG_MAX); + round_down(offset + len, PAGE_SIZE), LLONG_MAX); }
return error;
From: Eric Sandeen sandeen@redhat.com
commit 7d048df4e9b05ba89b74d062df59498aa81f3785 upstream.
xfs_btree_sblock_verify_crc is a bool so should not be returning a failaddr_t; worse, if xfs_log_check_lsn fails it returns __this_address which looks like a boolean true (i.e. success) to the caller.
(interestingly xfs_btree_lblock_verify_crc doesn't have the issue)
Signed-off-by: Eric Sandeen sandeen@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org --- fs/xfs/libxfs/xfs_btree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 34c6d7bd4d18..bbdae2b4559f 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -330,7 +330,7 @@ xfs_btree_sblock_verify_crc(
if (xfs_sb_version_hascrc(&mp->m_sb)) { if (!xfs_log_check_lsn(mp, be64_to_cpu(block->bb_u.s.bb_lsn))) - return __this_address; + return false; return xfs_buf_verify_cksum(bp, XFS_BTREE_SBLOCK_CRC_OFF); }
On Mon, Feb 4, 2019 at 6:54 PM Luis Chamberlain mcgrof@kernel.org wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Luis,
Thanks a lot for doing this work.
For the sake of people not following "oscheck", could you please write a list of configurations you tested with xfstests. auto group? Any expunged tests we should know about?
I went over the candidate patches and to me, they all look like stable worthy patches and I have not identified any dependencies.
Original authors and reviewers are in the best position to verify those assessments, so please guys, if each one of you acks his own patch, that shouldn't take a lot of anyone's time.
Specifically, repeating Luis's request from v1 cover letter - There are two patches by Dave ([6,7/10]) that are originally from a 7 patch series of assorted fixes: https://patchwork.kernel.org/cover/10689445/
Please confirm that those two patches do stand on their own.
Thanks, Amir.
Luis
Brian Foster (1): xfs: fix shared extent data corruption due to missing cow reservation
Carlos Maiolino (1): xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat
Christoph Hellwig (1): xfs: cancel COW blocks before swapext
Christophe JAILLET (1): xfs: Fix error code in 'xfs_ioc_getbmap()'
Darrick J. Wong (1): xfs: fix PAGE_MASK usage in xfs_free_file_space
Dave Chinner (3): xfs: fix overflow in xfs_attr3_leaf_verify xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers xfs: delalloc -> unwritten COW fork allocation can go wrong
Eric Sandeen (1): xfs: fix inverted return from xfs_btree_sblock_verify_crc
Ye Yin (1): fs/xfs: fix f_ffree value for statfs when project quota is set
fs/xfs/libxfs/xfs_attr_leaf.c | 11 +++++++++-- fs/xfs/libxfs/xfs_bmap.c | 5 ++++- fs/xfs/libxfs/xfs_btree.c | 2 +- fs/xfs/xfs_bmap_util.c | 10 ++++++++-- fs/xfs/xfs_buf_item.c | 28 +++++++++++++++++++++------- fs/xfs/xfs_ioctl.c | 2 +- fs/xfs/xfs_qm_bhv.c | 2 +- fs/xfs/xfs_reflink.c | 1 + fs/xfs/xfs_stats.c | 2 +- 9 files changed, 47 insertions(+), 16 deletions(-)
-- 2.18.0
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
Cheers,
Dave.
On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
Luis noted on v1 that it passes through his oscheck test suite, and I noted that I haven't seen any regression with the xfstests scripts I have.
What sort of data are you looking for beyond "we didn't see a regression"?
-- Thanks, Sasha
On Tue, Feb 05, 2019 at 11:05:59PM -0500, Sasha Levin wrote:
On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
Luis noted on v1 that it passes through his oscheck test suite, and I noted that I haven't seen any regression with the xfstests scripts I have.
What sort of data are you looking for beyond "we didn't see a regression"?
Nothing special, just a summary of what was tested so we have some visibility of whether the testing covered the proposed changes sufficiently. i.e. something like:
Patchset was run through ltp and the fstests "auto" group with the following configs:
- mkfs/mount defaults - -m reflink=1,rmapbt=1 - -b size=1k - -m crc=0 ....
No new regressions were reported.
Really, all I'm looking for is a bit more context for the review process - nobody remembers what configs other people test. However, it's important in reviewing a backport to know whether a backport to a fix, say, a bug in the rmap code actually got exercised by the tests on an rmap enabled filesystem...
Cheers,
Dave.
On Thu, Feb 07, 2019 at 08:54:54AM +1100, Dave Chinner wrote:
On Tue, Feb 05, 2019 at 11:05:59PM -0500, Sasha Levin wrote:
On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
Luis noted on v1 that it passes through his oscheck test suite, and I noted that I haven't seen any regression with the xfstests scripts I have.
What sort of data are you looking for beyond "we didn't see a regression"?
Nothing special, just a summary of what was tested so we have some visibility of whether the testing covered the proposed changes sufficiently. i.e. something like:
Patchset was run through ltp and the fstests "auto" group with the following configs:
- mkfs/mount defaults
- -m reflink=1,rmapbt=1
- -b size=1k
- -m crc=0
....
No new regressions were reported.
Really, all I'm looking for is a bit more context for the review process - nobody remembers what configs other people test. However, it's important in reviewing a backport to know whether a backport to a fix, say, a bug in the rmap code actually got exercised by the tests on an rmap enabled filesystem...
Sure! Below are the various configs this was run against. There were multiple runs over 48+ hours and no regressions from a 4.14.17 baseline were observed.
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0' USE_EXTERNAL=no LOGWRITES_DEV=/dev/nve0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default_pmem] TEST_DEV=/dev/pmem0 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/pmem1" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0' USE_EXTERNAL=no LOGWRITES_DEV=/dev/pmem2 FSTYP=xfs
[default_pmem] TEST_DEV=/dev/pmem0 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/pmem1" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/pmem2 FSTYP=xfs
[default_pmem] TEST_DEV=/dev/pmem0 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/pmem1" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/pmem2 FSTYP=xfs
[default_pmem] TEST_DEV=/dev/pmem0 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/pmem1" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/pmem2 FSTYP=xfs
[default_pmem] TEST_DEV=/dev/pmem0 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/pmem1" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/pmem2 FSTYP=xfs
-- Thanks, Sasha
On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
On Thu, Feb 07, 2019 at 08:54:54AM +1100, Dave Chinner wrote:
On Tue, Feb 05, 2019 at 11:05:59PM -0500, Sasha Levin wrote:
On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
Luis noted on v1 that it passes through his oscheck test suite, and I noted that I haven't seen any regression with the xfstests scripts I have.
What sort of data are you looking for beyond "we didn't see a regression"?
Nothing special, just a summary of what was tested so we have some visibility of whether the testing covered the proposed changes sufficiently. i.e. something like:
Patchset was run through ltp and the fstests "auto" group with the following configs:
- mkfs/mount defaults
- -m reflink=1,rmapbt=1
- -b size=1k
- -m crc=0
....
No new regressions were reported.
Really, all I'm looking for is a bit more context for the review process - nobody remembers what configs other people test. However, it's important in reviewing a backport to know whether a backport to a fix, say, a bug in the rmap code actually got exercised by the tests on an rmap enabled filesystem...
Sure! Below are the various configs this was run against.
To be clear, that was Sasha's own effort. I just replied with my own set of test and results against the baseline to confirm no regressions were found.
My tests run on 8-core kvm vms with 8 GiB of RAM, and qcow2 images which reside on an XFS partition mounted on nvme drives on the hypervisor, the hypervisor runs CentOS 7, on 3.10.0-862.3.2.el7.x86_64.
For the guest I use different qcow2 images. One is 100 GiB and is used to expose a disk to the guest so it can use it where to store the files use dfor the SCRATCH_DEV_POOL. For the SCRATCH_DEV_POOL I use loopback devices, using files created on the guest's own /media/truncated/ partition, using the 100 GiB partition. I end up with 8 loopback devices to test for then:
SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11"
The loopback devices are setup using my oscheck's $(./gendisks.sh -d) script.
Since Sasha seems to have a system rigged for testing XFS what I could do is collaborate with Sasha to consolidate our sections for testing and also have both of our systems run all tests to at least have two different test systems confirming no regressions. That is, if Sasha is up or that. Otherwise I'll continue with whatever rig I can get my hands on each time I test.
I have an expunge list, and he has his own, we need to consolidate that as well with time.
Since some tests have a failure rate which is not 1 -- ie, it doesn't fail 100% of the time, I am considering adding a *spinner tester* for each test which runs each test 1000 times and records when if first fails. It assumes that if you can run a test 1000 times, we really don't have it as an expunge. If there is a better term for failure rate let's use it, just not familiar, but I'm sure this nomenclature must exist.
A curious thing I noted was that the ppc64le bug didn't actually fail for me as a straight forward test. That is, I had to *first* manually mkfs.xfs with the big block specification for the partition used for TEST_DEV and then also the first device in SCRATCH_DEV_POOL with big block. Only after I did this and then run the test did I get with 100% failure rate the ability to trigger the failure.
It has me wondering how many other test may fail if we did the same.
Luis
On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
On Thu, Feb 07, 2019 at 08:54:54AM +1100, Dave Chinner wrote:
On Tue, Feb 05, 2019 at 11:05:59PM -0500, Sasha Levin wrote:
On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
Luis noted on v1 that it passes through his oscheck test suite, and I noted that I haven't seen any regression with the xfstests scripts I have.
What sort of data are you looking for beyond "we didn't see a regression"?
Nothing special, just a summary of what was tested so we have some visibility of whether the testing covered the proposed changes sufficiently. i.e. something like:
Patchset was run through ltp and the fstests "auto" group with the following configs:
- mkfs/mount defaults
- -m reflink=1,rmapbt=1
- -b size=1k
- -m crc=0
....
No new regressions were reported.
Really, all I'm looking for is a bit more context for the review process - nobody remembers what configs other people test. However, it's important in reviewing a backport to know whether a backport to a fix, say, a bug in the rmap code actually got exercised by the tests on an rmap enabled filesystem...
Sure! Below are the various configs this was run against. There were multiple runs over 48+ hours and no regressions from a 4.14.17 baseline were observed.
Thanks, Sasha. As an ongoing thing, I reckon a "grep _OPTIONS <config_files>" (catches both mkfs and mount options) would be sufficient as a summary of what was tested in the series decription...
Cheers,
Dave.
On Sat, Feb 09, 2019 at 08:29:21AM +1100, Dave Chinner wrote:
On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
On Thu, Feb 07, 2019 at 08:54:54AM +1100, Dave Chinner wrote:
On Tue, Feb 05, 2019 at 11:05:59PM -0500, Sasha Levin wrote:
On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
Luis noted on v1 that it passes through his oscheck test suite, and I noted that I haven't seen any regression with the xfstests scripts I have.
What sort of data are you looking for beyond "we didn't see a regression"?
Nothing special, just a summary of what was tested so we have some visibility of whether the testing covered the proposed changes sufficiently. i.e. something like:
Patchset was run through ltp and the fstests "auto" group with the following configs:
- mkfs/mount defaults
- -m reflink=1,rmapbt=1
- -b size=1k
- -m crc=0
....
No new regressions were reported.
Really, all I'm looking for is a bit more context for the review process - nobody remembers what configs other people test. However, it's important in reviewing a backport to know whether a backport to a fix, say, a bug in the rmap code actually got exercised by the tests on an rmap enabled filesystem...
Sure! Below are the various configs this was run against. There were multiple runs over 48+ hours and no regressions from a 4.14.17 baseline were observed.
Thanks, Sasha. As an ongoing thing, I reckon a "grep _OPTIONS <config_files>" (catches both mkfs and mount options) would be sufficient as a summary of what was tested in the series decription...
Will do.
-- Thanks, Sasha
On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
Sure! Below are the various configs this was run against. There were multiple runs over 48+ hours and no regressions from a 4.14.17 baseline were observed.
In an effort to consolidate our sections:
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
This matches my "xfs" section.
USE_EXTERNAL=no LOGWRITES_DEV=/dev/nve0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'
This matches my "xfs_reflink"
USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,'
This matches my "xfs_reflink_1024" section.
USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,'
This matches my "xfs_nocrc" section.
USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,'
This matches my "xfs_nocrc_512" section.
USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default_pmem] TEST_DEV=/dev/pmem0
I'll have to add this to my framework. Have you found pmem issues not present on other sections?
TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/pmem1" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
OK so you just repeat the above options vervbatim but for pmem. Correct?
Any reason you don't name the sections with more finer granularity? It would help me in ensuring when we revise both of tests we can more easily ensure we're talking about apples, pears, or bananas.
FWIW, I run two different bare metal hosts now, and each has a VM guest per section above. One host I use for tracking stable, the other host for my changes. This ensures I don't mess things up easier and I can re-test any time fast.
I dedicate a VM guest to test *one* section. I do this with oscheck easily:
./oscheck.sh --test-section xfs_nocrc | tee log-xfs-4.19.18+
For instance will just test xfs_nocrc section. On average each section takes about 1 hour to run.
I could run the tests on raw nvme and do away with the guests, but that loses some of my ability to debug on crashes easily and out to baremetal.. but curious, how long do your tests takes? How about per section? Say just the default "xfs" section?
IIRC you also had your system on hyperV :) so maybe you can still debug easily on crashes.
Luis
On Fri, Feb 08, 2019 at 02:17:26PM -0800, Luis Chamberlain wrote:
On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
Sure! Below are the various configs this was run against. There were multiple runs over 48+ hours and no regressions from a 4.14.17 baseline were observed.
In an effort to consolidate our sections:
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
This matches my "xfs" section.
USE_EXTERNAL=no LOGWRITES_DEV=/dev/nve0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'
This matches my "xfs_reflink"
USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,'
This matches my "xfs_reflink_1024" section.
USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,'
This matches my "xfs_nocrc" section.
USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default] TEST_DEV=/dev/nvme0n1p1 TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/nvme0n1p2" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r) MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,'
This matches my "xfs_nocrc_512" section.
USE_EXTERNAL=no LOGWRITES_DEV=/dev/nvme0n1p3 FSTYP=xfs
[default_pmem] TEST_DEV=/dev/pmem0
I'll have to add this to my framework. Have you found pmem issues not present on other sections?
Originally I've added this because the xfs folks suggested that pmem vs block exercises very different code paths and we should be testing both of them.
Looking at the baseline I have, it seems that there are differences between the failing tests. For example, with "MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'", generic/524 seems to fail on pmem but not on block.
TEST_DIR=/media/test SCRATCH_DEV_POOL="/dev/pmem1" SCRATCH_MNT=/media/scratch RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
OK so you just repeat the above options vervbatim but for pmem. Correct?
Right.
Any reason you don't name the sections with more finer granularity? It would help me in ensuring when we revise both of tests we can more easily ensure we're talking about apples, pears, or bananas.
Nope, I'll happily rename them if there are "official" names for it :)
FWIW, I run two different bare metal hosts now, and each has a VM guest per section above. One host I use for tracking stable, the other host for my changes. This ensures I don't mess things up easier and I can re-test any time fast.
I dedicate a VM guest to test *one* section. I do this with oscheck easily:
./oscheck.sh --test-section xfs_nocrc | tee log-xfs-4.19.18+
For instance will just test xfs_nocrc section. On average each section takes about 1 hour to run.
We have a similar setup then. I just spawn the VM on azure for each section and run them all in parallel that way.
I thought oscheck runs everything on a single VM, is it a built in mechanism to spawn a VM for each config? If so, I can add some code in to support azure and we can use the same codebase.
I could run the tests on raw nvme and do away with the guests, but that loses some of my ability to debug on crashes easily and out to baremetal.. but curious, how long do your tests takes? How about per section? Say just the default "xfs" section?
I think that the longest config takes about 5 hours, otherwise everything tends to take about 2 hours.
I basically run these on "repeat" until I issue a stop order, so in a timespan of 48 hours some configs run ~20 times and some only ~10.
IIRC you also had your system on hyperV :) so maybe you can still debug easily on crashes.
Luis
On Sat, Feb 09, 2019 at 04:56:27PM -0500, Sasha Levin wrote:
On Fri, Feb 08, 2019 at 02:17:26PM -0800, Luis Chamberlain wrote:
On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote: Have you found pmem issues not present on other sections?
Originally I've added this because the xfs folks suggested that pmem vs block exercises very different code paths and we should be testing both of them.
Looking at the baseline I have, it seems that there are differences between the failing tests. For example, with "MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'",
That's my "xfs" section.
generic/524 seems to fail on pmem but not on block.
This is useful thanks! Can you get the failure rate? How often does it fail when you run the test? Always? Does it *never* fail on block? How many consecutive runs did you have run on block?
To help with this oscheck has naggy-check.sh, you could run it until a failure is hit:
./naggy-check.sh -f -s xfs generic/524
And on another host:
./naggy-check.sh -f -s xfs_pmem generic/524
Any reason you don't name the sections with more finer granularity? It would help me in ensuring when we revise both of tests we can more easily ensure we're talking about apples, pears, or bananas.
Nope, I'll happily rename them if there are "official" names for it :)
Well since I am pushing out the stable fixes and am using oscheck to be transparent about how I test and what I track, and since I'm using section names, yes it would be useful to me. Simply adding a _pmem postfix to the pmem ones would suffice.
FWIW, I run two different bare metal hosts now, and each has a VM guest per section above. One host I use for tracking stable, the other host for my changes. This ensures I don't mess things up easier and I can re-test any time fast.
I dedicate a VM guest to test *one* section. I do this with oscheck easily:
./oscheck.sh --test-section xfs_nocrc | tee log-xfs-4.19.18+
For instance will just test xfs_nocrc section. On average each section takes about 1 hour to run.
We have a similar setup then. I just spawn the VM on azure for each section and run them all in parallel that way.
Indeed.
I thought oscheck runs everything on a single VM,
By default it does.
is it a built in mechanism to spawn a VM for each config?
Yes:
./oscheck.sh --test-section xfs_nocrc_512
For instance will test section xfs_nocrc_512 *only* on that host.
If so, I can add some code in to support azure and we can use the same codebase.
Groovy. I believe the next step will if you can send me your delta of expunges, and then I can run naggy-check.sh on them to see if I can reach similar results. I believe you have a larger expunge list. I suspect some of this may you may not have certain quirks handled. We will see. But getting this right and to sync our testing should yield good confirmation of failures.
I could run the tests on raw nvme and do away with the guests, but that loses some of my ability to debug on crashes easily and out to baremetal.. but curious, how long do your tests takes? How about per section? Say just the default "xfs" section?
I think that the longest config takes about 5 hours, otherwise everything tends to take about 2 hours.
Oh wow, mine are only 1 hour each. Guess I got a decent rig now :)
I basically run these on "repeat" until I issue a stop order, so in a timespan of 48 hours some configs run ~20 times and some only ~10.
I see... so you iterate over all tests and many times a day and this is how you've built your expunge list. Correct?
It could could explain how you may end up with a larger set. This can mean some tests only fail at a non-100% failure rate, for these I'm annotating the failure rate as a comment on each expunge line. Having a consistent format for this and proper agreed upon term would be good. Right now I just mention how oftem I have to run a test before reaching a failure. This provides a rough estimate how many times one should iterate running the test in a loop before detecting a failure. Of course this may not always be acurate, given systems vary and this could play an impact on the failure... but at least it provides some guidance. It would be curious to see if we end up with similar failure rates for tests don't always fail. And if there is a divergence, how big this could be.
Luis
On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
I've first established a baseline for v4.19.18 with fstests using a series of different sections to test against. I annotated the failures on an expunge list and then use that expunge list to confirm no regressions -- no failures if we skip the failures already known for v4.19.18.
Each different configuration I test against I use a section for. I only test x86_64 for now but am starting to create a baseline for ppc64le.
The sections I use:
* xfs * xfs_nocrc * xfs_nocrc_512 * xfs_reflink * xfs_reflink_1024 * xfs_logdev * xfs_realtimedev
The sections definitions for these are below:
[xfs] MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0' USE_EXTERNAL=no LOGWRITES_DEV=/dev/loop15 FSTYP=xfs
[xfs_nocrc] MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/loop15 FSTYP=xfs
[xfs_nocrc_512] MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/loop15 FSTYP=xfs
[xfs_reflink] MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/loop15 FSTYP=xfs
[xfs_reflink_1024] MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,' USE_EXTERNAL=no LOGWRITES_DEV=/dev/loop15 FSTYP=xfs
[xfs_logdev] MKFS_OPTIONS="-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0 -lsize=1g" SCRATCH_LOGDEV=/dev/loop15 USE_EXTERNAL=yes FSTYP=xfs
[xfs_realtimedev] MKFS_OPTIONS="-f -lsize=1g" SCRATCH_LOGDEV=/dev/loop15 SCRATCH_RTDEV=/dev/loop14 USE_EXTERNAL=yes FSTYP=xfs
These are listed in my example.config which oscheck copies over to /var/lib/xfstests/config/$(hostname).config upon install if you don't have one.
I didn't find any regressions against my tests.
The baseline is reflected on oscheck's expunge list per kernel release, so in this case expunges/4.19.18. A file exists for each section which tests are known to fail.
I'll put them below here for completeness, but all of these files are present on my oscheck repository [0], it is what I use to track baselines for upstream kernels for fstests failures:
$ cat expunges/4.19.18/xfs/unassigned/xfs.txt generic/091 generic/263 generic/464 # after ~6 runs generic/475 # after ~15 runs generic/484 xfs/191-input-validation xfs/278 xfs/451 xfs/495 xfs/499
$ cat expunges/4.19.18/xfs/unassigned/xfs_nocrc.txt generic/091 generic/263 generic/464 # after ~39 runs generic/475 # after ~5-10 runs generic/484 xfs/191-input-validation xfs/273 xfs/278 xfs/451 xfs/495 xfs/499
$ cat expunges/4.19.18/xfs/unassigned/xfs_nocrc_512.txt generic/091 generic/263 generic/475 # after ~33 runs generic/482 # after ~16 runs generic/484 xfs/071 xfs/191-input-validation xfs/273 xfs/278 xfs/451 xfs/495 xfs/499
$ cat expunges/4.19.18/xfs/unassigned/xfs_reflink.txt generic/091 generic/263 generic/464 # after ~1 run generic/475 # after ~5 runs generic/484 xfs/191-input-validation xfs/278 xfs/451 xfs/495 xfs/499
$ cat expunges/4.19.18/xfs/unassigned/xfs_reflink_1024.txt generic/091 generic/263 generic/475 # after ~2 runs generic/484 xfs/191-input-validation xfs/278 xfs/451 xfs/495 xfs/499
The xfs_logdev and xfs_realtimedev sections use an external log, and as I have noted before it seems works is needed to rule out an actual failure.
But for completely the test which fstests says fail for these sections are below:
$ cat expunges/4.19.18/xfs/unassigned/xfs_logdev.txt generic/034 generic/039 generic/040 generic/041 generic/054 generic/055 generic/056 generic/057 generic/059 generic/065 generic/066 generic/073 generic/081 generic/090 generic/091 generic/101 generic/104 generic/106 generic/107 generic/177 generic/204 generic/207 generic/223 generic/260 generic/263 generic/311 generic/321 generic/322 generic/325 generic/335 generic/336 generic/341 generic/342 generic/343 generic/347 generic/348 generic/361 generic/376 generic/455 generic/459 generic/464 # fails after ~2 runs generic/475 # fails after ~5 runs, crashes sometimes generic/482 generic/483 generic/484 generic/489 generic/498 generic/500 generic/502 generic/510 generic/512 generic/520 shared/002 shared/298 xfs/030 xfs/033 xfs/045 xfs/070 xfs/137 xfs/138 xfs/191-input-validation xfs/194 xfs/195 xfs/199 xfs/278 xfs/284 xfs/291 xfs/294 xfs/424 xfs/451 xfs/495 xfs/499
$ cat expunges/4.19.18/xfs/unassigned/xfs_realtimedev.txt generic/034 generic/039 generic/040 generic/041 generic/054 generic/056 generic/057 generic/059 generic/065 generic/066 generic/073 generic/081 generic/090 generic/091 generic/101 generic/104 generic/106 generic/107 generic/177 generic/204 generic/207 generic/223 generic/260 generic/263 generic/311 generic/321 generic/322 generic/325 generic/335 generic/336 generic/341 generic/342 generic/343 generic/347 generic/348 generic/361 generic/376 generic/455 generic/459 generic/464 # fails after ~40 runs generic/475 # fails, and sometimes crashes generic/482 generic/483 generic/484 generic/489 generic/498 generic/500 generic/502 generic/510 generic/512 generic/520 shared/002 shared/298 xfs/002 xfs/030 xfs/033 xfs/068 xfs/070 xfs/137 xfs/138 xfs/191-input-validation xfs/194 xfs/195 xfs/199 xfs/278 xfs/291 xfs/294 xfs/419 xfs/424 xfs/451 xfs/495 xfs/499
Perhaps worth noting which was curious is that I could not get to trigger generic/464 on sections xfs_nocrc_512 and xfs_reflink_1024.
Athough I don't have a full baseline for ppc64le I did confirm that backporting upstream commit 837514f7a4ca fixes the kernel.org bug [1] report triggerable via generic/070 on ppc64le.
If you have any questions please let me know.
[0] https://gitlab.com/mcgrof/oscheck [1] https://bugzilla.kernel.org/show_bug.cgi?id=201577
Luis
On Fri, Feb 08, 2019 at 11:48:29AM -0800, Luis Chamberlain wrote:
On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
I've first established a baseline for v4.19.18 with fstests using a series of different sections to test against. I annotated the failures on an expunge list and then use that expunge list to confirm no regressions -- no failures if we skip the failures already known for v4.19.18.
Each different configuration I test against I use a section for. I only test x86_64 for now but am starting to create a baseline for ppc64le.
The sections I use:
- xfs
- xfs_nocrc
- xfs_nocrc_512
- xfs_reflink
- xfs_reflink_1024
- xfs_logdev
- xfs_realtimedev
Yup, that seems to cover most common things :)
The xfs_logdev and xfs_realtimedev sections use an external log, and as I have noted before it seems works is needed to rule out an actual failure.
Yeah, there are many tests that don't work properly with external devices, esp. RT devices. That's a less critical area to cover, but it's still good to run it :)
Thanks, Luis!
-Dave.
On Sat, Feb 09, 2019 at 08:32:01AM +1100, Dave Chinner wrote:
On Fri, Feb 08, 2019 at 11:48:29AM -0800, Luis Chamberlain wrote:
On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
I've first established a baseline for v4.19.18 with fstests using a series of different sections to test against. I annotated the failures on an expunge list and then use that expunge list to confirm no regressions -- no failures if we skip the failures already known for v4.19.18.
Each different configuration I test against I use a section for. I only test x86_64 for now but am starting to create a baseline for ppc64le.
The sections I use:
- xfs
- xfs_nocrc
- xfs_nocrc_512
- xfs_reflink
- xfs_reflink_1024
- xfs_logdev
- xfs_realtimedev
Yup, that seems to cover most common things :)
To be clear in the future I hope to also have a baseline for:
* xfs_bigblock
But that is *currently* [0] only possible on the following architectures with the respective kernel config:
aarch64: CONFIG_ARM64_64K_PAGES=y
ppc64le: CONFIG_PPC_64K_PAGES=y
[0] Someone is working on 64k pages on x86 I think?
Luis
On Fri, Feb 08, 2019 at 01:50:57PM -0800, Luis Chamberlain wrote:
On Sat, Feb 09, 2019 at 08:32:01AM +1100, Dave Chinner wrote:
On Fri, Feb 08, 2019 at 11:48:29AM -0800, Luis Chamberlain wrote:
On Wed, Feb 06, 2019 at 09:06:55AM +1100, Dave Chinner wrote:
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Test results?
The set of changes look fine themselves, but as always, the proof is in the testing...
I've first established a baseline for v4.19.18 with fstests using a series of different sections to test against. I annotated the failures on an expunge list and then use that expunge list to confirm no regressions -- no failures if we skip the failures already known for v4.19.18.
Each different configuration I test against I use a section for. I only test x86_64 for now but am starting to create a baseline for ppc64le.
The sections I use:
- xfs
- xfs_nocrc
- xfs_nocrc_512
- xfs_reflink
- xfs_reflink_1024
- xfs_logdev
- xfs_realtimedev
Yup, that seems to cover most common things :)
To be clear in the future I hope to also have a baseline for:
- xfs_bigblock
But that is *currently* [0] only possible on the following architectures with the respective kernel config:
aarch64: CONFIG_ARM64_64K_PAGES=y
ppc64le: CONFIG_PPC_64K_PAGES=y
[0] Someone is working on 64k pages on x86 I think?
Yup, I am, but that got derailed by wanting fsx coverage w/ dedup/clone/copy_file_range before going any further with it. That was one of the triggers that lead to finding all those data corruption and API problems late last year...
Cheers,
Dave.
On Fri, Feb 8, 2019 at 1:48 PM Luis Chamberlain mcgrof@kernel.org wrote:
Perhaps worth noting which was curious is that I could not get to trigger generic/464 on sections xfs_nocrc_512 and xfs_reflink_1024.
Well I just hit a failure for generic/464 on 4.19.17 after ~3996 runs for the xfs_nocrc_512 section, and after 7382 runs for xfs_refkilnk_1024. I've updated the expunge list to reflect the difficult to hit failure of generic/464 and its failure rate on xfs_nocrc_512 on oscheck.
Luis
On Mon, Feb 04, 2019 at 08:54:17AM -0800, Luis Chamberlain wrote:
Kernel stable team,
here is a v2 respin of my XFS stable patches for v4.19.y. The only change in this series is adding the upstream commit to the commit log, and I've now also Cc'd stable@vger.kernel.org as well. No other issues were spotted or raised with this series.
Reviews, questions, or rants are greatly appreciated.
Luis
Brian Foster (1): xfs: fix shared extent data corruption due to missing cow reservation
Carlos Maiolino (1): xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat
Christoph Hellwig (1): xfs: cancel COW blocks before swapext
Christophe JAILLET (1): xfs: Fix error code in 'xfs_ioc_getbmap()'
Darrick J. Wong (1): xfs: fix PAGE_MASK usage in xfs_free_file_space
Dave Chinner (3): xfs: fix overflow in xfs_attr3_leaf_verify xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers xfs: delalloc -> unwritten COW fork allocation can go wrong
Eric Sandeen (1): xfs: fix inverted return from xfs_btree_sblock_verify_crc
Ye Yin (1): fs/xfs: fix f_ffree value for statfs when project quota is set
fs/xfs/libxfs/xfs_attr_leaf.c | 11 +++++++++-- fs/xfs/libxfs/xfs_bmap.c | 5 ++++- fs/xfs/libxfs/xfs_btree.c | 2 +- fs/xfs/xfs_bmap_util.c | 10 ++++++++-- fs/xfs/xfs_buf_item.c | 28 +++++++++++++++++++++------- fs/xfs/xfs_ioctl.c | 2 +- fs/xfs/xfs_qm_bhv.c | 2 +- fs/xfs/xfs_reflink.c | 1 + fs/xfs/xfs_stats.c | 2 +- 9 files changed, 47 insertions(+), 16 deletions(-)
Queued for 4.19, thank you.
-- Thanks, Sasha
linux-stable-mirror@lists.linaro.org