This piece was missing in commit ae678317b95e ("netfs: Remove deprecated use of PG_private_2 as a second writeback flag").
There is one remaining use of PG_private_2: the function __fscache_clear_page_bits(), whose only purpose is to clear PG_private_2. This is done via folio_end_private_2() which also releases the folio reference which was supposed to be taken by folio_start_private_2() (via ceph_set_page_fscache()).
__fscache_clear_page_bits() is called by __fscache_write_to_cache(), but only if the parameter using_pgpriv2 is true; the only caller of that function is ceph_fscache_write_to_cache() which still passes true.
By calling folio_end_private_2() without folio_start_private_2(), the folio refcounter breaks and causes trouble like RCU stalls and general protection faults.
Cc: stable@vger.kernel.org Fixes: ae678317b95e ("netfs: Remove deprecated use of PG_private_2 as a second writeback flag") Link: https://lore.kernel.org/ceph-devel/CAKPOu+_DA8XiMAA2ApMj7Pyshve_YWknw8Hdt1=z... Signed-off-by: Max Kellermann max.kellermann@ionos.com --- fs/ceph/addr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 8c16bc5250ef..aacea3e8fd6d 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -512,7 +512,7 @@ static void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len, b struct fscache_cookie *cookie = ceph_fscache_cookie(ci);
fscache_write_to_cache(cookie, inode->i_mapping, off, len, i_size_read(inode), - ceph_fscache_write_terminated, inode, true, caching); + ceph_fscache_write_terminated, inode, false, caching); } #else static inline void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len, bool caching)
For the moment, ceph has to continue using PG_private_2. It doesn't use netfs_writepages(). I have mostly complete patches to fix that, but they got popped onto the back burner for a bit.
I've finally managed to get cephfs set up and can now reproduce the hang you're seeing.
David
I think the right thing to do is probably to at least partially revert:
ae678317b95e760607c7b20b97c9cd4ca9ed6e1a netfs: Remove deprecated use of PG_private_2 as a second writeback flag
for the moment. That removed the bit that actually did the write to the cache on behalf of ceph.
David
linux-stable-mirror@lists.linaro.org