On Fri, Jul 12, 2019 at 02:21:11PM -0700, Andrew Morton wrote:
On Fri, 12 Jul 2019 13:39:35 +0100 Mel Gorman mgorman@suse.de wrote:
So although I still think that just failing the migration if we cannot invalidate buffer heads is a safer choice, just extending the private_lock protected section does not seem as bad as I was afraid.
That does not seem too bad and your revised patch looks functionally fine. I'd leave out the tracepoints though because a perf probe would have got roughly the same data and the tracepoint may be too specific to track another class of problem. Whether the tracepoint survives or not and with a changelog added;
Acked-by: Mel Gorman mgorman@techsingularity.net
Andrew, which version do you want to go with, the original version or this one that holds private_lock for slightly longer during migration?
The revised version looks much more appealing for a -stable backport. I expect any mild performance issues can be address in the usual fashion. My main concern is not to put a large performance regression into mainline and stable kernels. How confident are we that this is (will be) sufficiently tested from that point of view?
Fairly confident. If we agree on this patch in principle (build tested only), I'll make sure it gets tested from a functional point of view and queue up a few migration-intensive tests while a metadata workload is running in the background to see what falls out. Furthermore, not all filesystems even take this path even for migration-intensive situations so some setups will never notice a difference.
---8<---
From a4f07d789ba5742cd2fe6bcfb635502f2a1de004 Mon Sep 17 00:00:00 2001
From: Jan Kara jack@suse.cz Date: Wed, 10 Jul 2019 11:31:01 +0200 Subject: [PATCH] mm: migrate: Fix race with __find_get_block()
buffer_migrate_page_norefs() can race with bh users in a following way:
CPU1 CPU2 buffer_migrate_page_norefs() buffer_migrate_lock_buffers() checks bh refs spin_unlock(&mapping->private_lock) __find_get_block() spin_lock(&mapping->private_lock) grab bh ref spin_unlock(&mapping->private_lock) move page do bh work
This can result in various issues like lost updates to buffers (i.e. metadata corruption) or use after free issues for the old page.
This patch closes the race by holding mapping->private_lock while the mapping is being moved to a new page. Ordinarily, a reference can be taken outside of the private_lock using the per-cpu BH LRU but the references are checked and the LRU invalidated if necessary. The private_lock is held once the references are known so the buffer lookup slow path will spin on the private_lock. Between the page lock and private_lock, it should be impossible for other references to be acquired and updates to happen during the migration.
[mgorman@techsingularity.net: Changelog, removed tracing] Fixes: 89cb0888ca14 "mm: migrate: provide buffer_migrate_page_norefs()" CC: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Mel Gorman mgorman@techsingularity.net --- mm/migrate.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c index e9594bc0d406..a59e4aed6d2e 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -771,12 +771,12 @@ static int __buffer_migrate_page(struct address_space *mapping, } bh = bh->b_this_page; } while (bh != head); - spin_unlock(&mapping->private_lock); if (busy) { if (invalidated) { rc = -EAGAIN; goto unlock_buffers; } + spin_unlock(&mapping->private_lock); invalidate_bh_lrus(); invalidated = true; goto recheck_buffers; @@ -809,6 +809,8 @@ static int __buffer_migrate_page(struct address_space *mapping,
rc = MIGRATEPAGE_SUCCESS; unlock_buffers: + if (check_refs) + spin_unlock(&mapping->private_lock); bh = head; do { unlock_buffer(bh);