Re: [PATCH RFC] mm: migrate: Fix races of __find_get_block() and page migration

12 Jul 2019


      On Thu, Jul 11, 2019 at 05:04:55PM -0700, Andrew Morton wrote:
...
On Thu, 11 Jul 2019 14:58:38 +0200 Jan Kara jack@suse.cz wrote:
...
buffer_migrate_page_norefs() can race with bh users in a following way:
CPU1					CPU2
buffer_migrate_page_norefs()
  buffer_migrate_lock_buffers()
  checks bh refs
  spin_unlock(&mapping->private_lock)
   				__find_get_block()
   				  spin_lock(&mapping->private_lock)
   				  grab bh ref
   				  spin_unlock(&mapping->private_lock)
  move page				  do bh work
This can result in various issues like lost updates to buffers (i.e.
metadata corruption) or use after free issues for the old page.
Closing this race window is relatively difficult. We could hold
mapping->private_lock in buffer_migrate_page_norefs() until we are
finished with migrating the page but the lock hold times would be rather
big. So let's revert to a more careful variant of page migration requiring
eviction of buffers on migrated page. This is effectively
fallback_migrate_page() that additionally invalidates bh LRUs in case
try_to_free_buffers() failed.
Is this premature optimization?  Holding ->private_lock while messing
with the buffers would be the standard way of addressing this.  The
longer hold times *might* be an issue, but we don't know this, do we? 
If there are indeed such problems then they could be improved by, say,
doing more of the newpage preparation prior to taking ->private_lock.
To some extent, we do not know how much of a problem this patch will
be either or what impact avoiding dirty block pages during migration
is either. So both approaches have their downsides.
However, failing a high-order allocation is typically benign and it is an
inevitable problem that depends on the workload. I don't think we could
ever hit a case whereby there was enough spinning to cause a soft lockup
but on the other hand, I don't think there is much scope for doing more
of the preparation steps before acquiring private_lock either.
-- 
Mel Gorman
SUSE Labs

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH RFC] mm: migrate: Fix races of __find_get_block() and page migration