On Thu, Oct 24, 2019 at 11:03:20PM +0800, zhong jiang wrote:
By reviewing the code, I find that there is an race between iterate the radix_tree and radix_tree_insert/delete. Because the former just access its slot in rcu protected period. but it fails to prevent the radix_tree from being changed.
Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org
The locking here now matches the locking in memfd_tag_pins() that was changed in ef3038a573aa8bf2f3797b110f7244b55a0e519c (part of 4.20-rc1). I didn't notice that I was fixing a bug when I changed the locking. This bug has been present since 05f65b5c70909ef686f865f0a85406d74d75f70f (part of 3.17) so backports will need to go further back. This code has moved around a bit (mm/shmem.c) and the APIs have changed, so it will take some effort.
Cc: stable@vger.kernel.org Signed-off-by: zhong jiang zhongjiang@huawei.com
mm/memfd.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/mm/memfd.c b/mm/memfd.c index 2bb5e25..0b3fedc 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -37,8 +37,8 @@ static void memfd_tag_pins(struct address_space *mapping) lru_add_drain(); start = 0;
- rcu_read_lock();
- xa_lock_irq(&mapping->i_pages); radix_tree_for_each_slot(slot, &mapping->i_pages, &iter, start) { page = radix_tree_deref_slot(slot); if (!page || radix_tree_exception(page)) {
@@ -47,18 +47,16 @@ static void memfd_tag_pins(struct address_space *mapping) continue; } } else if (page_count(page) - page_mapcount(page) > 1) {
xa_lock_irq(&mapping->i_pages); radix_tree_tag_set(&mapping->i_pages, iter.index, MEMFD_TAG_PINNED);
}xa_unlock_irq(&mapping->i_pages);
if (need_resched()) { slot = radix_tree_iter_resume(slot, &iter);
cond_resched_rcu();
} }cond_resched_lock(&mapping->i_pages.xa_lock);
- rcu_read_unlock();
- xa_unlock_irq(&mapping->i_pages);
} /* -- 1.7.12.4