On 2021/7/19 18:14, Boqun Feng wrote:
On Mon, Jul 19, 2021 at 03:43:00AM +0100, Matthew Wilcox wrote:
On Mon, Jul 19, 2021 at 10:24:18AM +0800, Zhouyi Zhou wrote:
Meanwhile, I examined the 5.12.17 by naked eye, and found a suspicious place that could possibly trigger that problem:
struct swap_info_struct *get_swap_device(swp_entry_t entry) { struct swap_info_struct *si; unsigned long offset;
if (!entry.val) goto out; si = swp_swap_info(entry); if (!si) goto bad_nofile;
rcu_read_lock(); if (data_race(!(si->flags & SWP_VALID))) goto unlock_out; offset = swp_offset(entry); if (offset >= si->max) goto unlock_out;
return si; bad_nofile: pr_err("%s: %s%08lx\n", __func__, Bad_file, entry.val); out: return NULL; unlock_out: rcu_read_unlock(); return NULL; } I guess the function "return si" without a rcu_read_unlock.
Yes, but the caller is supposed to call put_swap_device() which calls rcu_read_unlock(). See commit eb085574a752.
Right, but we need to make sure there is no sleepable function called before put_swap_device() called, and the call trace showed the following happened:
do_swap_page(): si = get_swap_device(): rcu_read_lock(); lock_page_or_retry(): might_sleep(); // call a sleepable function inside RCU read-side c.s. __lock_page_or_retry(): wait_on_page_bit_common(): schedule(): rcu_note_context_switch(); // Warn here put_swap_device(); rcu_read_unlock();
, which introduced by commit 2799e77529c2a
When in the commit 2799e77529c2a, we're using the percpu_ref to serialize against concurrent swapoff, i.e. there's percpu_ref inside get_swap_device() instead of rcu_read_lock(). Please see commit 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against concurrent swapoff") for detail.
Thanks.
[Copy the author]
Regards, Boqun
.