On Tue, Oct 15, 2024 at 02:36:23PM +0300, Ido Schimmel wrote:
Upstream commit c2368b19807a ("net: devlink: introduce "unregistering" mark and use it during devlinks iteration") in v6.0 introduced a race when unregistering a devlink instance that can result in RCU stalls and in the system completely locking up. Exact details and reproducer can be found here [1]. The bug was inadvertently fixed in v6.3 by upstream commit d77278196441 ("devlink: bump the instance index directly when iterating").
This patchset fixes the bug by backporting the second commit and a related dependency from v6.3 to v6.1.y while adjusting them to the devlink file structure in v6.1.y (net/devlink/{core.c,devl_internal.h} -> net/devlink/leftover.c).
Tested by running the devlink tests under tools/testing/selftests/drivers/net/netdevsim/ and the reproducer mentioned in [1].
[1] https://lore.kernel.org/stable/20241001112035.973187-1-idosch@nvidia.com/
Jakub Kicinski (2): devlink: drop the filter argument from devlinks_xa_find_get devlink: bump the instance index directly when iterating
net/devlink/leftover.c | 40 ++++++++++------------------------------ 1 file changed, 10 insertions(+), 30 deletions(-)
-- 2.47.0
Both now queued up, thanks.
greg k-h