From: Filipe Manana <fdmanana(a)suse.com>
commit e0391e92f9ab4fb3dbdeb139c967dcfa7ac4b115 upstream.
If we do a direct IO sync write, at btrfs_sync_file(), and we need to skip
inode logging or we get an error starting a transaction or an error when
flushing delalloc, we end up unlocking the inode when we shouldn't under
the 'out_release_extents' label, and then unlock it again at
btrfs_direct_write().
Fix that by checking if we have to skip inode unlocking under that label.
Reported-by: syzbot+7dbbb74af6291b5a5a8b(a)syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/000000000000dfd631061eaeb4bc@google.com/
Fixes: 939b656bc8ab ("btrfs: fix corruption after buffer fault in during direct IO append write")
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
---
fs/btrfs/file.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 7ca49c02e8f8..c44dfb4370d7 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2433,7 +2433,10 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
out_release_extents:
btrfs_release_log_ctx_extents(&ctx);
- btrfs_inode_unlock(inode, BTRFS_ILOCK_MMAP);
+ if (skip_ilock)
+ up_write(&BTRFS_I(inode)->i_mmap_lock);
+ else
+ btrfs_inode_unlock(inode, BTRFS_ILOCK_MMAP);
goto out;
}
--
2.43.0
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x cd7c957f936f8cb80d03e5152f4013aae65bd986
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024081244-uncertain-snarl-e4f6@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
cd7c957f936f ("mptcp: pm: don't try to create sf if alloc failed")
c95eb32ced82 ("mptcp: pm: reduce indentation blocks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cd7c957f936f8cb80d03e5152f4013aae65bd986 Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Wed, 31 Jul 2024 13:05:56 +0200
Subject: [PATCH] mptcp: pm: don't try to create sf if alloc failed
It sounds better to avoid wasting cycles and / or put extreme memory
pressure on the system by trying to create new subflows if it was not
possible to add a new item in the announce list.
While at it, a warning is now printed if the entry was already in the
list as it should not happen with the in-kernel path-manager. With this
PM, mptcp_pm_alloc_anno_list() should only fail in case of memory
pressure.
Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink")
Cc: stable(a)vger.kernel.org
Suggested-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 780f4cca165c..2be7af377cda 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -348,7 +348,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
add_entry = mptcp_lookup_anno_list_by_saddr(msk, addr);
if (add_entry) {
- if (mptcp_pm_is_kernel(msk))
+ if (WARN_ON_ONCE(mptcp_pm_is_kernel(msk)))
return false;
sk_reset_timer(sk, &add_entry->add_timer,
@@ -555,8 +555,6 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
/* check first for announce */
if (msk->pm.add_addr_signaled < add_addr_signal_max) {
- local = select_signal_address(pernet, msk);
-
/* due to racing events on both ends we can reach here while
* previous add address is still running: if we invoke now
* mptcp_pm_announce_addr(), that will fail and the
@@ -567,11 +565,15 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL))
return;
+ local = select_signal_address(pernet, msk);
if (!local)
goto subflow;
+ /* If the alloc fails, we are on memory pressure, not worth
+ * continuing, and trying to create subflows.
+ */
if (!mptcp_pm_alloc_anno_list(msk, &local->addr))
- goto subflow;
+ return;
__clear_bit(local->addr.id, msk->pm.id_avail_bitmap);
msk->pm.add_addr_signaled++;
This reverts commit 052c9f0c140c78e5d6826688b5d2f33e19db0ae7.
If the test doesn't pass, we can get this error:
# ./simult_flows.sh: line 275: mptcp_lib_subtest_is_flaky: command not found
This patch is not needed in v6.6: it is there to mark a test as "flaky",
but the MPTCP selftests infrastructure in v6.6 doesn't support them. So
it looks better to revert this patch.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
tools/testing/selftests/net/mptcp/simult_flows.sh | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/net/mptcp/simult_flows.sh b/tools/testing/selftests/net/mptcp/simult_flows.sh
index be97a7ed0950..f24bd2bf0831 100755
--- a/tools/testing/selftests/net/mptcp/simult_flows.sh
+++ b/tools/testing/selftests/net/mptcp/simult_flows.sh
@@ -262,7 +262,7 @@ run_test()
do_transfer $small $large $time
lret=$?
mptcp_lib_result_code "${lret}" "${msg}"
- if [ $lret -ne 0 ] && ! mptcp_lib_subtest_is_flaky; then
+ if [ $lret -ne 0 ]; then
ret=$lret
[ $bail -eq 0 ] || exit $ret
fi
@@ -272,7 +272,7 @@ run_test()
do_transfer $large $small $time
lret=$?
mptcp_lib_result_code "${lret}" "${msg}"
- if [ $lret -ne 0 ] && ! mptcp_lib_subtest_is_flaky; then
+ if [ $lret -ne 0 ]; then
ret=$lret
[ $bail -eq 0 ] || exit $ret
fi
@@ -305,7 +305,7 @@ run_test 10 10 0 0 "balanced bwidth"
run_test 10 10 1 25 "balanced bwidth with unbalanced delay"
# we still need some additional infrastructure to pass the following test-cases
-MPTCP_LIB_SUBTEST_FLAKY=1 run_test 10 3 0 0 "unbalanced bwidth"
+run_test 10 3 0 0 "unbalanced bwidth"
run_test 10 3 1 25 "unbalanced bwidth with unbalanced delay"
run_test 10 3 25 1 "unbalanced bwidth with opposed, unbalanced delay"
--
2.45.2
From: Filipe Manana <fdmanana(a)suse.com>
commit e0391e92f9ab4fb3dbdeb139c967dcfa7ac4b115 upstream.
If we do a direct IO sync write, at btrfs_sync_file(), and we need to skip
inode logging or we get an error starting a transaction or an error when
flushing delalloc, we end up unlocking the inode when we shouldn't under
the 'out_release_extents' label, and then unlock it again at
btrfs_direct_write().
Fix that by checking if we have to skip inode unlocking under that label.
Reported-by: syzbot+7dbbb74af6291b5a5a8b(a)syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/000000000000dfd631061eaeb4bc@google.com/
Fixes: 939b656bc8ab ("btrfs: fix corruption after buffer fault in during direct IO append write")
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
---
fs/btrfs/file.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 7c3ae295fdb5..e23d178f9778 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2037,7 +2037,10 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
out_release_extents:
btrfs_release_log_ctx_extents(&ctx);
- btrfs_inode_unlock(inode, BTRFS_ILOCK_MMAP);
+ if (skip_ilock)
+ up_write(&BTRFS_I(inode)->i_mmap_lock);
+ else
+ btrfs_inode_unlock(inode, BTRFS_ILOCK_MMAP);
goto out;
}
--
2.43.0
From: Filipe Manana <fdmanana(a)suse.com>
commit e0391e92f9ab4fb3dbdeb139c967dcfa7ac4b115 upstream.
If we do a direct IO sync write, at btrfs_sync_file(), and we need to skip
inode logging or we get an error starting a transaction or an error when
flushing delalloc, we end up unlocking the inode when we shouldn't under
the 'out_release_extents' label, and then unlock it again at
btrfs_direct_write().
Fix that by checking if we have to skip inode unlocking under that label.
Reported-by: syzbot+7dbbb74af6291b5a5a8b(a)syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/000000000000dfd631061eaeb4bc@google.com/
Fixes: 939b656bc8ab ("btrfs: fix corruption after buffer fault in during direct IO append write")
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
---
fs/btrfs/file.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index a47a7bbf9b7e..952cf145c629 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2038,7 +2038,10 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
out_release_extents:
btrfs_release_log_ctx_extents(&ctx);
- btrfs_inode_unlock(BTRFS_I(inode), BTRFS_ILOCK_MMAP);
+ if (skip_ilock)
+ up_write(&BTRFS_I(inode)->i_mmap_lock);
+ else
+ btrfs_inode_unlock(BTRFS_I(inode), BTRFS_ILOCK_MMAP);
goto out;
}
--
2.43.0
From: Filipe Manana <fdmanana(a)suse.com>
commit e0391e92f9ab4fb3dbdeb139c967dcfa7ac4b115 upstream.
If we do a direct IO sync write, at btrfs_sync_file(), and we need to skip
inode logging or we get an error starting a transaction or an error when
flushing delalloc, we end up unlocking the inode when we shouldn't under
the 'out_release_extents' label, and then unlock it again at
btrfs_direct_write().
Fix that by checking if we have to skip inode unlocking under that label.
Reported-by: syzbot+7dbbb74af6291b5a5a8b(a)syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/000000000000dfd631061eaeb4bc@google.com/
Fixes: 939b656bc8ab ("btrfs: fix corruption after buffer fault in during direct IO append write")
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
---
fs/btrfs/file.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 35ce1c810bd3..ca434f0cd27f 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2080,7 +2080,10 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
out_release_extents:
btrfs_release_log_ctx_extents(&ctx);
- btrfs_inode_unlock(BTRFS_I(inode), BTRFS_ILOCK_MMAP);
+ if (skip_ilock)
+ up_write(&BTRFS_I(inode)->i_mmap_lock);
+ else
+ btrfs_inode_unlock(BTRFS_I(inode), BTRFS_ILOCK_MMAP);
goto out;
}
--
2.43.0
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x cd7c957f936f8cb80d03e5152f4013aae65bd986
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024081244-smock-nearest-c09a@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
cd7c957f936f ("mptcp: pm: don't try to create sf if alloc failed")
c95eb32ced82 ("mptcp: pm: reduce indentation blocks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cd7c957f936f8cb80d03e5152f4013aae65bd986 Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Wed, 31 Jul 2024 13:05:56 +0200
Subject: [PATCH] mptcp: pm: don't try to create sf if alloc failed
It sounds better to avoid wasting cycles and / or put extreme memory
pressure on the system by trying to create new subflows if it was not
possible to add a new item in the announce list.
While at it, a warning is now printed if the entry was already in the
list as it should not happen with the in-kernel path-manager. With this
PM, mptcp_pm_alloc_anno_list() should only fail in case of memory
pressure.
Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink")
Cc: stable(a)vger.kernel.org
Suggested-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 780f4cca165c..2be7af377cda 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -348,7 +348,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk,
add_entry = mptcp_lookup_anno_list_by_saddr(msk, addr);
if (add_entry) {
- if (mptcp_pm_is_kernel(msk))
+ if (WARN_ON_ONCE(mptcp_pm_is_kernel(msk)))
return false;
sk_reset_timer(sk, &add_entry->add_timer,
@@ -555,8 +555,6 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
/* check first for announce */
if (msk->pm.add_addr_signaled < add_addr_signal_max) {
- local = select_signal_address(pernet, msk);
-
/* due to racing events on both ends we can reach here while
* previous add address is still running: if we invoke now
* mptcp_pm_announce_addr(), that will fail and the
@@ -567,11 +565,15 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL))
return;
+ local = select_signal_address(pernet, msk);
if (!local)
goto subflow;
+ /* If the alloc fails, we are on memory pressure, not worth
+ * continuing, and trying to create subflows.
+ */
if (!mptcp_pm_alloc_anno_list(msk, &local->addr))
- goto subflow;
+ return;
__clear_bit(local->addr.id, msk->pm.id_avail_bitmap);
msk->pm.add_addr_signaled++;
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x e0391e92f9ab4fb3dbdeb139c967dcfa7ac4b115
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024081250-taco-simmering-1043@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
e0391e92f9ab ("btrfs: fix double inode unlock for direct IO sync writes")
56b7169f691c ("btrfs: use a btrfs_inode local variable at btrfs_sync_file()")
e641e323abb3 ("btrfs: pass a btrfs_inode to btrfs_wait_ordered_range()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e0391e92f9ab4fb3dbdeb139c967dcfa7ac4b115 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Fri, 2 Aug 2024 09:38:51 +0100
Subject: [PATCH] btrfs: fix double inode unlock for direct IO sync writes
If we do a direct IO sync write, at btrfs_sync_file(), and we need to skip
inode logging or we get an error starting a transaction or an error when
flushing delalloc, we end up unlocking the inode when we shouldn't under
the 'out_release_extents' label, and then unlock it again at
btrfs_direct_write().
Fix that by checking if we have to skip inode unlocking under that label.
Reported-by: syzbot+7dbbb74af6291b5a5a8b(a)syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/000000000000dfd631061eaeb4bc@google.com/
Fixes: 939b656bc8ab ("btrfs: fix corruption after buffer fault in during direct IO append write")
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 9f10a9f23fcc..9914419f3b7d 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1868,7 +1868,10 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
out_release_extents:
btrfs_release_log_ctx_extents(&ctx);
- btrfs_inode_unlock(inode, BTRFS_ILOCK_MMAP);
+ if (skip_ilock)
+ up_write(&inode->i_mmap_lock);
+ else
+ btrfs_inode_unlock(inode, BTRFS_ILOCK_MMAP);
goto out;
}