January 2024 - Linux-stable-mirror

FAILED: patch "[PATCH] mptcp: move __mptcp_error_report in protocol.c" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x d5fbeff1ab812b6c473b6924bee8748469462e2c # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023100421-divisible-bacterium-18b5@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From d5fbeff1ab812b6c473b6924bee8748469462e2c Mon Sep 17 00:00:00 2001 From: Paolo Abeni <pabeni(a)redhat.com> Date: Sat, 16 Sep 2023 12:52:46 +0200 Subject: [PATCH] mptcp: move __mptcp_error_report in protocol.c This will simplify the next patch ("mptcp: process pending subflow error on close"). No functional change intended. Cc: stable(a)vger.kernel.org # v5.12+ Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> Reviewed-by: Mat Martineau <martineau(a)kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index a7fc16f5175d..915860027b1a 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -770,6 +770,42 @@ static bool __mptcp_ofo_queue(struct mptcp_sock *msk) return moved; } +void __mptcp_error_report(struct sock *sk) +{ + struct mptcp_subflow_context *subflow; + struct mptcp_sock *msk = mptcp_sk(sk); + + mptcp_for_each_subflow(msk, subflow) { + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + int err = sock_error(ssk); + int ssk_state; + + if (!err) + continue; + + /* only propagate errors on fallen-back sockets or + * on MPC connect + */ + if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(msk)) + continue; + + /* We need to propagate only transition to CLOSE state. + * Orphaned socket will see such state change via + * subflow_sched_work_if_closed() and that path will properly + * destroy the msk as needed. + */ + ssk_state = inet_sk_state_load(ssk); + if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD)) + inet_sk_state_store(sk, ssk_state); + WRITE_ONCE(sk->sk_err, -err); + + /* This barrier is coupled with smp_rmb() in mptcp_poll() */ + smp_wmb(); + sk_error_report(sk); + break; + } +} + /* In most cases we will be able to lock the mptcp socket. If its already * owned, we need to defer to the work queue to avoid ABBA deadlock. */ diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 9bf3c7bc1762..2f40c23fdb0d 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1362,42 +1362,6 @@ void mptcp_space(const struct sock *ssk, int *space, int *full_space) *full_space = mptcp_win_from_space(sk, READ_ONCE(sk->sk_rcvbuf)); } -void __mptcp_error_report(struct sock *sk) -{ - struct mptcp_subflow_context *subflow; - struct mptcp_sock *msk = mptcp_sk(sk); - - mptcp_for_each_subflow(msk, subflow) { - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); - int err = sock_error(ssk); - int ssk_state; - - if (!err) - continue; - - /* only propagate errors on fallen-back sockets or - * on MPC connect - */ - if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(msk)) - continue; - - /* We need to propagate only transition to CLOSE state. - * Orphaned socket will see such state change via - * subflow_sched_work_if_closed() and that path will properly - * destroy the msk as needed. - */ - ssk_state = inet_sk_state_load(ssk); - if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD)) - inet_sk_state_store(sk, ssk_state); - WRITE_ONCE(sk->sk_err, -err); - - /* This barrier is coupled with smp_rmb() in mptcp_poll() */ - smp_wmb(); - sk_error_report(sk); - break; - } -} - static void subflow_error_report(struct sock *ssk) { struct sock *sk = mptcp_subflow_ctx(ssk)->conn;

1 year, 3 months

3
2
0 0

[PATCH] block: Remove special-casing of compound pages

by Matthew Wilcox (Oracle)

The special casing was originally added in pre-git history; reproducing the commit log here: > commit a318a92567d77 > Author: Andrew Morton <akpm(a)osdl.org> > Date: Sun Sep 21 01:42:22 2003 -0700 > > [PATCH] Speed up direct-io hugetlbpage handling > > This patch short-circuits all the direct-io page dirtying logic for > higher-order pages. Without this, we pointlessly bounce BIOs up to > keventd all the time. In the last twenty years, compound pages have become used for more than just hugetlb. Rewrite these functions to operate on folios instead of pages and remove the special case for hugetlbfs; I don't think it's needed any more (and if it is, we can put it back in as a call to folio_test_hugetlb()). This was found by inspection; as far as I can tell, this bug can lead to pages used as the destination of a direct I/O read not being marked as dirty. If those pages are then reclaimed by the MM without being dirtied for some other reason, they won't be written out. Then when they're faulted back in, they will not contain the data they should. It'll take a pretty unusual setup to produce this problem with several races all going the wrong way. This problem predates the folio work; it could for example have been triggered by mmaping a THP in tmpfs and using that as the target of an O_DIRECT read. Fixes: 800d8c63b2e98 ("shmem: add huge pages support") Cc: stable(a)vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org> --- block/bio.c | 46 ++++++++++++++++++++++++---------------------- 1 file changed, 24 insertions(+), 22 deletions(-) diff --git a/block/bio.c b/block/bio.c index 8672179213b9..f46d8ec71fbd 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1171,13 +1171,22 @@ EXPORT_SYMBOL(bio_add_folio); void __bio_release_pages(struct bio *bio, bool mark_dirty) { - struct bvec_iter_all iter_all; - struct bio_vec *bvec; + struct folio_iter fi; + + bio_for_each_folio_all(fi, bio) { + struct page *page; + size_t done = 0; - bio_for_each_segment_all(bvec, bio, iter_all) { - if (mark_dirty && !PageCompound(bvec->bv_page)) - set_page_dirty_lock(bvec->bv_page); - bio_release_page(bio, bvec->bv_page); + if (mark_dirty) { + folio_lock(fi.folio); + folio_mark_dirty(fi.folio); + folio_unlock(fi.folio); + } + page = folio_page(fi.folio, fi.offset / PAGE_SIZE); + do { + bio_release_page(bio, page++); + done += PAGE_SIZE; + } while (done < fi.length); } } EXPORT_SYMBOL_GPL(__bio_release_pages); @@ -1455,18 +1464,12 @@ EXPORT_SYMBOL(bio_free_pages); * bio_set_pages_dirty() and bio_check_pages_dirty() are support functions * for performing direct-IO in BIOs. * - * The problem is that we cannot run set_page_dirty() from interrupt context + * The problem is that we cannot run folio_mark_dirty() from interrupt context * because the required locks are not interrupt-safe. So what we can do is to * mark the pages dirty _before_ performing IO. And in interrupt context, * check that the pages are still dirty. If so, fine. If not, redirty them * in process context. * - * We special-case compound pages here: normally this means reads into hugetlb - * pages. The logic in here doesn't really work right for compound pages - * because the VM does not uniformly chase down the head page in all cases. - * But dirtiness of compound pages is pretty meaningless anyway: the VM doesn't - * handle them at all. So we skip compound pages here at an early stage. - * * Note that this code is very hard to test under normal circumstances because * direct-io pins the pages with get_user_pages(). This makes * is_page_cache_freeable return false, and the VM will not clean the pages. @@ -1482,12 +1485,12 @@ EXPORT_SYMBOL(bio_free_pages); */ void bio_set_pages_dirty(struct bio *bio) { - struct bio_vec *bvec; - struct bvec_iter_all iter_all; + struct folio_iter fi; - bio_for_each_segment_all(bvec, bio, iter_all) { - if (!PageCompound(bvec->bv_page)) - set_page_dirty_lock(bvec->bv_page); + bio_for_each_folio_all(fi, bio) { + folio_lock(fi.folio); + folio_mark_dirty(fi.folio); + folio_unlock(fi.folio); } } @@ -1530,12 +1533,11 @@ static void bio_dirty_fn(struct work_struct *work) void bio_check_pages_dirty(struct bio *bio) { - struct bio_vec *bvec; + struct folio_iter fi; unsigned long flags; - struct bvec_iter_all iter_all; - bio_for_each_segment_all(bvec, bio, iter_all) { - if (!PageDirty(bvec->bv_page) && !PageCompound(bvec->bv_page)) + bio_for_each_folio_all(fi, bio) { + if (!folio_test_dirty(fi.folio)) goto defer; } -- 2.40.1

1 year, 4 months

7
10
0 0

FAILED: patch "[PATCH] mptcp: rename timer related helper to less confusing names" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x f6909dc1c1f4452879278128012da6c76bc186a5 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023100402-launder-percent-3d59@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From f6909dc1c1f4452879278128012da6c76bc186a5 Mon Sep 17 00:00:00 2001 From: Paolo Abeni <pabeni(a)redhat.com> Date: Sat, 16 Sep 2023 12:52:48 +0200 Subject: [PATCH] mptcp: rename timer related helper to less confusing names The msk socket uses to different timeout to track close related events and retransmissions. The existing helpers do not indicate clearly which timer they actually touch, making the related code quite confusing. Change the existing helpers name to avoid such confusion. No functional change intended. This patch is linked to the next one ("mptcp: fix dangling connection hang-up"). The two patches are supposed to be backported together. Cc: stable(a)vger.kernel.org # v5.11+ Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts(a)tessares.net> Reviewed-by: Mat Martineau <martineau(a)kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 1c96b8da71df..c8f38f303a90 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -405,7 +405,7 @@ static bool __mptcp_move_skb(struct mptcp_sock *msk, struct sock *ssk, return false; } -static void mptcp_stop_timer(struct sock *sk) +static void mptcp_stop_rtx_timer(struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk); @@ -911,12 +911,12 @@ static void __mptcp_flush_join_list(struct sock *sk, struct list_head *join_list } } -static bool mptcp_timer_pending(struct sock *sk) +static bool mptcp_rtx_timer_pending(struct sock *sk) { return timer_pending(&inet_csk(sk)->icsk_retransmit_timer); } -static void mptcp_reset_timer(struct sock *sk) +static void mptcp_reset_rtx_timer(struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk); unsigned long tout; @@ -1050,10 +1050,10 @@ static void __mptcp_clean_una(struct sock *sk) out: if (snd_una == READ_ONCE(msk->snd_nxt) && snd_una == READ_ONCE(msk->write_seq)) { - if (mptcp_timer_pending(sk) && !mptcp_data_fin_enabled(msk)) - mptcp_stop_timer(sk); + if (mptcp_rtx_timer_pending(sk) && !mptcp_data_fin_enabled(msk)) + mptcp_stop_rtx_timer(sk); } else { - mptcp_reset_timer(sk); + mptcp_reset_rtx_timer(sk); } } @@ -1626,8 +1626,8 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) mptcp_push_release(ssk, &info); /* ensure the rtx timer is running */ - if (!mptcp_timer_pending(sk)) - mptcp_reset_timer(sk); + if (!mptcp_rtx_timer_pending(sk)) + mptcp_reset_rtx_timer(sk); if (do_check_data_fin) mptcp_check_send_data_fin(sk); } @@ -1690,8 +1690,8 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool if (copied) { tcp_push(ssk, 0, info.mss_now, tcp_sk(ssk)->nonagle, info.size_goal); - if (!mptcp_timer_pending(sk)) - mptcp_reset_timer(sk); + if (!mptcp_rtx_timer_pending(sk)) + mptcp_reset_rtx_timer(sk); if (msk->snd_data_fin_enable && msk->snd_nxt + 1 == msk->write_seq) @@ -2260,7 +2260,7 @@ static void mptcp_retransmit_timer(struct timer_list *t) sock_put(sk); } -static void mptcp_timeout_timer(struct timer_list *t) +static void mptcp_tout_timer(struct timer_list *t) { struct sock *sk = from_timer(sk, t, sk_timer); @@ -2629,14 +2629,14 @@ static void __mptcp_retrans(struct sock *sk) reset_timer: mptcp_check_and_set_pending(sk); - if (!mptcp_timer_pending(sk)) - mptcp_reset_timer(sk); + if (!mptcp_rtx_timer_pending(sk)) + mptcp_reset_rtx_timer(sk); } /* schedule the timeout timer for the relevant event: either close timeout * or mp_fail timeout. The close timeout takes precedence on the mp_fail one */ -void mptcp_reset_timeout(struct mptcp_sock *msk, unsigned long fail_tout) +void mptcp_reset_tout_timer(struct mptcp_sock *msk, unsigned long fail_tout) { struct sock *sk = (struct sock *)msk; unsigned long timeout, close_timeout; @@ -2669,7 +2669,7 @@ static void mptcp_mp_fail_no_response(struct mptcp_sock *msk) WRITE_ONCE(mptcp_subflow_ctx(ssk)->fail_tout, 0); unlock_sock_fast(ssk, slow); - mptcp_reset_timeout(msk, 0); + mptcp_reset_tout_timer(msk, 0); } static void mptcp_do_fastclose(struct sock *sk) @@ -2758,7 +2758,7 @@ static void __mptcp_init_sock(struct sock *sk) /* re-use the csk retrans timer for MPTCP-level retrans */ timer_setup(&msk->sk.icsk_retransmit_timer, mptcp_retransmit_timer, 0); - timer_setup(&sk->sk_timer, mptcp_timeout_timer, 0); + timer_setup(&sk->sk_timer, mptcp_tout_timer, 0); } static void mptcp_ca_reset(struct sock *sk) @@ -2849,8 +2849,8 @@ void mptcp_subflow_shutdown(struct sock *sk, struct sock *ssk, int how) } else { pr_debug("Sending DATA_FIN on subflow %p", ssk); tcp_send_ack(ssk); - if (!mptcp_timer_pending(sk)) - mptcp_reset_timer(sk); + if (!mptcp_rtx_timer_pending(sk)) + mptcp_reset_rtx_timer(sk); } break; } @@ -2933,7 +2933,7 @@ static void __mptcp_destroy_sock(struct sock *sk) might_sleep(); - mptcp_stop_timer(sk); + mptcp_stop_rtx_timer(sk); sk_stop_timer(sk, &sk->sk_timer); msk->pm.status = 0; mptcp_release_sched(msk); @@ -3053,7 +3053,7 @@ bool __mptcp_close(struct sock *sk, long timeout) __mptcp_destroy_sock(sk); do_cancel_work = true; } else { - mptcp_reset_timeout(msk, 0); + mptcp_reset_tout_timer(msk, 0); } return do_cancel_work; @@ -3116,7 +3116,7 @@ static int mptcp_disconnect(struct sock *sk, int flags) mptcp_check_listen_stop(sk); inet_sk_state_store(sk, TCP_CLOSE); - mptcp_stop_timer(sk); + mptcp_stop_rtx_timer(sk); sk_stop_timer(sk, &sk->sk_timer); if (msk->token) diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 7254b3562575..5e2026815c8e 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -718,7 +718,7 @@ void mptcp_get_options(const struct sk_buff *skb, void mptcp_finish_connect(struct sock *sk); void __mptcp_set_connected(struct sock *sk); -void mptcp_reset_timeout(struct mptcp_sock *msk, unsigned long fail_tout); +void mptcp_reset_tout_timer(struct mptcp_sock *msk, unsigned long fail_tout); static inline bool mptcp_is_fully_established(struct sock *sk) { return inet_sk_state_load(sk) == TCP_ESTABLISHED && diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 2f40c23fdb0d..433f290984c8 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1226,7 +1226,7 @@ static void mptcp_subflow_fail(struct mptcp_sock *msk, struct sock *ssk) WRITE_ONCE(subflow->fail_tout, fail_tout); tcp_send_ack(ssk); - mptcp_reset_timeout(msk, subflow->fail_tout); + mptcp_reset_tout_timer(msk, subflow->fail_tout); } static bool subflow_check_data_avail(struct sock *ssk)

1 year, 4 months

2
1
0 0

FAILED: patch "[PATCH] mptcp: process pending subflow error on close" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 9f1a98813b4b686482e5ef3c9d998581cace0ba6 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023100447-durable-snowiness-8b36@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 9f1a98813b4b686482e5ef3c9d998581cace0ba6 Mon Sep 17 00:00:00 2001 From: Paolo Abeni <pabeni(a)redhat.com> Date: Sat, 16 Sep 2023 12:52:47 +0200 Subject: [PATCH] mptcp: process pending subflow error on close On incoming TCP reset, subflow closing could happen before error propagation. That in turn could cause the socket error being ignored, and a missing socket state transition, as reported by Daire-Byrne. Address the issues explicitly checking for subflow socket error at close time. To avoid code duplication, factor-out of __mptcp_error_report() a new helper implementing the relevant bits. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/429 Fixes: 15cc10453398 ("mptcp: deliver ssk errors to msk") Cc: stable(a)vger.kernel.org Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> Reviewed-by: Mat Martineau <martineau(a)kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 915860027b1a..1c96b8da71df 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -770,40 +770,44 @@ static bool __mptcp_ofo_queue(struct mptcp_sock *msk) return moved; } +static bool __mptcp_subflow_error_report(struct sock *sk, struct sock *ssk) +{ + int err = sock_error(ssk); + int ssk_state; + + if (!err) + return false; + + /* only propagate errors on fallen-back sockets or + * on MPC connect + */ + if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(mptcp_sk(sk))) + return false; + + /* We need to propagate only transition to CLOSE state. + * Orphaned socket will see such state change via + * subflow_sched_work_if_closed() and that path will properly + * destroy the msk as needed. + */ + ssk_state = inet_sk_state_load(ssk); + if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD)) + inet_sk_state_store(sk, ssk_state); + WRITE_ONCE(sk->sk_err, -err); + + /* This barrier is coupled with smp_rmb() in mptcp_poll() */ + smp_wmb(); + sk_error_report(sk); + return true; +} + void __mptcp_error_report(struct sock *sk) { struct mptcp_subflow_context *subflow; struct mptcp_sock *msk = mptcp_sk(sk); - mptcp_for_each_subflow(msk, subflow) { - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); - int err = sock_error(ssk); - int ssk_state; - - if (!err) - continue; - - /* only propagate errors on fallen-back sockets or - * on MPC connect - */ - if (sk->sk_state != TCP_SYN_SENT && !__mptcp_check_fallback(msk)) - continue; - - /* We need to propagate only transition to CLOSE state. - * Orphaned socket will see such state change via - * subflow_sched_work_if_closed() and that path will properly - * destroy the msk as needed. - */ - ssk_state = inet_sk_state_load(ssk); - if (ssk_state == TCP_CLOSE && !sock_flag(sk, SOCK_DEAD)) - inet_sk_state_store(sk, ssk_state); - WRITE_ONCE(sk->sk_err, -err); - - /* This barrier is coupled with smp_rmb() in mptcp_poll() */ - smp_wmb(); - sk_error_report(sk); - break; - } + mptcp_for_each_subflow(msk, subflow) + if (__mptcp_subflow_error_report(sk, mptcp_subflow_tcp_sock(subflow))) + break; } /* In most cases we will be able to lock the mptcp socket. If its already @@ -2428,6 +2432,7 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, } out_release: + __mptcp_subflow_error_report(sk, ssk); release_sock(ssk); sock_put(ssk);

1 year, 4 months

2
1
0 0

[PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

by Paolo Bonzini

Supersedes: <20240130180400.1698136-1-pbonzini(a)redhat.com> MKTME repurposes the high bit of physical address to key id for encryption key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same, the valid bits in the MTRR mask register are based on the reduced number of physical address bits. This breaks boot on machines that have TME enabled and do something to cleanup MTRRs, unless "disable_mtrr_cleanup" is passed on the command line. The fix is to move the check to early CPU initialization, which runs before Linux sets up MTRRs. However, as noticed by Kirill, the patch I sent as v1 actually works only until Linux 6.6. In Linux 6.7, commit fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach") reorganized the initialization of c->x86_phys_bits in a way that broke the patch. But even in 6.7 AMD processors, which did try to reduce it in this_cpu->c_early_init(c), had their x86_phys_bits value overwritten by get_cpu_address_sizes(), so that early_identify_cpu() left the wrong value in x86_phys_bits. This probably went unnoticed because on AMD processors you need not apply the reduced MAXPHYADDR to MTRR masks. Therefore, this v2 prepends the fix for this issue in commit fbf6449f84bf. Apologies for the oversight. Tested on an AMD Epyc machine (where I resorted to dumping mtrr_state) and on the problematic Intel Emerald Rapids machine. Thanks, Paolo Paolo Bonzini (2): x86/cpu: allow reducing x86_phys_bits during early_identify_cpu() x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers arch/x86/kernel/cpu/common.c | 4 +- arch/x86/kernel/cpu/intel.c | 178 ++++++++++++++++++----------------- 2 files changed, 93 insertions(+), 89 deletions(-) -- 2.43.0

1 year, 4 months

5
23
0 0

[PATCH v4] md/raid5: fix atomicity violation in raid5_cache_count

by Gui-Dong Han

In raid5_cache_count(): if (conf->max_nr_stripes < conf->min_nr_stripes) return 0; return conf->max_nr_stripes - conf->min_nr_stripes; The current check is ineffective, as the values could change immediately after being checked. In raid5_set_cache_size(): ... conf->min_nr_stripes = size; ... while (size > conf->max_nr_stripes) conf->min_nr_stripes = conf->max_nr_stripes; ... Due to intermediate value updates in raid5_set_cache_size(), concurrent execution of raid5_cache_count() and raid5_set_cache_size() may lead to inconsistent reads of conf->max_nr_stripes and conf->min_nr_stripes. The current checks are ineffective as values could change immediately after being checked, raising the risk of conf->min_nr_stripes exceeding conf->max_nr_stripes and potentially causing an integer overflow. This possible bug is found by an experimental static analysis tool developed by our team. This tool analyzes the locking APIs to extract function pairs that can be concurrently executed, and then analyzes the instructions in the paired functions to identify possible concurrency bugs including data races and atomicity violations. The above possible bug is reported when our tool analyzes the source code of Linux 6.2. To resolve this issue, it is suggested to introduce local variables 'min_stripes' and 'max_stripes' in raid5_cache_count() to ensure the values remain stable throughout the check. Adding locks in raid5_cache_count() fails to resolve atomicity violations, as raid5_set_cache_size() may hold intermediate values of conf->min_nr_stripes while unlocked. With this patch applied, our tool no longer reports the bug, with the kernel configuration allyesconfig for x86_64. Due to the lack of associated hardware, we cannot test the patch in runtime testing, and just verify it according to the code logic. Fixes: edbe83ab4c27 ("md/raid5: allow the stripe_cache to grow and shrink.") Cc: stable(a)vger.kernel.org Signed-off-by: Gui-Dong Han <2045gemini(a)gmail.com> --- v2: * In this patch v2, we've updated to use READ_ONCE() instead of direct reads for accessing max_nr_stripes and min_nr_stripes, since read and write can concurrent. Thank Yu Kuai for helpful advice. --- v3: * In this patch v3, we've updated to use WRITE_ONCE() in raid5_set_cache_size(), grow_one_stripe() and drop_one_stripe(), in order to pair READ_ONCE() with WRITE_ONCE(). Thank Yu Kuai for helpful advice. --- v4: * In this patch v4, we've addressed several code style issues. Thank Yu Kuai for helpful advice. --- drivers/md/raid5.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 8497880135ee..30e118d10c0b 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -2412,7 +2412,7 @@ static int grow_one_stripe(struct r5conf *conf, gfp_t gfp) atomic_inc(&conf->active_stripes); raid5_release_stripe(sh); - conf->max_nr_stripes++; + WRITE_ONCE(conf->max_nr_stripes, conf->max_nr_stripes + 1); return 1; } @@ -2707,7 +2707,7 @@ static int drop_one_stripe(struct r5conf *conf) shrink_buffers(sh); free_stripe(conf->slab_cache, sh); atomic_dec(&conf->active_stripes); - conf->max_nr_stripes--; + WRITE_ONCE(conf->max_nr_stripes, conf->max_nr_stripes - 1); return 1; } @@ -6820,7 +6820,7 @@ raid5_set_cache_size(struct mddev *mddev, int size) if (size <= 16 || size > 32768) return -EINVAL; - conf->min_nr_stripes = size; + WRITE_ONCE(conf->min_nr_stripes, size); mutex_lock(&conf->cache_size_mutex); while (size < conf->max_nr_stripes && drop_one_stripe(conf)) @@ -6832,7 +6832,7 @@ raid5_set_cache_size(struct mddev *mddev, int size) mutex_lock(&conf->cache_size_mutex); while (size > conf->max_nr_stripes) if (!grow_one_stripe(conf, GFP_KERNEL)) { - conf->min_nr_stripes = conf->max_nr_stripes; + WRITE_ONCE(conf->min_nr_stripes, conf->max_nr_stripes); result = -ENOMEM; break; } @@ -7390,11 +7390,13 @@ static unsigned long raid5_cache_count(struct shrinker *shrink, struct shrink_control *sc) { struct r5conf *conf = shrink->private_data; + int max_stripes = READ_ONCE(conf->max_nr_stripes); + int min_stripes = READ_ONCE(conf->min_nr_stripes); - if (conf->max_nr_stripes < conf->min_nr_stripes) + if (max_stripes < min_stripes) /* unlikely, but not impossible */ return 0; - return conf->max_nr_stripes - conf->min_nr_stripes; + return max_stripes - min_stripes; } static struct r5conf *setup_conf(struct mddev *mddev) -- 2.34.1

1 year, 4 months

3
4
0 0

[PATCH v1] usb: typec: altmodes/displayport: add null pointer check for sysfs nodes

by RD Babiera

The DisplayPort driver's sysfs nodes may be present to the userspace before typec_altmode_set_drvdata() completes in dp_altmode_probe. This means that a sysfs read can trigger a NULL pointer error by deferencing dp->hpd in hpd_show or dp->lock in pin_assignment_show, as dev_get_drvdata() returns NULL in those cases. Verify dp drvdata is present in sysfs reads and writes before proceeding. Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode") Cc: stable(a)vger.kernel.org Signed-off-by: RD Babiera <rdbabiera(a)google.com> --- drivers/usb/typec/altmodes/displayport.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c index 5a80776c7255..0423326219d8 100644 --- a/drivers/usb/typec/altmodes/displayport.c +++ b/drivers/usb/typec/altmodes/displayport.c @@ -518,6 +518,9 @@ configuration_store(struct device *dev, struct device_attribute *attr, int con; int ret = 0; + if (!dp) + return -ENODEV; + con = sysfs_match_string(configurations, buf); if (con < 0) return con; @@ -563,6 +566,9 @@ static ssize_t configuration_show(struct device *dev, u8 cur; int i; + if (!dp) + return -ENODEV; + mutex_lock(&dp->lock); cap = DP_CAP_CAPABILITY(dp->alt->vdo); @@ -615,6 +621,9 @@ pin_assignment_store(struct device *dev, struct device_attribute *attr, u32 conf; int ret; + if (!dp) + return -ENODEV; + ret = sysfs_match_string(pin_assignments, buf); if (ret < 0) return ret; @@ -666,6 +675,9 @@ static ssize_t pin_assignment_show(struct device *dev, u8 cur; int i; + if (!dp) + return -ENODEV; + mutex_lock(&dp->lock); cur = get_count_order(DP_CONF_GET_PIN_ASSIGN(dp->data.conf)); @@ -698,6 +710,9 @@ static ssize_t hpd_show(struct device *dev, struct device_attribute *attr, char { struct dp_altmode *dp = dev_get_drvdata(dev); + if (!dp) + return -ENODEV; + return sysfs_emit(buf, "%d\n", dp->hpd); } static DEVICE_ATTR_RO(hpd); base-commit: f1a27f081c1fa1eeebf38406e45f29636114470f -- 2.43.0.429.g432eaa2c6b-goog

1 year, 4 months

2
2
0 0

Re: [REGRESSION 6.1.70] system calls with CIFS mounts failing with "Resource temporarily unavailable"

by Mohamed Abuelfotoh, Hazem

It looks like both 5.15.146 and 5.10.206 are impacted by this regression as they both have the bad commit 33eae65c6f (smb: client: fix OOB in SMB2_query_info_init()). We tried to apply the proposed fix eb3e28c1e89b ("smb3: Replace smb2pdu 1-element arrays with flex-arrays”) but there are a lot of dependencies required to do the backport. Is it possible to consider the simple fix that Paulo proposed as a solution for 5.10 and 5.15. We were lucky with 5.4 as it doesn’t have the bad commit because of merge conflict reported in https://lore.kernel.org/all/2023122857-doubling-crazed-27f4@gregkh/T/#m3aa0… diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index 05ff8a457a3d..aed5067661de 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -3556,7 +3556,7 @@ SMB2_query_info_init(struct cifs_tcon *tcon, struct TCP_Server_Info *server, iov[0].iov_base = (char *)req; /* 1 for Buffer */ - iov[0].iov_len = len; + iov[0].iov_len = len - 1; return 0; } Hazem

1 year, 4 months

9
19
0 0

FAILED: patch "[PATCH] erofs: fix lz4 inplace decompression" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024012650-altitude-gush-572f@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: 3c12466b6b7b ("erofs: fix lz4 inplace decompression") 123ec246ebe3 ("erofs: get rid of the remaining kmap_atomic()") ab749badf9f4 ("erofs: support unaligned data decompression") 10e5f6e482e1 ("erofs: introduce z_erofs_fixup_insize") d67aee76d418 ("erofs: tidy up z_erofs_lz4_decompress") 7e508f2ca8bb ("erofs: rename lz4_0pading to zero_padding") eaa9172ad988 ("erofs: get rid of ->lru usage") 622ceaddb764 ("erofs: lzma compression support") 966edfb0a3dc ("erofs: rename some generic methods in decompressor") 386292919c25 ("erofs: introduce readmore decompression strategy") 8f89926290c4 ("erofs: get compression algorithms directly on mapping") dfeab2e95a75 ("erofs: add multiple device support") e62424651f43 ("erofs: decouple basic mount options from fs_context") 5b6e7e120e71 ("erofs: remove the fast path of per-CPU buffer decompression") 2e5fd489a4e5 ("Merge tag 'libnvdimm-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de Mon Sep 17 00:00:00 2001 From: Gao Xiang <xiang(a)kernel.org> Date: Wed, 6 Dec 2023 12:55:34 +0800 Subject: [PATCH] erofs: fix lz4 inplace decompression Currently EROFS can map another compressed buffer for inplace decompression, that was used to handle the cases that some pages of compressed data are actually not in-place I/O. However, like most simple LZ77 algorithms, LZ4 expects the compressed data is arranged at the end of the decompressed buffer and it explicitly uses memmove() to handle overlapping: __________________________________________________________ |_ direction of decompression --> ____ |_ compressed data _| Although EROFS arranges compressed data like this, it typically maps two individual virtual buffers so the relative order is uncertain. Previously, it was hardly observed since LZ4 only uses memmove() for short overlapped literals and x86/arm64 memmove implementations seem to completely cover it up and they don't have this issue. Juhyung reported that EROFS data corruption can be found on a new Intel x86 processor. After some analysis, it seems that recent x86 processors with the new FSRM feature expose this issue with "rep movsb". Let's strictly use the decompressed buffer for lz4 inplace decompression for now. Later, as an useful improvement, we could try to tie up these two buffers together in the correct order. Reported-and-tested-by: Juhyung Park <qkrwngud825(a)gmail.com> Closes: https://lore.kernel.org/r/CAD14+f2AVKf8Fa2OO1aAUdDNTDsVzzR6ctU_oJSmTyd6zSYR… Fixes: 0ffd71bcc3a0 ("staging: erofs: introduce LZ4 decompression inplace") Fixes: 598162d05080 ("erofs: support decompress big pcluster for lz4 backend") Cc: stable <stable(a)vger.kernel.org> # 5.4+ Tested-by: Yifan Zhao <zhaoyifan(a)sjtu.edu.cn> Signed-off-by: Gao Xiang <hsiangkao(a)linux.alibaba.com> Link: https://lore.kernel.org/r/20231206045534.3920847-1-hsiangkao@linux.alibaba.… diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c index 021be5feb1bc..e0d609c3958f 100644 --- a/fs/erofs/decompressor.c +++ b/fs/erofs/decompressor.c @@ -121,11 +121,11 @@ static int z_erofs_lz4_prepare_dstpages(struct z_erofs_lz4_decompress_ctx *ctx, } static void *z_erofs_lz4_handle_overlap(struct z_erofs_lz4_decompress_ctx *ctx, - void *inpage, unsigned int *inputmargin, int *maptype, - bool may_inplace) + void *inpage, void *out, unsigned int *inputmargin, + int *maptype, bool may_inplace) { struct z_erofs_decompress_req *rq = ctx->rq; - unsigned int omargin, total, i, j; + unsigned int omargin, total, i; struct page **in; void *src, *tmp; @@ -135,12 +135,13 @@ static void *z_erofs_lz4_handle_overlap(struct z_erofs_lz4_decompress_ctx *ctx, omargin < LZ4_DECOMPRESS_INPLACE_MARGIN(rq->inputsize)) goto docopy; - for (i = 0; i < ctx->inpages; ++i) { - DBG_BUGON(rq->in[i] == NULL); - for (j = 0; j < ctx->outpages - ctx->inpages + i; ++j) - if (rq->out[j] == rq->in[i]) - goto docopy; - } + for (i = 0; i < ctx->inpages; ++i) + if (rq->out[ctx->outpages - ctx->inpages + i] != + rq->in[i]) + goto docopy; + kunmap_local(inpage); + *maptype = 3; + return out + ((ctx->outpages - ctx->inpages) << PAGE_SHIFT); } if (ctx->inpages <= 1) { @@ -148,7 +149,6 @@ static void *z_erofs_lz4_handle_overlap(struct z_erofs_lz4_decompress_ctx *ctx, return inpage; } kunmap_local(inpage); - might_sleep(); src = erofs_vm_map_ram(rq->in, ctx->inpages); if (!src) return ERR_PTR(-ENOMEM); @@ -204,12 +204,12 @@ int z_erofs_fixup_insize(struct z_erofs_decompress_req *rq, const char *padbuf, } static int z_erofs_lz4_decompress_mem(struct z_erofs_lz4_decompress_ctx *ctx, - u8 *out) + u8 *dst) { struct z_erofs_decompress_req *rq = ctx->rq; bool support_0padding = false, may_inplace = false; unsigned int inputmargin; - u8 *headpage, *src; + u8 *out, *headpage, *src; int ret, maptype; DBG_BUGON(*rq->in == NULL); @@ -230,11 +230,12 @@ static int z_erofs_lz4_decompress_mem(struct z_erofs_lz4_decompress_ctx *ctx, } inputmargin = rq->pageofs_in; - src = z_erofs_lz4_handle_overlap(ctx, headpage, &inputmargin, + src = z_erofs_lz4_handle_overlap(ctx, headpage, dst, &inputmargin, &maptype, may_inplace); if (IS_ERR(src)) return PTR_ERR(src); + out = dst + rq->pageofs_out; /* legacy format could compress extra data in a pcluster. */ if (rq->partial_decoding || !support_0padding) ret = LZ4_decompress_safe_partial(src + inputmargin, out, @@ -265,7 +266,7 @@ static int z_erofs_lz4_decompress_mem(struct z_erofs_lz4_decompress_ctx *ctx, vm_unmap_ram(src, ctx->inpages); } else if (maptype == 2) { erofs_put_pcpubuf(src); - } else { + } else if (maptype != 3) { DBG_BUGON(1); return -EFAULT; } @@ -308,7 +309,7 @@ static int z_erofs_lz4_decompress(struct z_erofs_decompress_req *rq, } dstmap_out: - ret = z_erofs_lz4_decompress_mem(&ctx, dst + rq->pageofs_out); + ret = z_erofs_lz4_decompress_mem(&ctx, dst); if (!dst_maptype) kunmap_local(dst); else if (dst_maptype == 2)

1 year, 4 months

3
2
0 0

[PATCH v2] kbuild: tools: drop overridden CFLAGS from MAKEOVERRIDES

by Max Filippov

Some Makefiles under tools/ use the 'override CFLAGS += ...' construct to add a few required options to CFLAGS passed by the user. Unfortunately that only works when user passes CFLAGS as an environment variable, i.e. CFLAGS=... make ... and not in case when CFLAGS are passed as make command line arguments: make ... CFLAGS=... It happens because in the latter case CFLAGS=... is recorded in the make variable MAKEOVERRIDES and this variable is passed in its original form to all $(MAKE) subcommands, taking precedence over modified CFLAGS value passed in the environment variable. E.g. this causes build failure for gpio and iio tools when the build is run with user CFLAGS because of missing _GNU_SOURCE definition needed for the asprintf(). One way to fix it is by removing overridden variables from the MAKEOVERRIDES. Add macro 'drop-var-from-overrides' that removes a definition of a variable passed to it from the MAKEOVERRIDES and use it to fix CFLAGS passing for tools/gpio and tools/iio. This implementation tries to be precise in string processing and handle variables with embedded spaces and backslashes correctly. To achieve that it replaces every '\\' sequence with '\-' to make sure that every '\' in the resulting string is an escape character. It then replaces every '\ ' sequence with '\_' to turn string values with embedded spaces into single words. After filtering the overridden variable definition out of the resulting string these two transformations are reversed. Cc: stable(a)vger.kernel.org Fixes: 4ccc98a48958 ("tools gpio: Allow overriding CFLAGS") Fixes: 572974610273 ("tools iio: Override CFLAGS assignments") Signed-off-by: Max Filippov <jcmvbkbc(a)gmail.com> --- Changes v1->v2: - make drop-var-from-overrides-code work correctly with arbitrary variables, including thoses ending with '\'. tools/gpio/Makefile | 1 + tools/iio/Makefile | 1 + tools/scripts/Makefile.include | 9 +++++++++ 3 files changed, 11 insertions(+) diff --git a/tools/gpio/Makefile b/tools/gpio/Makefile index d29c9c49e251..46fc38d51639 100644 --- a/tools/gpio/Makefile +++ b/tools/gpio/Makefile @@ -24,6 +24,7 @@ ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS)) all: $(ALL_PROGRAMS) export srctree OUTPUT CC LD CFLAGS +$(call drop-var-from-overrides,CFLAGS) include $(srctree)/tools/build/Makefile.include # diff --git a/tools/iio/Makefile b/tools/iio/Makefile index fa720f062229..04307588dd3f 100644 --- a/tools/iio/Makefile +++ b/tools/iio/Makefile @@ -20,6 +20,7 @@ ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS)) all: $(ALL_PROGRAMS) export srctree OUTPUT CC LD CFLAGS +$(call drop-var-from-overrides,CFLAGS) include $(srctree)/tools/build/Makefile.include # diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include index 6fba29f3222d..0f68b95cf55c 100644 --- a/tools/scripts/Makefile.include +++ b/tools/scripts/Makefile.include @@ -51,6 +51,15 @@ define allow-override $(eval $(1) = $(2))) endef +# When a Makefile overrides a variable and exports it for the nested $(MAKE) +# invocations to use its modified value, it must remove that variable definition +# from the MAKEOVERRIDES variable, otherwise the original definition from the +# MAKEOVERRIDES takes precedence over the exported value. +drop-var-from-overrides = $(eval $(drop-var-from-overrides-code)) +define drop-var-from-overrides-code +MAKEOVERRIDES := $(subst \-,\\,$(subst \_,\ ,$(filter-out $(1)=%,$(subst \ ,\_,$(subst \\,\-,$(MAKEOVERRIDES)))))) +endef + ifneq ($(LLVM),) ifneq ($(filter %/,$(LLVM)),) LLVM_PREFIX := $(LLVM) -- 2.39.2

1 year, 4 months

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror January 2024