February 2024 - Linux-stable-mirror

FAILED: patch "[PATCH] io_uring/net: limit inline multishot retries" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x 76b367a2d83163cf19173d5cb0b562acbabc8eac # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021330-twice-pacify-2be5@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: 76b367a2d831 ("io_uring/net: limit inline multishot retries") 91e5d765a82f ("io_uring/net: un-indent mshot retry path in io_recv_finish()") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 76b367a2d83163cf19173d5cb0b562acbabc8eac Mon Sep 17 00:00:00 2001 From: Jens Axboe <axboe(a)kernel.dk> Date: Mon, 29 Jan 2024 12:00:58 -0700 Subject: [PATCH] io_uring/net: limit inline multishot retries If we have multiple clients and some/all are flooding the receives to such an extent that we can retry a LOT handling multishot receives, then we can be starving some clients and hence serving traffic in an imbalanced fashion. Limit multishot retry attempts to some arbitrary value, whose only purpose serves to ensure that we don't keep serving a single connection for way too long. We default to 32 retries, which should be more than enough to provide fairness, yet not so small that we'll spend too much time requeuing rather than handling traffic. Cc: stable(a)vger.kernel.org Depends-on: 704ea888d646 ("io_uring/poll: add requeue return code from poll multishot handling") Depends-on: 1e5d765a82f ("io_uring/net: un-indent mshot retry path in io_recv_finish()") Depends-on: e84b01a880f6 ("io_uring/poll: move poll execution helpers higher up") Fixes: b3fdea6ecb55 ("io_uring: multishot recv") Fixes: 9bb66906f23e ("io_uring: support multishot in recvmsg") Link: https://github.com/axboe/liburing/issues/1043 Signed-off-by: Jens Axboe <axboe(a)kernel.dk> diff --git a/io_uring/net.c b/io_uring/net.c index 740c6bfa5b59..a12ff69e6843 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -60,6 +60,7 @@ struct io_sr_msg { unsigned len; unsigned done_io; unsigned msg_flags; + unsigned nr_multishot_loops; u16 flags; /* initialised and used only by !msg send variants */ u16 addr_len; @@ -70,6 +71,13 @@ struct io_sr_msg { struct io_kiocb *notif; }; +/* + * Number of times we'll try and do receives if there's more data. If we + * exceed this limit, then add us to the back of the queue and retry from + * there. This helps fairness between flooding clients. + */ +#define MULTISHOT_MAX_RETRY 32 + static inline bool io_check_multishot(struct io_kiocb *req, unsigned int issue_flags) { @@ -611,6 +619,7 @@ int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) sr->msg_flags |= MSG_CMSG_COMPAT; #endif sr->done_io = 0; + sr->nr_multishot_loops = 0; return 0; } @@ -654,12 +663,20 @@ static inline bool io_recv_finish(struct io_kiocb *req, int *ret, */ if (io_fill_cqe_req_aux(req, issue_flags & IO_URING_F_COMPLETE_DEFER, *ret, cflags | IORING_CQE_F_MORE)) { + struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); + int mshot_retry_ret = IOU_ISSUE_SKIP_COMPLETE; + io_recv_prep_retry(req); /* Known not-empty or unknown state, retry */ - if (cflags & IORING_CQE_F_SOCK_NONEMPTY || msg->msg_inq == -1) - return false; + if (cflags & IORING_CQE_F_SOCK_NONEMPTY || msg->msg_inq == -1) { + if (sr->nr_multishot_loops++ < MULTISHOT_MAX_RETRY) + return false; + /* mshot retries exceeded, force a requeue */ + sr->nr_multishot_loops = 0; + mshot_retry_ret = IOU_REQUEUE; + } if (issue_flags & IO_URING_F_MULTISHOT) - *ret = IOU_ISSUE_SKIP_COMPLETE; + *ret = mshot_retry_ret; else *ret = -EAGAIN; return true;

1 year, 4 months

3
5
0 0

FAILED: patch "[PATCH] hrtimer: Report offline hrtimer enqueue" failed to apply to 4.19-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.19-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y git checkout FETCH_HEAD git cherry-pick -x dad6a09f3148257ac1773cd90934d721d68ab595 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021300-sagging-enhance-9113@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^.. Possible dependencies: dad6a09f3148 ("hrtimer: Report offline hrtimer enqueue") 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier") f61eff83cec9 ("hrtimer: Prepare support for PREEMPT_RT") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From dad6a09f3148257ac1773cd90934d721d68ab595 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker <frederic(a)kernel.org> Date: Mon, 29 Jan 2024 15:56:36 -0800 Subject: [PATCH] hrtimer: Report offline hrtimer enqueue The hrtimers migration on CPU-down hotplug process has been moved earlier, before the CPU actually goes to die. This leaves a small window of opportunity to queue an hrtimer in a blind spot, leaving it ignored. For example a practical case has been reported with RCU waking up a SCHED_FIFO task right before the CPUHP_AP_IDLE_DEAD stage, queuing that way a sched/rt timer to the local offline CPU. Make sure such situations never go unnoticed and warn when that happens. Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier") Reported-by: Paul E. McKenney <paulmck(a)kernel.org> Signed-off-by: Frederic Weisbecker <frederic(a)kernel.org> Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: stable(a)vger.kernel.org Link: https://lore.kernel.org/r/20240129235646.3171983-4-boqun.feng@gmail.com diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index 87e3bedf8eb0..641c4567cfa7 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -157,6 +157,7 @@ enum hrtimer_base_type { * @max_hang_time: Maximum time spent in hrtimer_interrupt * @softirq_expiry_lock: Lock which is taken while softirq based hrtimer are * expired + * @online: CPU is online from an hrtimers point of view * @timer_waiters: A hrtimer_cancel() invocation waits for the timer * callback to finish. * @expires_next: absolute time of the next event, is required for remote @@ -179,7 +180,8 @@ struct hrtimer_cpu_base { unsigned int hres_active : 1, in_hrtirq : 1, hang_detected : 1, - softirq_activated : 1; + softirq_activated : 1, + online : 1; #ifdef CONFIG_HIGH_RES_TIMERS unsigned int nr_events; unsigned short nr_retries; diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 760793998cdd..edb0f821dcea 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -1085,6 +1085,7 @@ static int enqueue_hrtimer(struct hrtimer *timer, enum hrtimer_mode mode) { debug_activate(timer, mode); + WARN_ON_ONCE(!base->cpu_base->online); base->cpu_base->active_bases |= 1 << base->index; @@ -2183,6 +2184,7 @@ int hrtimers_prepare_cpu(unsigned int cpu) cpu_base->softirq_next_timer = NULL; cpu_base->expires_next = KTIME_MAX; cpu_base->softirq_expires_next = KTIME_MAX; + cpu_base->online = 1; hrtimer_cpu_base_init_expiry_lock(cpu_base); return 0; } @@ -2250,6 +2252,7 @@ int hrtimers_cpu_dying(unsigned int dying_cpu) smp_call_function_single(ncpu, retrigger_next_event, NULL, 0); raw_spin_unlock(&new_base->lock); + old_base->online = 0; raw_spin_unlock(&old_base->lock); return 0;

1 year, 4 months

3
2
0 0

[PATCH] Bluetooth: qca: fix device-address endianness

by Johan Hovold

The WCN6855 firmware on the Lenovo ThinkPad X13s expects the Bluetooth device address in MSB order when setting it using the EDL_WRITE_BD_ADDR_OPCODE command. Presumably, this is the case for all non-ROME devices which all use the EDL_WRITE_BD_ADDR_OPCODE command for this (unlike the ROME devices which use a different command and expect the address in LSB order). Reverse the little-endian address before setting it to make sure that the address can be configured using tools like btmgmt or using the 'local-bd-address' devicetree property. Note that this can potentially break systems with boot firmware which has started relying on the broken behaviour and is incorrectly passing the address via devicetree in MSB order. Fixes: 5c0a1001c8be ("Bluetooth: hci_qca: Add helper to set device address") Cc: stable(a)vger.kernel.org # 5.1 Cc: Balakrishna Godavarthi <quic_bgodavar(a)quicinc.com> Cc: Matthias Kaehlcke <mka(a)chromium.org> Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org> --- Hi Qualcomm people, Could you please verify with your documentation that all non-ROME devices expect the address provided in the EDL_WRITE_BD_ADDR_OPCODE command in MSB order? I assume this is not something that anyone would change between firmware revisions, but if that turns out to be the case, we'd need to reverse the address based on firmware revision or similar. Johan drivers/bluetooth/btqca.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/bluetooth/btqca.c b/drivers/bluetooth/btqca.c index fdb0fae88d1c..29035daf21bc 100644 --- a/drivers/bluetooth/btqca.c +++ b/drivers/bluetooth/btqca.c @@ -826,11 +826,15 @@ EXPORT_SYMBOL_GPL(qca_uart_setup); int qca_set_bdaddr(struct hci_dev *hdev, const bdaddr_t *bdaddr) { + bdaddr_t bdaddr_swapped; struct sk_buff *skb; int err; - skb = __hci_cmd_sync_ev(hdev, EDL_WRITE_BD_ADDR_OPCODE, 6, bdaddr, - HCI_EV_VENDOR, HCI_INIT_TIMEOUT); + baswap(&bdaddr_swapped, bdaddr); + + skb = __hci_cmd_sync_ev(hdev, EDL_WRITE_BD_ADDR_OPCODE, 6, + &bdaddr_swapped, HCI_EV_VENDOR, + HCI_INIT_TIMEOUT); if (IS_ERR(skb)) { err = PTR_ERR(skb); bt_dev_err(hdev, "QCA Change address cmd failed (%d)", err); -- 2.41.0

1 year, 4 months

6
15
0 0

FAILED: patch "[PATCH] clocksource: Skip watchdog check for large watchdog intervals" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x 644649553508b9bacf0fc7a5bdc4f9e0165576a5 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024012905-carry-revolt-b8d5@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: 644649553508 ("clocksource: Skip watchdog check for large watchdog intervals") c37e85c135ce ("clocksource: Loosen clocksource watchdog constraints") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 644649553508b9bacf0fc7a5bdc4f9e0165576a5 Mon Sep 17 00:00:00 2001 From: Jiri Wiesner <jwiesner(a)suse.de> Date: Mon, 22 Jan 2024 18:23:50 +0100 Subject: [PATCH] clocksource: Skip watchdog check for large watchdog intervals There have been reports of the watchdog marking clocksources unstable on machines with 8 NUMA nodes: clocksource: timekeeping watchdog on CPU373: Marking clocksource 'tsc' as unstable because the skew is too large: clocksource: 'hpet' wd_nsec: 14523447520 clocksource: 'tsc' cs_nsec: 14524115132 The measured clocksource skew - the absolute difference between cs_nsec and wd_nsec - was 668 microseconds: cs_nsec - wd_nsec = 14524115132 - 14523447520 = 667612 The kernel used 200 microseconds for the uncertainty_margin of both the clocksource and watchdog, resulting in a threshold of 400 microseconds (the md variable). Both the cs_nsec and the wd_nsec value indicate that the readout interval was circa 14.5 seconds. The observed behaviour is that watchdog checks failed for large readout intervals on 8 NUMA node machines. This indicates that the size of the skew was directly proportinal to the length of the readout interval on those machines. The measured clocksource skew, 668 microseconds, was evaluated against a threshold (the md variable) that is suited for readout intervals of roughly WATCHDOG_INTERVAL, i.e. HZ >> 1, which is 0.5 second. The intention of 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold") was to tighten the threshold for evaluating skew and set the lower bound for the uncertainty_margin of clocksources to twice WATCHDOG_MAX_SKEW. Later in c37e85c135ce ("clocksource: Loosen clocksource watchdog constraints"), the WATCHDOG_MAX_SKEW constant was increased to 125 microseconds to fit the limit of NTP, which is able to use a clocksource that suffers from up to 500 microseconds of skew per second. Both the TSC and the HPET use default uncertainty_margin. When the readout interval gets stretched the default uncertainty_margin is no longer a suitable lower bound for evaluating skew - it imposes a limit that is far stricter than the skew with which NTP can deal. The root causes of the skew being directly proportinal to the length of the readout interval are: * the inaccuracy of the shift/mult pairs of clocksources and the watchdog * the conversion to nanoseconds is imprecise for large readout intervals Prevent this by skipping the current watchdog check if the readout interval exceeds 2 * WATCHDOG_INTERVAL. Considering the maximum readout interval of 2 * WATCHDOG_INTERVAL, the current default uncertainty margin (of the TSC and HPET) corresponds to a limit on clocksource skew of 250 ppm (microseconds of skew per second). To keep the limit imposed by NTP (500 microseconds of skew per second) for all possible readout intervals, the margins would have to be scaled so that the threshold value is proportional to the length of the actual readout interval. As for why the readout interval may get stretched: Since the watchdog is executed in softirq context the expiration of the watchdog timer can get severely delayed on account of a ksoftirqd thread not getting to run in a timely manner. Surely, a system with such belated softirq execution is not working well and the scheduling issue should be looked into but the clocksource watchdog should be able to deal with it accordingly. Fixes: 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold") Suggested-by: Feng Tang <feng.tang(a)intel.com> Signed-off-by: Jiri Wiesner <jwiesner(a)suse.de> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Tested-by: Paul E. McKenney <paulmck(a)kernel.org> Reviewed-by: Feng Tang <feng.tang(a)intel.com> Cc: stable(a)vger.kernel.org Link: https://lore.kernel.org/r/20240122172350.GA740@incl diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index c108ed8a9804..3052b1f1168e 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -99,6 +99,7 @@ static u64 suspend_start; * Interval: 0.5sec. */ #define WATCHDOG_INTERVAL (HZ >> 1) +#define WATCHDOG_INTERVAL_MAX_NS ((2 * WATCHDOG_INTERVAL) * (NSEC_PER_SEC / HZ)) /* * Threshold: 0.0312s, when doubled: 0.0625s. @@ -134,6 +135,7 @@ static DECLARE_WORK(watchdog_work, clocksource_watchdog_work); static DEFINE_SPINLOCK(watchdog_lock); static int watchdog_running; static atomic_t watchdog_reset_pending; +static int64_t watchdog_max_interval; static inline void clocksource_watchdog_lock(unsigned long *flags) { @@ -399,8 +401,8 @@ static inline void clocksource_reset_watchdog(void) static void clocksource_watchdog(struct timer_list *unused) { u64 csnow, wdnow, cslast, wdlast, delta; + int64_t wd_nsec, cs_nsec, interval; int next_cpu, reset_pending; - int64_t wd_nsec, cs_nsec; struct clocksource *cs; enum wd_read_status read_ret; unsigned long extra_wait = 0; @@ -470,6 +472,27 @@ static void clocksource_watchdog(struct timer_list *unused) if (atomic_read(&watchdog_reset_pending)) continue; + /* + * The processing of timer softirqs can get delayed (usually + * on account of ksoftirqd not getting to run in a timely + * manner), which causes the watchdog interval to stretch. + * Skew detection may fail for longer watchdog intervals + * on account of fixed margins being used. + * Some clocksources, e.g. acpi_pm, cannot tolerate + * watchdog intervals longer than a few seconds. + */ + interval = max(cs_nsec, wd_nsec); + if (unlikely(interval > WATCHDOG_INTERVAL_MAX_NS)) { + if (system_state > SYSTEM_SCHEDULING && + interval > 2 * watchdog_max_interval) { + watchdog_max_interval = interval; + pr_warn("Long readout interval, skipping watchdog check: cs_nsec: %lld wd_nsec: %lld\n", + cs_nsec, wd_nsec); + } + watchdog_timer.expires = jiffies; + continue; + } + /* Check the deviation from the watchdog clocksource. */ md = cs->uncertainty_margin + watchdog->uncertainty_margin; if (abs(cs_nsec - wd_nsec) > md) {

1 year, 4 months

3
2
0 0

Backport request: commit fe92f874f091 ("net: Fix from address in memcpy_to_iter_csum()")

by Jeffrey E Altman

Please backport to stable 6.7 the following patch which was merged upstream as commit fe92f874f09145a6951deacaa4961390238bbe0d Author: Michael Lass <bevan(a)bi-co.net> Date: Wed Jan 31 16:52:20 2024 +0100 net: Fix from address in memcpy_to_iter_csum() While inlining csum_and_memcpy() into memcpy_to_iter_csum(), the from address passed to csum_partial_copy_nocheck() was accidentally changed. This causes a regression in applications using UDP, as for example OpenAFS, causing loss of datagrams. Fixes: dc32bff195b4 ("iov_iter, net: Fold in csum_and_memcpy()") Cc: David Howells <dhowells(a)redhat.com> Cc: stable(a)vger.kernel.org Cc: regressions(a)lists.linux.dev Signed-off-by: Michael Lass <bevan(a)bi-co.net> Reviewed-by: Jeffrey Altman <jaltman(a)auristor.com> Acked-by: David Howells <dhowells(a)redhat.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> Thank you. -------- Forwarded Message -------- Subject: [PATCH] net: Fix from address in memcpy_to_iter_csum() Date: Wed, 31 Jan 2024 16:52:20 +0100 From: Michael Lass <bevan(a)bi-co.net> To: netdev(a)vger.kernel.org CC: David Howells <dhowells(a)redhat.com>, regressions(a)lists.linux.dev While inlining csum_and_memcpy() into memcpy_to_iter_csum(), the from address passed to csum_partial_copy_nocheck() was accidentally changed. This causes a regression in applications using UDP, as for example OpenAFS, causing loss of datagrams. Fixes: dc32bff195b4 ("iov_iter, net: Fold in csum_and_memcpy()") Cc: David Howells <dhowells(a)redhat.com> Cc: stable(a)vger.kernel.org Cc: regressions(a)lists.linux.dev Signed-off-by: Michael Lass <bevan(a)bi-co.net> --- net/core/datagram.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/datagram.c b/net/core/datagram.c index 103d46fa0eeb..a8b625abe242 100644 --- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -751,7 +751,7 @@ size_t memcpy_to_iter_csum(void *iter_to, size_t progress, size_t len, void *from, void *priv2) { __wsum *csum = priv2; - __wsum next = csum_partial_copy_nocheck(from, iter_to, len); + __wsum next = csum_partial_copy_nocheck(from + progress, iter_to, len); *csum = csum_block_add(*csum, next, progress); return 0; -- 2.43.0

1 year, 4 months

2
1
0 0

6.1-stable backport

by Jens Axboe

Hi, Can you cherry pick commit 33391eecd631 to 6.1-stable? Looks like we never got that one marked appropriately. For full reference: commit 33391eecd63158536fb5257fee5be3a3bdc30e3c Author: Jens Axboe <axboe(a)kernel.dk> Date: Fri Jan 20 07:51:07 2023 -0700 block: treat poll queue enter similarly to timeouts Thanks, -- Jens Axboe

1 year, 4 months

2
1
0 0

[regression] linux-6.6.y, minmax: virtual memory exhausted in i586 chroot during kernel compilation

by Linux regression tracking (Thorsten Leemhuis)

Hi Greg, Sasha, and David, I noticed a regression report in bugzilla.kernel.org that seems to be specific to the linux-6.6.y series: Quoting from https://bugzilla.kernel.org/show_bug.cgi?id=218484 : > After upgrading to version 6.6.16, the kernel compilation on a i586 > arch (on a 32bit chroot in a 64bit host) fails with a message: > > virtual memory exhausted: Cannot allocate memory > > this happens even lowering the number of parallel compilation > threads. On a x86_64 arch the same problem doesn't occur. It's not > clear whether some weird recursion is triggered that exhausts the > memory, but it seems that the problem is caused by the patchset > 'minmax' added to the 6.6.16 version, in particular it seems caused > by these patches: > > - minmax-allow-min-max-clamp-if-the-arguments-have-the-same-signedness.patch > - minmax-fix-indentation-of-__cmp_once-and-__clamp_once.patch > - minmax-allow-comparisons-of-int-against-unsigned-char-short.patch > - minmax-relax-check-to-allow-comparison-between-unsigned-arguments-and-signed-constants.patch > > Reverting those patches fixes the memory exhaustion problem during compilation. The reporter later added: > From a quick test the same problem doesn't occur in 6.8-rc4. See the ticket for more details. Note, you have to use bugzilla to reach the reporter, as I sadly[1] can not CCed them in mails like this. [TLDR for the rest of this mail: I'm adding this report to the list of tracked Linux kernel regressions; the text you find below is based on a few templates paragraphs you might have encountered already in similar form.] BTW, let me use this mail to also add the report to the list of tracked regressions to ensure it's doesn't fall through the cracks: #regzbot introduced: 204c653d5d0c79940..9487d93f172acef https://bugzilla.kernel.org/show_bug.cgi?id=218484 #regzbot title: minmax: virtual memory exhausted in 6.6.16 with i586 chroot #regzbot ignore-activity Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. [1] because bugzilla.kernel.org tells users upon registration their "email address will never be displayed to logged out users"

1 year, 4 months

3
5
0 0

FAILED: patch "[PATCH] io_uring/net: limit inline multishot retries" failed to apply to 6.7-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.7-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.7.y git checkout FETCH_HEAD git cherry-pick -x 76b367a2d83163cf19173d5cb0b562acbabc8eac # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021328-washboard-crevice-aaa0@gregkh' --subject-prefix 'PATCH 6.7.y' HEAD^.. Possible dependencies: 76b367a2d831 ("io_uring/net: limit inline multishot retries") 91e5d765a82f ("io_uring/net: un-indent mshot retry path in io_recv_finish()") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 76b367a2d83163cf19173d5cb0b562acbabc8eac Mon Sep 17 00:00:00 2001 From: Jens Axboe <axboe(a)kernel.dk> Date: Mon, 29 Jan 2024 12:00:58 -0700 Subject: [PATCH] io_uring/net: limit inline multishot retries If we have multiple clients and some/all are flooding the receives to such an extent that we can retry a LOT handling multishot receives, then we can be starving some clients and hence serving traffic in an imbalanced fashion. Limit multishot retry attempts to some arbitrary value, whose only purpose serves to ensure that we don't keep serving a single connection for way too long. We default to 32 retries, which should be more than enough to provide fairness, yet not so small that we'll spend too much time requeuing rather than handling traffic. Cc: stable(a)vger.kernel.org Depends-on: 704ea888d646 ("io_uring/poll: add requeue return code from poll multishot handling") Depends-on: 1e5d765a82f ("io_uring/net: un-indent mshot retry path in io_recv_finish()") Depends-on: e84b01a880f6 ("io_uring/poll: move poll execution helpers higher up") Fixes: b3fdea6ecb55 ("io_uring: multishot recv") Fixes: 9bb66906f23e ("io_uring: support multishot in recvmsg") Link: https://github.com/axboe/liburing/issues/1043 Signed-off-by: Jens Axboe <axboe(a)kernel.dk> diff --git a/io_uring/net.c b/io_uring/net.c index 740c6bfa5b59..a12ff69e6843 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -60,6 +60,7 @@ struct io_sr_msg { unsigned len; unsigned done_io; unsigned msg_flags; + unsigned nr_multishot_loops; u16 flags; /* initialised and used only by !msg send variants */ u16 addr_len; @@ -70,6 +71,13 @@ struct io_sr_msg { struct io_kiocb *notif; }; +/* + * Number of times we'll try and do receives if there's more data. If we + * exceed this limit, then add us to the back of the queue and retry from + * there. This helps fairness between flooding clients. + */ +#define MULTISHOT_MAX_RETRY 32 + static inline bool io_check_multishot(struct io_kiocb *req, unsigned int issue_flags) { @@ -611,6 +619,7 @@ int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) sr->msg_flags |= MSG_CMSG_COMPAT; #endif sr->done_io = 0; + sr->nr_multishot_loops = 0; return 0; } @@ -654,12 +663,20 @@ static inline bool io_recv_finish(struct io_kiocb *req, int *ret, */ if (io_fill_cqe_req_aux(req, issue_flags & IO_URING_F_COMPLETE_DEFER, *ret, cflags | IORING_CQE_F_MORE)) { + struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); + int mshot_retry_ret = IOU_ISSUE_SKIP_COMPLETE; + io_recv_prep_retry(req); /* Known not-empty or unknown state, retry */ - if (cflags & IORING_CQE_F_SOCK_NONEMPTY || msg->msg_inq == -1) - return false; + if (cflags & IORING_CQE_F_SOCK_NONEMPTY || msg->msg_inq == -1) { + if (sr->nr_multishot_loops++ < MULTISHOT_MAX_RETRY) + return false; + /* mshot retries exceeded, force a requeue */ + sr->nr_multishot_loops = 0; + mshot_retry_ret = IOU_REQUEUE; + } if (issue_flags & IO_URING_F_MULTISHOT) - *ret = IOU_ISSUE_SKIP_COMPLETE; + *ret = mshot_retry_ret; else *ret = -EAGAIN; return true;

1 year, 4 months

3
3
0 0

FAILED: patch "[PATCH] io_uring/net: fix sr->len for IORING_OP_RECV with MSG_WAITALL" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 72bd80252feeb3bef8724230ee15d9f7ab541c6e # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024021339-flick-facsimile-65c3@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: 72bd80252fee ("io_uring/net: fix sr->len for IORING_OP_RECV with MSG_WAITALL and buffers") f9ead18c1058 ("io_uring: split network related opcodes into its own file") e0da14def1ee ("io_uring: move statx handling to its own file") a9c210cebe13 ("io_uring: move epoll handler to its own file") 4cf90495281b ("io_uring: add a dummy -EOPNOTSUPP prep handler") 99f15d8d6136 ("io_uring: move uring_cmd handling to its own file") cd40cae29ef8 ("io_uring: split out open/close operations") 453b329be5ea ("io_uring: separate out file table handling code") f4c163dd7d4b ("io_uring: split out fadvise/madvise operations") 0d5847274037 ("io_uring: split out fs related sync/fallocate functions") 531113bbd5bf ("io_uring: split out splice related operations") 11aeb71406dd ("io_uring: split out filesystem related operations") e28683bdfc2f ("io_uring: move nop into its own file") 5e2a18d93fec ("io_uring: move xattr related opcodes to its own file") 97b388d70b53 ("io_uring: handle completions in the core") de23077eda61 ("io_uring: set completion results upfront") e27f928ee1cb ("io_uring: add io_uring_types.h") 4d4c9cff4f70 ("io_uring: define a request type cleanup handler") 890968dc0336 ("io_uring: unify struct io_symlink and io_hardlink") 9a3a11f977f9 ("io_uring: convert iouring_cmd to io_cmd_type") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 72bd80252feeb3bef8724230ee15d9f7ab541c6e Mon Sep 17 00:00:00 2001 From: Jens Axboe <axboe(a)kernel.dk> Date: Thu, 1 Feb 2024 06:42:36 -0700 Subject: [PATCH] io_uring/net: fix sr->len for IORING_OP_RECV with MSG_WAITALL and buffers If we use IORING_OP_RECV with provided buffers and pass in '0' as the length of the request, the length is retrieved from the selected buffer. If MSG_WAITALL is also set and we get a short receive, then we may hit the retry path which decrements sr->len and increments the buffer for a retry. However, the length is still zero at this point, which means that sr->len now becomes huge and import_ubuf() will cap it to MAX_RW_COUNT and subsequently return -EFAULT for the range as a whole. Fix this by always assigning sr->len once the buffer has been selected. Cc: stable(a)vger.kernel.org Fixes: 7ba89d2af17a ("io_uring: ensure recv and recvmsg handle MSG_WAITALL correctly") Signed-off-by: Jens Axboe <axboe(a)kernel.dk> diff --git a/io_uring/net.c b/io_uring/net.c index a12ff69e6843..43bc9a5f96f9 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -923,6 +923,7 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) if (!buf) return -ENOBUFS; sr->buf = buf; + sr->len = len; } ret = import_ubuf(ITER_DEST, sr->buf, len, &msg.msg_iter);

1 year, 4 months

3
2
0 0

[PATCH 6.1] f2fs: add helper to check compression level

by Chao Yu

From: Sheng Yong <shengyong(a)oppo.com> commit c571fbb5b59a3741e48014faa92c2f14bc59fe50 upstream. This patch adds a helper function to check if compression level is valid. Meanwhile, this patch fixes a reported issue [1]: The issue is easily reproducible by: 1. dd if=/dev/zero of=test.img count=100 bs=1M 2. mkfs.f2fs -f -O compression,extra_attr ./test.img 3. mount -t f2fs -o compress_algorithm=zstd:6,compress_chksum,atgc,gc_merge,lazytime ./test.img /mnt resulting in [ 60.789982] F2FS-fs (loop0): invalid zstd compress level: 6 A bugzilla report has been submitted in https://bugzilla.kernel.org/show_bug.cgi?id=218471 [1] https://lore.kernel.org/lkml/ZcWDOjKEnPDxZ0Or@google.com/T/ The root cause is commit 00e120b5e4b5 ("f2fs: assign default compression level") tries to check low boundary of compress level w/ zstd_min_clevel(), however, since commit e0c1b49f5b67 ("lib: zstd: Upgrade to latest upstream zstd version 1.4.10"), zstd supports negative compress level, it cast type for negative value returned from zstd_min_clevel() to unsigned int in below check condition, result in repored issue. if (level < zstd_min_clevel() || ... This patch fixes this issue by casting type for level to int before comparison. Fixes: 00e120b5e4b5 ("f2fs: assign default compression level") Signed-off-by: Sheng Yong <shengyong(a)oppo.com> Reviewed-by: Chao Yu <chao(a)kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org> Signed-off-by: Chao Yu <chao(a)kernel.org> --- fs/f2fs/compress.c | 27 +++++++++++++++++++++++++++ fs/f2fs/f2fs.h | 2 ++ fs/f2fs/super.c | 4 ++-- 3 files changed, 31 insertions(+), 2 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 4cb58e8d699e..4e83cfa1b073 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -55,6 +55,7 @@ struct f2fs_compress_ops { int (*init_decompress_ctx)(struct decompress_io_ctx *dic); void (*destroy_decompress_ctx)(struct decompress_io_ctx *dic); int (*decompress_pages)(struct decompress_io_ctx *dic); + bool (*is_level_valid)(int level); }; static unsigned int offset_in_cluster(struct compress_ctx *cc, pgoff_t index) @@ -322,11 +323,21 @@ static int lz4_decompress_pages(struct decompress_io_ctx *dic) return 0; } +static bool lz4_is_level_valid(int lvl) +{ +#ifdef CONFIG_F2FS_FS_LZ4HC + return !lvl || (lvl >= LZ4HC_MIN_CLEVEL && lvl <= LZ4HC_MAX_CLEVEL); +#else + return lvl == 0; +#endif +} + static const struct f2fs_compress_ops f2fs_lz4_ops = { .init_compress_ctx = lz4_init_compress_ctx, .destroy_compress_ctx = lz4_destroy_compress_ctx, .compress_pages = lz4_compress_pages, .decompress_pages = lz4_decompress_pages, + .is_level_valid = lz4_is_level_valid, }; #endif @@ -490,6 +501,11 @@ static int zstd_decompress_pages(struct decompress_io_ctx *dic) return 0; } +static bool zstd_is_level_valid(int lvl) +{ + return lvl >= zstd_min_clevel() && lvl <= zstd_max_clevel(); +} + static const struct f2fs_compress_ops f2fs_zstd_ops = { .init_compress_ctx = zstd_init_compress_ctx, .destroy_compress_ctx = zstd_destroy_compress_ctx, @@ -497,6 +513,7 @@ static const struct f2fs_compress_ops f2fs_zstd_ops = { .init_decompress_ctx = zstd_init_decompress_ctx, .destroy_decompress_ctx = zstd_destroy_decompress_ctx, .decompress_pages = zstd_decompress_pages, + .is_level_valid = zstd_is_level_valid, }; #endif @@ -555,6 +572,16 @@ bool f2fs_is_compress_backend_ready(struct inode *inode) return f2fs_cops[F2FS_I(inode)->i_compress_algorithm]; } +bool f2fs_is_compress_level_valid(int alg, int lvl) +{ + const struct f2fs_compress_ops *cops = f2fs_cops[alg]; + + if (cops->is_level_valid) + return cops->is_level_valid(lvl); + + return lvl == 0; +} + static mempool_t *compress_page_pool; static int num_compress_pages = 512; module_param(num_compress_pages, uint, 0444); diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 5c76ba764b71..e5a9498b89c0 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -4219,6 +4219,7 @@ bool f2fs_compress_write_end(struct inode *inode, void *fsdata, int f2fs_truncate_partial_cluster(struct inode *inode, u64 from, bool lock); void f2fs_compress_write_end_io(struct bio *bio, struct page *page); bool f2fs_is_compress_backend_ready(struct inode *inode); +bool f2fs_is_compress_level_valid(int alg, int lvl); int f2fs_init_compress_mempool(void); void f2fs_destroy_compress_mempool(void); void f2fs_decompress_cluster(struct decompress_io_ctx *dic, bool in_task); @@ -4283,6 +4284,7 @@ static inline bool f2fs_is_compress_backend_ready(struct inode *inode) /* not support compression */ return false; } +static inline bool f2fs_is_compress_level_valid(int alg, int lvl) { return false; } static inline struct page *f2fs_compress_control_page(struct page *page) { WARN_ON_ONCE(1); diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 3805162dcef2..0c0d0671febe 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -628,7 +628,7 @@ static int f2fs_set_lz4hc_level(struct f2fs_sb_info *sbi, const char *str) if (kstrtouint(str + 1, 10, &level)) return -EINVAL; - if (level < LZ4HC_MIN_CLEVEL || level > LZ4HC_MAX_CLEVEL) { + if (!f2fs_is_compress_level_valid(COMPRESS_LZ4, level)) { f2fs_info(sbi, "invalid lz4hc compress level: %d", level); return -EINVAL; } @@ -666,7 +666,7 @@ static int f2fs_set_zstd_level(struct f2fs_sb_info *sbi, const char *str) if (kstrtouint(str + 1, 10, &level)) return -EINVAL; - if (level < zstd_min_clevel() || level > zstd_max_clevel()) { + if (!f2fs_is_compress_level_valid(COMPRESS_ZSTD, level)) { f2fs_info(sbi, "invalid zstd compress level: %d", level); return -EINVAL; } -- 2.40.1

1 year, 4 months

2
1
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror February 2024