Hi,
Can you add the below to 6.1-stable? Thanks!
commit 3181e22fb79910c7071e84a43af93ac89e8a7106
Author: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Mon Jan 9 14:46:10 2023 +0000
io_uring: wake up optimisations
Commit 3181e22fb79910c7071e84a43af93ac89e8a7106 upstream.
Flush completions is done either from the submit syscall or by the
task_work, both are in the context of the submitter task, and when it
goes for a single threaded rings like implied by ->task_complete, there
won't be any waiters on ->cq_wait but the master task. That means that
there can be no tasks sleeping on cq_wait while we run
__io_submit_flush_completions() and so waking up can be skipped.
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/60ad9768ec74435a0ddaa6eec0ffa7729474f69f.16732742…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 4f0ae938b146..0b1361663267 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -582,6 +582,16 @@ static inline void __io_cq_unlock_post(struct io_ring_ctx *ctx)
io_cqring_ev_posted(ctx);
}
+static inline void __io_cq_unlock_post_flush(struct io_ring_ctx *ctx)
+ __releases(ctx->completion_lock)
+{
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ io_commit_cqring_flush(ctx);
+ if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))
+ __io_cqring_wake(ctx);
+}
+
void io_cq_unlock_post(struct io_ring_ctx *ctx)
{
__io_cq_unlock_post(ctx);
@@ -1339,7 +1349,7 @@ static void __io_submit_flush_completions(struct io_ring_ctx *ctx)
if (!(req->flags & REQ_F_CQE_SKIP))
__io_fill_cqe_req(ctx, req);
}
- __io_cq_unlock_post(ctx);
+ __io_cq_unlock_post_flush(ctx);
io_free_batch_list(ctx, state->compl_reqs.first);
INIT_WQ_LIST(&state->compl_reqs);
--
Jens Axboe
From: Ard Biesheuvel <ardb(a)kernel.org>
When the host stage1 is configured for LPA2, the value currently being
programmed into TCR_EL2.T0SZ may be invalid unless LPA2 is configured
at HYP as well. This means kvm_lpa2_is_enabled() is not the right
condition to test when setting TCR_EL2.DS, as it will return false if
LPA2 is only available for stage 1 but not for stage 2.
Similary, programming TCR_EL2.PS based on a limited IPA range due to
lack of stage2 LPA2 support could potentially result in problems.
So use lpa2_is_enabled() instead, and set the PS field according to the
host's IPS, which is capped at 48 bits if LPA2 support is absent or
disabled. Whether or not we can make meaningful use of such a
configuration is a different question.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
---
arch/arm64/kvm/arm.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index a102c3aebdbc..7b2735ad32e9 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1990,8 +1990,7 @@ static int kvm_init_vector_slots(void)
static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits)
{
struct kvm_nvhe_init_params *params = per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
- u64 mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
- unsigned long tcr;
+ unsigned long tcr, ips;
/*
* Calculate the raw per-cpu offset without a translation from the
@@ -2005,6 +2004,7 @@ static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits)
params->mair_el2 = read_sysreg(mair_el1);
tcr = read_sysreg(tcr_el1);
+ ips = FIELD_GET(TCR_IPS_MASK, tcr);
if (cpus_have_final_cap(ARM64_KVM_HVHE)) {
tcr |= TCR_EPD1_MASK;
} else {
@@ -2014,8 +2014,8 @@ static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits)
tcr &= ~TCR_T0SZ_MASK;
tcr |= TCR_T0SZ(hyp_va_bits);
tcr &= ~TCR_EL2_PS_MASK;
- tcr |= FIELD_PREP(TCR_EL2_PS_MASK, kvm_get_parange(mmfr0));
- if (kvm_lpa2_is_enabled())
+ tcr |= FIELD_PREP(TCR_EL2_PS_MASK, ips);
+ if (lpa2_is_enabled())
tcr |= TCR_EL2_DS;
params->tcr_el2 = tcr;
--
2.47.1.613.gc27f4b7a9f-goog
From: Ard Biesheuvel <ardb(a)kernel.org>
Currently, LPA2 kernel support implies support for up to 52 bits of
physical addressing, and this is reflected in global definitions such as
PHYS_MASK_SHIFT and MAX_PHYSMEM_BITS.
This is potentially problematic, given that LPA2 hardware support is
modeled as a CPU feature which can be overridden, and with LPA2 hardware
support turned off, attempting to map physical regions with address bits
[51:48] set (which may exist on LPA2 capable systems booting with
arm64.nolva) will result in corrupted mappings with a truncated output
address and bogus shareability attributes.
This means that the accepted physical address range in the mapping
routines should be at most 48 bits wide when LPA2 support is configured
but not enabled at runtime.
Fixes: 352b0395b505 ("arm64: Enable 52-bit virtual addressing for 4k and 16k granule configs")
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
---
arch/arm64/include/asm/pgtable-hwdef.h | 6 ------
arch/arm64/include/asm/pgtable-prot.h | 7 +++++++
arch/arm64/include/asm/sparsemem.h | 5 ++++-
3 files changed, 11 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index c78a988cca93..a9136cc551cc 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -222,12 +222,6 @@
*/
#define S1_TABLE_AP (_AT(pmdval_t, 3) << 61)
-/*
- * Highest possible physical address supported.
- */
-#define PHYS_MASK_SHIFT (CONFIG_ARM64_PA_BITS)
-#define PHYS_MASK ((UL(1) << PHYS_MASK_SHIFT) - 1)
-
#define TTBR_CNP_BIT (UL(1) << 0)
/*
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 9f9cf13bbd95..a95f1f77bb39 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -81,6 +81,7 @@ extern unsigned long prot_ns_shared;
#define lpa2_is_enabled() false
#define PTE_MAYBE_SHARED PTE_SHARED
#define PMD_MAYBE_SHARED PMD_SECT_S
+#define PHYS_MASK_SHIFT (CONFIG_ARM64_PA_BITS)
#else
static inline bool __pure lpa2_is_enabled(void)
{
@@ -89,8 +90,14 @@ static inline bool __pure lpa2_is_enabled(void)
#define PTE_MAYBE_SHARED (lpa2_is_enabled() ? 0 : PTE_SHARED)
#define PMD_MAYBE_SHARED (lpa2_is_enabled() ? 0 : PMD_SECT_S)
+#define PHYS_MASK_SHIFT (lpa2_is_enabled() ? CONFIG_ARM64_PA_BITS : 48)
#endif
+/*
+ * Highest possible physical address supported.
+ */
+#define PHYS_MASK ((UL(1) << PHYS_MASK_SHIFT) - 1)
+
/*
* If we have userspace only BTI we don't want to mark kernel pages
* guarded even if the system does support BTI.
diff --git a/arch/arm64/include/asm/sparsemem.h b/arch/arm64/include/asm/sparsemem.h
index 8a8acc220371..84783efdc9d1 100644
--- a/arch/arm64/include/asm/sparsemem.h
+++ b/arch/arm64/include/asm/sparsemem.h
@@ -5,7 +5,10 @@
#ifndef __ASM_SPARSEMEM_H
#define __ASM_SPARSEMEM_H
-#define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS
+#include <asm/pgtable-prot.h>
+
+#define MAX_PHYSMEM_BITS PHYS_MASK_SHIFT
+#define MAX_POSSIBLE_PHYSMEM_BITS (52)
/*
* Section size must be at least 512MB for 64K base
--
2.47.1.613.gc27f4b7a9f-goog
The PE Reset State "0" obtained from RTAS calls
ibm_read_slot_reset_[state|state2] indicates that
the Reset is deactivated and the PE is not in the MMIO
Stopped or DMA Stopped state.
With PE Reset State "0", the MMIO and DMA is allowed for
the PE. The function pseries_eeh_get_state() is currently
not indicating that to the caller because of which the
drivers are unable to resume the MMIO and DMA activity.
The patch fixes that by reflecting what is actually allowed.
Fixes: 00ba05a12b3c ("powerpc/pseries: Cleanup on pseries_eeh_get_state()")
Signed-off-by: Narayana Murty N <nnmlinux(a)linux.ibm.com>
---
Changelog:
V1:https://lore.kernel.org/all/20241107042027.338065-1-nnmlinux@linux.ibm.c…
--added Fixes tag for "powerpc/pseries: Cleanup on
pseries_eeh_get_state()".
---
arch/powerpc/platforms/pseries/eeh_pseries.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 1893f66371fa..b12ef382fec7 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -580,8 +580,10 @@ static int pseries_eeh_get_state(struct eeh_pe *pe, int *delay)
switch(rets[0]) {
case 0:
- result = EEH_STATE_MMIO_ACTIVE |
- EEH_STATE_DMA_ACTIVE;
+ result = EEH_STATE_MMIO_ACTIVE |
+ EEH_STATE_DMA_ACTIVE |
+ EEH_STATE_MMIO_ENABLED |
+ EEH_STATE_DMA_ENABLED;
break;
case 1:
result = EEH_STATE_RESET_ACTIVE |
--
2.47.1
When `skb_splice_from_iter` was introduced, it inadvertently added
checksumming for AF_UNIX sockets. This resulted in significant
slowdowns, for example when using sendfile over unix sockets.
Using the test code in [1] in my test setup (2G single core qemu),
the client receives a 1000M file in:
- without the patch: 1482ms (+/- 36ms)
- with the patch: 652.5ms (+/- 22.9ms)
This commit addresses the issue by marking checksumming as unnecessary in
`unix_stream_sendmsg`
Cc: stable(a)vger.kernel.org
Signed-off-by: Frederik Deweerdt <deweerdt.lkml(a)gmail.com>
Fixes: 2e910b95329c ("net: Add a function to splice pages into an skbuff for MSG_SPLICE_PAGES")
---
net/unix/af_unix.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 001ccc55ef0f..6b1762300443 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2313,6 +2313,7 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
fds_sent = true;
if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) {
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
err = skb_splice_from_iter(skb, &msg->msg_iter, size,
sk->sk_allocation);
if (err < 0) {
--
2.44.1
I am Tomasz Chmielewski, a Portfolio Manager and Chartered
Financial Analyst affiliated with Iwoca Poland Sp. Z OO in
Poland. I have the privilege of working with distinguished
investors who are eager to support your company's current
initiatives, thereby broadening their investment portfolios. If
this proposal aligns with your interests, I invite you to
respond, and I will gladly share more information to assist you.
Yours sincerely,
Tomasz Chmielewski Warsaw, Mazowieckie,
Poland.