The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042354-dividers-trowel-ba4a@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
1647b52757d5 ("KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing")
cbb359d81a26 ("KVM: x86/pmu: Move PMU reset logic to common x86 code")
73554b29bd70 ("KVM: x86/pmu: Synthesize at most one PMI per VM-exit")
b29a2acd36dd ("KVM: x86/pmu: Truncate counter value to allowed width on write")
4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support")
1c2bf8a6b045 ("KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap")
13afa29ae489 ("KVM: x86/pmu: Provide Intel PMU's pmc_is_enabled() as generic x86 code")
c85cdc1cc1ea ("KVM: x86/pmu: Move handling PERF_GLOBAL_CTRL and friends to common x86")
30dab5c0b65e ("KVM: x86/pmu: Reject userspace attempts to set reserved GLOBAL_STATUS bits")
8de18543dfe3 ("KVM: x86/pmu: Move reprogram_counters() to pmu.h")
53550b89220b ("KVM: x86/pmu: Rename global_ovf_ctrl_mask to global_status_mask")
649bccd7fac9 ("KVM: x86/pmu: Rewrite reprogram_counters() to improve performance")
8bca8c5ce40b ("KVM: VMX: Refactor intel_pmu_{g,}set_msr() to align with other helpers")
cdd2fbf6360e ("KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()")
957d0f70e97b ("KVM: x86/pmu: Zero out LBR capabilities during PMU refresh")
3a6de51a437f ("KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run")
7e768ce8278b ("KVM: x86/pmu: Zero out pmu->all_valid_pmc_idx each time it's refreshed")
e33b6d79acac ("KVM: x86/pmu: Don't tell userspace to save MSRs for non-existent fixed PMCs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042350-unaware-very-3433@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
1647b52757d5 ("KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing")
cbb359d81a26 ("KVM: x86/pmu: Move PMU reset logic to common x86 code")
73554b29bd70 ("KVM: x86/pmu: Synthesize at most one PMI per VM-exit")
b29a2acd36dd ("KVM: x86/pmu: Truncate counter value to allowed width on write")
4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support")
1c2bf8a6b045 ("KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap")
13afa29ae489 ("KVM: x86/pmu: Provide Intel PMU's pmc_is_enabled() as generic x86 code")
c85cdc1cc1ea ("KVM: x86/pmu: Move handling PERF_GLOBAL_CTRL and friends to common x86")
30dab5c0b65e ("KVM: x86/pmu: Reject userspace attempts to set reserved GLOBAL_STATUS bits")
8de18543dfe3 ("KVM: x86/pmu: Move reprogram_counters() to pmu.h")
53550b89220b ("KVM: x86/pmu: Rename global_ovf_ctrl_mask to global_status_mask")
649bccd7fac9 ("KVM: x86/pmu: Rewrite reprogram_counters() to improve performance")
8bca8c5ce40b ("KVM: VMX: Refactor intel_pmu_{g,}set_msr() to align with other helpers")
cdd2fbf6360e ("KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()")
957d0f70e97b ("KVM: x86/pmu: Zero out LBR capabilities during PMU refresh")
3a6de51a437f ("KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run")
7e768ce8278b ("KVM: x86/pmu: Zero out pmu->all_valid_pmc_idx each time it's refreshed")
e33b6d79acac ("KVM: x86/pmu: Don't tell userspace to save MSRs for non-existent fixed PMCs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042347-polar-gains-eb8f@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
1647b52757d5 ("KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing")
cbb359d81a26 ("KVM: x86/pmu: Move PMU reset logic to common x86 code")
73554b29bd70 ("KVM: x86/pmu: Synthesize at most one PMI per VM-exit")
b29a2acd36dd ("KVM: x86/pmu: Truncate counter value to allowed width on write")
4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support")
1c2bf8a6b045 ("KVM: x86/pmu: Constrain the num of guest counters with kvm_pmu_cap")
13afa29ae489 ("KVM: x86/pmu: Provide Intel PMU's pmc_is_enabled() as generic x86 code")
c85cdc1cc1ea ("KVM: x86/pmu: Move handling PERF_GLOBAL_CTRL and friends to common x86")
30dab5c0b65e ("KVM: x86/pmu: Reject userspace attempts to set reserved GLOBAL_STATUS bits")
8de18543dfe3 ("KVM: x86/pmu: Move reprogram_counters() to pmu.h")
53550b89220b ("KVM: x86/pmu: Rename global_ovf_ctrl_mask to global_status_mask")
649bccd7fac9 ("KVM: x86/pmu: Rewrite reprogram_counters() to improve performance")
8bca8c5ce40b ("KVM: VMX: Refactor intel_pmu_{g,}set_msr() to align with other helpers")
cdd2fbf6360e ("KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()")
957d0f70e97b ("KVM: x86/pmu: Zero out LBR capabilities during PMU refresh")
3a6de51a437f ("KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run")
7e768ce8278b ("KVM: x86/pmu: Zero out pmu->all_valid_pmc_idx each time it's refreshed")
e33b6d79acac ("KVM: x86/pmu: Don't tell userspace to save MSRs for non-existent fixed PMCs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042343-imitate-divinity-9e38@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
1647b52757d5 ("KVM: x86/pmu: Reset the PMU, i.e. stop counters, before refreshing")
cbb359d81a26 ("KVM: x86/pmu: Move PMU reset logic to common x86 code")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x de120e1d692d73c7eefa3278837b1eb68f90728a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042340-retrieval-unhook-57b9@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
de120e1d692d ("KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"")
f933b88e2015 ("KVM: x86/pmu: Zero out PMU metadata on AMD if PMU is disabled")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From de120e1d692d73c7eefa3278837b1eb68f90728a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 8 Mar 2024 17:36:40 -0800
Subject: [PATCH] KVM: x86/pmu: Set enable bits for GP counters in
PERF_GLOBAL_CTRL at "RESET"
Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
when refreshing the PMU to emulate the MSR's architecturally defined
post-RESET behavior. Per Intel's SDM:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
and
Where "n" is the number of general-purpose counters available in the processor.
AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
PPRs.
Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
counters, although a literal reading of the SDM would require the CPU to
set either bits 63:0 or 31:0. The intent of the behavior is to globally
enable all GP counters; honor the intent, if not the letter of the law.
Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
This bug was recently exposed when KVM added supported for AMD's
PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
guest software that only knew how to program v1 PMUs (that don't support
PERF_GLOBAL_CTRL).
Failure to emulate the post-RESET behavior results in such guests
unknowingly leaving all general purpose counters globally disabled (the
entire reason the post-RESET value sets the GP counter enable bits is to
maintain backwards compatibility).
The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
CPUs, although the old behavior was likely dumb luck.
Because (a) that old code was also broken in its own way (the history of
this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
as having a value of '0' post-RESET in all SDMs before March 2023.
Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
architectural PMU to a guests") *almost* got it right (again likely by
dumb luck), but for some reason only set the bits if the guest PMU was
advertised as v1:
if (pmu->version == 1) {
pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
return;
}
Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
enabled on reset") then tried to remedy that goof, presumably because
guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
counters.
pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
Don't overwrite the pmu->global_ctrl when refreshing") removed
*everything*. However, it did so based on the behavior defined by the
SDM , which at the time stated that "Global Perf Counter Controls" is
'0' at Power-Up and RESET.
But then the March 2023 SDM (325462-079US), stealthily changed its
"IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
table to say:
IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
"pure" RESET flow. But it can only be called prior to the first KVM_RUN,
i.e. the guest will only ever observe the final value.
Note #2, KVM has always cleared global_ctrl during refresh (see commit
f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
i.e. there is no danger of breaking existing setups by clobbering a value
set by userspace.
Reported-by: Babu Moger <babu.moger(a)amd.com>
Cc: Sandipan Das <sandipan.das(a)amd.com>
Cc: Like Xu <like.xu.linux(a)gmail.com>
Cc: Mingwei Zhang <mizhang(a)google.com>
Cc: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Tested-by: Dapeng Mi <dapeng1.mi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240309013641.1413400-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index c397b28e3d1b..a593b03c9aed 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -775,8 +775,20 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_data_cfg_mask = ~0ull;
bitmap_zero(pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX);
- if (vcpu->kvm->arch.enable_pmu)
- static_call(kvm_x86_pmu_refresh)(vcpu);
+ if (!vcpu->kvm->arch.enable_pmu)
+ return;
+
+ static_call(kvm_x86_pmu_refresh)(vcpu);
+
+ /*
+ * At RESET, both Intel and AMD CPUs set all enable bits for general
+ * purpose counters in IA32_PERF_GLOBAL_CTRL (so that software that
+ * was written for v1 PMUs don't unknowingly leave GP counters disabled
+ * in the global controls). Emulate that behavior when refreshing the
+ * PMU so that userspace doesn't need to manually set PERF_GLOBAL_CTRL.
+ */
+ if (kvm_pmu_has_perf_global_ctrl(pmu) && pmu->nr_arch_gp_counters)
+ pmu->global_ctrl = GENMASK_ULL(pmu->nr_arch_gp_counters - 1, 0);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 5bfc311dd6c376d350b39028b9000ad766ddc934
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042323-ransack-tightness-3eba@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
5bfc311dd6c3 ("usb: xhci: correct return value in case of STS_HCE")
84ac5e4fa517 ("xhci: move event processing for one interrupter to a separate function")
e30e9ad9ed66 ("xhci: update event ring dequeue pointer position to controller correctly")
143e64df1bda ("xhci: remove unnecessary event_ring_deq parameter from xhci_handle_event()")
becbd202af84 ("xhci: make isoc_bei_interval variable interrupter specific.")
4f022aad80dc ("xhci: Add interrupt pending autoclear flag to each interrupter")
c99b38c41234 ("xhci: add support to allocate several interrupters")
47f503cf5f79 ("xhci: split free interrupter into separate remove and free parts")
d1830364e963 ("xhci: Simplify event ring dequeue pointer update for port change events")
08cc5616798d ("xhci: Clean up ERST_PTR_MASK inversion")
28084d3fcc3c ("xhci: Use more than one Event Ring segment")
044818a6cd80 ("xhci: Set DESI bits in ERDP register correctly")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5bfc311dd6c376d350b39028b9000ad766ddc934 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum(a)suse.com>
Date: Thu, 4 Apr 2024 15:11:05 +0300
Subject: [PATCH] usb: xhci: correct return value in case of STS_HCE
If we get STS_HCE we give up on the interrupt, but for the purpose
of IRQ handling that still counts as ours. We may return IRQ_NONE
only if we are positive that it wasn't ours. Hence correct the default.
Fixes: 2a25e66d676d ("xhci: print warning when HCE was set")
Cc: stable(a)vger.kernel.org # v6.2+
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240404121106.2842417-2-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 52278afea94b..575f0fd9c9f1 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3133,7 +3133,7 @@ static int xhci_handle_events(struct xhci_hcd *xhci, struct xhci_interrupter *ir
irqreturn_t xhci_irq(struct usb_hcd *hcd)
{
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
- irqreturn_t ret = IRQ_NONE;
+ irqreturn_t ret = IRQ_HANDLED;
u32 status;
spin_lock(&xhci->lock);
@@ -3141,12 +3141,13 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
status = readl(&xhci->op_regs->status);
if (status == ~(u32)0) {
xhci_hc_died(xhci);
- ret = IRQ_HANDLED;
goto out;
}
- if (!(status & STS_EINT))
+ if (!(status & STS_EINT)) {
+ ret = IRQ_NONE;
goto out;
+ }
if (status & STS_HCE) {
xhci_warn(xhci, "WARNING: Host Controller Error\n");
@@ -3156,7 +3157,6 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
if (status & STS_FATAL) {
xhci_warn(xhci, "WARNING: Host System Error\n");
xhci_halt(xhci);
- ret = IRQ_HANDLED;
goto out;
}
@@ -3167,7 +3167,6 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
*/
status |= STS_EINT;
writel(status, &xhci->op_regs->status);
- ret = IRQ_HANDLED;
/* This is the handler of the primary interrupter */
xhci_handle_events(xhci, xhci->interrupters[0]);
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x 5bfc311dd6c376d350b39028b9000ad766ddc934
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042322-lunchbox-flint-5a2b@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
5bfc311dd6c3 ("usb: xhci: correct return value in case of STS_HCE")
84ac5e4fa517 ("xhci: move event processing for one interrupter to a separate function")
e30e9ad9ed66 ("xhci: update event ring dequeue pointer position to controller correctly")
143e64df1bda ("xhci: remove unnecessary event_ring_deq parameter from xhci_handle_event()")
becbd202af84 ("xhci: make isoc_bei_interval variable interrupter specific.")
4f022aad80dc ("xhci: Add interrupt pending autoclear flag to each interrupter")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5bfc311dd6c376d350b39028b9000ad766ddc934 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum(a)suse.com>
Date: Thu, 4 Apr 2024 15:11:05 +0300
Subject: [PATCH] usb: xhci: correct return value in case of STS_HCE
If we get STS_HCE we give up on the interrupt, but for the purpose
of IRQ handling that still counts as ours. We may return IRQ_NONE
only if we are positive that it wasn't ours. Hence correct the default.
Fixes: 2a25e66d676d ("xhci: print warning when HCE was set")
Cc: stable(a)vger.kernel.org # v6.2+
Signed-off-by: Oliver Neukum <oneukum(a)suse.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20240404121106.2842417-2-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 52278afea94b..575f0fd9c9f1 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3133,7 +3133,7 @@ static int xhci_handle_events(struct xhci_hcd *xhci, struct xhci_interrupter *ir
irqreturn_t xhci_irq(struct usb_hcd *hcd)
{
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
- irqreturn_t ret = IRQ_NONE;
+ irqreturn_t ret = IRQ_HANDLED;
u32 status;
spin_lock(&xhci->lock);
@@ -3141,12 +3141,13 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
status = readl(&xhci->op_regs->status);
if (status == ~(u32)0) {
xhci_hc_died(xhci);
- ret = IRQ_HANDLED;
goto out;
}
- if (!(status & STS_EINT))
+ if (!(status & STS_EINT)) {
+ ret = IRQ_NONE;
goto out;
+ }
if (status & STS_HCE) {
xhci_warn(xhci, "WARNING: Host Controller Error\n");
@@ -3156,7 +3157,6 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
if (status & STS_FATAL) {
xhci_warn(xhci, "WARNING: Host System Error\n");
xhci_halt(xhci);
- ret = IRQ_HANDLED;
goto out;
}
@@ -3167,7 +3167,6 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd)
*/
status |= STS_EINT;
writel(status, &xhci->op_regs->status);
- ret = IRQ_HANDLED;
/* This is the handler of the primary interrupter */
xhci_handle_events(xhci, xhci->interrupters[0]);
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x 91f10a3d21f2313485178d49efef8a3ba02bd8c7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042332-depose-frenzy-e027@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
91f10a3d21f2 ("Revert "drm/amd/display: fix USB-C flag update after enc10 feature init"")
7d1e9d0369e4 ("drm/amd/display: Check DP Alt mode DPCS state via DMUB")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 91f10a3d21f2313485178d49efef8a3ba02bd8c7 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher(a)amd.com>
Date: Fri, 29 Mar 2024 18:03:03 -0400
Subject: [PATCH] Revert "drm/amd/display: fix USB-C flag update after enc10
feature init"
This reverts commit b5abd7f983e14054593dc91d6df2aa5f8cc67652.
This change breaks DSC on 4k monitors at 144Hz over USB-C.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3254
Reviewed-by: Harry Wentland <harry.wentland(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: Muhammad Ahmed <ahmed.ahmed(a)amd.com>
Cc: Tom Chung <chiahsuan.chung(a)amd.com>
Cc: Charlene Liu <charlene.liu(a)amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Cc: Harry Wentland <harry.wentland(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dio_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dio_link_encoder.c
index e224a028d68a..8a0460e86309 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dio_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_dio_link_encoder.c
@@ -248,14 +248,12 @@ void dcn32_link_encoder_construct(
enc10->base.hpd_source = init_data->hpd_source;
enc10->base.connector = init_data->connector;
+ if (enc10->base.connector.id == CONNECTOR_ID_USBC)
+ enc10->base.features.flags.bits.DP_IS_USB_C = 1;
+
enc10->base.preferred_engine = ENGINE_ID_UNKNOWN;
enc10->base.features = *enc_features;
- if (enc10->base.connector.id == CONNECTOR_ID_USBC)
- enc10->base.features.flags.bits.DP_IS_USB_C = 1;
-
- if (enc10->base.connector.id == CONNECTOR_ID_USBC)
- enc10->base.features.flags.bits.DP_IS_USB_C = 1;
enc10->base.transmitter = init_data->transmitter;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_dio_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_dio_link_encoder.c
index 81e349d5835b..da94e5309fba 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_dio_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn35/dcn35_dio_link_encoder.c
@@ -184,6 +184,8 @@ void dcn35_link_encoder_construct(
enc10->base.hpd_source = init_data->hpd_source;
enc10->base.connector = init_data->connector;
+ if (enc10->base.connector.id == CONNECTOR_ID_USBC)
+ enc10->base.features.flags.bits.DP_IS_USB_C = 1;
enc10->base.preferred_engine = ENGINE_ID_UNKNOWN;
@@ -238,8 +240,6 @@ void dcn35_link_encoder_construct(
}
enc10->base.features.flags.bits.HDMI_6GB_EN = 1;
- if (enc10->base.connector.id == CONNECTOR_ID_USBC)
- enc10->base.features.flags.bits.DP_IS_USB_C = 1;
if (bp_funcs->get_connector_speed_cap_info)
result = bp_funcs->get_connector_speed_cap_info(enc10->base.ctx->dc_bios,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 13c785323b36b845300b256d0e5963c3727667d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042348-galleria-wielder-75a5@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
13c785323b36 ("serial: stm32: Return IRQ_NONE in the ISR if no handling happend")
c5d06662551c ("serial: stm32: Use port lock wrappers")
a01ae50d7eae ("serial: stm32: replace access to DMAR bit by dmaengine_pause/resume")
7f28bcea824e ("serial: stm32: group dma pause/resume error handling into single function")
00d1f9c6af0d ("serial: stm32: modify parameter and rename stm32_usart_rx_dma_enabled")
db89728abad5 ("serial: stm32: avoid clearing DMAT bit during transfer")
3f6c02fa712b ("serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler")
d7c76716169d ("serial: stm32: Use TC interrupt to deassert GPIO RTS in RS485 mode")
3bcea529b295 ("serial: stm32: Factor out GPIO RTS toggling into separate function")
037b91ec7729 ("serial: stm32: fix software flow control transfer")
d3d079bde07e ("serial: stm32: prevent TDR register overwrite when sending x_char")
195437d14fb4 ("serial: stm32: correct loop for dma error handling")
2a3bcfe03725 ("serial: stm32: fix flow control transfer in DMA mode")
9a135f16d228 ("serial: stm32: rework TX DMA state condition")
56a23f9319e8 ("serial: stm32: move tx dma terminate DMA to shutdown")
6333a4850621 ("serial: stm32: push DMA RX data before suspending")
6eeb348c8482 ("serial: stm32: terminate / restart DMA transfer at suspend / resume")
e0abc903deea ("serial: stm32: rework RX dma initialization and release")
d1ec8a2eabe9 ("serial: stm32: update throttle and unthrottle ops for dma mode")
33bb2f6ac308 ("serial: stm32: rework RX over DMA")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13c785323b36b845300b256d0e5963c3727667d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)pengutronix.de>
Date: Wed, 17 Apr 2024 11:03:27 +0200
Subject: [PATCH] serial: stm32: Return IRQ_NONE in the ISR if no handling
happend
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If there is a stuck irq that the handler doesn't address, returning
IRQ_HANDLED unconditionally makes it impossible for the irq core to
detect the problem and disable the irq. So only return IRQ_HANDLED if
an event was handled.
A stuck irq is still problematic, but with this change at least it only
makes the UART nonfunctional instead of occupying the (usually only) CPU
by 100% and so stall the whole machine.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Link: https://lore.kernel.org/r/5f92603d0dfd8a5b8014b2b10a902d91e0bb881f.17133441…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 58d169e5c1db..d60cbac69194 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -861,6 +861,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
u32 sr;
unsigned int size;
+ irqreturn_t ret = IRQ_NONE;
sr = readl_relaxed(port->membase + ofs->isr);
@@ -869,11 +870,14 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
(sr & USART_SR_TC)) {
stm32_usart_tc_interrupt_disable(port);
stm32_usart_rs485_rts_disable(port);
+ ret = IRQ_HANDLED;
}
- if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG)
+ if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG) {
writel_relaxed(USART_ICR_RTOCF,
port->membase + ofs->icr);
+ ret = IRQ_HANDLED;
+ }
if ((sr & USART_SR_WUF) && ofs->icr != UNDEF_REG) {
/* Clear wake up flag and disable wake up interrupt */
@@ -882,6 +886,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_WUFIE);
if (irqd_is_wakeup_set(irq_get_irq_data(port->irq)))
pm_wakeup_event(tport->tty->dev, 0);
+ ret = IRQ_HANDLED;
}
/*
@@ -896,6 +901,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
}
@@ -903,6 +909,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_port_lock(port);
stm32_usart_transmit_chars(port);
uart_port_unlock(port);
+ ret = IRQ_HANDLED;
}
/* Receiver timeout irq for DMA RX */
@@ -912,9 +919,10 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
- return IRQ_HANDLED;
+ return ret;
}
static void stm32_usart_set_mctrl(struct uart_port *port, unsigned int mctrl)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 13c785323b36b845300b256d0e5963c3727667d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042346-correct-footgear-b44a@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
13c785323b36 ("serial: stm32: Return IRQ_NONE in the ISR if no handling happend")
c5d06662551c ("serial: stm32: Use port lock wrappers")
a01ae50d7eae ("serial: stm32: replace access to DMAR bit by dmaengine_pause/resume")
7f28bcea824e ("serial: stm32: group dma pause/resume error handling into single function")
00d1f9c6af0d ("serial: stm32: modify parameter and rename stm32_usart_rx_dma_enabled")
db89728abad5 ("serial: stm32: avoid clearing DMAT bit during transfer")
3f6c02fa712b ("serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler")
d7c76716169d ("serial: stm32: Use TC interrupt to deassert GPIO RTS in RS485 mode")
3bcea529b295 ("serial: stm32: Factor out GPIO RTS toggling into separate function")
037b91ec7729 ("serial: stm32: fix software flow control transfer")
d3d079bde07e ("serial: stm32: prevent TDR register overwrite when sending x_char")
195437d14fb4 ("serial: stm32: correct loop for dma error handling")
2a3bcfe03725 ("serial: stm32: fix flow control transfer in DMA mode")
9a135f16d228 ("serial: stm32: rework TX DMA state condition")
56a23f9319e8 ("serial: stm32: move tx dma terminate DMA to shutdown")
6333a4850621 ("serial: stm32: push DMA RX data before suspending")
6eeb348c8482 ("serial: stm32: terminate / restart DMA transfer at suspend / resume")
e0abc903deea ("serial: stm32: rework RX dma initialization and release")
d1ec8a2eabe9 ("serial: stm32: update throttle and unthrottle ops for dma mode")
33bb2f6ac308 ("serial: stm32: rework RX over DMA")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13c785323b36b845300b256d0e5963c3727667d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)pengutronix.de>
Date: Wed, 17 Apr 2024 11:03:27 +0200
Subject: [PATCH] serial: stm32: Return IRQ_NONE in the ISR if no handling
happend
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If there is a stuck irq that the handler doesn't address, returning
IRQ_HANDLED unconditionally makes it impossible for the irq core to
detect the problem and disable the irq. So only return IRQ_HANDLED if
an event was handled.
A stuck irq is still problematic, but with this change at least it only
makes the UART nonfunctional instead of occupying the (usually only) CPU
by 100% and so stall the whole machine.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Link: https://lore.kernel.org/r/5f92603d0dfd8a5b8014b2b10a902d91e0bb881f.17133441…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 58d169e5c1db..d60cbac69194 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -861,6 +861,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
u32 sr;
unsigned int size;
+ irqreturn_t ret = IRQ_NONE;
sr = readl_relaxed(port->membase + ofs->isr);
@@ -869,11 +870,14 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
(sr & USART_SR_TC)) {
stm32_usart_tc_interrupt_disable(port);
stm32_usart_rs485_rts_disable(port);
+ ret = IRQ_HANDLED;
}
- if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG)
+ if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG) {
writel_relaxed(USART_ICR_RTOCF,
port->membase + ofs->icr);
+ ret = IRQ_HANDLED;
+ }
if ((sr & USART_SR_WUF) && ofs->icr != UNDEF_REG) {
/* Clear wake up flag and disable wake up interrupt */
@@ -882,6 +886,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_WUFIE);
if (irqd_is_wakeup_set(irq_get_irq_data(port->irq)))
pm_wakeup_event(tport->tty->dev, 0);
+ ret = IRQ_HANDLED;
}
/*
@@ -896,6 +901,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
}
@@ -903,6 +909,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_port_lock(port);
stm32_usart_transmit_chars(port);
uart_port_unlock(port);
+ ret = IRQ_HANDLED;
}
/* Receiver timeout irq for DMA RX */
@@ -912,9 +919,10 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
- return IRQ_HANDLED;
+ return ret;
}
static void stm32_usart_set_mctrl(struct uart_port *port, unsigned int mctrl)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 13c785323b36b845300b256d0e5963c3727667d7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042345-brunette-striking-6f32@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
13c785323b36 ("serial: stm32: Return IRQ_NONE in the ISR if no handling happend")
c5d06662551c ("serial: stm32: Use port lock wrappers")
a01ae50d7eae ("serial: stm32: replace access to DMAR bit by dmaengine_pause/resume")
7f28bcea824e ("serial: stm32: group dma pause/resume error handling into single function")
00d1f9c6af0d ("serial: stm32: modify parameter and rename stm32_usart_rx_dma_enabled")
db89728abad5 ("serial: stm32: avoid clearing DMAT bit during transfer")
3f6c02fa712b ("serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler")
d7c76716169d ("serial: stm32: Use TC interrupt to deassert GPIO RTS in RS485 mode")
3bcea529b295 ("serial: stm32: Factor out GPIO RTS toggling into separate function")
037b91ec7729 ("serial: stm32: fix software flow control transfer")
d3d079bde07e ("serial: stm32: prevent TDR register overwrite when sending x_char")
195437d14fb4 ("serial: stm32: correct loop for dma error handling")
2a3bcfe03725 ("serial: stm32: fix flow control transfer in DMA mode")
9a135f16d228 ("serial: stm32: rework TX DMA state condition")
56a23f9319e8 ("serial: stm32: move tx dma terminate DMA to shutdown")
6333a4850621 ("serial: stm32: push DMA RX data before suspending")
6eeb348c8482 ("serial: stm32: terminate / restart DMA transfer at suspend / resume")
e0abc903deea ("serial: stm32: rework RX dma initialization and release")
d1ec8a2eabe9 ("serial: stm32: update throttle and unthrottle ops for dma mode")
33bb2f6ac308 ("serial: stm32: rework RX over DMA")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13c785323b36b845300b256d0e5963c3727667d7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)pengutronix.de>
Date: Wed, 17 Apr 2024 11:03:27 +0200
Subject: [PATCH] serial: stm32: Return IRQ_NONE in the ISR if no handling
happend
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If there is a stuck irq that the handler doesn't address, returning
IRQ_HANDLED unconditionally makes it impossible for the irq core to
detect the problem and disable the irq. So only return IRQ_HANDLED if
an event was handled.
A stuck irq is still problematic, but with this change at least it only
makes the UART nonfunctional instead of occupying the (usually only) CPU
by 100% and so stall the whole machine.
Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Link: https://lore.kernel.org/r/5f92603d0dfd8a5b8014b2b10a902d91e0bb881f.17133441…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index 58d169e5c1db..d60cbac69194 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -861,6 +861,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
u32 sr;
unsigned int size;
+ irqreturn_t ret = IRQ_NONE;
sr = readl_relaxed(port->membase + ofs->isr);
@@ -869,11 +870,14 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
(sr & USART_SR_TC)) {
stm32_usart_tc_interrupt_disable(port);
stm32_usart_rs485_rts_disable(port);
+ ret = IRQ_HANDLED;
}
- if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG)
+ if ((sr & USART_SR_RTOF) && ofs->icr != UNDEF_REG) {
writel_relaxed(USART_ICR_RTOCF,
port->membase + ofs->icr);
+ ret = IRQ_HANDLED;
+ }
if ((sr & USART_SR_WUF) && ofs->icr != UNDEF_REG) {
/* Clear wake up flag and disable wake up interrupt */
@@ -882,6 +886,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
stm32_usart_clr_bits(port, ofs->cr3, USART_CR3_WUFIE);
if (irqd_is_wakeup_set(irq_get_irq_data(port->irq)))
pm_wakeup_event(tport->tty->dev, 0);
+ ret = IRQ_HANDLED;
}
/*
@@ -896,6 +901,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
}
@@ -903,6 +909,7 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_port_lock(port);
stm32_usart_transmit_chars(port);
uart_port_unlock(port);
+ ret = IRQ_HANDLED;
}
/* Receiver timeout irq for DMA RX */
@@ -912,9 +919,10 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
+ ret = IRQ_HANDLED;
}
- return IRQ_HANDLED;
+ return ret;
}
static void stm32_usart_set_mctrl(struct uart_port *port, unsigned int mctrl)
With deferred IO enabled, a page fault happens when data is written to the
framebuffer device. Then driver determines which page is being updated by
calculating the offset of the written virtual address within the virtual
memory area, and uses this offset to get the updated page within the
internal buffer. This page is later copied to hardware (thus the name
"deferred IO").
This calculation is only correct if the virtual memory area is mapped to
the beginning of the internal buffer. Otherwise this is wrong. For example,
if users do:
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
Then the virtual memory area will mapped at offset 0xff000 within the
internal buffer. This offset 0xff000 is not accounted for, and wrong page
is updated. This will lead to wrong pixels being updated on the device.
However, it gets worse: if users do 2 mmap to the same virtual address, for
example:
int fd = open("/dev/fb0", O_RDWR, 0);
char *ptr = (char *) 0x20000000ul;
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
*ptr = 0; // write #1
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0);
*ptr = 0; // write #2
In this case, both write #1 and write #2 apply to the same virtual address
(0x20000000ul), and the driver mistakenly thinks the same page is being
written to. When the second write happens, the driver thinks this is the
same page as the last time, and reuse the page from write #1. The driver
then lock_page() an incorrect page, and returns VM_FAULT_LOCKED with the
correct page unlocked. It is unclear what will happen with memory
management subsystem after that, but likely something terrible.
Fix this by taking the mapping offset into account.
Reported-and-tested-by: Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
Closes: https://lore.kernel.org/linux-fbdev/271372d6-e665-4e7f-b088-dee5f4ab341a@or…
Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct")
Cc: stable(a)vger.kernel.org
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
---
drivers/video/fbdev/core/fb_defio.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
index dae96c9f61cf..d5d6cd9e8b29 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
*/
static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf)
{
- unsigned long offset = vmf->address - vmf->vma->vm_start;
+ unsigned long offset = vmf->address - vmf->vma->vm_start
+ + (vmf->vma->vm_pgoff << PAGE_SHIFT);
struct page *page = vmf->page;
file_update_time(vmf->vma->vm_file);
--
2.39.2
Inspired by a patch from [Justin][1] I took a closer look at kdb_read().
Despite Justin's patch being a (correct) one-line manipulation it was a
tough patch to review because the surrounding code was hard to read and
it looked like there were unfixed problems.
This series isn't enough to make kdb_read() beautiful but it does make
it shorter, easier to reason about and fixes two buffer overflows and a
screen redraw problem!
[1]: https://lore.kernel.org/all/20240403-strncpy-kernel-debug-kdb-kdb_io-c-v1-1…
Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org>
---
Changes in v2:
- No code changes!
- I belatedly realized that one of the cleanups actually fixed a buffer
overflow so there are changes to Cc: (to add stable@...) and to one
of the patch descriptions.
- Link to v1: https://lore.kernel.org/r/20240416-kgdb_read_refactor-v1-0-b18c2d01076d@lin…
---
Daniel Thompson (7):
kdb: Fix buffer overflow during tab-complete
kdb: Use format-strings rather than '\0' injection in kdb_read()
kdb: Fix console handling when editing and tab-completing commands
kdb: Merge identical case statements in kdb_read()
kdb: Use format-specifiers rather than memset() for padding in kdb_read()
kdb: Replace double memcpy() with memmove() in kdb_read()
kdb: Simplify management of tmpbuffer in kdb_read()
kernel/debug/kdb/kdb_io.c | 133 ++++++++++++++++++++--------------------------
1 file changed, 58 insertions(+), 75 deletions(-)
---
base-commit: dccce9b8780618986962ba37c373668bcf426866
change-id: 20240415-kgdb_read_refactor-2ea2dfc15dbb
Best regards,
--
Daniel Thompson <daniel.thompson(a)linaro.org>
[ Upstream commit 9fb9eb4b59acc607e978288c96ac7efa917153d4 ]
systems, using igb driver, crash while executing poweroff command
as per following call stack:
crash> bt -a
PID: 62583 TASK: ffff97ebbf28dc40 CPU: 0 COMMAND: "poweroff"
#0 [ffffa7adcd64f8a0] machine_kexec at ffffffffa606c7c1
#1 [ffffa7adcd64f900] __crash_kexec at ffffffffa613bb52
#2 [ffffa7adcd64f9d0] panic at ffffffffa6099c45
#3 [ffffa7adcd64fa50] oops_end at ffffffffa603359a
#4 [ffffa7adcd64fa78] die at ffffffffa6033c32
#5 [ffffa7adcd64faa8] do_trap at ffffffffa60309a0
#6 [ffffa7adcd64faf8] do_error_trap at ffffffffa60311e7
#7 [ffffa7adcd64fbc0] do_invalid_op at ffffffffa6031320
#8 [ffffa7adcd64fbd0] invalid_op at ffffffffa6a01f2a
[exception RIP: free_msi_irqs+408]
RIP: ffffffffa645d248 RSP: ffffa7adcd64fc88 RFLAGS: 00010286
RAX: ffff97eb1396fe00 RBX: 0000000000000000 RCX: ffff97eb1396fe00
RDX: ffff97eb1396fe00 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffa7adcd64fcb0 R8: 0000000000000002 R9: 000000000000fbff
R10: 0000000000000000 R11: 0000000000000000 R12: ffff98c047af4720
R13: ffff97eb87cd32a0 R14: ffff97eb87cd3000 R15: ffffa7adcd64fd57
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffffa7adcd64fc80] free_msi_irqs at ffffffffa645d0fc
#10 [ffffa7adcd64fcb8] pci_disable_msix at ffffffffa645d896
#11 [ffffa7adcd64fce0] igb_reset_interrupt_capability at ffffffffc024f335 [igb]
#12 [ffffa7adcd64fd08] __igb_shutdown at ffffffffc0258ed7 [igb]
#13 [ffffa7adcd64fd48] igb_shutdown at ffffffffc025908b [igb]
#14 [ffffa7adcd64fd70] pci_device_shutdown at ffffffffa6441e3a
#15 [ffffa7adcd64fd98] device_shutdown at ffffffffa6570260
#16 [ffffa7adcd64fdc8] kernel_power_off at ffffffffa60c0725
#17 [ffffa7adcd64fdd8] SYSC_reboot at ffffffffa60c08f1
#18 [ffffa7adcd64ff18] sys_reboot at ffffffffa60c09ee
#19 [ffffa7adcd64ff28] do_syscall_64 at ffffffffa6003ca9
#20 [ffffa7adcd64ff50] entry_SYSCALL_64_after_hwframe at ffffffffa6a001b1
This happens because igb_shutdown has not yet freed up allocated irqs and
free_msi_irqs finds irq_has_action true for involved msi irqs here and this
condition triggers BUG_ON.
Freeing irqs before proceeding further in igb_clear_interrupt_scheme,
fixes this problem.
Signed-off-by: Imran Khan <imran.f.khan(a)oracle.com>
---
This issue does not happen in v5.17 or later kernel versions because
'commit 9fb9eb4b59ac ("PCI/MSI: Let core code free MSI descriptors")',
explicitly frees up MSI based irqs and hence indirectly fixes this issue
as well. Also this is why I have mentioned this commit as equivalent
upstream commit. But this upstream change itself is dependent on a bunch
of changes starting from 'commit 288c81ce4be7 ("PCI/MSI: Move code into a
separate directory")', which refactored msi driver into multiple parts.
So another way of fixing this issue would be to backport these patches and
get this issue implictly fixed.
Kindly let me know if my current patch is not acceptable and in that case
will it be fine if I backport the above mentioned msi driver refactoring
patches to LST.
drivers/net/ethernet/intel/igb/igb_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 7c42a99be5065..5b59fb65231ba 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -107,6 +107,7 @@ static int igb_setup_all_rx_resources(struct igb_adapter *);
static void igb_free_all_tx_resources(struct igb_adapter *);
static void igb_free_all_rx_resources(struct igb_adapter *);
static void igb_setup_mrqc(struct igb_adapter *);
+static void igb_free_irq(struct igb_adapter *adapter);
static int igb_probe(struct pci_dev *, const struct pci_device_id *);
static void igb_remove(struct pci_dev *pdev);
static int igb_sw_init(struct igb_adapter *);
@@ -1080,6 +1081,7 @@ static void igb_free_q_vectors(struct igb_adapter *adapter)
*/
static void igb_clear_interrupt_scheme(struct igb_adapter *adapter)
{
+ igb_free_irq(adapter);
igb_free_q_vectors(adapter);
igb_reset_interrupt_capability(adapter);
}
--
2.34.1
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 72b779bc09422..d1a0f36dcdd43 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -101,7 +101,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 0791480bf922b..88ca5015f987e 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -86,7 +86,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 0d9b7d453a877..75907f77f9e38 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -87,7 +87,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 5e2657c1dbbe6..a0c5a372dcf62 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -85,7 +85,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index ea695c4a7a3fb..3bdf6df4b553e 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -83,7 +83,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
From: Joakim Sindholt <opensource(a)zhasha.com>
[ Upstream commit cd25e15e57e68a6b18dc9323047fe9c68b99290b ]
Garbage in plain 9P2000's perm bits is allowed through, which causes it
to be able to set (among others) the suid bit. This was presumably not
the intent since the unix extended bits are handled explicitly and
conditionally on .u.
Signed-off-by: Joakim Sindholt <opensource(a)zhasha.com>
Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/9p/vfs_inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 32572982f72e6..e337fec9b18e1 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -83,7 +83,7 @@ static int p9mode2perm(struct v9fs_session_info *v9ses,
int res;
int mode = stat->mode;
- res = mode & S_IALLUGO;
+ res = mode & 0777; /* S_IRWXUGO */
if (v9fs_proto_dotu(v9ses)) {
if ((mode & P9_DMSETUID) == P9_DMSETUID)
res |= S_ISUID;
--
2.43.0
Removes the output of this purely informational message from the
kernel buffer:
"ntfs3: Max link count 4000"
Signed-off-by: Konstantin Komarov <almaz.alexandrovich(a)paragon-software.com>
Cc: stable(a)vger.kernel.org
---
fs/ntfs3/super.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
index 9df7c20d066f..ac4722011140 100644
--- a/fs/ntfs3/super.c
+++ b/fs/ntfs3/super.c
@@ -1804,8 +1804,6 @@ static int __init init_ntfs_fs(void)
{
int err;
- pr_info("ntfs3: Max link count %u\n", NTFS_LINK_MAX);
-
if (IS_ENABLED(CONFIG_NTFS3_FS_POSIX_ACL))
pr_info("ntfs3: Enabled Linux POSIX ACLs support\n");
if (IS_ENABLED(CONFIG_NTFS3_64BIT_CLUSTER))
--
2.34.1
When counting and checking hard links in an ntfs file record,
struct MFT_REC {
struct NTFS_RECORD_HEADER rhdr; // 'FILE'
__le16 seq; // 0x10: Sequence number for this record.
>> __le16 hard_links; // 0x12: The number of hard links to record.
__le16 attr_off; // 0x14: Offset to attributes.
...
the ntfs3 driver ignored short names (DOS names), causing the link count
to be reduced by 1 and messages to be output to dmesg.
For Windows, such a situation is a minor error, meaning chkdsk does not report
errors on such a volume, and in the case of using the /f switch, it silently
corrects them, reporting that no errors were found. This does not affect
the consistency of the file system.
Nevertheless, the behavior in the ntfs3 driver is incorrect and
changes the content of the file system. This patch should fix that.
PS: most likely, there has been a confusion of concepts
MFT_REC::hard_links and inode::__i_nlink.
Fixes: 82cae269cfa95 ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich(a)paragon-software.com>
Cc: stable(a)vger.kernel.org
---
fs/ntfs3/inode.c | 7 ++++---
fs/ntfs3/record.c | 11 ++---------
2 files changed, 6 insertions(+), 12 deletions(-)
diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
index eb7a8c9fba01..05f169018c4e 100644
--- a/fs/ntfs3/inode.c
+++ b/fs/ntfs3/inode.c
@@ -37,7 +37,7 @@ static struct inode *ntfs_read_mft(struct inode *inode,
bool is_dir;
unsigned long ino = inode->i_ino;
u32 rp_fa = 0, asize, t32;
- u16 roff, rsize, names = 0;
+ u16 roff, rsize, names = 0, links = 0;
const struct ATTR_FILE_NAME *fname = NULL;
const struct INDEX_ROOT *root;
struct REPARSE_DATA_BUFFER rp; // 0x18 bytes
@@ -200,11 +200,12 @@ static struct inode *ntfs_read_mft(struct inode *inode,
rsize < SIZEOF_ATTRIBUTE_FILENAME)
goto out;
+ names += 1;
fname = Add2Ptr(attr, roff);
if (fname->type == FILE_NAME_DOS)
goto next_attr;
- names += 1;
+ links += 1;
if (name && name->len == fname->name_len &&
!ntfs_cmp_names_cpu(name, (struct le_str *)&fname->name_len,
NULL, false))
@@ -429,7 +430,7 @@ static struct inode *ntfs_read_mft(struct inode *inode,
ni->mi.dirty = true;
}
- set_nlink(inode, names);
+ set_nlink(inode, links);
if (S_ISDIR(mode)) {
ni->std_fa |= FILE_ATTRIBUTE_DIRECTORY;
diff --git a/fs/ntfs3/record.c b/fs/ntfs3/record.c
index 6aa3a9d44df1..6c76503edc20 100644
--- a/fs/ntfs3/record.c
+++ b/fs/ntfs3/record.c
@@ -534,16 +534,9 @@ bool mi_remove_attr(struct ntfs_inode *ni, struct mft_inode *mi,
if (aoff + asize > used)
return false;
- if (ni && is_attr_indexed(attr)) {
+ if (ni && is_attr_indexed(attr) && attr->type == ATTR_NAME) {
u16 links = le16_to_cpu(ni->mi.mrec->hard_links);
- struct ATTR_FILE_NAME *fname =
- attr->type != ATTR_NAME ?
- NULL :
- resident_data_ex(attr,
- SIZEOF_ATTRIBUTE_FILENAME);
- if (fname && fname->type == FILE_NAME_DOS) {
- /* Do not decrease links count deleting DOS name. */
- } else if (!links) {
+ if (!links) {
/* minor error. Not critical. */
} else {
ni->mi.mrec->hard_links = cpu_to_le16(links - 1);
--
2.34.1
When KUnit tests are enabled, under very big kernel configurations
(e.g. `allyesconfig`), we can trigger a `rustdoc` ICE [1]:
RUSTDOC TK rust/kernel/lib.rs
error: the compiler unexpectedly panicked. this is a bug.
The reason is that this build step has a duplicated `@rustc_cfg` argument,
which contains the kernel configuration, and thus a lot of arguments. The
factor 2 happens to be enough to reach the ICE.
Thus remove the unneeded `@rustc_cfg`. By doing so, we clean up the
command and workaround the ICE.
The ICE has been fixed in the upcoming Rust 1.79 [2].
Cc: stable(a)vger.kernel.org
Fixes: a66d733da801 ("rust: support running Rust documentation tests as KUnit ones")
Link: https://github.com/rust-lang/rust/issues/122722 [1]
Link: https://github.com/rust-lang/rust/pull/122840 [2]
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
---
rust/Makefile | 1 -
1 file changed, 1 deletion(-)
diff --git a/rust/Makefile b/rust/Makefile
index 846e6ab9d5a9..86a125c4243c 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -175,7 +175,6 @@ quiet_cmd_rustdoc_test_kernel = RUSTDOC TK $<
mkdir -p $(objtree)/$(obj)/test/doctests/kernel; \
OBJTREE=$(abspath $(objtree)) \
$(RUSTDOC) --test $(rust_flags) \
- @$(objtree)/include/generated/rustc_cfg \
-L$(objtree)/$(obj) --extern alloc --extern kernel \
--extern build_error --extern macros \
--extern bindings --extern uapi \
base-commit: 4cece764965020c22cff7665b18a012006359095
--
2.44.0
On Sun, Apr 21, 2024 at 01:11:19PM -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> configs/hardening: Fix disabling UBSAN configurations
>
> to the 6.8-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> configs-hardening-fix-disabling-ubsan-configurations.patch
> and it can be found in the queue-6.8 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit a54fba0bb1f52707b423c908e153d6429d08db58
> Author: Nathan Chancellor <nathan(a)kernel.org>
> Date: Thu Apr 11 11:11:06 2024 -0700
>
> configs/hardening: Fix disabling UBSAN configurations
>
> [ Upstream commit e048d668f2969cf2b76e0fa21882a1b3bb323eca ]
While I think backporting this makes sense, I don't know that
backporting 918327e9b7ff ("ubsan: Remove CONFIG_UBSAN_SANITIZE_ALL") to
resolve the conflict with 6.8 is entirely necessary (or beneficial, I
don't know how Kees feels about it though). I've attached a version that
applies cleanly to 6.8, in case it is desirable.
Cheers,
Nathan
Hi,
Based on the issue reported here [0], commit 0553e753ea9e ("net/mlx5: E-
switch, store eswitch pointer before registering devlink_param") is missing from
kernel 6.6.x due to a missing tag. Could you please include it in 6.6.x and
6.8.x?
[0] https://lore.kernel.org/netdev/20240422170428.32576-1-oxana@cloudflare.com/
Thanks,
Dragos
The patch titled
Subject: maple_tree: fix mas_empty_area_rev() null pointer dereference
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
maple_tree-fix-mas_empty_area_rev-null-pointer-dereference.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Subject: maple_tree: fix mas_empty_area_rev() null pointer dereference
Date: Mon, 22 Apr 2024 16:33:49 -0400
Currently the code calls mas_start() followed by mas_data_end() if the
maple state is MA_START, but mas_start() may return with the maple state
node == NULL. This will lead to a null pointer dereference when checking
information in the NULL node, which is done in mas_data_end().
Avoid setting the offset if there is no node by waiting until after the
maple state is checked for an empty or single entry state.
A user could trigger the events to cause a kernel oops by unmapping all
vmas to produce an empty maple tree, then mapping a vma that would cause
the scenario described above.
Link: https://lkml.kernel.org/r/20240422203349.2418465-1-Liam.Howlett@oracle.com
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reported-by: Marius Fleischer <fleischermarius(a)gmail.com>
Closes: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW…
Link: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW…
Tested-by: Marius Fleischer <fleischermarius(a)gmail.com>
Tested-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/maple_tree.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
--- a/lib/maple_tree.c~maple_tree-fix-mas_empty_area_rev-null-pointer-dereference
+++ a/lib/maple_tree.c
@@ -5109,18 +5109,18 @@ int mas_empty_area_rev(struct ma_state *
if (size == 0 || max - min < size - 1)
return -EINVAL;
- if (mas_is_start(mas)) {
+ if (mas_is_start(mas))
mas_start(mas);
- mas->offset = mas_data_end(mas);
- } else if (mas->offset >= 2) {
- mas->offset -= 2;
- } else if (!mas_rewind_node(mas)) {
+ else if ((mas->offset < 2) && (!mas_rewind_node(mas)))
return -EBUSY;
- }
- /* Empty set. */
- if (mas_is_none(mas) || mas_is_ptr(mas))
+ if (unlikely(mas_is_none(mas) || mas_is_ptr(mas)))
return mas_sparse_area(mas, min, max, size, false);
+ else if (mas->offset >= 2)
+ mas->offset -= 2;
+ else
+ mas->offset = mas_data_end(mas);
+
/* The start of the window can only be within these values. */
mas->index = min;
_
Patches currently in -mm which might be from Liam.Howlett(a)oracle.com are
maple_tree-fix-mas_empty_area_rev-null-pointer-dereference.patch
Currently, when kdb is compiled with keyboard support, then we will use
schedule_work() to provoke reset of the keyboard status. Unfortunately
schedule_work() gets called from the kgdboc post-debug-exception
handler. That risks deadlock since schedule_work() is not NMI-safe and,
even on platforms where the NMI is not directly used for debugging, the
debug trap can have NMI-like behaviour depending on where breakpoints
are placed.
Fix this by using the irq work system, which is NMI-safe, to defer the
call to schedule_work() to a point when it is safe to call.
Reported-by: Liuye <liu.yeC(a)h3c.com>
Closes: https://lore.kernel.org/all/20240228025602.3087748-1-liu.yeC@h3c.com/
Cc: stable(a)vger.kernel.org
Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org>
---
drivers/tty/serial/kgdboc.c | 30 +++++++++++++++++++++++++++++-
1 file changed, 29 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index 7ce7bb1640054..adcea70fd7507 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -19,6 +19,7 @@
#include <linux/console.h>
#include <linux/vt_kern.h>
#include <linux/input.h>
+#include <linux/irq_work.h>
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/serial_core.h>
@@ -48,6 +49,25 @@ static struct kgdb_io kgdboc_earlycon_io_ops;
static int (*earlycon_orig_exit)(struct console *con);
#endif /* IS_BUILTIN(CONFIG_KGDB_SERIAL_CONSOLE) */
+/*
+ * When we leave the debug trap handler we need to reset the keyboard status
+ * (since the original keyboard state gets partially clobbered by kdb use of
+ * the keyboard).
+ *
+ * The path to deliver the reset is somewhat circuitous.
+ *
+ * To deliver the reset we register an input handler, reset the keyboard and
+ * then deregister the input handler. However, to get this done right, we do
+ * have to carefully manage the calling context because we can only register
+ * input handlers from task context.
+ *
+ * In particular we need to trigger the action from the debug trap handler with
+ * all its NMI and/or NMI-like oddities. To solve this the kgdboc trap exit code
+ * (the "post_exception" callback) uses irq_work_queue(), which is NMI-safe, to
+ * schedule a callback from a hardirq context. From there we have to defer the
+ * work again, this time using schedule_Work(), to get a callback using the
+ * system workqueue, which runs in task context.
+ */
#ifdef CONFIG_KDB_KEYBOARD
static int kgdboc_reset_connect(struct input_handler *handler,
struct input_dev *dev,
@@ -99,10 +119,17 @@ static void kgdboc_restore_input_helper(struct work_struct *dummy)
static DECLARE_WORK(kgdboc_restore_input_work, kgdboc_restore_input_helper);
+static void kgdboc_queue_restore_input_helper(struct irq_work *unused)
+{
+ schedule_work(&kgdboc_restore_input_work);
+}
+
+static DEFINE_IRQ_WORK(kgdboc_restore_input_irq_work, kgdboc_queue_restore_input_helper);
+
static void kgdboc_restore_input(void)
{
if (likely(system_state == SYSTEM_RUNNING))
- schedule_work(&kgdboc_restore_input_work);
+ irq_work_queue(&kgdboc_restore_input_irq_work);
}
static int kgdboc_register_kbd(char **cptr)
@@ -133,6 +160,7 @@ static void kgdboc_unregister_kbd(void)
i--;
}
}
+ irq_work_sync(&kgdboc_restore_input_irq_work);
flush_work(&kgdboc_restore_input_work);
}
#else /* ! CONFIG_KDB_KEYBOARD */
---
base-commit: 0bbac3facb5d6cc0171c45c9873a2dc96bea9680
change-id: 20240419-kgdboc_fix_schedule_work-f0cb44b8a354
Best regards,
--
Daniel Thompson <daniel.thompson(a)linaro.org>
Currently the code calls mas_start() followed by mas_data_end() if the
maple state is MA_START, but mas_start() may return with the maple state
node == NULL. This will lead to a null pointer dereference when
checking information in the NULL node, which is done in mas_data_end().
Avoid setting the offset if there is no node by waiting until after the
maple state is checked for an empty or single entry state.
A user could trigger the events to cause a kernel oops by unmapping all
vmas to produce an empty maple tree, then mapping a vma that would cause
the scenario described above.
Reported-by: Marius Fleischer <fleischermarius(a)gmail.com>
Closes: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW…
Link: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW…
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Tested-by: Marius Fleischer <fleischermarius(a)gmail.com>
Tested-by: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: maple-tree(a)lists.infradead.org
Cc: linux-mm(a)kvack.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
---
lib/maple_tree.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 55e1b35bf877..2d7d27e6ae3c 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5109,18 +5109,18 @@ int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
if (size == 0 || max - min < size - 1)
return -EINVAL;
- if (mas_is_start(mas)) {
+ if (mas_is_start(mas))
mas_start(mas);
- mas->offset = mas_data_end(mas);
- } else if (mas->offset >= 2) {
- mas->offset -= 2;
- } else if (!mas_rewind_node(mas)) {
+ else if ((mas->offset < 2) && (!mas_rewind_node(mas)))
return -EBUSY;
- }
- /* Empty set. */
- if (mas_is_none(mas) || mas_is_ptr(mas))
+ if (unlikely(mas_is_none(mas) || mas_is_ptr(mas)))
return mas_sparse_area(mas, min, max, size, false);
+ else if (mas->offset >= 2)
+ mas->offset -= 2;
+ else
+ mas->offset = mas_data_end(mas);
+
/* The start of the window can only be within these values. */
mas->index = min;
--
2.43.0
The patch titled
Subject: mm/userfaultfd: reset ptes when close() for wr-protected ones
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-userfaultfd-reset-ptes-when-close-for-wr-protected-ones.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/userfaultfd: reset ptes when close() for wr-protected ones
Date: Mon, 22 Apr 2024 09:33:11 -0400
Userfaultfd unregister includes a step to remove wr-protect bits from all
the relevant pgtable entries, but that only covered an explicit
UFFDIO_UNREGISTER ioctl, not a close() on the userfaultfd itself. Cover
that too. This fixes a WARN trace.
Link: https://lore.kernel.org/all/000000000000ca4df20616a0fe16@google.com/
Analyzed-by: David Hildenbrand <david(a)redhat.com>
Link: https://lkml.kernel.org/r/20240422133311.2987675-1-peterx@redhat.com
Reported-by: syzbot+d8426b591c36b21c750e(a)syzkaller.appspotmail.com
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/userfaultfd.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/fs/userfaultfd.c~mm-userfaultfd-reset-ptes-when-close-for-wr-protected-ones
+++ a/fs/userfaultfd.c
@@ -895,6 +895,10 @@ static int userfaultfd_release(struct in
prev = vma;
continue;
}
+ /* Reset ptes for the whole vma range if wr-protected */
+ if (userfaultfd_wp(vma))
+ uffd_wp_range(vma, vma->vm_start,
+ vma->vm_end - vma->vm_start, false);
new_flags = vma->vm_flags & ~__VM_UFFD_FLAGS;
vma = vma_modify_flags_uffd(&vmi, prev, vma, vma->vm_start,
vma->vm_end, new_flags,
_
Patches currently in -mm which might be from peterx(a)redhat.com are
mm-hugetlb-fix-missing-hugetlb_lock-for-resv-uncharge.patch
mm-userfaultfd-reset-ptes-when-close-for-wr-protected-ones.patch
mm-hmm-process-pud-swap-entry-without-pud_huge.patch
mm-gup-cache-p4d-in-follow_p4d_mask.patch
mm-gup-check-p4d-presence-before-going-on.patch
mm-x86-change-pxd_huge-behavior-to-exclude-swap-entries.patch
mm-sparc-change-pxd_huge-behavior-to-exclude-swap-entries.patch
mm-arm-use-macros-to-define-pmd-pud-helpers.patch
mm-arm-redefine-pmd_huge-with-pmd_leaf.patch
mm-arm64-merge-pxd_huge-and-pxd_leaf-definitions.patch
mm-powerpc-redefine-pxd_huge-with-pxd_leaf.patch
mm-gup-merge-pxd-huge-mapping-checks.patch
mm-treewide-replace-pxd_huge-with-pxd_leaf.patch
mm-treewide-remove-pxd_huge.patch
mm-arm-remove-pmd_thp_or_huge.patch
mm-document-pxd_leaf-api.patch
mm-always-initialise-folio-_deferred_list-fix.patch
selftests-mm-run_vmtestssh-fix-hugetlb-mem-size-calculation.patch
selftests-mm-run_vmtestssh-fix-hugetlb-mem-size-calculation-fix.patch
mm-kconfig-config_pgtable_has_huge_leaves.patch
mm-hugetlb-declare-hugetlbfs_pagecache_present-non-static.patch
mm-make-hpage_pxd_-macros-even-if-thp.patch
mm-introduce-vma_pgtable_walk_beginend.patch
mm-arch-provide-pud_pfn-fallback.patch
mm-arch-provide-pud_pfn-fallback-fix.patch
mm-gup-drop-folio_fast_pin_allowed-in-hugepd-processing.patch
mm-gup-refactor-record_subpages-to-find-1st-small-page.patch
mm-gup-handle-hugetlb-for-no_page_table.patch
mm-gup-cache-pudp-in-follow_pud_mask.patch
mm-gup-handle-huge-pud-for-follow_pud_mask.patch
mm-gup-handle-huge-pmd-for-follow_pmd_mask.patch
mm-gup-handle-huge-pmd-for-follow_pmd_mask-fix.patch
mm-gup-handle-hugepd-for-follow_page.patch
mm-gup-handle-hugetlb-in-the-generic-follow_page_mask-code.patch
mm-allow-anon-exclusive-check-over-hugetlb-tail-pages.patch
mm-hugetlb-assert-hugetlb_lock-in-__hugetlb_cgroup_commit_charge.patch
mm-page_table_check-support-userfault-wr-protect-entries.patch
commit 2f4a4d63a193be6fd530d180bb13c3592052904c modified
cpc_read/cpc_write to use access_width to read CPC registers. For PCC
registers the access width field in the ACPI register macro specifies
the PCC subspace id. For non-zero PCC subspace id the access width is
incorrectly treated as access width. This causes errors when reading
from PCC registers in the CPPC driver.
For PCC registers base the size of read/write on the bit width field.
The debug message in cpc_read/cpc_write is updated to print relevant
information for the address space type used to read the register.
Signed-off-by: Vanshidhar Konda <vanshikonda(a)os.amperecomputing.com>
Tested-by: Jarred White <jarredwhite(a)linux.microsoft.com>
Reviewed-by: Jarred White <jarredwhite(a)linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha(a)linux.microsoft.com>
Cc: 5.15+ <stable(a)vger.kernel.org> # 5.15+
---
When testing v6.9-rc1 kernel on AmpereOne system dmesg showed that
cpufreq policy had failed to initialize on some cores during boot because
cpufreq->get() always returned 0. On this system CPPC registers are in PCC
subspace index 2 that are 32 bits wide. With this patch the CPPC driver
interpreted the access width field as 16 bits, causing the register read
to roll over too quickly to provide valid values during frequency
computation.
v2:
- Use size variable in debug print message
- Use size instead of reg->bit_width for acpi_os_read_memory and
acpi_os_write_memory
v3:
- Fix language in error messages in cpc_read/cpc_write
drivers/acpi/cppc_acpi.c | 53 ++++++++++++++++++++++++++++------------
1 file changed, 37 insertions(+), 16 deletions(-)
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 4bfbe55553f4..7d476988fae3 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1002,14 +1002,14 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
}
*val = 0;
+ size = GET_BIT_WIDTH(reg);
if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
- u32 width = GET_BIT_WIDTH(reg);
u32 val_u32;
acpi_status status;
status = acpi_os_read_port((acpi_io_address)reg->address,
- &val_u32, width);
+ &val_u32, size);
if (ACPI_FAILURE(status)) {
pr_debug("Error: Failed to read SystemIO port %llx\n",
reg->address);
@@ -1018,17 +1018,22 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
*val = val_u32;
return 0;
- } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0)
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) {
+ /*
+ * For registers in PCC space, the register size is determined
+ * by the bit width field; the access size is used to indicate
+ * the PCC subspace id.
+ */
+ size = reg->bit_width;
vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id);
+ }
else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
vaddr = reg_res->sys_mem_vaddr;
else if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE)
return cpc_read_ffh(cpu, reg, val);
else
return acpi_os_read_memory((acpi_physical_address)reg->address,
- val, reg->bit_width);
-
- size = GET_BIT_WIDTH(reg);
+ val, size);
switch (size) {
case 8:
@@ -1044,8 +1049,13 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
*val = readq_relaxed(vaddr);
break;
default:
- pr_debug("Error: Cannot read %u bit width from PCC for ss: %d\n",
- reg->bit_width, pcc_ss_id);
+ if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
+ pr_debug("Error: Cannot read %u bit width from system memory: 0x%llx\n",
+ size, reg->address);
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM) {
+ pr_debug("Error: Cannot read %u bit width from PCC for ss: %d\n",
+ size, pcc_ss_id);
+ }
return -EFAULT;
}
@@ -1063,12 +1073,13 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu);
struct cpc_reg *reg = ®_res->cpc_entry.reg;
+ size = GET_BIT_WIDTH(reg);
+
if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
- u32 width = GET_BIT_WIDTH(reg);
acpi_status status;
status = acpi_os_write_port((acpi_io_address)reg->address,
- (u32)val, width);
+ (u32)val, size);
if (ACPI_FAILURE(status)) {
pr_debug("Error: Failed to write SystemIO port %llx\n",
reg->address);
@@ -1076,17 +1087,22 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
}
return 0;
- } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0)
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM && pcc_ss_id >= 0) {
+ /*
+ * For registers in PCC space, the register size is determined
+ * by the bit width field; the access size is used to indicate
+ * the PCC subspace id.
+ */
+ size = reg->bit_width;
vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id);
+ }
else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
vaddr = reg_res->sys_mem_vaddr;
else if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE)
return cpc_write_ffh(cpu, reg, val);
else
return acpi_os_write_memory((acpi_physical_address)reg->address,
- val, reg->bit_width);
-
- size = GET_BIT_WIDTH(reg);
+ val, size);
if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
val = MASK_VAL(reg, val);
@@ -1105,8 +1121,13 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
writeq_relaxed(val, vaddr);
break;
default:
- pr_debug("Error: Cannot write %u bit width to PCC for ss: %d\n",
- reg->bit_width, pcc_ss_id);
+ if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
+ pr_debug("Error: Cannot write %u bit width to system memory: 0x%llx\n",
+ size, reg->address);
+ } else if (reg->space_id == ACPI_ADR_SPACE_PLATFORM_COMM) {
+ pr_debug("Error: Cannot write %u bit width to PCC for ss: %d\n",
+ size, pcc_ss_id);
+ }
ret_val = -EFAULT;
break;
}
--
2.43.1
Commit 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for
system memory accesses") neglected to properly wrap the bit_offset shift
when it comes to applying the mask. This may cause incorrect values to be
read and may cause the cpufreq module not be loaded.
[ 11.059751] cpu_capacity: CPU0 missing/invalid highest performance.
[ 11.066005] cpu_capacity: partial information: fallback to 1024 for all CPUs
Also, corrected the bitmask generation in GENMASK (extra bit being added).
Fixes: 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Jarred White <jarredwhite(a)linux.microsoft.com>
CC: Vanshidhar Konda <vanshikonda(a)os.amperecomputing.com>
CC: stable(a)vger.kernel.org #5.15+
---
drivers/acpi/cppc_acpi.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 4bfbe55553f4..00a30ca35e78 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -170,8 +170,8 @@ show_cppc_data(cppc_get_perf_ctrs, cppc_perf_fb_ctrs, wraparound_time);
#define GET_BIT_WIDTH(reg) ((reg)->access_width ? (8 << ((reg)->access_width - 1)) : (reg)->bit_width)
/* Shift and apply the mask for CPC reads/writes */
-#define MASK_VAL(reg, val) ((val) >> ((reg)->bit_offset & \
- GENMASK(((reg)->bit_width), 0)))
+#define MASK_VAL(reg, val) (((val) >> (reg)->bit_offset) & \
+ GENMASK(((reg)->bit_width) - 1, 0))
static ssize_t show_feedback_ctrs(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
--
2.33.8
From: George Shen <george.shen(a)amd.com>
Theoretically rare corner case where ceil(Y) results in rounding up to
an integer. If this happens, the 1 should be carried over to the X
value.
CC: stable(a)vger.kernel.org
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
Signed-off-by: George Shen <george.shen(a)amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler(a)amd.com>
---
.../drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c
index 48e63550f696..03b4ac2f1991 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hpo_dp_link_encoder.c
@@ -395,6 +395,12 @@ void dcn31_hpo_dp_link_enc_set_throttled_vcp_size(
x),
25));
+ // If y rounds up to integer, carry it over to x.
+ if (y >> 25) {
+ x += 1;
+ y = 0;
+ }
+
switch (stream_encoder_inst) {
case 0:
REG_SET_2(DP_DPHY_SYM32_VC_RATE_CNTL0, 0,
--
2.44.0
An async dio write to a sparse file can generate a lot of extents
and when we unlink this file (using rm), the kernel can be busy in umapping
and freeing those extents as part of transaction processing.
Add cond_resched() in xfs_defer_finish_noroll() to avoid soft lockups
messages. Here is a call trace of such soft lockup.
watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [rm:81335]
CPU: 1 PID: 81335 Comm: rm Kdump: loaded Tainted: G L X 5.14.21-150500.53-default
NIP [c00800001b174768] xfs_extent_busy_trim+0xc0/0x2a0 [xfs]
LR [c00800001b1746f4] xfs_extent_busy_trim+0x4c/0x2a0 [xfs]
Call Trace:
0xc0000000a8268340 (unreliable)
xfs_alloc_compute_aligned+0x5c/0x150 [xfs]
xfs_alloc_ag_vextent_size+0x1dc/0x8c0 [xfs]
xfs_alloc_ag_vextent+0x17c/0x1c0 [xfs]
xfs_alloc_fix_freelist+0x274/0x4b0 [xfs]
xfs_free_extent_fix_freelist+0x84/0xe0 [xfs]
__xfs_free_extent+0xa0/0x240 [xfs]
xfs_trans_free_extent+0x6c/0x140 [xfs]
xfs_defer_finish_noroll+0x2b0/0x650 [xfs]
xfs_inactive_truncate+0xe8/0x140 [xfs]
xfs_fs_destroy_inode+0xdc/0x320 [xfs]
destroy_inode+0x6c/0xc0
do_unlinkat+0x1fc/0x410
sys_unlinkat+0x58/0xb0
system_call_exception+0x15c/0x330
system_call_common+0xec/0x250
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
cc: stable(a)vger.kernel.org
cc: Ojaswin Mujoo <ojaswin(a)linux.ibm.com>
---
fs/xfs/libxfs/xfs_defer.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index c13276095cc0..cb185b97447d 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -705,6 +705,7 @@ xfs_defer_finish_noroll(
error = xfs_defer_finish_one(*tp, dfp);
if (error && error != -EAGAIN)
goto out_shutdown;
+ cond_resched();
}
/* Requeue the paused items in the outgoing transaction. */
--
2.44.0
Qualcomm ROME controllers can be registered from the Bluetooth line
discipline and in this case the HCI UART serdev pointer is NULL.
Add the missing sanity check to prevent a NULL-pointer dereference when
setup() is called for a non-serdev controller.
Fixes: e9b3e5b8c657 ("Bluetooth: hci_qca: only assign wakeup with serial port support")
Cc: stable(a)vger.kernel.org # 6.2
Cc: Zhengping Jiang <jiangzp(a)google.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/bluetooth/hci_qca.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 94c85f4fbf3b..b621a0a40ea4 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -1958,8 +1958,10 @@ static int qca_setup(struct hci_uart *hu)
qca_debugfs_init(hdev);
hu->hdev->hw_error = qca_hw_error;
hu->hdev->cmd_timeout = qca_cmd_timeout;
- if (device_can_wakeup(hu->serdev->ctrl->dev.parent))
- hu->hdev->wakeup = qca_wakeup;
+ if (hu->serdev) {
+ if (device_can_wakeup(hu->serdev->ctrl->dev.parent))
+ hu->hdev->wakeup = qca_wakeup;
+ }
} else if (ret == -ENOENT) {
/* No patch/nvm-config found, run with original fw/config */
set_bit(QCA_ROM_FW, &qca->flags);
--
2.43.2
Qualcomm ROME controllers can be registered from the Bluetooth line
discipline and in this case the HCI UART serdev pointer is NULL.
Add the missing sanity check to prevent a NULL-pointer dereference when
wakeup() is called for a non-serdev controller during suspend.
Just return true for now to restore the original behaviour and address
the crash with pre-6.2 kernels, which do not have commit e9b3e5b8c657
("Bluetooth: hci_qca: only assign wakeup with serial port support") that
causes the crash to happen already at setup() time.
Fixes: c1a74160eaf1 ("Bluetooth: hci_qca: Add device_may_wakeup support")
Cc: stable(a)vger.kernel.org # 5.13
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/bluetooth/hci_qca.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 92fa20f5ac7d..94c85f4fbf3b 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -1672,6 +1672,9 @@ static bool qca_wakeup(struct hci_dev *hdev)
struct hci_uart *hu = hci_get_drvdata(hdev);
bool wakeup;
+ if (!hu->serdev)
+ return true;
+
/* BT SoC attached through the serial bus is handled by the serdev driver.
* So we need to use the device handle of the serdev driver to get the
* status of device may wakeup.
--
2.43.2
If none of the clusters are added because of some error, fail to load
driver without presenting root domain. In this case root domain will
present invalid data.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Fixes: 01c10f88c9b7 ("platform/x86/intel-uncore-freq: tpmi: Provide cluster level control")
Cc: <stable(a)vger.kernel.org> # 6.5+
---
This error can be reproduced in the pre production hardware only.
So can go through regular cycle and they apply to stable.
.../x86/intel/uncore-frequency/uncore-frequency-tpmi.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
index bd75d61ff8a6..587437211d72 100644
--- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
+++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
@@ -240,6 +240,7 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
bool read_blocked = 0, write_blocked = 0;
struct intel_tpmi_plat_info *plat_info;
struct tpmi_uncore_struct *tpmi_uncore;
+ bool uncore_sysfs_added = false;
int ret, i, pkg = 0;
int num_resources;
@@ -384,9 +385,15 @@ static int uncore_probe(struct auxiliary_device *auxdev, const struct auxiliary_
}
/* Point to next cluster offset */
cluster_offset >>= UNCORE_MAX_CLUSTER_PER_DOMAIN;
+ uncore_sysfs_added = true;
}
}
+ if (!uncore_sysfs_added) {
+ ret = -ENODEV;
+ goto remove_clusters;
+ }
+
auxiliary_set_drvdata(auxdev, tpmi_uncore);
tpmi_uncore->root_cluster.root_domain = true;
--
2.40.1
Nick Bowler reported that sparc64 failed to bring all his CPU's online,
and that turned out to be an easy fix.
The sparc64 build was rather noisy with a lot of warnings which had
irritated me enough to go ahead and fix them.
With this set of patches my arch/sparc/ is almost warning free for
all{no,yes,mod}config + defconfig builds.
There is one warning about "clone3 not implemented", which I have ignored.
The warning fixes hides the fact that sparc64 is not yet y2038 prepared,
and it would be preferable if someone knowledgeable would fix this
poperly.
All fixes looks like 6.9 material to me.
Sam
---
Sam Ravnborg (10):
sparc64: Fix prototype warning for init_vdso_image
sparc64: Fix prototype warnings in traps_64.c
sparc64: Fix prototype warning for vmemmap_free
sparc64: Fix prototype warning for alloc_irqstack_bootmem
sparc64: Fix prototype warning for uprobe_trap
sparc64: Fix prototype warning for dma_4v_iotsb_bind
sparc64: Fix prototype warnings in adi_64.c
sparc64: Fix prototype warning for sched_clock
sparc64: Fix number of online CPUs
sparc64: Fix prototype warnings for vdso
arch/sparc/include/asm/smp_64.h | 2 --
arch/sparc/include/asm/vdso.h | 10 ++++++++++
arch/sparc/kernel/adi_64.c | 14 +++++++-------
arch/sparc/kernel/kernel.h | 4 ++++
arch/sparc/kernel/pci_sun4v.c | 6 +++---
arch/sparc/kernel/prom_64.c | 4 +++-
arch/sparc/kernel/setup_64.c | 3 +--
arch/sparc/kernel/smp_64.c | 14 --------------
arch/sparc/kernel/time_64.c | 1 +
arch/sparc/kernel/traps_64.c | 10 +++++-----
arch/sparc/kernel/uprobes.c | 2 ++
arch/sparc/mm/init_64.c | 5 -----
arch/sparc/vdso/vclock_gettime.c | 1 +
arch/sparc/vdso/vma.c | 5 +++--
14 files changed, 40 insertions(+), 41 deletions(-)
---
base-commit: 84b76d05828a1909e20d0f66553b876b801f98c8
change-id: 20240329-sparc64-warnings-668cc90ef53b
Best regards,
--
Sam Ravnborg <sam(a)ravnborg.org>
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: mc: mark the media devnode as registered from the, start
Author: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Date: Fri Feb 23 09:46:19 2024 +0100
First the media device node was created, and if successful it was
marked as 'registered'. This leaves a small race condition where
an application can open the device node and get an error back
because the 'registered' flag was not yet set.
Change the order: first set the 'registered' flag, then actually
register the media device node. If that fails, then clear the flag.
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Acked-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Fixes: cf4b9211b568 ("[media] media: Media device node support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
drivers/media/mc/mc-devnode.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
---
diff --git a/drivers/media/mc/mc-devnode.c b/drivers/media/mc/mc-devnode.c
index 7f67825c8757..318e267e798e 100644
--- a/drivers/media/mc/mc-devnode.c
+++ b/drivers/media/mc/mc-devnode.c
@@ -245,15 +245,14 @@ int __must_check media_devnode_register(struct media_device *mdev,
kobject_set_name(&devnode->cdev.kobj, "media%d", devnode->minor);
/* Part 3: Add the media and char device */
+ set_bit(MEDIA_FLAG_REGISTERED, &devnode->flags);
ret = cdev_device_add(&devnode->cdev, &devnode->dev);
if (ret < 0) {
+ clear_bit(MEDIA_FLAG_REGISTERED, &devnode->flags);
pr_err("%s: cdev_device_add failed\n", __func__);
goto cdev_add_error;
}
- /* Part 4: Activate this minor. The char device can now be used. */
- set_bit(MEDIA_FLAG_REGISTERED, &devnode->flags);
-
return 0;
cdev_add_error:
The type defined for the BINDER_SET_MAX_THREADS ioctl was changed from
size_t to __u32 in order to avoid incompatibility issues between 32 and
64-bit kernels. However, the internal types used to copy from user and
store the value were never updated. Use u32 to fix the inconsistency.
Fixes: a9350fc859ae ("staging: android: binder: fix BINDER_SET_MAX_THREADS declaration")
Reported-by: Arve Hjønnevåg <arve(a)android.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
---
Notes:
v2: rebased, send fix patch separately per Greg's feedback.
drivers/android/binder.c | 2 +-
drivers/android/binder_internal.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index bad28cf42010..5834e829f391 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -5365,7 +5365,7 @@ static long binder_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
goto err;
break;
case BINDER_SET_MAX_THREADS: {
- int max_threads;
+ u32 max_threads;
if (copy_from_user(&max_threads, ubuf,
sizeof(max_threads))) {
diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
index 7270d4d22207..5b7c80b99ae8 100644
--- a/drivers/android/binder_internal.h
+++ b/drivers/android/binder_internal.h
@@ -421,7 +421,7 @@ struct binder_proc {
struct list_head todo;
struct binder_stats stats;
struct list_head delivered_death;
- int max_threads;
+ u32 max_threads;
int requested_threads;
int requested_threads_started;
int tmp_ref;
--
2.44.0.769.g3c40516874-goog
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 6d029c25b71f2de2838a6f093ce0fa0e69336154
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041509-triangle-parlor-1783@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d029c25b71f2de2838a6f093ce0fa0e69336154 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg(a)redhat.com>
Date: Tue, 9 Apr 2024 15:38:03 +0200
Subject: [PATCH] selftests/timers/posix_timers: Reimplement
check_timer_distribution()
check_timer_distribution() runs ten threads in a busy loop and tries to
test that the kernel distributes a process posix CPU timer signal to every
thread over time.
There is not guarantee that this is true even after commit bcb7ee79029d
("posix-timers: Prefer delivery of signals to the current thread") because
that commit only avoids waking up the sleeping process leader thread, but
that has nothing to do with the actual signal delivery.
As the signal is process wide the first thread which observes sigpending
and wins the race to lock sighand will deliver the signal. Testing shows
that this hangs on a regular base because some threads never win the race.
The comment "This primarily tests that the kernel does not favour any one."
is wrong. The kernel does favour a thread which hits the timer interrupt
when CLOCK_PROCESS_CPUTIME_ID expires.
Rewrite the test so it only checks that the group leader sleeping in join()
never receives SIGALRM and the thread which burns CPU cycles receives all
signals.
In older kernels which do not have commit bcb7ee79029d ("posix-timers:
Prefer delivery of signals to the current thread") the test-case fails
immediately, the very 1st tick wakes the leader up. Otherwise it quickly
succeeds after 100 ticks.
CI testing wants to use newer selftest versions on stable kernels. In this
case the test is guaranteed to fail.
So check in the failure case whether the kernel version is less than v6.3
and skip the test result in that case.
[ tglx: Massaged change log, renamed the version check helper ]
Fixes: e797203fb3ba ("selftests/timers/posix_timers: Test delivery of signals across threads")
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240409133802.GD29396@redhat.com
diff --git a/tools/testing/selftests/kselftest.h b/tools/testing/selftests/kselftest.h
index 541bf192e30e..973b18e156b2 100644
--- a/tools/testing/selftests/kselftest.h
+++ b/tools/testing/selftests/kselftest.h
@@ -51,6 +51,7 @@
#include <stdarg.h>
#include <string.h>
#include <stdio.h>
+#include <sys/utsname.h>
#endif
#ifndef ARRAY_SIZE
@@ -388,4 +389,16 @@ static inline __printf(1, 2) int ksft_exit_skip(const char *msg, ...)
exit(KSFT_SKIP);
}
+static inline int ksft_min_kernel_version(unsigned int min_major,
+ unsigned int min_minor)
+{
+ unsigned int major, minor;
+ struct utsname info;
+
+ if (uname(&info) || sscanf(info.release, "%u.%u.", &major, &minor) != 2)
+ ksft_exit_fail_msg("Can't parse kernel version\n");
+
+ return major > min_major || (major == min_major && minor >= min_minor);
+}
+
#endif /* __KSELFTEST_H */
diff --git a/tools/testing/selftests/timers/posix_timers.c b/tools/testing/selftests/timers/posix_timers.c
index d49dd3ffd0d9..d86a0e00711e 100644
--- a/tools/testing/selftests/timers/posix_timers.c
+++ b/tools/testing/selftests/timers/posix_timers.c
@@ -184,80 +184,71 @@ static int check_timer_create(int which)
return 0;
}
-int remain;
-__thread int got_signal;
+static pthread_t ctd_thread;
+static volatile int ctd_count, ctd_failed;
-static void *distribution_thread(void *arg)
+static void ctd_sighandler(int sig)
{
- while (__atomic_load_n(&remain, __ATOMIC_RELAXED));
- return NULL;
+ if (pthread_self() != ctd_thread)
+ ctd_failed = 1;
+ ctd_count--;
}
-static void distribution_handler(int nr)
+static void *ctd_thread_func(void *arg)
{
- if (!__atomic_exchange_n(&got_signal, 1, __ATOMIC_RELAXED))
- __atomic_fetch_sub(&remain, 1, __ATOMIC_RELAXED);
-}
-
-/*
- * Test that all running threads _eventually_ receive CLOCK_PROCESS_CPUTIME_ID
- * timer signals. This primarily tests that the kernel does not favour any one.
- */
-static int check_timer_distribution(void)
-{
- int err, i;
- timer_t id;
- const int nthreads = 10;
- pthread_t threads[nthreads];
struct itimerspec val = {
.it_value.tv_sec = 0,
.it_value.tv_nsec = 1000 * 1000,
.it_interval.tv_sec = 0,
.it_interval.tv_nsec = 1000 * 1000,
};
+ timer_t id;
- remain = nthreads + 1; /* worker threads + this thread */
- signal(SIGALRM, distribution_handler);
- err = timer_create(CLOCK_PROCESS_CPUTIME_ID, NULL, &id);
- if (err < 0) {
- ksft_perror("Can't create timer");
- return -1;
- }
- err = timer_settime(id, 0, &val, NULL);
- if (err < 0) {
- ksft_perror("Can't set timer");
- return -1;
- }
+ /* 1/10 seconds to ensure the leader sleeps */
+ usleep(10000);
- for (i = 0; i < nthreads; i++) {
- err = pthread_create(&threads[i], NULL, distribution_thread,
- NULL);
- if (err) {
- ksft_print_msg("Can't create thread: %s (%d)\n",
- strerror(errno), errno);
- return -1;
- }
- }
+ ctd_count = 100;
+ if (timer_create(CLOCK_PROCESS_CPUTIME_ID, NULL, &id))
+ return "Can't create timer\n";
+ if (timer_settime(id, 0, &val, NULL))
+ return "Can't set timer\n";
- /* Wait for all threads to receive the signal. */
- while (__atomic_load_n(&remain, __ATOMIC_RELAXED));
+ while (ctd_count > 0 && !ctd_failed)
+ ;
- for (i = 0; i < nthreads; i++) {
- err = pthread_join(threads[i], NULL);
- if (err) {
- ksft_print_msg("Can't join thread: %s (%d)\n",
- strerror(errno), errno);
- return -1;
- }
- }
+ if (timer_delete(id))
+ return "Can't delete timer\n";
- if (timer_delete(id)) {
- ksft_perror("Can't delete timer");
- return -1;
- }
+ return NULL;
+}
- ksft_test_result_pass("check_timer_distribution\n");
+/*
+ * Test that only the running thread receives the timer signal.
+ */
+static int check_timer_distribution(void)
+{
+ const char *errmsg;
+
+ signal(SIGALRM, ctd_sighandler);
+
+ errmsg = "Can't create thread\n";
+ if (pthread_create(&ctd_thread, NULL, ctd_thread_func, NULL))
+ goto err;
+
+ errmsg = "Can't join thread\n";
+ if (pthread_join(ctd_thread, (void **)&errmsg) || errmsg)
+ goto err;
+
+ if (!ctd_failed)
+ ksft_test_result_pass("check signal distribution\n");
+ else if (ksft_min_kernel_version(6, 3))
+ ksft_test_result_fail("check signal distribution\n");
+ else
+ ksft_test_result_skip("check signal distribution (old kernel)\n");
return 0;
+err:
+ ksft_print_msg(errmsg);
+ return -1;
}
int main(int argc, char **argv)
Hi
We have published a survey on E-bike Drive and Deceleration System. If you have further interest in this or related reports, we are happy to share sample reports for your reference, please contact: kiko(a)vicmarketresearch.com
Some of the prominent players reviewed in the research report include:
Shimano
Bosch
Yamaha
Bafang Electric
Brose
Ananda
Aikem
TQ-Group
Panasonic
MAHLE
……
The primary objectives of this report are to provide
1) global market size and forecasts, growth rates, market dynamics, industry structure and developments, market situation, trends;
2) global market share and ranking by company;
3) comprehensive presentation of the global market for E-bike Drive and Deceleration System, with both quantitative and qualitative analysis through detailed segmentation;
4) detailed value chain analysis and review of growth factors essential for the existing market players and new entrants;
5) emerging opportunities in the market and the future impact of major drivers and restraints of the market.
Segment by Type
Mid Motor
Geared Hub Motor
Direct Drive Hub Motor
Segment by Application
OEM market
Aftermarket
More companies that not listed here are also available.
Thanks for you reading.
Each RPMh VRM accelerator resource has 3 or 4 contiguous 4-byte aligned
addresses associated with it. These control voltage, enable state, mode,
and in legacy targets, voltage headroom. The current in-flight request
checking logic looks for exact address matches. Requests for different
addresses of the same RPMh resource as thus not detected as in-flight.
Add new cmd-db API cmd_db_match_resource_addr() to enhance the in-flight
request check for VRM requests by ignoring the address offset.
This ensures that only one request is allowed to be in-flight for a given
VRM resource. This is needed to avoid scenarios where request commands are
carried out by RPMh hardware out-of-order leading to LDO regulator
over-current protection triggering.
Fixes: 658628e7ef78 ("drivers: qcom: rpmh-rsc: add RPMH controller for QCOM SoCs")
cc: stable(a)vger.kernel.org
Reviewed-by: Konrad Dybcio <konrad.dybcio(a)linaro.org>
Tested-by: Elliot Berman <quic_eberman(a)quicinc.com> # sm8650-qrd
Signed-off-by: Maulik Shah <quic_mkshah(a)quicinc.com>
---
Changes in v4:
- Simplify cmd_db_match_resource_addr()
- Remove unrelated changes to newly added logic
- Update function description comments
- Replace Signed-off-by: with Tested-by: from Elliot
- Link to v3: https://lore.kernel.org/r/20240212-rpmh-rsc-fixes-v3-1-1be0d705dbb5@quicinc…
Changes in v3:
- Fix s-o-b chain
- Add cmd-db API to compare addresses
- Reuse already defined resource types in cmd-db
- Add Fixes tag and Cc to stable
- Retain Reviewed-by tag of v2
- Link to v2: https://lore.kernel.org/r/20240119-rpmh-rsc-fixes-v2-1-e42c0a9e36f0@quicinc…
Changes in v2:
- Use GENMASK() and FIELD_GET()
- Link to v1: https://lore.kernel.org/r/20240117-rpmh-rsc-fixes-v1-1-71ee4f8f72a4@quicinc…
---
drivers/soc/qcom/cmd-db.c | 32 +++++++++++++++++++++++++++++++-
drivers/soc/qcom/rpmh-rsc.c | 3 ++-
include/soc/qcom/cmd-db.h | 10 +++++++++-
3 files changed, 42 insertions(+), 3 deletions(-)
diff --git a/drivers/soc/qcom/cmd-db.c b/drivers/soc/qcom/cmd-db.c
index a5fd68411bed..86fb2cd4f484 100644
--- a/drivers/soc/qcom/cmd-db.c
+++ b/drivers/soc/qcom/cmd-db.c
@@ -1,6 +1,10 @@
/* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (c) 2016-2018, 2020, The Linux Foundation. All rights reserved. */
+/*
+ * Copyright (c) 2016-2018, 2020, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+#include <linux/bitfield.h>
#include <linux/debugfs.h>
#include <linux/kernel.h>
#include <linux/module.h>
@@ -17,6 +21,8 @@
#define MAX_SLV_ID 8
#define SLAVE_ID_MASK 0x7
#define SLAVE_ID_SHIFT 16
+#define SLAVE_ID(addr) FIELD_GET(GENMASK(19, 16), addr)
+#define VRM_ADDR(addr) FIELD_GET(GENMASK(19, 4), addr)
/**
* struct entry_header: header for each entry in cmddb
@@ -220,6 +226,30 @@ const void *cmd_db_read_aux_data(const char *id, size_t *len)
}
EXPORT_SYMBOL_GPL(cmd_db_read_aux_data);
+/**
+ * cmd_db_match_resource_addr() - Compare if both Resource addresses are same
+ *
+ * @addr1: Resource address to compare
+ * @addr2: Resource address to compare
+ *
+ * Return: true if two addresses refer to the same resource, false otherwise
+ */
+bool cmd_db_match_resource_addr(u32 addr1, u32 addr2)
+{
+ /*
+ * Each RPMh VRM accelerator resource has 3 or 4 contiguous 4-byte
+ * aligned addresses associated with it. Ignore the offset to check
+ * for VRM requests.
+ */
+ if (addr1 == addr2)
+ return true;
+ else if (SLAVE_ID(addr1) == CMD_DB_HW_VRM && VRM_ADDR(addr1) == VRM_ADDR(addr2))
+ return true;
+
+ return false;
+}
+EXPORT_SYMBOL_GPL(cmd_db_match_resource_addr);
+
/**
* cmd_db_read_slave_id - Get the slave ID for a given resource address
*
diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c
index a021dc71807b..daf64be966fe 100644
--- a/drivers/soc/qcom/rpmh-rsc.c
+++ b/drivers/soc/qcom/rpmh-rsc.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2023-2024, Qualcomm Innovation Center, Inc. All rights reserved.
*/
#define pr_fmt(fmt) "%s " fmt, KBUILD_MODNAME
@@ -557,7 +558,7 @@ static int check_for_req_inflight(struct rsc_drv *drv, struct tcs_group *tcs,
for_each_set_bit(j, &curr_enabled, MAX_CMDS_PER_TCS) {
addr = read_tcs_cmd(drv, drv->regs[RSC_DRV_CMD_ADDR], i, j);
for (k = 0; k < msg->num_cmds; k++) {
- if (addr == msg->cmds[k].addr)
+ if (cmd_db_match_resource_addr(msg->cmds[k].addr, addr))
return -EBUSY;
}
}
diff --git a/include/soc/qcom/cmd-db.h b/include/soc/qcom/cmd-db.h
index c8bb56e6852a..47a6cab75e63 100644
--- a/include/soc/qcom/cmd-db.h
+++ b/include/soc/qcom/cmd-db.h
@@ -1,5 +1,8 @@
/* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (c) 2016-2018, The Linux Foundation. All rights reserved. */
+/*
+ * Copyright (c) 2016-2018, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+ */
#ifndef __QCOM_COMMAND_DB_H__
#define __QCOM_COMMAND_DB_H__
@@ -21,6 +24,8 @@ u32 cmd_db_read_addr(const char *resource_id);
const void *cmd_db_read_aux_data(const char *resource_id, size_t *len);
+bool cmd_db_match_resource_addr(u32 addr1, u32 addr2);
+
enum cmd_db_hw_type cmd_db_read_slave_id(const char *resource_id);
int cmd_db_ready(void);
@@ -31,6 +36,9 @@ static inline u32 cmd_db_read_addr(const char *resource_id)
static inline const void *cmd_db_read_aux_data(const char *resource_id, size_t *len)
{ return ERR_PTR(-ENODEV); }
+static inline bool cmd_db_match_resource_addr(u32 addr1, u32 addr2)
+{ return false; }
+
static inline enum cmd_db_hw_type cmd_db_read_slave_id(const char *resource_id)
{ return -ENODEV; }
---
base-commit: 615d300648869c774bd1fe54b4627bb0c20faed4
change-id: 20240210-rpmh-rsc-fixes-372a79ab364b
Best regards,
--
Maulik Shah <quic_mkshah(a)quicinc.com>
The type defined for the BINDER_SET_MAX_THREADS ioctl was changed from
size_t to __u32 in order to avoid incompatibility issues between 32 and
64-bit kernels. However, the internal types used to copy from user and
store the value were never updated. Use u32 to fix the inconsistency.
Fixes: a9350fc859ae ("staging: android: binder: fix BINDER_SET_MAX_THREADS declaration")
Reported-by: Arve Hjønnevåg <arve(a)android.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
---
drivers/android/binder.c | 2 +-
drivers/android/binder_internal.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index f120a24c9ae6..2596cbfa16d0 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -5408,7 +5408,7 @@ static long binder_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
goto err;
break;
case BINDER_SET_MAX_THREADS: {
- int max_threads;
+ u32 max_threads;
if (copy_from_user(&max_threads, ubuf,
sizeof(max_threads))) {
diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
index 221ab7a6384a..3c522698083f 100644
--- a/drivers/android/binder_internal.h
+++ b/drivers/android/binder_internal.h
@@ -426,7 +426,7 @@ struct binder_proc {
struct list_head todo;
struct binder_stats stats;
struct list_head delivered_death;
- int max_threads;
+ u32 max_threads;
int requested_threads;
int requested_threads_started;
int tmp_ref;
--
2.44.0.683.g7961c838ac-goog
There are two issues with SDHC2 configuration for SA8155P-ADP,
which prevent use of SDHC2 and causes issues with ethernet:
- Card Detect pin for SHDC2 on SA8155P-ADP is connected to gpio4 of
PMM8155AU_1, not to SoC itself. SoC's gpio4 is used for DWMAC
TX. If sdhc driver probes after dwmac driver, it reconfigures
gpio4 and this breaks Ethernet MAC.
- pinctrl configuration mentions gpio96 as CD pin. It seems it was
copied from some SM8150 example, because as mentioned above,
correct CD pin is gpio4 on PMM8155AU_1.
This patch fixes both mentioned issues by providing correct pin handle
and pinctrl configuration.
Fixes: 0deb2624e2d0 ("arm64: dts: qcom: sa8155p-adp: Add support for uSD card")
Cc: stable(a)vger.kernel.org
Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk(a)epam.com>
---
In v3:
- Moved regulator changes to a separate patch
- Renamed pinctrl node
- Moved pinctrl node so it will appear in alphabetical order
In v2:
- Added "Fixes:" tag
- CCed stable ML
- Fixed pinctrl configuration
- Extended voltage range for L13C voltage regulator
---
arch/arm64/boot/dts/qcom/sa8155p-adp.dts | 30 ++++++++++--------------
1 file changed, 13 insertions(+), 17 deletions(-)
diff --git a/arch/arm64/boot/dts/qcom/sa8155p-adp.dts b/arch/arm64/boot/dts/qcom/sa8155p-adp.dts
index 5e4287f8c8cd1..b2cf2c988336c 100644
--- a/arch/arm64/boot/dts/qcom/sa8155p-adp.dts
+++ b/arch/arm64/boot/dts/qcom/sa8155p-adp.dts
@@ -367,6 +367,16 @@ queue0 {
};
};
+&pmm8155au_1_gpios {
+ pmm8155au_1_sdc2_cd: sdc2-cd-default-state {
+ pins = "gpio4";
+ function = "normal";
+ input-enable;
+ bias-pull-up;
+ power-source = <0>;
+ };
+};
+
&qupv3_id_1 {
status = "okay";
};
@@ -384,10 +394,10 @@ &remoteproc_cdsp {
&sdhc_2 {
status = "okay";
- cd-gpios = <&tlmm 4 GPIO_ACTIVE_LOW>;
+ cd-gpios = <&pmm8155au_1_gpios 4 GPIO_ACTIVE_LOW>;
pinctrl-names = "default", "sleep";
- pinctrl-0 = <&sdc2_on>;
- pinctrl-1 = <&sdc2_off>;
+ pinctrl-0 = <&sdc2_on &pmm8155au_1_sdc2_cd>;
+ pinctrl-1 = <&sdc2_off &pmm8155au_1_sdc2_cd>;
vqmmc-supply = <&vreg_l13c_2p96>; /* IO line power */
vmmc-supply = <&vreg_l17a_2p96>; /* Card power line */
bus-width = <4>;
@@ -505,13 +515,6 @@ data-pins {
bias-pull-up; /* pull up */
drive-strength = <16>; /* 16 MA */
};
-
- sd-cd-pins {
- pins = "gpio96";
- function = "gpio";
- bias-pull-up; /* pull up */
- drive-strength = <2>; /* 2 MA */
- };
};
sdc2_off: sdc2-off-state {
@@ -532,13 +535,6 @@ data-pins {
bias-pull-up; /* pull up */
drive-strength = <2>; /* 2 MA */
};
-
- sd-cd-pins {
- pins = "gpio96";
- function = "gpio";
- bias-pull-up; /* pull up */
- drive-strength = <2>; /* 2 MA */
- };
};
usb2phy_ac_en1_default: usb2phy-ac-en1-default-state {
--
2.44.0
Since some version of Linux (I think since some 5.x), my computer
started to turn off randomly, especially when under a load. A workaround
is to put it into power saving mode (using Power in Gnome Settings).
This makes it almost 2 times slower.
Unmodified Asus X515EA. I recently updated BIOS to Asus's version 315,
but this didn't help.
Previously we cleared it from dust and replaced the thermal paste, but
that didn't fix the problem.
I am attaching dmidecode.
In my previous email, I forgot to tell Linux version:
$ uname -a
Linux victor 6.5.0-28-generic #29-Ubuntu SMP PREEMPT_DYNAMIC Thu Mar 28
23:46:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Hi,
The Framework 13 quirk for s2idle got extended to cover another BIOS
version very recently.
Can you please backport this to 6.6.y and later as well?
f609e7b1b49e ("platform/x86/amd/pmc: Extend Framework 13 quirk to more
BIOSes")
Thanks,
Hello.
These are the remaining bugfix patches for the MT7530 DSA subdriver. They
didn't apply as is to the 5.15 stable tree so I have submitted the adjusted
versions in this thread. Please apply them in the order they were
submitted.
Signed-off-by: Arınç ÜNAL <arinc.unal(a)arinc9.com>
---
Arınç ÜNAL (3):
net: dsa: mt7530: set all CPU ports in MT7531_CPU_PMAP
net: dsa: mt7530: fix improper frames on all 25MHz and 40MHz XTAL MT7530
net: dsa: mt7530: fix enabling EEE on MT7531 switch on all boards
Vladimir Oltean (1):
net: dsa: introduce preferred_default_local_cpu_port and use on MT7530
drivers/net/dsa/mt7530.c | 54 ++++++++++++++++++++++++++++++++++--------------
drivers/net/dsa/mt7530.h | 2 ++
include/net/dsa.h | 8 +++++++
net/dsa/dsa2.c | 24 ++++++++++++++++++++-
4 files changed, 71 insertions(+), 17 deletions(-)
---
base-commit: c52b9710c83d3b8ab63bb217cc7c8b61e13f12cd
change-id: 20240420-for-stable-5-15-backports-d1fca1fee8c9
Best regards,
--
Arınç ÜNAL <arinc.unal(a)arinc9.com>
On 4/19/24 2:47 PM, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> ravb: Group descriptor types used in Rx ring
>
> to the 6.8-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> ravb-group-descriptor-types-used-in-rx-ring.patch
> and it can be found in the queue-6.8 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit fb17fd565be203e2aa62544a586a72430c457751
> Author: Niklas Söderlund <niklas.soderlund+renesas(a)ragnatech.se>
> Date: Mon Mar 4 12:08:53 2024 +0100
>
> ravb: Group descriptor types used in Rx ring
>
> [ Upstream commit 4123c3fbf8632e5c553222bf1c10b3a3e0a8dc06 ]
>
> The Rx ring can either be made up of normal or extended descriptors, not
> a mix of the two at the same time. Make this explicit by grouping the
> two variables in a rx_ring union.
>
> The extension of the storage for more than one queue of normal
> descriptors from a single to NUM_RX_QUEUE queues have no practical
> effect. But aids in making the code readable as the code that uses it
> already piggyback on other members of struct ravb_private that are
> arrays of max length NUM_RX_QUEUE, e.g. rx_desc_dma. This will also make
> further refactoring easier.
>
> While at it, rename the normal descriptor Rx ring to make it clear it's
> not strictly related to the GbEthernet E-MAC IP found in RZ/G2L, normal
> descriptors could be used on R-Car SoCs too.
>
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas(a)ragnatech.se>
> Reviewed-by: Paul Barker <paul.barker.ct(a)bp.renesas.com>
> Reviewed-by: Sergey Shtylyov <s.shtylyov(a)omp.ru>
> Signed-off-by: David S. Miller <davem(a)davemloft.net>
> Stable-dep-of: def52db470df ("net: ravb: Count packets instead of descriptors in R-Car RX path")
Hm, I doubt this patch is actually necessary here...
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
[...]
MBR, Sergey
Since some version of Linux (I think since some 5.x), my computer
started to turn off randomly, especially when under a load. A workaround
is to put it into power saving mode (using Power in Gnome Settings).
This makes it almost 2 times slower.
Unmodified Asus X515EA. I recently updated BIOS to Asus's version 315,
but this didn't help.
Previously we cleared it from dust and replaced the thermal paste, but
that didn't fix the problem.
I am attaching dmidecode.
The patch titled
Subject: mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-hugetlb-fix-debug_locks_warn_on1-when-dissolve_free_hugetlb_folio.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Miaohe Lin <linmiaohe(a)huawei.com>
Subject: mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
Date: Fri, 19 Apr 2024 16:58:19 +0800
When I did memory failure tests recently, below warning occurs:
DEBUG_LOCKS_WARN_ON(1)
WARNING: CPU: 8 PID: 1011 at kernel/locking/lockdep.c:232 __lock_acquire+0xccb/0x1ca0
Modules linked in: mce_inject hwpoison_inject
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:__lock_acquire+0xccb/0x1ca0
RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082
RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0
RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb
R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10
R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004
FS: 00007ff9f32aa740(0000) GS:ffffa1ce5fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff9f3134ba0 CR3: 00000008484e4000 CR4: 00000000000006f0
Call Trace:
<TASK>
lock_acquire+0xbe/0x2d0
_raw_spin_lock_irqsave+0x3a/0x60
hugepage_subpool_put_pages.part.0+0xe/0xc0
free_huge_folio+0x253/0x3f0
dissolve_free_huge_page+0x147/0x210
__page_handle_poison+0x9/0x70
memory_failure+0x4e6/0x8c0
hard_offline_page_store+0x55/0xa0
kernfs_fop_write_iter+0x12c/0x1d0
vfs_write+0x380/0x540
ksys_write+0x64/0xe0
do_syscall_64+0xbc/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff9f3114887
RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887
RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001
RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00
</TASK>
Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
panic+0x326/0x350
check_panic_on_warn+0x4f/0x50
__warn+0x98/0x190
report_bug+0x18e/0x1a0
handle_bug+0x3d/0x70
exc_invalid_op+0x18/0x70
asm_exc_invalid_op+0x1a/0x20
RIP: 0010:__lock_acquire+0xccb/0x1ca0
RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082
RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0
RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb
R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10
R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004
lock_acquire+0xbe/0x2d0
_raw_spin_lock_irqsave+0x3a/0x60
hugepage_subpool_put_pages.part.0+0xe/0xc0
free_huge_folio+0x253/0x3f0
dissolve_free_huge_page+0x147/0x210
__page_handle_poison+0x9/0x70
memory_failure+0x4e6/0x8c0
hard_offline_page_store+0x55/0xa0
kernfs_fop_write_iter+0x12c/0x1d0
vfs_write+0x380/0x540
ksys_write+0x64/0xe0
do_syscall_64+0xbc/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff9f3114887
RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887
RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001
RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00
</TASK>
After git bisecting and digging into the code, I believe the root cause is
that _deferred_list field of folio is unioned with _hugetlb_subpool field.
In __update_and_free_hugetlb_folio(), folio->_deferred_list is
initialized leading to corrupted folio->_hugetlb_subpool when folio is
hugetlb. Later free_huge_folio() will use _hugetlb_subpool and above
warning happens.
But it is assumed hugetlb flag must have been cleared when calling
folio_put() in update_and_free_hugetlb_folio(). This assumption is broken
due to below race:
CPU1 CPU2
dissolve_free_huge_page update_and_free_pages_bulk
update_and_free_hugetlb_folio hugetlb_vmemmap_restore_folios
folio_clear_hugetlb_vmemmap_optimized
clear_flag = folio_test_hugetlb_vmemmap_optimized
if (clear_flag) <-- False, it's already cleared.
__folio_clear_hugetlb(folio) <-- Hugetlb is not cleared.
folio_put
free_huge_folio <-- free_the_page is expected.
list_for_each_entry()
__folio_clear_hugetlb <-- Too late.
Fix this issue by checking whether folio is hugetlb directly instead of
checking clear_flag to close the race window.
Link: https://lkml.kernel.org/r/20240419085819.1901645-1-linmiaohe@huawei.com
Fixes: 32c877191e02 ("hugetlb: do not clear hugetlb dtor until allocating vmemmap")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-debug_locks_warn_on1-when-dissolve_free_hugetlb_folio
+++ a/mm/hugetlb.c
@@ -1781,7 +1781,7 @@ static void __update_and_free_hugetlb_fo
* If vmemmap pages were allocated above, then we need to clear the
* hugetlb destructor under the hugetlb lock.
*/
- if (clear_dtor) {
+ if (folio_test_hugetlb(folio)) {
spin_lock_irq(&hugetlb_lock);
__clear_hugetlb_destructor(h, folio);
spin_unlock_irq(&hugetlb_lock);
_
Patches currently in -mm which might be from linmiaohe(a)huawei.com are
mm-hugetlb-fix-debug_locks_warn_on1-when-dissolve_free_hugetlb_folio.patch
(struct dirty_throttle_control *)->thresh is an unsigned long, but is
passed as the u32 divisor argument to div_u64(). On architectures where
unsigned long is 64 bytes, the argument will be implicitly truncated.
Use div64_u64() instead of div_u64() so that the value used in the "is
this a safe division" check is the same as the divisor.
Also, remove redundant cast of the numerator to u64, as that should
happen implicitly.
This would be difficult to exploit in memcg domain, given the
ratio-based arithmetic domain_drity_limits() uses, but is much easier in
global writeback domain with a BDI_CAP_STRICTLIMIT-backing device, using
e.g. vm.dirty_bytes=(1<<32)*PAGE_SIZE so that dtc->thresh == (1<<32)
Fixes: f6789593d5ce ("mm/page-writeback.c: fix divide by zero in bdi_dirty_limits()")
Cc: Maxim Patlasov <MPatlasov(a)parallels.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Zach O'Keefe <zokeefe(a)google.com>
---
mm/page-writeback.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index cd4e4ae77c40a..02147b61712bc 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1638,7 +1638,7 @@ static inline void wb_dirty_limits(struct dirty_throttle_control *dtc)
*/
dtc->wb_thresh = __wb_calc_thresh(dtc);
dtc->wb_bg_thresh = dtc->thresh ?
- div_u64((u64)dtc->wb_thresh * dtc->bg_thresh, dtc->thresh) : 0;
+ div64_u64(dtc->wb_thresh * dtc->bg_thresh, dtc->thresh) : 0;
/*
* In order to avoid the stacked BDI deadlock we need
--
2.43.0.429.g432eaa2c6b-goog
Hi Nam,
+CC stable( heads up as this is a regression affecting 5.15.y and
probably others, Greg: this was reproducible upstream so reported
everything w.r.t upstream code but initially found on 5.15.y)
On 19/04/24 20:29, Nam Cao wrote:
> On 2024-04-18 Harshit Mogalapalli wrote:
>> While fuzzing 5.15.y kernel with Syzkaller, we noticed a INFO: task hung
>> bug in fb_deferred_io_work()
>
> I think the problem is because of improper offset address calculation.
> The kernel calculate address offset with:
> offset = vmf->address - vmf->vma->vm_start
>
> Now the problem is that your C program mmap the framebuffer at 2
> different offsets:
> mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
> mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0);
>
> but the kernel doesn't take these different offsets into account.
> So, 2 different pages are mistakenly recognized as the same page.
>
> Can you try the following patch?
This patch works well against the reproducer, this simplified repro and
the longer repro which syzkaller generated couldn't trigger any hang
with the below patch applied.
Thanks,
Harshit
>
> Best regards,
> Nam
>
> diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
> index dae96c9f61cf..d5d6cd9e8b29 100644
> --- a/drivers/video/fbdev/core/fb_defio.c
> +++ b/drivers/video/fbdev/core/fb_defio.c
> @@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
> */
> static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf)
> {
> - unsigned long offset = vmf->address - vmf->vma->vm_start;
> + unsigned long offset = vmf->address - vmf->vma->vm_start
> + + (vmf->vma->vm_pgoff << PAGE_SHIFT);
> struct page *page = vmf->page;
>
> file_update_time(vmf->vma->vm_file);
>
>
commit 3aadf100f93d80815685493d60cd8cab206403df upstream.
This reverts commit ad6bcdad2b6724e113f191a12f859a9e8456b26d. I had
nak'd it, and Greg said on the thread that it links that he wasn't going
to take it either, especially since it's not his code or his tree, but
then, seemingly accidentally, it got pushed up some months later, in
what looks like a mistake, with no further discussion in the linked
thread. So revert it, since it's clearly not intended.
Fixes: ad6bcdad2b67 ("vmgenid: emit uevent when VMGENID updates")
Cc: stable(a)vger.kernel.org
Acked-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Link: https://lore.kernel.org/r/20230531095119.11202-2-bchalios@amazon.es
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/virt/vmgenid.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
index b67a28da4702..a1c467a0e9f7 100644
--- a/drivers/virt/vmgenid.c
+++ b/drivers/virt/vmgenid.c
@@ -68,7 +68,6 @@ static int vmgenid_add(struct acpi_device *device)
static void vmgenid_notify(struct acpi_device *device, u32 event)
{
struct vmgenid_state *state = acpi_driver_data(device);
- char *envp[] = { "NEW_VMGENID=1", NULL };
u8 old_id[VMGENID_SIZE];
memcpy(old_id, state->this_id, sizeof(old_id));
@@ -76,7 +75,6 @@ static void vmgenid_notify(struct acpi_device *device, u32 event)
if (!memcmp(old_id, state->this_id, sizeof(old_id)))
return;
add_vmfork_randomness(state->this_id, sizeof(state->this_id));
- kobject_uevent_env(&device->dev.kobj, KOBJ_CHANGE, envp);
}
static const struct acpi_device_id vmgenid_ids[] = {
--
2.44.0
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e871abcda3b67d0820b4182ebe93435624e9c6a4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041901-roundish-flint-1143@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
e871abcda3b6 ("random: handle creditable entropy from atomic process context")
bbc7e1bed1f5 ("random: add back async readiness notifier")
9148de3196ed ("random: reseed in delayed work rather than on-demand")
f62384995e4c ("random: split initialization into early step and later step")
745558f95885 ("random: use hwgenerator randomness more frequently at early boot")
228dfe98a313 ("Merge tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e871abcda3b67d0820b4182ebe93435624e9c6a4 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Wed, 17 Apr 2024 13:38:29 +0200
Subject: [PATCH] random: handle creditable entropy from atomic process context
The entropy accounting changes a static key when the RNG has
initialized, since it only ever initializes once. Static key changes,
however, cannot be made from atomic context, so depending on where the
last creditable entropy comes from, the static key change might need to
be deferred to a worker.
Previously the code used the execute_in_process_context() helper
function, which accounts for whether or not the caller is
in_interrupt(). However, that doesn't account for the case where the
caller is actually in process context but is holding a spinlock.
This turned out to be the case with input_handle_event() in
drivers/input/input.c contributing entropy:
[<ffffffd613025ba0>] die+0xa8/0x2fc
[<ffffffd613027428>] bug_handler+0x44/0xec
[<ffffffd613016964>] brk_handler+0x90/0x144
[<ffffffd613041e58>] do_debug_exception+0xa0/0x148
[<ffffffd61400c208>] el1_dbg+0x60/0x7c
[<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
[<ffffffd613011294>] el1h_64_sync+0x64/0x6c
[<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
[<ffffffd613102b54>] __might_sleep+0x44/0x7c
[<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
[<ffffffd6132c2820>] static_key_enable+0x14/0x38
[<ffffffd61400ac08>] crng_set_ready+0x14/0x28
[<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
[<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
[<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
[<ffffffd613857e54>] add_input_randomness+0x38/0x48
[<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
[<ffffffd613a81310>] input_event+0x6c/0x98
According to Guoyong, it's not really possible to refactor the various
drivers to never hold a spinlock there. And in_atomic() isn't reliable.
So, rather than trying to be too fancy, just punt the change in the
static key to a workqueue always. There's basically no drawback of doing
this, as the code already needed to account for the static key not
changing immediately, and given that it's just an optimization, there's
not exactly a hurry to change the static key right away, so deferal is
fine.
Reported-by: Guoyong Wang <guoyong.wang(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 456be28ba67c..2597cb43f438 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
static void __cold _credit_init_bits(size_t bits)
{
- static struct execute_work set_ready;
+ static DECLARE_WORK(set_ready, crng_set_ready);
unsigned int new, orig, add;
unsigned long flags;
@@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
- if (static_key_initialized)
- execute_in_process_context(crng_set_ready, &set_ready);
+ if (static_key_initialized && system_unbound_wq)
+ queue_work(system_unbound_wq, &set_ready);
atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
wake_up_interruptible(&crng_init_wait);
kill_fasync(&fasync, SIGIO, POLL_IN);
@@ -890,8 +890,8 @@ void __init random_init(void)
/*
* If we were initialized by the cpu or bootloader before jump labels
- * are initialized, then we should enable the static branch here, where
- * it's guaranteed that jump labels have been initialized.
+ * or workqueues are initialized, then we should enable the static
+ * branch here, where it's guaranteed that these have been initialized.
*/
if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
crng_set_ready(NULL);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x e871abcda3b67d0820b4182ebe93435624e9c6a4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041903-pentagon-deceiver-fe1d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
e871abcda3b6 ("random: handle creditable entropy from atomic process context")
bbc7e1bed1f5 ("random: add back async readiness notifier")
9148de3196ed ("random: reseed in delayed work rather than on-demand")
f62384995e4c ("random: split initialization into early step and later step")
745558f95885 ("random: use hwgenerator randomness more frequently at early boot")
228dfe98a313 ("Merge tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e871abcda3b67d0820b4182ebe93435624e9c6a4 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Wed, 17 Apr 2024 13:38:29 +0200
Subject: [PATCH] random: handle creditable entropy from atomic process context
The entropy accounting changes a static key when the RNG has
initialized, since it only ever initializes once. Static key changes,
however, cannot be made from atomic context, so depending on where the
last creditable entropy comes from, the static key change might need to
be deferred to a worker.
Previously the code used the execute_in_process_context() helper
function, which accounts for whether or not the caller is
in_interrupt(). However, that doesn't account for the case where the
caller is actually in process context but is holding a spinlock.
This turned out to be the case with input_handle_event() in
drivers/input/input.c contributing entropy:
[<ffffffd613025ba0>] die+0xa8/0x2fc
[<ffffffd613027428>] bug_handler+0x44/0xec
[<ffffffd613016964>] brk_handler+0x90/0x144
[<ffffffd613041e58>] do_debug_exception+0xa0/0x148
[<ffffffd61400c208>] el1_dbg+0x60/0x7c
[<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
[<ffffffd613011294>] el1h_64_sync+0x64/0x6c
[<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
[<ffffffd613102b54>] __might_sleep+0x44/0x7c
[<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
[<ffffffd6132c2820>] static_key_enable+0x14/0x38
[<ffffffd61400ac08>] crng_set_ready+0x14/0x28
[<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
[<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
[<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
[<ffffffd613857e54>] add_input_randomness+0x38/0x48
[<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
[<ffffffd613a81310>] input_event+0x6c/0x98
According to Guoyong, it's not really possible to refactor the various
drivers to never hold a spinlock there. And in_atomic() isn't reliable.
So, rather than trying to be too fancy, just punt the change in the
static key to a workqueue always. There's basically no drawback of doing
this, as the code already needed to account for the static key not
changing immediately, and given that it's just an optimization, there's
not exactly a hurry to change the static key right away, so deferal is
fine.
Reported-by: Guoyong Wang <guoyong.wang(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 456be28ba67c..2597cb43f438 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
static void __cold _credit_init_bits(size_t bits)
{
- static struct execute_work set_ready;
+ static DECLARE_WORK(set_ready, crng_set_ready);
unsigned int new, orig, add;
unsigned long flags;
@@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
- if (static_key_initialized)
- execute_in_process_context(crng_set_ready, &set_ready);
+ if (static_key_initialized && system_unbound_wq)
+ queue_work(system_unbound_wq, &set_ready);
atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
wake_up_interruptible(&crng_init_wait);
kill_fasync(&fasync, SIGIO, POLL_IN);
@@ -890,8 +890,8 @@ void __init random_init(void)
/*
* If we were initialized by the cpu or bootloader before jump labels
- * are initialized, then we should enable the static branch here, where
- * it's guaranteed that jump labels have been initialized.
+ * or workqueues are initialized, then we should enable the static
+ * branch here, where it's guaranteed that these have been initialized.
*/
if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
crng_set_ready(NULL);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x e871abcda3b67d0820b4182ebe93435624e9c6a4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041905-obvious-blush-10d8@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
e871abcda3b6 ("random: handle creditable entropy from atomic process context")
bbc7e1bed1f5 ("random: add back async readiness notifier")
9148de3196ed ("random: reseed in delayed work rather than on-demand")
f62384995e4c ("random: split initialization into early step and later step")
745558f95885 ("random: use hwgenerator randomness more frequently at early boot")
228dfe98a313 ("Merge tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e871abcda3b67d0820b4182ebe93435624e9c6a4 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Wed, 17 Apr 2024 13:38:29 +0200
Subject: [PATCH] random: handle creditable entropy from atomic process context
The entropy accounting changes a static key when the RNG has
initialized, since it only ever initializes once. Static key changes,
however, cannot be made from atomic context, so depending on where the
last creditable entropy comes from, the static key change might need to
be deferred to a worker.
Previously the code used the execute_in_process_context() helper
function, which accounts for whether or not the caller is
in_interrupt(). However, that doesn't account for the case where the
caller is actually in process context but is holding a spinlock.
This turned out to be the case with input_handle_event() in
drivers/input/input.c contributing entropy:
[<ffffffd613025ba0>] die+0xa8/0x2fc
[<ffffffd613027428>] bug_handler+0x44/0xec
[<ffffffd613016964>] brk_handler+0x90/0x144
[<ffffffd613041e58>] do_debug_exception+0xa0/0x148
[<ffffffd61400c208>] el1_dbg+0x60/0x7c
[<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
[<ffffffd613011294>] el1h_64_sync+0x64/0x6c
[<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
[<ffffffd613102b54>] __might_sleep+0x44/0x7c
[<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
[<ffffffd6132c2820>] static_key_enable+0x14/0x38
[<ffffffd61400ac08>] crng_set_ready+0x14/0x28
[<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
[<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
[<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
[<ffffffd613857e54>] add_input_randomness+0x38/0x48
[<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
[<ffffffd613a81310>] input_event+0x6c/0x98
According to Guoyong, it's not really possible to refactor the various
drivers to never hold a spinlock there. And in_atomic() isn't reliable.
So, rather than trying to be too fancy, just punt the change in the
static key to a workqueue always. There's basically no drawback of doing
this, as the code already needed to account for the static key not
changing immediately, and given that it's just an optimization, there's
not exactly a hurry to change the static key right away, so deferal is
fine.
Reported-by: Guoyong Wang <guoyong.wang(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 456be28ba67c..2597cb43f438 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -702,7 +702,7 @@ static void extract_entropy(void *buf, size_t len)
static void __cold _credit_init_bits(size_t bits)
{
- static struct execute_work set_ready;
+ static DECLARE_WORK(set_ready, crng_set_ready);
unsigned int new, orig, add;
unsigned long flags;
@@ -718,8 +718,8 @@ static void __cold _credit_init_bits(size_t bits)
if (orig < POOL_READY_BITS && new >= POOL_READY_BITS) {
crng_reseed(NULL); /* Sets crng_init to CRNG_READY under base_crng.lock. */
- if (static_key_initialized)
- execute_in_process_context(crng_set_ready, &set_ready);
+ if (static_key_initialized && system_unbound_wq)
+ queue_work(system_unbound_wq, &set_ready);
atomic_notifier_call_chain(&random_ready_notifier, 0, NULL);
wake_up_interruptible(&crng_init_wait);
kill_fasync(&fasync, SIGIO, POLL_IN);
@@ -890,8 +890,8 @@ void __init random_init(void)
/*
* If we were initialized by the cpu or bootloader before jump labels
- * are initialized, then we should enable the static branch here, where
- * it's guaranteed that jump labels have been initialized.
+ * or workqueues are initialized, then we should enable the static
+ * branch here, where it's guaranteed that these have been initialized.
*/
if (!static_branch_likely(&crng_is_ready) && crng_init >= CRNG_READY)
crng_set_ready(NULL);
There is a recent report on UFFDIO_COPY over hugetlb:
https://lore.kernel.org/all/000000000000ee06de0616177560@google.com/
350: lockdep_assert_held(&hugetlb_lock);
Should be an issue in hugetlb but triggered in an userfault context, where
it goes into the unlikely path where two threads modifying the resv map
together. Mike has a fix in that path for resv uncharge but it looks like
the locking criteria was overlooked: hugetlb_cgroup_uncharge_folio_rsvd()
will update the cgroup pointer, so it requires to be called with the lock
held.
Looks like a stable material, so have it copied.
Reported-by: syzbot+4b8077a5fccc61c385a1(a)syzkaller.appspotmail.com
Cc: Mina Almasry <almasrymina(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: linux-stable <stable(a)vger.kernel.org>
Fixes: 79aa925bf239 ("hugetlb_cgroup: fix reservation accounting")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
---
mm/hugetlb.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 26ab9dfc7d63..3158a55ce567 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3247,9 +3247,12 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
rsv_adjust = hugepage_subpool_put_pages(spool, 1);
hugetlb_acct_memory(h, -rsv_adjust);
- if (deferred_reserve)
+ if (deferred_reserve) {
+ spin_lock_irq(&hugetlb_lock);
hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
pages_per_huge_page(h), folio);
+ spin_unlock_irq(&hugetlb_lock);
+ }
}
if (!memcg_charge_ret)
--
2.44.0
Hi,
Roland Rosenfeld reported in Debian a regression after the update to
the 6.1.85 based kernel, with his USB ethernet device not anymore
able to use the usb ethernet names.
https://bugs.debian.org/1069082
it is somehow linked to the already reported regression
https://lore.kernel.org/regressions/ZhFl6xueHnuVHKdp@nuc/ but has
another aspect. I'm quoting his original report:
> Dear Maintainer,
>
> when upgrading from 6.1.76-1 to 6.1.85-1 my USB ethernet device
> ID 0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet
> is no longer named enx00249bXXXXXX but eth0.
>
> I see the following in dmsg:
>
> [ 1.484345] usb 4-5: Manufacturer: ASIX Elec. Corp.
> [ 1.484661] usb 4-5: SerialNumber: 0000249BXXXXXX
> [ 1.496312] ax88179_178a 4-5:1.0 eth0: register 'ax88179_178a' at usb-0000:00:14.0-5, ASIX AX88179 USB 3.0 Gigabit Ethernet, d2:60:4c:YY:YY:YY
> [ 1.497746] usbcore: registered new interface driver ax88179_178a
>
> Unplugging and plugging again does not solve the issue, but the
> interface still is named eth0.
>
> Maybe it has to do with the following commit from
> https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.1.85
>
> commit fc77240f6316d17fc58a8881927c3732b1d75d51
> Author: Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
> Date: Wed Apr 3 15:21:58 2024 +0200
>
> net: usb: ax88179_178a: avoid the interface always configured as random address
>
> commit 2e91bb99b9d4f756e92e83c4453f894dda220f09 upstream.
>
> After the commit d2689b6a86b9 ("net: usb: ax88179_178a: avoid two
> consecutive device resets"), reset is not executed from bind operation and
> mac address is not read from the device registers or the devicetree at that
> moment. Since the check to configure if the assigned mac address is random
> or not for the interface, happens after the bind operation from
> usbnet_probe, the interface keeps configured as random address, although the
> address is correctly read and set during open operation (the only reset
> now).
>
> In order to keep only one reset for the device and to avoid the interface
> always configured as random address, after reset, configure correctly the
> suitable field from the driver, if the mac address is read successfully from
> the device registers or the devicetree. Take into account if a locally
> administered address (random) was previously stored.
>
> cc: stable(a)vger.kernel.org # 6.6+
> Fixes: d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets")
> Reported-by: Dave Stevenson <dave.stevenson(a)raspberrypi.com>
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
> Reviewed-by: Simon Horman <horms(a)kernel.org>
> Link: https://lore.kernel.org/r/20240403132158.344838-1-jtornosm@redhat.com
> Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
>
> Seems, that I'm not alone with this issue, there are also reports in
> https://www.reddit.com/r/debian/comments/1c304xn/linuximageamd64_61851_usb_…
> and https://infosec.space/@topher/112276500329020316
>
>
> All other (pci based) network interfaces still use there static names
> (enp0s25, enp2s0, enp3s0), only the usb ethernet name is broken with
> the new kernel.
>
> Greetings
> Roland
Roland confirmed that reverting both fc77240f6316 ("net: usb:
ax88179_178a: avoid the interface always configured as random
address") and 5c4cbec5106d ("net: usb: ax88179_178a: avoid two
consecutive device resets") fixes the problem.
Confirmation: https://bugs.debian.org/1069082#27
Regards,
Salvatore
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 325f3fb551f8cd672dbbfc4cf58b14f9ee3fc9e8
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041524-monoxide-kilobyte-1c44@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
325f3fb551f8 ("kprobes: Fix possible use-after-free issue on kprobe registration")
1efda38d6f9b ("kprobes: Prohibit probes in gate area")
28f6c37a2910 ("kprobes: Forbid probing on trampoline and BPF code areas")
223a76b268c9 ("kprobes: Fix coding style issues")
9c89bb8e3272 ("kprobes: treewide: Cleanup the error messages for kprobes")
02afb8d6048d ("kprobe: Simplify prepare_kprobe() by dropping redundant version")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 325f3fb551f8cd672dbbfc4cf58b14f9ee3fc9e8 Mon Sep 17 00:00:00 2001
From: Zheng Yejian <zhengyejian1(a)huawei.com>
Date: Wed, 10 Apr 2024 09:58:02 +0800
Subject: [PATCH] kprobes: Fix possible use-after-free issue on kprobe
registration
When unloading a module, its state is changing MODULE_STATE_LIVE ->
MODULE_STATE_GOING -> MODULE_STATE_UNFORMED. Each change will take
a time. `is_module_text_address()` and `__module_text_address()`
works with MODULE_STATE_LIVE and MODULE_STATE_GOING.
If we use `is_module_text_address()` and `__module_text_address()`
separately, there is a chance that the first one is succeeded but the
next one is failed because module->state becomes MODULE_STATE_UNFORMED
between those operations.
In `check_kprobe_address_safe()`, if the second `__module_text_address()`
is failed, that is ignored because it expected a kernel_text address.
But it may have failed simply because module->state has been changed
to MODULE_STATE_UNFORMED. In this case, arm_kprobe() will try to modify
non-exist module text address (use-after-free).
To fix this problem, we should not use separated `is_module_text_address()`
and `__module_text_address()`, but use only `__module_text_address()`
once and do `try_module_get(module)` which is only available with
MODULE_STATE_LIVE.
Link: https://lore.kernel.org/all/20240410015802.265220-1-zhengyejian1@huawei.com/
Fixes: 28f6c37a2910 ("kprobes: Forbid probing on trampoline and BPF code areas")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 9d9095e81792..65adc815fc6e 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1567,10 +1567,17 @@ static int check_kprobe_address_safe(struct kprobe *p,
jump_label_lock();
preempt_disable();
- /* Ensure it is not in reserved area nor out of text */
- if (!(core_kernel_text((unsigned long) p->addr) ||
- is_module_text_address((unsigned long) p->addr)) ||
- in_gate_area_no_mm((unsigned long) p->addr) ||
+ /* Ensure the address is in a text area, and find a module if exists. */
+ *probed_mod = NULL;
+ if (!core_kernel_text((unsigned long) p->addr)) {
+ *probed_mod = __module_text_address((unsigned long) p->addr);
+ if (!(*probed_mod)) {
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+ /* Ensure it is not in reserved area. */
+ if (in_gate_area_no_mm((unsigned long) p->addr) ||
within_kprobe_blacklist((unsigned long) p->addr) ||
jump_label_text_reserved(p->addr, p->addr) ||
static_call_text_reserved(p->addr, p->addr) ||
@@ -1580,8 +1587,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
goto out;
}
- /* Check if 'p' is probing a module. */
- *probed_mod = __module_text_address((unsigned long) p->addr);
+ /* Get module refcount and reject __init functions for loaded modules. */
if (*probed_mod) {
/*
* We must hold a refcount of the probed module while updating
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 64620e0a1e712a778095bd35cbb277dc2259281f Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Tue, 11 Jan 2022 14:43:41 +0000
Subject: [PATCH] bpf: Fix out of bounds access for ringbuf helpers
Both bpf_ringbuf_submit() and bpf_ringbuf_discard() have ARG_PTR_TO_ALLOC_MEM
in their bpf_func_proto definition as their first argument. They both expect
the result from a prior bpf_ringbuf_reserve() call which has a return type of
RET_PTR_TO_ALLOC_MEM_OR_NULL.
Meaning, after a NULL check in the code, the verifier will promote the register
type in the non-NULL branch to a PTR_TO_MEM and in the NULL branch to a known
zero scalar. Generally, pointer arithmetic on PTR_TO_MEM is allowed, so the
latter could have an offset.
The ARG_PTR_TO_ALLOC_MEM expects a PTR_TO_MEM register type. However, the non-
zero result from bpf_ringbuf_reserve() must be fed into either bpf_ringbuf_submit()
or bpf_ringbuf_discard() but with the original offset given it will then read
out the struct bpf_ringbuf_hdr mapping.
The verifier missed to enforce a zero offset, so that out of bounds access
can be triggered which could be used to escalate privileges if unprivileged
BPF was enabled (disabled by default in kernel).
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: <tr3e.wang(a)gmail.com> (SecCoder Security Lab)
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e0b3f4d683eb..c72c57a6684f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5318,9 +5318,15 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
case PTR_TO_BUF:
case PTR_TO_BUF | MEM_RDONLY:
case PTR_TO_STACK:
+ /* Some of the argument types nevertheless require a
+ * zero register offset.
+ */
+ if (arg_type == ARG_PTR_TO_ALLOC_MEM)
+ goto force_off_check;
break;
/* All the rest must be rejected: */
default:
+force_off_check:
err = __check_ptr_off_reg(env, reg, regno,
type == PTR_TO_BTF_ID);
if (err < 0)
This is the start of the stable review cycle for the 6.6.28 release.
There are 122 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 17 Apr 2024 14:19:30 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.28-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.6.28-rc1
Fudongwang <fudong.wang(a)amd.com>
drm/amd/display: fix disable otg wa logic in DCN316
Harry Wentland <harry.wentland(a)amd.com>
drm/amd/display: Set VSC SDP Colorimetry same way for MST and SST
Harry Wentland <harry.wentland(a)amd.com>
drm/amd/display: Program VSC SDP colorimetry for all DP sinks >= 1.4
Tim Huang <Tim.Huang(a)amd.com>
drm/amdgpu: fix incorrect number of active RBs for gfx11
Alex Deucher <alexander.deucher(a)amd.com>
drm/amdgpu: always force full reset for SOC21
Lijo Lazar <lijo.lazar(a)amd.com>
drm/amdgpu: Reset dGPU if suspend got aborted
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915: Disable port sync when bigjoiner is used
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915/cdclk: Fix CDCLK programming order when pipes are active
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Replace CONFIG_SPECTRE_BHI_{ON,OFF} with CONFIG_MITIGATION_SPECTRE_BHI
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Clarify that syscall hardening isn't a BHI mitigation
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Fix BHI handling of RRSBA
Ingo Molnar <mingo(a)kernel.org>
x86/bugs: Rename various 'ia32_cap' variables to 'x86_arch_cap_msr'
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Cache the value of MSR_IA32_ARCH_CAPABILITIES
Josh Poimboeuf <jpoimboe(a)kernel.org>
x86/bugs: Fix BHI documentation
Daniel Sneddon <daniel.sneddon(a)linux.intel.com>
x86/bugs: Fix return type of spectre_bhi_state()
Arnd Bergmann <arnd(a)arndb.de>
irqflags: Explicitly ignore lockdep_hrtimer_exit() argument
Adam Dunlap <acdunlap(a)google.com>
x86/apic: Force native_apic_mem_read() to use the MOV instruction
John Stultz <jstultz(a)google.com>
selftests: timers: Fix abs() warning in posix_timers test
Sean Christopherson <seanjc(a)google.com>
x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n
Namhyung Kim <namhyung(a)kernel.org>
perf/x86: Fix out of range data
Gavin Shan <gshan(a)redhat.com>
vhost: Add smp_rmb() in vhost_enable_notify()
Gavin Shan <gshan(a)redhat.com>
vhost: Add smp_rmb() in vhost_vq_avail_empty()
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-dma: fix spi lpcg indices
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-lsio: fix pwm lpcg indices
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-conn: fix usb lpcg indices
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-dma: fix adc lpcg indices
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-dma: fix can lpcg indices
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8qm-ss-dma: fix can lpcg indices
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/client: Fully protect modes[] with dev->mode_config.mutex
Boris Brezillon <boris.brezillon(a)collabora.com>
drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr()
Jammy Huang <jammy_huang(a)aspeedtech.com>
drm/ast: Fix soft lockup
Harish Kasiviswanathan <Harish.Kasiviswanathan(a)amd.com>
drm/amdkfd: Reset GPU on queue preemption failure
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
drm/i915/vrr: Disable VRR when using bigjoiner
Zack Rusin <zack.rusin(a)broadcom.com>
drm/vmwgfx: Enable DMA mappings with SEV
Jacek Lawrynowicz <jacek.lawrynowicz(a)linux.intel.com>
accel/ivpu: Fix deadlock in context_xa
Alexander Wetzel <Alexander(a)wetzel-home.de>
scsi: sg: Avoid race in error handling & drop bogus warn
Alexander Wetzel <Alexander(a)wetzel-home.de>
scsi: sg: Avoid sg device teardown race
Zheng Yejian <zhengyejian1(a)huawei.com>
kprobes: Fix possible use-after-free issue on kprobe registration
Pavel Begunkov <asml.silence(a)gmail.com>
io_uring/net: restore msg_control on sendzc retry
Boris Burkov <boris(a)bur.io>
btrfs: qgroup: convert PREALLOC to PERTRANS after record_root_in_trans
Boris Burkov <boris(a)bur.io>
btrfs: record delayed inode root in transaction
Boris Burkov <boris(a)bur.io>
btrfs: qgroup: fix qgroup prealloc rsv leak in subvolume operations
Boris Burkov <boris(a)bur.io>
btrfs: qgroup: correctly model root qgroup rsv in convert
Geliang Tang <tanggeliang(a)kylinos.cn>
selftests: mptcp: use += operator to append strings
Jacob Pan <jacob.jun.pan(a)linux.intel.com>
iommu/vt-d: Allocate local memory for page request queue
Xuchun Shang <xuchun.shang(a)linux.alibaba.com>
iommu/vt-d: Fix wrong use of pasid config
Arnd Bergmann <arnd(a)arndb.de>
tracing: hide unused ftrace_event_id_fops
David Arinzon <darinzon(a)amazon.com>
net: ena: Set tx_info->xdpf value to NULL
David Arinzon <darinzon(a)amazon.com>
net: ena: Use tx_ring instead of xdp_ring for XDP channel TX
David Arinzon <darinzon(a)amazon.com>
net: ena: Pass ena_adapter instead of net_device to ena_xmit_common()
David Arinzon <darinzon(a)amazon.com>
net: ena: Move XDP code to its new files
David Arinzon <darinzon(a)amazon.com>
net: ena: Fix incorrect descriptor free behavior
David Arinzon <darinzon(a)amazon.com>
net: ena: Wrong missing IO completions check order
David Arinzon <darinzon(a)amazon.com>
net: ena: Fix potential sign extension issue
Michal Luczaj <mhal(a)rbox.co>
af_unix: Fix garbage collector racing against connect()
Kuniyuki Iwashima <kuniyu(a)amazon.com>
af_unix: Do not use atomic ops for unix_sk(sk)->inflight.
Arınç ÜNAL <arinc.unal(a)arinc9.com>
net: dsa: mt7530: trap link-local frames regardless of ST Port State
Gerd Bayer <gbayer(a)linux.ibm.com>
Revert "s390/ism: fix receive message buffer allocation"
Daniel Machon <daniel.machon(a)microchip.com>
net: sparx5: fix wrong config being used when reconfiguring PCS
Rahul Rameshbabu <rrameshbabu(a)nvidia.com>
net/mlx5e: Do not produce metadata freelist entries in Tx port ts WQE xmit
Carolina Jubran <cjubran(a)nvidia.com>
net/mlx5e: HTB, Fix inconsistencies with QoS SQs number
Carolina Jubran <cjubran(a)nvidia.com>
net/mlx5e: Fix mlx5e_priv_init() cleanup flow
Cosmin Ratiu <cratiu(a)nvidia.com>
net/mlx5: Correctly compare pkt reformat ids
Cosmin Ratiu <cratiu(a)nvidia.com>
net/mlx5: Properly link new fs rules into the tree
Michael Liang <mliang(a)purestorage.com>
net/mlx5: offset comp irq index in name by one
Shay Drory <shayd(a)nvidia.com>
net/mlx5: Register devlink first under devlink lock
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: SF, Stop waiting for FW as teardown was called
Eric Dumazet <edumazet(a)google.com>
netfilter: complete validation of user input
Archie Pusaka <apusaka(a)chromium.org>
Bluetooth: l2cap: Don't double set the HCI_CONN_MGMT_CONNECTED bit
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: SCO: Fix not validating setsockopt user input
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_sync: Fix using the same interval and window for Coded PHY
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: hci_sync: Use QoS to determine which PHY to scan
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: ISO: Don't reject BT_ISO_QOS if parameters are unset
Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Bluetooth: ISO: Align broadcast sync_timeout with connection timeout
Jiri Benc <jbenc(a)redhat.com>
ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr
Arnd Bergmann <arnd(a)arndb.de>
ipv4/route: avoid unused-but-set-variable warning
Arnd Bergmann <arnd(a)arndb.de>
ipv6: fib: hide unused 'pn' variable
Geetha sowjanya <gakula(a)marvell.com>
octeontx2-af: Fix NIX SQ mode and BP config
Kuniyuki Iwashima <kuniyu(a)amazon.com>
af_unix: Clear stale u->oob_skb.
Marek Vasut <marex(a)denx.de>
net: ks8851: Handle softirqs at the end of IRQ thread to fix hang
Marek Vasut <marex(a)denx.de>
net: ks8851: Inline ks8851_rx_skb()
Pavan Chebbi <pavan.chebbi(a)broadcom.com>
bnxt_en: Reset PTP tx_avail after possible firmware reset
Vikas Gupta <vikas.gupta(a)broadcom.com>
bnxt_en: Fix error recovery for RoCE ulp client
Vikas Gupta <vikas.gupta(a)broadcom.com>
bnxt_en: Fix possible memory leak in bnxt_rdma_aux_device_init()
Gerd Bayer <gbayer(a)linux.ibm.com>
s390/ism: fix receive message buffer allocation
Eric Dumazet <edumazet(a)google.com>
geneve: fix header validation in geneve[6]_xmit_skb
Ming Lei <ming.lei(a)redhat.com>
block: fix q->blkg_list corruption during disk rebind
Hariprasad Kelam <hkelam(a)marvell.com>
octeontx2-pf: Fix transmit scheduler resource leak
Eric Dumazet <edumazet(a)google.com>
xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING
Petr Tesarik <petr(a)tesarici.cz>
u64_stats: fix u64_stats_init() for lockdep when used repeatedly in one file
Ilya Maximets <i.maximets(a)ovn.org>
net: openvswitch: fix unwanted error log on timeout policy probing
Dan Carpenter <dan.carpenter(a)linaro.org>
scsi: qla2xxx: Fix off by one in qla_edif_app_getstats()
Xiang Chen <chenxiang66(a)hisilicon.com>
scsi: hisi_sas: Modify the deadline for ata_wait_after_reset()
Arnd Bergmann <arnd(a)arndb.de>
nouveau: fix function cast warning
Alex Constantino <dreaming.about.electric.sheep(a)gmail.com>
Revert "drm/qxl: simplify qxl_fence_wait"
Kwangjin Ko <kwangjin.ko(a)sk.com>
cxl/core: Fix initialization of mbox_cmd.size_out in get event
Frank Li <Frank.Li(a)nxp.com>
arm64: dts: imx8-ss-conn: fix usdhc wrong lpcg clock order
Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
drm/msm/dpu: don't allow overriding data from catalog
Dave Jiang <dave.jiang(a)intel.com>
cxl/core/regs: Fix usage of map->reg_type in cxl_decode_regblock() before assigned
Yuquan Wang <wangyuquan1236(a)phytium.com.cn>
cxl/mem: Fix for the index of Clear Event Record Handle
Cristian Marussi <cristian.marussi(a)arm.com>
firmware: arm_scmi: Make raw debugfs entries non-seekable
Aaro Koskinen <aaro.koskinen(a)iki.fi>
ARM: OMAP2+: fix USB regression on Nokia N8x0
Aaro Koskinen <aaro.koskinen(a)iki.fi>
mmc: omap: restore original power up/down steps
Aaro Koskinen <aaro.koskinen(a)iki.fi>
mmc: omap: fix deferred probe
Aaro Koskinen <aaro.koskinen(a)iki.fi>
mmc: omap: fix broken slot switch lookup
Aaro Koskinen <aaro.koskinen(a)iki.fi>
ARM: OMAP2+: fix N810 MMC gpiod table
Aaro Koskinen <aaro.koskinen(a)iki.fi>
ARM: OMAP2+: fix bogus MMC GPIO labels on Nokia N8x0
Nini Song <nini.song(a)mediatek.com>
media: cec: core: remove length check of Timer Status
Anna-Maria Behnsen <anna-maria(a)linutronix.de>
PM: s2idle: Make sure CPUs will wakeup directly on resume
Hans de Goede <hdegoede(a)redhat.com>
ACPI: scan: Do not increase dep_unmet for already met dependencies
Noah Loomans <noah(a)noahloomans.com>
platform/chrome: cros_ec_uart: properly fix race condition
Tim Huang <Tim.Huang(a)amd.com>
drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11
Dmitry Antipov <dmantipov(a)yandex.ru>
Bluetooth: Fix memory leak in hci_req_sync_complete()
Steven Rostedt (Google) <rostedt(a)goodmis.org>
ring-buffer: Only update pages_touched when a new page is touched
Yu Kuai <yukuai3(a)huawei.com>
raid1: fix use-after-free for original bio in raid1_write_request()
Fabio Estevam <festevam(a)denx.de>
ARM: dts: imx7s-warp: Pass OV2680 link-frequencies
Gavin Shan <gshan(a)redhat.com>
arm64: tlb: Fix TLBI RANGE operand
Sven Eckelmann <sven(a)narfation.org>
batman-adv: Avoid infinite loop trying to resize local TT
Damien Le Moal <dlemoal(a)kernel.org>
ata: libata-scsi: Fix ata_scsi_dev_rescan() error path
Igor Pylypiv <ipylypiv(a)google.com>
ata: libata-core: Allow command duration limits detection for ACS-4 drives
Steve French <stfrench(a)microsoft.com>
smb3: fix Open files on server counter going negative
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/spectre.rst | 22 +-
Documentation/admin-guide/kernel-parameters.txt | 12 +-
.../device_drivers/ethernet/amazon/ena.rst | 1 +
Makefile | 4 +-
arch/arm/boot/dts/nxp/imx/imx7s-warp.dts | 1 +
arch/arm/mach-omap2/board-n8x0.c | 23 +-
arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi | 16 +-
arch/arm64/boot/dts/freescale/imx8-ss-dma.dtsi | 36 +-
arch/arm64/boot/dts/freescale/imx8-ss-lsio.dtsi | 16 +-
arch/arm64/boot/dts/freescale/imx8qm-ss-dma.dtsi | 8 +-
arch/arm64/include/asm/tlbflush.h | 20 +-
arch/x86/Kconfig | 21 +-
arch/x86/events/core.c | 1 +
arch/x86/include/asm/apic.h | 3 +-
arch/x86/kernel/apic/apic.c | 6 +-
arch/x86/kernel/cpu/bugs.c | 82 ++-
arch/x86/kernel/cpu/common.c | 48 +-
block/blk-cgroup.c | 9 +-
block/blk-cgroup.h | 2 +
block/blk-core.c | 2 +
drivers/accel/ivpu/ivpu_drv.c | 2 +-
drivers/acpi/scan.c | 3 +-
drivers/ata/libata-core.c | 2 +-
drivers/ata/libata-scsi.c | 9 +-
drivers/cxl/core/mbox.c | 5 +-
drivers/cxl/core/regs.c | 5 +-
drivers/firmware/arm_scmi/raw_mode.c | 7 +-
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +-
drivers/gpu/drm/amd/amdgpu/soc21.c | 27 +-
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 1 +
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +-
.../amd/display/dc/clk_mgr/dcn316/dcn316_clk_mgr.c | 19 +-
.../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c | 12 +-
drivers/gpu/drm/ast/ast_dp.c | 3 +
drivers/gpu/drm/drm_client_modeset.c | 3 +-
drivers/gpu/drm/i915/display/intel_cdclk.c | 7 +-
drivers/gpu/drm/i915/display/intel_cdclk.h | 3 +
drivers/gpu/drm/i915/display/intel_ddi.c | 5 +
drivers/gpu/drm/i915/display/intel_vrr.c | 7 +
drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 10 +-
.../gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c | 7 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 13 +-
drivers/gpu/drm/qxl/qxl_release.c | 50 +-
drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 11 +-
drivers/iommu/intel/perfmon.c | 2 +-
drivers/iommu/intel/svm.c | 2 +-
drivers/md/raid1.c | 2 +-
drivers/media/cec/core/cec-adap.c | 14 -
drivers/mmc/host/omap.c | 48 +-
drivers/net/dsa/mt7530.c | 229 ++++++-
drivers/net/dsa/mt7530.h | 5 +
drivers/net/ethernet/amazon/ena/Makefile | 2 +-
drivers/net/ethernet/amazon/ena/ena_com.c | 2 +-
drivers/net/ethernet/amazon/ena/ena_ethtool.c | 1 +
drivers/net/ethernet/amazon/ena/ena_netdev.c | 688 ++-------------------
drivers/net/ethernet/amazon/ena/ena_netdev.h | 83 +--
drivers/net/ethernet/amazon/ena/ena_xdp.c | 466 ++++++++++++++
drivers/net/ethernet/amazon/ena/ena_xdp.h | 152 +++++
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +
drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 6 +-
.../net/ethernet/marvell/octeontx2/af/rvu_nix.c | 22 +-
drivers/net/ethernet/marvell/octeontx2/nic/qos.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h | 8 +-
drivers/net/ethernet/mellanox/mlx5/core/en/qos.c | 33 +-
drivers/net/ethernet/mellanox/mlx5/core/en/selq.c | 2 +
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 -
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 7 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 17 +-
drivers/net/ethernet/mellanox/mlx5/core/main.c | 37 +-
drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 4 +-
.../ethernet/mellanox/mlx5/core/sf/dev/driver.c | 22 +-
drivers/net/ethernet/micrel/ks8851.h | 3 -
drivers/net/ethernet/micrel/ks8851_common.c | 16 +-
drivers/net/ethernet/micrel/ks8851_par.c | 11 -
drivers/net/ethernet/micrel/ks8851_spi.c | 11 -
.../net/ethernet/microchip/sparx5/sparx5_port.c | 4 +-
drivers/net/geneve.c | 4 +-
drivers/platform/chrome/cros_ec_uart.c | 28 +-
drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +-
drivers/scsi/qla2xxx/qla_edif.c | 2 +-
drivers/scsi/sg.c | 20 +-
drivers/vhost/vhost.c | 28 +-
fs/btrfs/delayed-inode.c | 3 +
fs/btrfs/inode.c | 13 +-
fs/btrfs/ioctl.c | 37 +-
fs/btrfs/qgroup.c | 2 +
fs/btrfs/root-tree.c | 10 -
fs/btrfs/root-tree.h | 2 -
fs/btrfs/transaction.c | 17 +-
fs/smb/client/cached_dir.c | 4 +-
include/linux/dma-fence.h | 7 +
include/linux/irqflags.h | 2 +-
include/linux/u64_stats_sync.h | 9 +-
include/net/addrconf.h | 4 +
include/net/af_unix.h | 2 +-
include/net/bluetooth/bluetooth.h | 11 +
include/net/ip_tunnels.h | 33 +
io_uring/net.c | 1 +
kernel/cpu.c | 3 +-
kernel/kprobes.c | 18 +-
kernel/power/suspend.c | 6 +
kernel/trace/ring_buffer.c | 6 +-
kernel/trace/trace_events.c | 4 +
net/batman-adv/translation-table.c | 2 +-
net/bluetooth/hci_request.c | 4 +-
net/bluetooth/hci_sync.c | 66 +-
net/bluetooth/iso.c | 14 +-
net/bluetooth/l2cap_core.c | 3 +-
net/bluetooth/sco.c | 23 +-
net/ipv4/netfilter/arp_tables.c | 4 +
net/ipv4/netfilter/ip_tables.c | 4 +
net/ipv4/route.c | 4 +-
net/ipv6/addrconf.c | 7 +-
net/ipv6/ip6_fib.c | 7 +-
net/ipv6/netfilter/ip6_tables.c | 4 +
net/openvswitch/conntrack.c | 5 +-
net/unix/af_unix.c | 8 +-
net/unix/garbage.c | 35 +-
net/unix/scm.c | 8 +-
net/xdp/xsk.c | 2 +
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 53 +-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 30 +-
tools/testing/selftests/timers/posix_timers.c | 2 +-
123 files changed, 1765 insertions(+), 1263 deletions(-)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x a4833e3abae132d613ce7da0e0c9a9465d1681fa
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041925-walnut-flammable-adf1@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
a4833e3abae1 ("SUNRPC: Fix rpcgss_context trace event acceptor field")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a4833e3abae132d613ce7da0e0c9a9465d1681fa Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Date: Wed, 10 Apr 2024 12:38:13 -0400
Subject: [PATCH] SUNRPC: Fix rpcgss_context trace event acceptor field
The rpcgss_context trace event acceptor field is a dynamically sized
string that records the "data" parameter. But this parameter is also
dependent on the "len" field to determine the size of the data.
It needs to use __string_len() helper macro where the length can be passed
in. It also incorrectly uses strncpy() to save it instead of
__assign_str(). As these macros can change, it is not wise to open code
them in trace events.
As of commit c759e609030c ("tracing: Remove __assign_str_len()"),
__assign_str() can be used for both __string() and __string_len() fields.
Before that commit, __assign_str_len() is required to be used. This needs
to be noted for backporting. (In actuality, commit c1fa617caeb0 ("tracing:
Rework __assign_str() and __string() to not duplicate getting the string")
is the commit that makes __string_str_len() obsolete).
Cc: stable(a)vger.kernel.org
Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in rpc_auth_gss.ko")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/include/trace/events/rpcgss.h b/include/trace/events/rpcgss.h
index ba2d96a1bc2f..f50fcafc69de 100644
--- a/include/trace/events/rpcgss.h
+++ b/include/trace/events/rpcgss.h
@@ -609,7 +609,7 @@ TRACE_EVENT(rpcgss_context,
__field(unsigned int, timeout)
__field(u32, window_size)
__field(int, len)
- __string(acceptor, data)
+ __string_len(acceptor, data, len)
),
TP_fast_assign(
@@ -618,7 +618,7 @@ TRACE_EVENT(rpcgss_context,
__entry->timeout = timeout;
__entry->window_size = window_size;
__entry->len = len;
- strncpy(__get_str(acceptor), data, len);
+ __assign_str(acceptor, data);
),
TP_printk("win_size=%u expiry=%lu now=%lu timeout=%u acceptor=%.*s",
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x a4833e3abae132d613ce7da0e0c9a9465d1681fa
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041953-duration-fructose-0bc1@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
a4833e3abae1 ("SUNRPC: Fix rpcgss_context trace event acceptor field")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a4833e3abae132d613ce7da0e0c9a9465d1681fa Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
Date: Wed, 10 Apr 2024 12:38:13 -0400
Subject: [PATCH] SUNRPC: Fix rpcgss_context trace event acceptor field
The rpcgss_context trace event acceptor field is a dynamically sized
string that records the "data" parameter. But this parameter is also
dependent on the "len" field to determine the size of the data.
It needs to use __string_len() helper macro where the length can be passed
in. It also incorrectly uses strncpy() to save it instead of
__assign_str(). As these macros can change, it is not wise to open code
them in trace events.
As of commit c759e609030c ("tracing: Remove __assign_str_len()"),
__assign_str() can be used for both __string() and __string_len() fields.
Before that commit, __assign_str_len() is required to be used. This needs
to be noted for backporting. (In actuality, commit c1fa617caeb0 ("tracing:
Rework __assign_str() and __string() to not duplicate getting the string")
is the commit that makes __string_str_len() obsolete).
Cc: stable(a)vger.kernel.org
Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in rpc_auth_gss.ko")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/include/trace/events/rpcgss.h b/include/trace/events/rpcgss.h
index ba2d96a1bc2f..f50fcafc69de 100644
--- a/include/trace/events/rpcgss.h
+++ b/include/trace/events/rpcgss.h
@@ -609,7 +609,7 @@ TRACE_EVENT(rpcgss_context,
__field(unsigned int, timeout)
__field(u32, window_size)
__field(int, len)
- __string(acceptor, data)
+ __string_len(acceptor, data, len)
),
TP_fast_assign(
@@ -618,7 +618,7 @@ TRACE_EVENT(rpcgss_context,
__entry->timeout = timeout;
__entry->window_size = window_size;
__entry->len = len;
- strncpy(__get_str(acceptor), data, len);
+ __assign_str(acceptor, data);
),
TP_printk("win_size=%u expiry=%lu now=%lu timeout=%u acceptor=%.*s",
From: Ard Biesheuvel <ardb(a)kernel.org>
This is the final batch of changes to bring linux-6.1.y in sync with
6.6 and later in terms of compatibility with tightened boot security
requirements imposed by MicroSoft, compliance with which is a
prerequisite for them to be willing to resume signing distro shim images
with the MS 3rd party secure boot certificate.
Without this, distros can only boot on off-the-shelf x86 PCs after
disabling secure boot explicitly.
Most of these changes appeared in v6.8 and have been backported to v6.6
already.
Ard Biesheuvel (20):
x86/efi: Drop EFI stub .bss from .data section
x86/efi: Disregard setup header of loaded image
x86/efistub: Reinstate soft limit for initrd loading
x86/efi: Drop alignment flags from PE section headers
x86/boot: Remove the 'bugger off' message
x86/boot: Omit compression buffer from PE/COFF image memory footprint
x86/boot: Drop redundant code setting the root device
x86/boot: Drop references to startup_64
x86/boot: Grab kernel_info offset from zoffset header directly
x86/boot: Set EFI handover offset directly in header asm
x86/boot: Define setup size in linker script
x86/boot: Derive file size from _edata symbol
x86/boot: Construct PE/COFF .text section from assembler
x86/boot: Drop PE/COFF .reloc section
x86/boot: Split off PE/COFF .data section
x86/boot: Increase section and file alignment to 4k/512
x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section
x86/sme: Move early SME kernel encryption handling into .head.text
x86/sev: Move early startup code into .head.text section
x86/efistub: Remap kernel text read-only before dropping NX attribute
Hou Wenlong (2):
x86/head/64: Add missing __head annotation to startup_64_load_idt()
x86/head/64: Move the __head definition to <asm/init.h>
Pasha Tatashin (1):
x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macros
arch/x86/boot/Makefile | 2 +-
arch/x86/boot/compressed/Makefile | 2 +-
arch/x86/boot/compressed/misc.c | 1 +
arch/x86/boot/compressed/sev.c | 3 +
arch/x86/boot/compressed/vmlinux.lds.S | 6 +-
arch/x86/boot/header.S | 211 ++++++---------
arch/x86/boot/setup.ld | 14 +-
arch/x86/boot/tools/build.c | 273 +-------------------
arch/x86/include/asm/boot.h | 1 +
arch/x86/include/asm/init.h | 2 +
arch/x86/include/asm/mem_encrypt.h | 8 +-
arch/x86/include/asm/page_types.h | 12 +-
arch/x86/include/asm/sev.h | 10 +-
arch/x86/kernel/amd_gart_64.c | 2 +-
arch/x86/kernel/head64.c | 7 +-
arch/x86/kernel/sev-shared.c | 23 +-
arch/x86/kernel/sev.c | 11 +-
arch/x86/mm/mem_encrypt_boot.S | 4 +-
arch/x86/mm/mem_encrypt_identity.c | 58 ++---
arch/x86/mm/pat/set_memory.c | 6 +-
arch/x86/mm/pti.c | 2 +-
drivers/firmware/efi/libstub/Makefile | 7 -
drivers/firmware/efi/libstub/x86-stub.c | 58 ++---
23 files changed, 194 insertions(+), 529 deletions(-)
--
2.44.0.769.g3c40516874-goog
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 1db7959aacd905e6487d0478ac01d89f86eb1e51
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041954-bullish-slingshot-109f@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
1db7959aacd9 ("btrfs: do not wait for short bulk allocation")
09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method")
397239ed6a6c ("btrfs: allow extent buffer helpers to skip cross-page handling")
94dbf7c0871f ("btrfs: free the allocated memory if btrfs_alloc_page_array() fails")
096d23016543 ("btrfs: refactor main loop in memmove_extent_buffer()")
13840f3f2837 ("btrfs: refactor main loop in memcpy_extent_buffer()")
730c374e5b2c ("btrfs: use write_extent_buffer() to implement write_extent_buffer_*id()")
cb22964f1dad ("btrfs: refactor extent buffer bitmaps operations")
52ea5bfbfa6d ("btrfs: move eb subpage preallocation out of the loop")
5a96341927b0 ("btrfs: subpage: make alloc_extent_buffer() handle previously uptodate range efficiently")
2af2aaf98205 ("btrfs: scrub: introduce structure for new BTRFS_STRIPE_LEN based interface")
5eb30ee26fa4 ("btrfs: raid56: introduce the main entrance for RMW path")
6486d21c99cb ("btrfs: raid56: extract rwm write bios assembly into a helper")
509c27aa2fb6 ("btrfs: raid56: extract the rmw bio list build code into a helper")
30e3c897f4a8 ("btrfs: raid56: extract the pq generation code into a helper")
2fc6822c99d7 ("btrfs: move scrub prototypes into scrub.h")
677074792a1d ("btrfs: move relocation prototypes into relocation.h")
33cf97a7b658 ("btrfs: move acl prototypes into acl.h")
af142b6f44d3 ("btrfs: move file prototypes to file.h")
7572dec8f522 ("btrfs: move ioctl prototypes into ioctl.h")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1db7959aacd905e6487d0478ac01d89f86eb1e51 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Tue, 26 Mar 2024 09:16:46 +1030
Subject: [PATCH] btrfs: do not wait for short bulk allocation
[BUG]
There is a recent report that when memory pressure is high (including
cached pages), btrfs can spend most of its time on memory allocation in
btrfs_alloc_page_array() for compressed read/write.
[CAUSE]
For btrfs_alloc_page_array() we always go alloc_pages_bulk_array(), and
even if the bulk allocation failed (fell back to single page
allocation) we still retry but with extra memalloc_retry_wait().
If the bulk alloc only returned one page a time, we would spend a lot of
time on the retry wait.
The behavior was introduced in commit 395cb57e8560 ("btrfs: wait between
incomplete batch memory allocations").
[FIX]
Although the commit mentioned that other filesystems do the wait, it's
not the case at least nowadays.
All the mainlined filesystems only call memalloc_retry_wait() if they
failed to allocate any page (not only for bulk allocation).
If there is any progress, they won't call memalloc_retry_wait() at all.
For example, xfs_buf_alloc_pages() would only call memalloc_retry_wait()
if there is no allocation progress at all, and the call is not for
metadata readahead.
So I don't believe we should call memalloc_retry_wait() unconditionally
for short allocation.
Call memalloc_retry_wait() if it fails to allocate any page for tree
block allocation (which goes with __GFP_NOFAIL and may not need the
special handling anyway), and reduce the latency for
btrfs_alloc_page_array().
Reported-by: Julian Taylor <julian.taylor(a)1und1.de>
Tested-by: Julian Taylor <julian.taylor(a)1und1.de>
Link: https://lore.kernel.org/all/8966c095-cbe7-4d22-9784-a647d1bf27c3@1und1.de/
Fixes: 395cb57e8560 ("btrfs: wait between incomplete batch memory allocations")
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Sweet Tea Dorminy <sweettea-kernel(a)dorminy.me>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index b18034f2ab80..2776112dbdf8 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -681,31 +681,21 @@ static void end_bbio_data_read(struct btrfs_bio *bbio)
int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array,
gfp_t extra_gfp)
{
+ const gfp_t gfp = GFP_NOFS | extra_gfp;
unsigned int allocated;
for (allocated = 0; allocated < nr_pages;) {
unsigned int last = allocated;
- allocated = alloc_pages_bulk_array(GFP_NOFS | extra_gfp,
- nr_pages, page_array);
-
- if (allocated == nr_pages)
- return 0;
-
- /*
- * During this iteration, no page could be allocated, even
- * though alloc_pages_bulk_array() falls back to alloc_page()
- * if it could not bulk-allocate. So we must be out of memory.
- */
- if (allocated == last) {
+ allocated = alloc_pages_bulk_array(gfp, nr_pages, page_array);
+ if (unlikely(allocated == last)) {
+ /* No progress, fail and do cleanup. */
for (int i = 0; i < allocated; i++) {
__free_page(page_array[i]);
page_array[i] = NULL;
}
return -ENOMEM;
}
-
- memalloc_retry_wait(GFP_NOFS);
}
return 0;
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 1db7959aacd905e6487d0478ac01d89f86eb1e51
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024041951-sports-hula-f2a5@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
1db7959aacd9 ("btrfs: do not wait for short bulk allocation")
09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method")
397239ed6a6c ("btrfs: allow extent buffer helpers to skip cross-page handling")
94dbf7c0871f ("btrfs: free the allocated memory if btrfs_alloc_page_array() fails")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1db7959aacd905e6487d0478ac01d89f86eb1e51 Mon Sep 17 00:00:00 2001
From: Qu Wenruo <wqu(a)suse.com>
Date: Tue, 26 Mar 2024 09:16:46 +1030
Subject: [PATCH] btrfs: do not wait for short bulk allocation
[BUG]
There is a recent report that when memory pressure is high (including
cached pages), btrfs can spend most of its time on memory allocation in
btrfs_alloc_page_array() for compressed read/write.
[CAUSE]
For btrfs_alloc_page_array() we always go alloc_pages_bulk_array(), and
even if the bulk allocation failed (fell back to single page
allocation) we still retry but with extra memalloc_retry_wait().
If the bulk alloc only returned one page a time, we would spend a lot of
time on the retry wait.
The behavior was introduced in commit 395cb57e8560 ("btrfs: wait between
incomplete batch memory allocations").
[FIX]
Although the commit mentioned that other filesystems do the wait, it's
not the case at least nowadays.
All the mainlined filesystems only call memalloc_retry_wait() if they
failed to allocate any page (not only for bulk allocation).
If there is any progress, they won't call memalloc_retry_wait() at all.
For example, xfs_buf_alloc_pages() would only call memalloc_retry_wait()
if there is no allocation progress at all, and the call is not for
metadata readahead.
So I don't believe we should call memalloc_retry_wait() unconditionally
for short allocation.
Call memalloc_retry_wait() if it fails to allocate any page for tree
block allocation (which goes with __GFP_NOFAIL and may not need the
special handling anyway), and reduce the latency for
btrfs_alloc_page_array().
Reported-by: Julian Taylor <julian.taylor(a)1und1.de>
Tested-by: Julian Taylor <julian.taylor(a)1und1.de>
Link: https://lore.kernel.org/all/8966c095-cbe7-4d22-9784-a647d1bf27c3@1und1.de/
Fixes: 395cb57e8560 ("btrfs: wait between incomplete batch memory allocations")
CC: stable(a)vger.kernel.org # 6.1+
Reviewed-by: Sweet Tea Dorminy <sweettea-kernel(a)dorminy.me>
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index b18034f2ab80..2776112dbdf8 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -681,31 +681,21 @@ static void end_bbio_data_read(struct btrfs_bio *bbio)
int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array,
gfp_t extra_gfp)
{
+ const gfp_t gfp = GFP_NOFS | extra_gfp;
unsigned int allocated;
for (allocated = 0; allocated < nr_pages;) {
unsigned int last = allocated;
- allocated = alloc_pages_bulk_array(GFP_NOFS | extra_gfp,
- nr_pages, page_array);
-
- if (allocated == nr_pages)
- return 0;
-
- /*
- * During this iteration, no page could be allocated, even
- * though alloc_pages_bulk_array() falls back to alloc_page()
- * if it could not bulk-allocate. So we must be out of memory.
- */
- if (allocated == last) {
+ allocated = alloc_pages_bulk_array(gfp, nr_pages, page_array);
+ if (unlikely(allocated == last)) {
+ /* No progress, fail and do cleanup. */
for (int i = 0; i < allocated; i++) {
__free_page(page_array[i]);
page_array[i] = NULL;
}
return -ENOMEM;
}
-
- memalloc_retry_wait(GFP_NOFS);
}
return 0;
}
From: Joao Paulo Goncalves <joao.goncalves(a)toradex.com>
When using davinci-mcasp as CPU DAI with simple-card, there are some
conditions that cause simple-card to finish registering a sound card before
davinci-mcasp finishes registering all sound components. This creates a
non-working sound card from userspace with no problem indication apart
from not being able to play/record audio on a PCM stream. The issue
arises during simultaneous probe execution of both drivers. Specifically,
the simple-card driver, awaiting a CPU DAI, proceeds as soon as
davinci-mcasp registers its DAI. However, this process can lead to the
client mutex lock (client_mutex in soc-core.c) being held or davinci-mcasp
being preempted before PCM DMA registration on davinci-mcasp finishes.
This situation occurs when the probes of both drivers run concurrently.
Below is the code path for this condition. To solve the issue, defer
davinci-mcasp CPU DAI registration to the last step in the audio part of
it. This way, simple-card CPU DAI parsing will be deferred until all
audio components are registered.
Fail Code Path:
simple-card.c: probe starts
simple-card.c: simple_dai_link_of: simple_parse_node(..,cpu,..) returns EPROBE_DEFER, no CPU DAI yet
davinci-mcasp.c: probe starts
davinci-mcasp.c: devm_snd_soc_register_component() register CPU DAI
simple-card.c: probes again, finish CPU DAI parsing and call devm_snd_soc_register_card()
simple-card.c: finish probe
davinci-mcasp.c: *dma_pcm_platform_register() register PCM DMA
davinci-mcasp.c: probe finish
Cc: stable(a)vger.kernel.org
Fixes: 9fbd58cf4ab0 ("ASoC: davinci-mcasp: Choose PCM driver based on configured DMA controller")
Signed-off-by: Joao Paulo Goncalves <joao.goncalves(a)toradex.com>
---
sound/soc/ti/davinci-mcasp.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/sound/soc/ti/davinci-mcasp.c b/sound/soc/ti/davinci-mcasp.c
index b892d66f78470..1e760c3155213 100644
--- a/sound/soc/ti/davinci-mcasp.c
+++ b/sound/soc/ti/davinci-mcasp.c
@@ -2417,12 +2417,6 @@ static int davinci_mcasp_probe(struct platform_device *pdev)
mcasp_reparent_fck(pdev);
- ret = devm_snd_soc_register_component(&pdev->dev, &davinci_mcasp_component,
- &davinci_mcasp_dai[mcasp->op_mode], 1);
-
- if (ret != 0)
- goto err;
-
ret = davinci_mcasp_get_dma_type(mcasp);
switch (ret) {
case PCM_EDMA:
@@ -2449,6 +2443,12 @@ static int davinci_mcasp_probe(struct platform_device *pdev)
goto err;
}
+ ret = devm_snd_soc_register_component(&pdev->dev, &davinci_mcasp_component,
+ &davinci_mcasp_dai[mcasp->op_mode], 1);
+
+ if (ret != 0)
+ goto err;
+
no_audio:
ret = davinci_mcasp_init_gpiochip(mcasp);
if (ret) {
--
2.34.1
Framebuffer memory is allocated via vzalloc() from non-contiguous
physical pages. The physical framebuffer start address is therefore
meaningless. Do not set it.
The value is not used within the kernel and only exported to userspace
on dedicated ARM configs. No functional change is expected.
v2:
- refer to vzalloc() in commit message (Javier)
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Fixes: a5b44c4adb16 ("drm/fbdev-generic: Always use shadow buffering")
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: Zack Rusin <zackr(a)vmware.com>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: <stable(a)vger.kernel.org> # v6.4+
Reviewed-by: Javier Martinez Canillas <javierm(a)redhat.com>
Reviewed-by: Zack Rusin <zack.rusin(a)broadcom.com>
Reviewed-by: Sui Jingfeng <sui.jingfeng(a)linux.dev>
Tested-by: Sui Jingfeng <sui.jingfeng(a)linux.dev>
Acked-by: Maxime Ripard <mripard(a)kernel.org>
---
drivers/gpu/drm/drm_fbdev_generic.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_fbdev_generic.c b/drivers/gpu/drm/drm_fbdev_generic.c
index be357f926faec..97e579c33d84a 100644
--- a/drivers/gpu/drm/drm_fbdev_generic.c
+++ b/drivers/gpu/drm/drm_fbdev_generic.c
@@ -113,7 +113,6 @@ static int drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper,
/* screen */
info->flags |= FBINFO_VIRTFB | FBINFO_READS_FAST;
info->screen_buffer = screen_buffer;
- info->fix.smem_start = page_to_phys(vmalloc_to_page(info->screen_buffer));
info->fix.smem_len = screen_size;
/* deferred I/O */
--
2.44.0
The patch titled
Subject: init: fix allocated page overlapping with PTR_ERR
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
init-fix-allocated-page-overlapping-with-ptr_err.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Nam Cao <namcao(a)linutronix.de>
Subject: init: fix allocated page overlapping with PTR_ERR
Date: Thu, 18 Apr 2024 12:29:43 +0200
There is nothing preventing kernel memory allocators from allocating a
page that overlaps with PTR_ERR(), except for architecture-specific code
that setup memblock.
It was discovered that RISCV architecture doesn't setup memblock corectly,
leading to a page overlapping with PTR_ERR() being allocated, and
subsequently crashing the kernel (link in Close: )
The reported crash has nothing to do with PTR_ERR(): the last page (at
address 0xfffff000) being allocated leads to an unexpected arithmetic
overflow in ext4; but still, this page shouldn't be allocated in the first
place.
Because PTR_ERR() is an architecture-independent thing, we shouldn't ask
every single architecture to set this up. There may be other
architectures beside RISCV that have the same problem.
Fix this once and for all by reserving the physical memory page that may
be mapped to the last virtual memory page as part of low memory.
Unfortunately, this means if there is actual memory at this reserved
location, that memory will become inaccessible. However, if this page is
not reserved, it can only be accessed as high memory, so this doesn't
matter if high memory is not supported. Even if high memory is supported,
it is still only one page.
Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@all.your.base.are.belong…
Link: https://lkml.kernel.org/r/20240418102943.180510-1-namcao@linutronix.de
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Reported-by: Bj��rn T��pel <bjorn(a)kernel.org>
Tested-by: Bj��rn T��pel <bjorn(a)kernel.org>
Reviewed-by: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Andreas Dilger <adilger(a)dilger.ca>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Changbin Du <changbin.du(a)huawei.com>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Cc: Geert Uytterhoeven <geert+renesas(a)glider.be>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Krister Johansen <kjlx(a)templeofstupid.com>
Cc: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: Stephen Rothwell <sfr(a)canb.auug.org.au>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
init/main.c | 1 +
1 file changed, 1 insertion(+)
--- a/init/main.c~init-fix-allocated-page-overlapping-with-ptr_err
+++ a/init/main.c
@@ -900,6 +900,7 @@ void start_kernel(void)
page_address_init();
pr_notice("%s", linux_banner);
early_security_init();
+ memblock_reserve(__pa(-PAGE_SIZE), PAGE_SIZE); /* reserve last page for ERR_PTR */
setup_arch(&command_line);
setup_boot_config();
setup_command_line(command_line);
_
Patches currently in -mm which might be from namcao(a)linutronix.de are
init-fix-allocated-page-overlapping-with-ptr_err.patch
The patch titled
Subject: stackdepot: respect __GFP_NOLOCKDEP allocation flag
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
stackdepot-respect-__gfp_nolockdep-allocation-flag.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Subject: stackdepot: respect __GFP_NOLOCKDEP allocation flag
Date: Thu, 18 Apr 2024 16:11:33 +0200
If stack_depot_save_flags() allocates memory it always drops
__GFP_NOLOCKDEP flag. So when KASAN tries to track __GFP_NOLOCKDEP
allocation we may end up with lockdep splat like bellow:
======================================================
WARNING: possible circular locking dependency detected
6.9.0-rc3+ #49 Not tainted
------------------------------------------------------
kswapd0/149 is trying to acquire lock:
ffff88811346a920
(&xfs_nondir_ilock_class){++++}-{4:4}, at: xfs_reclaim_inode+0x3ac/0x590
[xfs]
but task is already holding lock:
ffffffff8bb33100 (fs_reclaim){+.+.}-{0:0}, at:
balance_pgdat+0x5d9/0xad0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (fs_reclaim){+.+.}-{0:0}:
__lock_acquire+0x7da/0x1030
lock_acquire+0x15d/0x400
fs_reclaim_acquire+0xb5/0x100
prepare_alloc_pages.constprop.0+0xc5/0x230
__alloc_pages+0x12a/0x3f0
alloc_pages_mpol+0x175/0x340
stack_depot_save_flags+0x4c5/0x510
kasan_save_stack+0x30/0x40
kasan_save_track+0x10/0x30
__kasan_slab_alloc+0x83/0x90
kmem_cache_alloc+0x15e/0x4a0
__alloc_object+0x35/0x370
__create_object+0x22/0x90
__kmalloc_node_track_caller+0x477/0x5b0
krealloc+0x5f/0x110
xfs_iext_insert_raw+0x4b2/0x6e0 [xfs]
xfs_iext_insert+0x2e/0x130 [xfs]
xfs_iread_bmbt_block+0x1a9/0x4d0 [xfs]
xfs_btree_visit_block+0xfb/0x290 [xfs]
xfs_btree_visit_blocks+0x215/0x2c0 [xfs]
xfs_iread_extents+0x1a2/0x2e0 [xfs]
xfs_buffered_write_iomap_begin+0x376/0x10a0 [xfs]
iomap_iter+0x1d1/0x2d0
iomap_file_buffered_write+0x120/0x1a0
xfs_file_buffered_write+0x128/0x4b0 [xfs]
vfs_write+0x675/0x890
ksys_write+0xc3/0x160
do_syscall_64+0x94/0x170
entry_SYSCALL_64_after_hwframe+0x71/0x79
Always preserve __GFP_NOLOCKDEP to fix this.
Link: https://lkml.kernel.org/r/20240418141133.22950-1-ryabinin.a.a@gmail.com
Fixes: cd11016e5f52 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Reported-by: Xiubo Li <xiubli(a)redhat.com>
Closes: https://lore.kernel.org/all/a0caa289-ca02-48eb-9bf2-d86fd47b71f4@redhat.com/
Reported-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Closes: https://lore.kernel.org/all/f9ff999a-e170-b66b-7caf-293f2b147ac2@opensource…
Suggested-by: Dave Chinner <david(a)fromorbit.com>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/stackdepot.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/lib/stackdepot.c~stackdepot-respect-__gfp_nolockdep-allocation-flag
+++ a/lib/stackdepot.c
@@ -627,10 +627,10 @@ depot_stack_handle_t stack_depot_save_fl
/*
* Zero out zone modifiers, as we don't have specific zone
* requirements. Keep the flags related to allocation in atomic
- * contexts and I/O.
+ * contexts, I/O, nolockdep.
*/
alloc_flags &= ~GFP_ZONEMASK;
- alloc_flags &= (GFP_ATOMIC | GFP_KERNEL);
+ alloc_flags &= (GFP_ATOMIC | GFP_KERNEL | __GFP_NOLOCKDEP);
alloc_flags |= __GFP_NOWARN;
page = alloc_pages(alloc_flags, DEPOT_POOL_ORDER);
if (page)
_
Patches currently in -mm which might be from ryabinin.a.a(a)gmail.com are
stackdepot-respect-__gfp_nolockdep-allocation-flag.patch
The patch titled
Subject: hugetlb: check for anon_vma prior to folio allocation
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
hugetlb-check-for-anon_vma-prior-to-folio-allocation.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Vishal Moola (Oracle)" <vishal.moola(a)gmail.com>
Subject: hugetlb: check for anon_vma prior to folio allocation
Date: Mon, 15 Apr 2024 14:17:47 -0700
Commit 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of
anon_vma_prepare()") may bailout after allocating a folio if we do not
hold the mmap lock. When this occurs, vmf_anon_prepare() will release the
vma lock. Hugetlb then attempts to call restore_reserve_on_error(), which
depends on the vma lock being held.
We can move vmf_anon_prepare() prior to the folio allocation in order to
avoid calling restore_reserve_on_error() without the vma lock.
Link: https://lkml.kernel.org/r/ZiFqSrSRLhIV91og@fedora
Fixes: 9acad7ba3e25 ("hugetlb: use vmf_anon_prepare() instead of anon_vma_prepare()")
Reported-by: syzbot+ad1b592fc4483655438b(a)syzkaller.appspotmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/mm/hugetlb.c~hugetlb-check-for-anon_vma-prior-to-folio-allocation
+++ a/mm/hugetlb.c
@@ -6261,6 +6261,12 @@ static vm_fault_t hugetlb_no_page(struct
VM_UFFD_MISSING);
}
+ if (!(vma->vm_flags & VM_MAYSHARE)) {
+ ret = vmf_anon_prepare(vmf);
+ if (unlikely(ret))
+ goto out;
+ }
+
folio = alloc_hugetlb_folio(vma, haddr, 0);
if (IS_ERR(folio)) {
/*
@@ -6297,15 +6303,12 @@ static vm_fault_t hugetlb_no_page(struct
*/
restore_reserve_on_error(h, vma, haddr, folio);
folio_put(folio);
+ ret = VM_FAULT_SIGBUS;
goto out;
}
new_pagecache_folio = true;
} else {
folio_lock(folio);
-
- ret = vmf_anon_prepare(vmf);
- if (unlikely(ret))
- goto backout_unlocked;
anon_rmap = 1;
}
} else {
_
Patches currently in -mm which might be from vishal.moola(a)gmail.com are
hugetlb-check-for-anon_vma-prior-to-folio-allocation.patch
hugetlb-convert-hugetlb_fault-to-use-struct-vm_fault.patch
hugetlb-convert-hugetlb_no_page-to-use-struct-vm_fault.patch
hugetlb-convert-hugetlb_no_page-to-use-struct-vm_fault-fix.patch
hugetlb-convert-hugetlb_wp-to-use-struct-vm_fault.patch
hugetlb-convert-hugetlb_wp-to-use-struct-vm_fault-fix.patch
The patch titled
Subject: mm: zswap: fix shrinker NULL crash with cgroup_disable=memory
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-zswap-fix-shrinker-null-crash-with-cgroup_disable=memory.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: zswap: fix shrinker NULL crash with cgroup_disable=memory
Date: Thu, 18 Apr 2024 08:26:28 -0400
Christian reports a NULL deref in zswap that he bisected down to the zswap
shrinker. The issue also cropped up in the bug trackers of libguestfs [1]
and the Red Hat bugzilla [2].
The problem is that when memcg is disabled with the boot time flag, the
zswap shrinker might get called with sc->memcg == NULL. This is okay in
many places, like the lruvec operations. But it crashes in
memcg_page_state() - which is only used due to the non-node accounting of
cgroup's the zswap memory to begin with.
Nhat spotted that the memcg can be NULL in the memcg-disabled case, and I
was then able to reproduce the crash locally as well.
[1] https://github.com/libguestfs/libguestfs/issues/139
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2275252
Link: https://lkml.kernel.org/r/20240418124043.GC1055428@cmpxchg.org
Link: https://lkml.kernel.org/r/20240417143324.GA1055428@cmpxchg.org
Fixes: b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Christian Heusel <christian(a)heusel.eu>
Debugged-by: Nhat Pham <nphamcs(a)gmail.com>
Suggested-by: Nhat Pham <nphamcs(a)gmail.com>
Tested-by: Christian Heusel <christian(a)heusel.eu>
Cc: Chengming Zhou <chengming.zhou(a)linux.dev>
Cc: Dan Streetman <ddstreet(a)ieee.org>
Cc: Richard W.M. Jones <rjones(a)redhat.com>
Cc: Seth Jennings <sjenning(a)redhat.com>
Cc: Vitaly Wool <vitaly.wool(a)konsulko.com>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: <stable(a)vger.kernel.org> [v6.8]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/zswap.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
--- a/mm/zswap.c~mm-zswap-fix-shrinker-null-crash-with-cgroup_disable=memory
+++ a/mm/zswap.c
@@ -1331,15 +1331,22 @@ static unsigned long zswap_shrinker_coun
if (!gfp_has_io_fs(sc->gfp_mask))
return 0;
-#ifdef CONFIG_MEMCG_KMEM
- mem_cgroup_flush_stats(memcg);
- nr_backing = memcg_page_state(memcg, MEMCG_ZSWAP_B) >> PAGE_SHIFT;
- nr_stored = memcg_page_state(memcg, MEMCG_ZSWAPPED);
-#else
- /* use pool stats instead of memcg stats */
- nr_backing = zswap_pool_total_size >> PAGE_SHIFT;
- nr_stored = atomic_read(&zswap_nr_stored);
-#endif
+ /*
+ * For memcg, use the cgroup-wide ZSWAP stats since we don't
+ * have them per-node and thus per-lruvec. Careful if memcg is
+ * runtime-disabled: we can get sc->memcg == NULL, which is ok
+ * for the lruvec, but not for memcg_page_state().
+ *
+ * Without memcg, use the zswap pool-wide metrics.
+ */
+ if (!mem_cgroup_disabled()) {
+ mem_cgroup_flush_stats(memcg);
+ nr_backing = memcg_page_state(memcg, MEMCG_ZSWAP_B) >> PAGE_SHIFT;
+ nr_stored = memcg_page_state(memcg, MEMCG_ZSWAPPED);
+ } else {
+ nr_backing = zswap_pool_total_size >> PAGE_SHIFT;
+ nr_stored = atomic_read(&zswap_nr_stored);
+ }
if (!nr_stored)
return 0;
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
mm-zswap-fix-shrinker-null-crash-with-cgroup_disable=memory.patch
mm-zswap-optimize-zswap-pool-size-tracking.patch
mm-zpool-return-pool-size-in-pages.patch
mm-page_alloc-remove-pcppage-migratetype-caching.patch
mm-page_alloc-optimize-free_unref_folios.patch
mm-page_alloc-fix-up-block-types-when-merging-compatible-blocks.patch
mm-page_alloc-move-free-pages-when-converting-block-during-isolation.patch
mm-page_alloc-fix-move_freepages_block-range-error.patch
mm-page_alloc-fix-freelist-movement-during-block-conversion.patch
mm-page_alloc-close-migratetype-race-between-freeing-and-stealing.patch
mm-page_isolation-prepare-for-hygienic-freelists.patch
mm-page_isolation-prepare-for-hygienic-freelists-fix.patch
mm-page_alloc-consolidate-free-page-accounting.patch
mm-page_alloc-consolidate-free-page-accounting-fix.patch
mm-page_alloc-consolidate-free-page-accounting-fix-2.patch
mm-page_alloc-batch-vmstat-updates-in-expand.patch
After the commit d2689b6a86b9 ("net: usb: ax88179_178a: avoid two
consecutive device resets"), reset operation, in which the default mac
address from the device is read, is not executed from bind operation and
the random address, that is pregenerated just in case, is direclty written
the first time in the device, so the default one from the device is not
even read. This writing is not dangerous because is volatile and the
default mac address is not missed.
In order to avoid this and keep the simplification to have only one
reset and reduce the delays, restore the reset from bind operation and
remove the reset that is commanded from open operation. The behavior is
the same but everything is ready for usbnet_probe.
Tested with ASIX AX88179 USB Gigabit Ethernet devices.
Restore the old behavior for the rest of possible devices because I don't
have the hardware to test.
cc: stable(a)vger.kernel.org # 6.6+
Fixes: d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets")
Reported-by: Jarkko Palviainen <jarkko.palviainen(a)gmail.com>
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
---
v2:
- Restore reset from bind operation to avoid problems with usbnet_probe.
v1: https://lore.kernel.org/netdev/20240410095603.502566-1-jtornosm@redhat.com/
drivers/net/usb/ax88179_178a.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
index 69169842fa2f..a493fde1af3f 100644
--- a/drivers/net/usb/ax88179_178a.c
+++ b/drivers/net/usb/ax88179_178a.c
@@ -1316,6 +1316,8 @@ static int ax88179_bind(struct usbnet *dev, struct usb_interface *intf)
netif_set_tso_max_size(dev->net, 16384);
+ ax88179_reset(dev);
+
return 0;
}
@@ -1694,7 +1696,6 @@ static const struct driver_info ax88179_info = {
.unbind = ax88179_unbind,
.status = ax88179_status,
.link_reset = ax88179_link_reset,
- .reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
@@ -1707,7 +1708,6 @@ static const struct driver_info ax88178a_info = {
.unbind = ax88179_unbind,
.status = ax88179_status,
.link_reset = ax88179_link_reset,
- .reset = ax88179_reset,
.stop = ax88179_stop,
.flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
--
2.44.0
Sorry, apologies to everyone. By accident I replied off the list.
Redoing it now on the list. More below.
On Mon, Apr 8, 2024 at 12:10 PM Christian König
<christian.koenig(a)amd.com> wrote:
>
> Am 08.04.24 um 18:04 schrieb Zack Rusin:
> > On Mon, Apr 8, 2024 at 11:59 AM Christian König
> > <christian.koenig(a)amd.com> wrote:
> >> Am 08.04.24 um 17:56 schrieb Zack Rusin:
> >>> Stop printing the TT memory decryption status info each time tt is created
> >>> and instead print it just once.
> >>>
> >>> Reduces the spam in the system logs when running guests with SEV enabled.
> >> Do we then really need this in the first place?
> > Thomas asked for it just to have an indication when those paths are
> > being used because they could potentially break things pretty bad. I
> > think it is useful knowing that those paths are hit (but only once).
> > It makes it pretty easy for me to tell whether bug reports with people
> > who report black screen can be answered with "the kernel needs to be
> > upgraded" ;)
>
> Sounds reasonable, but my expectation was rather that we would print
> something on the device level.
>
> If that's not feasible for whatever reason than printing it once works
> as well of course.
TBH, I think it's pretty convenient to have the drm_info in the TT
just to make sure that when drivers request use_dma_alloc on SEV
systems TT turns decryption on correctly, i.e. it's a nice sanity
check when reading the logs. But if you'd prefer it in the driver I
can move this logic there as well.
z