This is a note to let you know that I've just added the patch titled
xhci: Free the command allocated for setting LPM if we return early
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From f6caea4855553a8b99ba3ec23ecdb5ed8262f26c Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Date: Thu, 30 Mar 2023 17:30:56 +0300
Subject: xhci: Free the command allocated for setting LPM if we return early
The command allocated to set exit latency LPM values need to be freed in
case the command is never queued. This would be the case if there is no
change in exit latency values, or device is missing.
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Link: https://lore.kernel.org/linux-usb/24263902-c9b3-ce29-237b-1c3d6918f4fe@alu.…
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac(a)alu.unizg.hr>
Fixes: 5c2a380a5aa8 ("xhci: Allocate separate command structures for each LPM command")
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Link: https://lore.kernel.org/r/20230330143056.1390020-4-mathias.nyman@linux.inte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index bdb6dd819a3b..6307bae9cddf 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4442,6 +4442,7 @@ static int __maybe_unused xhci_change_max_exit_latency(struct xhci_hcd *xhci,
if (!virt_dev || max_exit_latency == virt_dev->current_mel) {
spin_unlock_irqrestore(&xhci->lock, flags);
+ xhci_free_command(xhci, command);
return 0;
}
--
2.40.0
The RX macro codec comes on some platforms in two variants - ADSP
and ADSP bypassed - thus the clock-names varies from 3 to 5. The clocks
must vary as well:
sc7280-idp.dtb: codec@3200000: clocks: [[202, 8], [202, 7], [203]] is too short
Fixes: 852fda58d99a ("ASoC: qcom: dt-bindings: Update bindings for clocks in lpass digital codes")
Cc: <stable(a)vger.kernel.org>
Acked-by: Rob Herring <robh(a)kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
---
Documentation/devicetree/bindings/sound/qcom,lpass-rx-macro.yaml | 1 +
1 file changed, 1 insertion(+)
diff --git a/Documentation/devicetree/bindings/sound/qcom,lpass-rx-macro.yaml b/Documentation/devicetree/bindings/sound/qcom,lpass-rx-macro.yaml
index f8972769cc6a..ec4b0ac8ad68 100644
--- a/Documentation/devicetree/bindings/sound/qcom,lpass-rx-macro.yaml
+++ b/Documentation/devicetree/bindings/sound/qcom,lpass-rx-macro.yaml
@@ -28,6 +28,7 @@ properties:
const: 0
clocks:
+ minItems: 3
maxItems: 5
clock-names:
--
2.34.1
I'm announcing the release of the 5.4.239 kernel.
It just fixes up a permission of one file,
tools/testing/selftests/net/fib_tests.sh, if you don't have an issue with this
specific testing file, no need to upgrade.
The updated 5.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Greg Kroah-Hartman (1):
Linux 5.4.239
Rishabh Bhatnagar (1):
selftests: Fix the executable permissions for fib_tests.sh
When upreving llvm I realised that kexec stopped working on my test
platform. This patch fixes it.
To: Eric Biederman <ebiederm(a)xmission.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Philipp Rudo <prudo(a)redhat.com>
Cc: kexec(a)lists.infradead.org
Cc: linux-kernel(a)vger.kernel.org
Cc: Ross Zwisler <zwisler(a)google.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Changes in v4:
- Add Cc: stable
- Add linker script for x86
- Add a warning when the kernel image has overlapping sections.
- Link to v3: https://lore.kernel.org/r/20230321-kexec_clang16-v3-0-5f016c8d0e87@chromium…
Changes in v3:
- Fix initial value. Thanks Ross!
- Link to v2: https://lore.kernel.org/r/20230321-kexec_clang16-v2-0-d10e5d517869@chromium…
Changes in v2:
- Fix if condition. Thanks Steven!.
- Update Philipp email. Thanks Baoquan.
- Link to v1: https://lore.kernel.org/r/20230321-kexec_clang16-v1-0-a768fc2c7c4d@chromium…
---
Ricardo Ribalda (2):
kexec: Support purgatories with .text.hot sections
x86/purgatory: Add linker script
arch/x86/purgatory/.gitignore | 2 ++
arch/x86/purgatory/Makefile | 20 +++++++++----
arch/x86/purgatory/kexec-purgatory.S | 2 +-
arch/x86/purgatory/purgatory.lds.S | 57 ++++++++++++++++++++++++++++++++++++
kernel/kexec_file.c | 13 +++++++-
5 files changed, 86 insertions(+), 8 deletions(-)
---
base-commit: 17214b70a159c6547df9ae204a6275d983146f6b
change-id: 20230321-kexec_clang16-4510c23d129c
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>
The OSM/EPSS hardware controls the frequency of each CPU cluster based
on requests from the OS and various throttling events in the system.
While throttling is in effect the related dcvs interrupt will be kept
high. The purpose of the code handling this interrupt is to
continuously report the thermal pressure based on the throttled
frequency.
The reasoning for adding QoS control to this mechanism is not entirely
clear, but the introduction of commit 'c4c0efb06f17 ("cpufreq:
qcom-cpufreq-hw: Add cpufreq qos for LMh")' causes the
scaling_max_frequncy to be set to the throttled frequency. On the next
iteration of polling, the throttled frequency is above or equal to the
newly requested frequency, so the polling is stopped.
With cpufreq limiting the max frequency, the hardware no longer report a
throttling state and no further updates to thermal pressure or qos
state are made.
The result of this is that scaling_max_frequency can only go down, and
the system becomes slower and slower every time a thermal throttling
event is reported by the hardware.
Even if the logic could be improved, there is no reason for software to
limit the max freqency in response to the hardware limiting the max
frequency. At best software will follow the reported hardware state, but
typically it will cause slower backoff of the throttling.
This reverts commit c4c0efb06f17fa4a37ad99e7752b18a5405c76dc.
Fixes: c4c0efb06f17 ("cpufreq: qcom-cpufreq-hw: Add cpufreq qos for LMh")
Cc: stable(a)vger.kernel.org
Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Signed-off-by: Bjorn Andersson <quic_bjorande(a)quicinc.com>
---
drivers/cpufreq/qcom-cpufreq-hw.c | 14 --------------
1 file changed, 14 deletions(-)
diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index 575a4461c25a..1503d315fa7e 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -14,7 +14,6 @@
#include <linux/of_address.h>
#include <linux/of_platform.h>
#include <linux/pm_opp.h>
-#include <linux/pm_qos.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/units.h>
@@ -60,8 +59,6 @@ struct qcom_cpufreq_data {
struct clk_hw cpu_clk;
bool per_core_dcvs;
-
- struct freq_qos_request throttle_freq_req;
};
static struct {
@@ -351,8 +348,6 @@ static void qcom_lmh_dcvs_notify(struct qcom_cpufreq_data *data)
throttled_freq = freq_hz / HZ_PER_KHZ;
- freq_qos_update_request(&data->throttle_freq_req, throttled_freq);
-
/* Update thermal pressure (the boost frequencies are accepted) */
arch_update_thermal_pressure(policy->related_cpus, throttled_freq);
@@ -445,14 +440,6 @@ static int qcom_cpufreq_hw_lmh_init(struct cpufreq_policy *policy, int index)
if (data->throttle_irq < 0)
return data->throttle_irq;
- ret = freq_qos_add_request(&policy->constraints,
- &data->throttle_freq_req, FREQ_QOS_MAX,
- FREQ_QOS_MAX_DEFAULT_VALUE);
- if (ret < 0) {
- dev_err(&pdev->dev, "Failed to add freq constraint (%d)\n", ret);
- return ret;
- }
-
data->cancel_throttle = false;
data->policy = policy;
@@ -519,7 +506,6 @@ static void qcom_cpufreq_hw_lmh_exit(struct qcom_cpufreq_data *data)
if (data->throttle_irq <= 0)
return;
- freq_qos_remove_request(&data->throttle_freq_req);
free_irq(data->throttle_irq, data);
}
--
2.25.1
Device exclusive page table entries are used to prevent CPU access to
a page whilst it is being accessed from a device. Typically this is
used to implement atomic operations when the underlying bus does not
support atomic access. When a CPU thread encounters a device exclusive
entry it locks the page and restores the original entry after calling
mmu notifiers to signal drivers that exclusive access is no longer
available.
The device exclusive entry holds a reference to the page making it
safe to access the struct page whilst the entry is present. However
the fault handling code does not hold the PTL when taking the page
lock. This means if there are multiple threads faulting concurrently
on the device exclusive entry one will remove the entry whilst others
will wait on the page lock without holding a reference.
This can lead to threads locking or waiting on a page with a zero
refcount. Whilst mmap_lock prevents the pages getting freed via
munmap() they may still be freed by a migration. This leads to
warnings such as PAGE_FLAGS_CHECK_AT_FREE due to the page being locked
when the refcount drops to zero. Note that during removal of the
device exclusive entry the PTE is currently re-checked under the PTL
so no futher bad page accesses occur once it is locked.
Signed-off-by: Alistair Popple <apopple(a)nvidia.com>
Fixes: b756a3b5e7ea ("mm: device exclusive memory access")
Cc: stable(a)vger.kernel.org
---
mm/memory.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index 8c8420934d60..b499bd283d8e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3623,8 +3623,19 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf)
struct vm_area_struct *vma = vmf->vma;
struct mmu_notifier_range range;
- if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags))
+ /*
+ * We need a page reference to lock the page because we don't
+ * hold the PTL so a racing thread can remove the
+ * device-exclusive entry and unmap the page. If the page is
+ * free the entry must have been removed already.
+ */
+ if (!get_page_unless_zero(vmf->page))
+ return 0;
+
+ if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags)) {
+ put_page(vmf->page);
return VM_FAULT_RETRY;
+ }
mmu_notifier_range_init_owner(&range, MMU_NOTIFY_EXCLUSIVE, 0, vma,
vma->vm_mm, vmf->address & PAGE_MASK,
(vmf->address & PAGE_MASK) + PAGE_SIZE, NULL);
@@ -3637,6 +3648,7 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf)
pte_unmap_unlock(vmf->pte, vmf->ptl);
folio_unlock(folio);
+ put_page(vmf->page);
mmu_notifier_invalidate_range_end(&range);
return 0;
--
2.39.2
The quilt patch titled
Subject: mm: take a page reference when removing device exclusive entries
has been removed from the -mm tree. Its filename was
mm-take-a-page-reference-when-removing-device-exclusive-entries.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Alistair Popple <apopple(a)nvidia.com>
Subject: mm: take a page reference when removing device exclusive entries
Date: Tue, 28 Mar 2023 13:14:34 +1100
Device exclusive page table entries are used to prevent CPU access to a
page whilst it is being accessed from a device. Typically this is used to
implement atomic operations when the underlying bus does not support
atomic access. When a CPU thread encounters a device exclusive entry it
locks the page and restores the original entry after calling mmu notifiers
to signal drivers that exclusive access is no longer available.
The device exclusive entry holds a reference to the page making it safe to
access the struct page whilst the entry is present. However the fault
handling code does not hold the PTL when taking the page lock. This means
if there are multiple threads faulting concurrently on the device
exclusive entry one will remove the entry whilst others will wait on the
page lock without holding a reference.
This can lead to threads locking or waiting on a page with a zero
refcount. Whilst mmap_lock prevents the pages getting freed via munmap()
they may still be freed by a migration. This leads to warnings such as
PAGE_FLAGS_CHECK_AT_FREE due to the page being locked when the refcount
drops to zero. Note that during removal of the device exclusive entry the
PTE is currently re-checked under the PTL so no futher bad page accesses
occur once it is locked.
Link: https://lkml.kernel.org/r/20230328021434.292971-1-apopple@nvidia.com
Fixes: b756a3b5e7ea ("mm: device exclusive memory access")
Signed-off-by: Alistair Popple <apopple(a)nvidia.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: Alistair Popple <apopple(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
--- a/mm/memory.c~mm-take-a-page-reference-when-removing-device-exclusive-entries
+++ a/mm/memory.c
@@ -3563,8 +3563,19 @@ static vm_fault_t remove_device_exclusiv
struct vm_area_struct *vma = vmf->vma;
struct mmu_notifier_range range;
- if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags))
+ /*
+ * We need a page reference to lock the page because we don't hold the
+ * PTL so a racing thread can remove the device-exclusive entry and
+ * unmap the page. If the page is free the entry must have been
+ * removed already.
+ */
+ if (!get_page_unless_zero(vmf->page))
+ return 0;
+
+ if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags)) {
+ put_page(vmf->page);
return VM_FAULT_RETRY;
+ }
mmu_notifier_range_init_owner(&range, MMU_NOTIFY_EXCLUSIVE, 0,
vma->vm_mm, vmf->address & PAGE_MASK,
(vmf->address & PAGE_MASK) + PAGE_SIZE, NULL);
@@ -3577,6 +3588,7 @@ static vm_fault_t remove_device_exclusiv
pte_unmap_unlock(vmf->pte, vmf->ptl);
folio_unlock(folio);
+ put_page(vmf->page);
mmu_notifier_invalidate_range_end(&range);
return 0;
_
Patches currently in -mm which might be from apopple(a)nvidia.com are
The patch titled
Subject: mm: khugepaged: fix kernel BUG in hpage_collapse_scan_file()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-khugepaged-fix-kernel-bug-in-hpage_collapse_scan_file.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ivan Orlov <ivan.orlov0322(a)gmail.com>
Subject: mm: khugepaged: fix kernel BUG in hpage_collapse_scan_file()
Date: Wed, 29 Mar 2023 18:53:30 +0400
Syzkaller reported the following issue:
kernel BUG at mm/khugepaged.c:1823!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 5097 Comm: syz-executor220 Not tainted 6.2.0-syzkaller-13154-g857f1268a591 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023
RIP: 0010:collapse_file mm/khugepaged.c:1823 [inline]
RIP: 0010:hpage_collapse_scan_file+0x67c8/0x7580 mm/khugepaged.c:2233
Code: 00 00 89 de e8 c9 66 a3 ff 31 ff 89 de e8 c0 66 a3 ff 45 84 f6 0f 85 28 0d 00 00 e8 22 64 a3 ff e9 dc f7 ff ff e8 18 64 a3 ff <0f> 0b f3 0f 1e fa e8 0d 64 a3 ff e9 93 f6 ff ff f3 0f 1e fa 4c 89
RSP: 0018:ffffc90003dff4e0 EFLAGS: 00010093
RAX: ffffffff81e95988 RBX: 00000000000001c1 RCX: ffff8880205b3a80
RDX: 0000000000000000 RSI: 00000000000001c0 RDI: 00000000000001c1
RBP: ffffc90003dff830 R08: ffffffff81e90e67 R09: fffffbfff1a433c3
R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000000
R13: ffffc90003dff6c0 R14: 00000000000001c0 R15: 0000000000000000
FS: 00007fdbae5ee700(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdbae6901e0 CR3: 000000007b2dd000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
madvise_collapse+0x721/0xf50 mm/khugepaged.c:2693
madvise_vma_behavior mm/madvise.c:1086 [inline]
madvise_walk_vmas mm/madvise.c:1260 [inline]
do_madvise+0x9e5/0x4680 mm/madvise.c:1439
__do_sys_madvise mm/madvise.c:1452 [inline]
__se_sys_madvise mm/madvise.c:1450 [inline]
__x64_sys_madvise+0xa5/0xb0 mm/madvise.c:1450
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The xas_store() call during page cache scanning can potentially translate
'xas' into the error state (with the reproducer provided by the syzkaller
the error code is -ENOMEM). However, there are no further checks after
the 'xas_store', and the next call of 'xas_next' at the start of the
scanning cycle doesn't increase the xa_index, and the issue occurs.
This patch will add the xarray state error checking after the xas_store()
and the corresponding result error code.
Tested via syzbot.
Link: https://lkml.kernel.org/r/20230329145330.23191-1-ivan.orlov0322@gmail.com
Link: https://syzkaller.appspot.com/bug?id=7d6bb3760e026ece7524500fe44fb024a0e959…
Signed-off-by: Ivan Orlov <ivan.orlov0322(a)gmail.com>
Reported-by: syzbot+9578faa5475acb35fa50(a)syzkaller.appspotmail.com
Cc: Himadri Pandya <himadrispandya(a)gmail.com>
Cc: Ivan Orlov <ivan.orlov0322(a)gmail.com>
Cc: Shuah Khan <skhan(a)linuxfoundation.org>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/mm/khugepaged.c~mm-khugepaged-fix-kernel-bug-in-hpage_collapse_scan_file
+++ a/mm/khugepaged.c
@@ -55,6 +55,7 @@ enum scan_result {
SCAN_CGROUP_CHARGE_FAIL,
SCAN_TRUNCATED,
SCAN_PAGE_HAS_PRIVATE,
+ SCAN_STORE_FAILED,
};
#define CREATE_TRACE_POINTS
@@ -1840,6 +1841,15 @@ static int collapse_file(struct mm_struc
goto xa_locked;
}
xas_store(&xas, hpage);
+ if (xas_error(&xas)) {
+ /* revert shmem_charge performed
+ * in the previous condition
+ */
+ mapping->nrpages--;
+ shmem_uncharge(mapping->host, 1);
+ result = SCAN_STORE_FAILED;
+ goto xa_locked;
+ }
nr_none++;
continue;
}
_
Patches currently in -mm which might be from ivan.orlov0322(a)gmail.com are
mm-khugepaged-fix-kernel-bug-in-hpage_collapse_scan_file.patch
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 97a71c444a147ae41c7d0ab5b3d855d7f762f3ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '167812333979118(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
97a71c444a14 ("KVM: x86: Purge "highest ISR" cache when updating APICv state")
ce0a58f4756c ("KVM: x86: Move "apicv_active" into "struct kvm_lapic"")
d39850f57d21 ("KVM: x86: Drop @vcpu parameter from kvm_x86_ops.hwapic_isr_update()")
47e8eec83262 ("Merge tag 'kvmarm-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 97a71c444a147ae41c7d0ab5b3d855d7f762f3ed Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Fri, 6 Jan 2023 01:12:35 +0000
Subject: [PATCH] KVM: x86: Purge "highest ISR" cache when updating APICv state
Purge the "highest ISR" cache when updating APICv state on a vCPU. The
cache must not be used when APICv is active as hardware may emulate EOIs
(and other operations) without exiting to KVM.
This fixes a bug where KVM will effectively block IRQs in perpetuity due
to the "highest ISR" never getting reset if APICv is activated on a vCPU
while an IRQ is in-service. Hardware emulates the EOI and KVM never gets
a chance to update its cache.
Fixes: b26a695a1d78 ("kvm: lapic: Introduce APICv update helper function")
Cc: stable(a)vger.kernel.org
Cc: Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
Cc: Maxim Levitsky <mlevitsk(a)redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20230106011306.85230-3-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 5c0f93fc073a..33a661d82da7 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2424,6 +2424,7 @@ void kvm_apic_update_apicv(struct kvm_vcpu *vcpu)
*/
apic->isr_count = count_vectors(apic->regs + APIC_ISR);
}
+ apic->highest_isr_cache = -1;
}
void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
@@ -2479,7 +2480,6 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
kvm_lapic_set_reg(apic, APIC_TMR + 0x10 * i, 0);
}
kvm_apic_update_apicv(vcpu);
- apic->highest_isr_cache = -1;
update_divide_count(apic);
atomic_set(&apic->lapic_timer.pending, 0);
@@ -2767,7 +2767,6 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
__start_apic_timer(apic, APIC_TMCCT);
kvm_lapic_set_reg(apic, APIC_TMCCT, 0);
kvm_apic_update_apicv(vcpu);
- apic->highest_isr_cache = -1;
if (apic->apicv_active) {
static_call_cond(kvm_x86_apicv_post_state_restore)(vcpu);
static_call_cond(kvm_x86_hwapic_irr_update)(vcpu, apic_find_highest_irr(apic));
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x ab52be1b310bcb39e6745d34a8f0e8475d67381a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '167812345411383(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
ab52be1b310b ("KVM: x86: Inject #GP on x2APIC WRMSR that sets reserved bits 63:32")
a57a31684d7b ("KVM: x86: Treat x2APIC's ICR as a 64-bit register, not two 32-bit regs")
5429478d038f ("KVM: x86: Add helpers to handle 64-bit APIC MSR read/writes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ab52be1b310bcb39e6745d34a8f0e8475d67381a Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Sat, 7 Jan 2023 01:10:21 +0000
Subject: [PATCH] KVM: x86: Inject #GP on x2APIC WRMSR that sets reserved bits
63:32
Reject attempts to set bits 63:32 for 32-bit x2APIC registers, i.e. all
x2APIC registers except ICR. Per Intel's SDM:
Non-zero writes (by WRMSR instruction) to reserved bits to these
registers will raise a general protection fault exception
Opportunistically fix a typo in a nearby comment.
Reported-by: Marc Orr <marcorr(a)google.com>
Cc: stable(a)vger.kernel.org
Reviewed-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Link: https://lore.kernel.org/r/20230107011025.565472-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9aca006b2d22..814b65106057 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -3114,13 +3114,17 @@ static int kvm_lapic_msr_read(struct kvm_lapic *apic, u32 reg, u64 *data)
static int kvm_lapic_msr_write(struct kvm_lapic *apic, u32 reg, u64 data)
{
/*
- * ICR is a 64-bit register in x2APIC mode (and Hyper'v PV vAPIC) and
+ * ICR is a 64-bit register in x2APIC mode (and Hyper-V PV vAPIC) and
* can be written as such, all other registers remain accessible only
* through 32-bit reads/writes.
*/
if (reg == APIC_ICR)
return kvm_x2apic_icr_write(apic, data);
+ /* Bits 63:32 are reserved in all other registers. */
+ if (data >> 32)
+ return 1;
+
return kvm_lapic_reg_write(apic, reg, (u32)data);
}
Commit 52f04f10b900 ("thermal: intel: int340x: processor_thermal: Fix
deadlock") addressed deadlock issue during user space trip update. But it
missed a case when thermal zone device is disabled when user writes 0.
Call to thermal_zone_device_disable() also causes deadlock as it also
tries to lock tz->lock, which is already claimed by trip_point_temp_store()
in the thermal core code.
Remove call to thermal_zone_device_disable() in the function
sys_set_trip_temp(), which is called from trip_point_temp_store().
Fixes: 52f04f10b900 ("thermal: intel: int340x: processor_thermal: Fix deadlock")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # 6.2+
---
.../thermal/intel/int340x_thermal/processor_thermal_device_pci.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c b/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c
index 90526f46c9b1..d71ee50e7878 100644
--- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c
+++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c
@@ -153,7 +153,6 @@ static int sys_set_trip_temp(struct thermal_zone_device *tzd, int trip, int temp
cancel_delayed_work_sync(&pci_info->work);
proc_thermal_mmio_write(pci_info, PROC_THERMAL_MMIO_INT_ENABLE_0, 0);
proc_thermal_mmio_write(pci_info, PROC_THERMAL_MMIO_THRES_0, 0);
- thermal_zone_device_disable(tzd);
pci_info->stored_thres = 0;
return 0;
}
--
2.39.1
commit 727209376f4998bc84db1d5d8af15afea846a92b upstream.
Commit b041b525dab9 ("x86/split_lock: Make life miserable for split lockers")
changed the way the split lock detector works when in "warn" mode;
basically, it not only shows the warn message, but also intentionally
introduces a slowdown through sleeping plus serialization mechanism
on such task. Based on discussions in [0], seems the warning alone
wasn't enough motivation for userspace developers to fix their
applications.
This slowdown is enough to totally break some proprietary (aka.
unfixable) userspace[1].
Happens that originally the proposal in [0] was to add a new mode
which would warns + slowdown the "split locking" task, keeping the
old warn mode untouched. In the end, that idea was discarded and
the regular/default "warn" mode now slows down the applications. This
is quite aggressive with regards proprietary/legacy programs that
basically are unable to properly run in kernel with this change.
While it is understandable that a malicious application could DoS
by split locking, it seems unacceptable to regress old/proprietary
userspace programs through a default configuration that previously
worked. An example of such breakage was reported in [1].
Add a sysctl to allow controlling the "misery mode" behavior, as per
Thomas suggestion on [2]. This way, users running legacy and/or
proprietary software are allowed to still execute them with a decent
performance while still observing the warning messages on kernel log.
[0] https://lore.kernel.org/lkml/20220217012721.9694-1-tony.luck@intel.com/
[1] https://github.com/doitsujin/dxvk/issues/2938
[2] https://lore.kernel.org/lkml/87pmf4bter.ffs@tglx/
[ dhansen: minor changelog tweaks, including clarifying the actual
problem ]
Fixes: b041b525dab9 ("x86/split_lock: Make life miserable for split lockers")
Suggested-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Tony Luck <tony.luck(a)intel.com>
Tested-by: Andre Almeida <andrealmeid(a)igalia.com>
Link: https://lore.kernel.org/all/20221024200254.635256-1-gpiccoli%40igalia.com
---
Hi folks, I've build tested this on both 6.0.13 and 6.1, worked fine. The
split lock detector code changed almost nothing since 6.0, so that makes
sense...
I think this is important to have in stable, some gaming community members
seems excited with that, it'll help with general proprietary software
(that is basically unfixable), making them run smoothly on 6.0.y and 6.1.y.
I've CCed some folks more than just the stable list, to gather more
opinions on that, so apologies if you received this email but think
that you shouldn't have.
Thanks in advance,
Guilherme
Documentation/admin-guide/sysctl/kernel.rst | 23 ++++++++
arch/x86/kernel/cpu/intel.c | 63 +++++++++++++++++----
2 files changed, 76 insertions(+), 10 deletions(-)
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 98d1b198b2b4..c2c64c1b706f 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -1314,6 +1314,29 @@ watchdog work to be queued by the watchdog timer function, otherwise the NMI
watchdog — if enabled — can detect a hard lockup condition.
+split_lock_mitigate (x86 only)
+==============================
+
+On x86, each "split lock" imposes a system-wide performance penalty. On larger
+systems, large numbers of split locks from unprivileged users can result in
+denials of service to well-behaved and potentially more important users.
+
+The kernel mitigates these bad users by detecting split locks and imposing
+penalties: forcing them to wait and only allowing one core to execute split
+locks at a time.
+
+These mitigations can make those bad applications unbearably slow. Setting
+split_lock_mitigate=0 may restore some application performance, but will also
+increase system exposure to denial of service attacks from split lock users.
+
+= ===================================================================
+0 Disable the mitigation mode - just warns the split lock on kernel log
+ and exposes the system to denials of service from the split lockers.
+1 Enable the mitigation mode (this is the default) - penalizes the split
+ lockers with intentional performance degradation.
+= ===================================================================
+
+
stack_erasing
=============
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 2d7ea5480ec3..427899650483 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1034,8 +1034,32 @@ static const struct {
static struct ratelimit_state bld_ratelimit;
+static unsigned int sysctl_sld_mitigate = 1;
static DEFINE_SEMAPHORE(buslock_sem);
+#ifdef CONFIG_PROC_SYSCTL
+static struct ctl_table sld_sysctls[] = {
+ {
+ .procname = "split_lock_mitigate",
+ .data = &sysctl_sld_mitigate,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_douintvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {}
+};
+
+static int __init sld_mitigate_sysctl_init(void)
+{
+ register_sysctl_init("kernel", sld_sysctls);
+ return 0;
+}
+
+late_initcall(sld_mitigate_sysctl_init);
+#endif
+
static inline bool match_option(const char *arg, int arglen, const char *opt)
{
int len = strlen(opt), ratelimit;
@@ -1146,12 +1170,20 @@ static void split_lock_init(void)
split_lock_verify_msr(sld_state != sld_off);
}
-static void __split_lock_reenable(struct work_struct *work)
+static void __split_lock_reenable_unlock(struct work_struct *work)
{
sld_update_msr(true);
up(&buslock_sem);
}
+static DECLARE_DELAYED_WORK(sl_reenable_unlock, __split_lock_reenable_unlock);
+
+static void __split_lock_reenable(struct work_struct *work)
+{
+ sld_update_msr(true);
+}
+static DECLARE_DELAYED_WORK(sl_reenable, __split_lock_reenable);
+
/*
* If a CPU goes offline with pending delayed work to re-enable split lock
* detection then the delayed work will be executed on some other CPU. That
@@ -1169,10 +1201,9 @@ static int splitlock_cpu_offline(unsigned int cpu)
return 0;
}
-static DECLARE_DELAYED_WORK(split_lock_reenable, __split_lock_reenable);
-
static void split_lock_warn(unsigned long ip)
{
+ struct delayed_work *work;
int cpu;
if (!current->reported_split_lock)
@@ -1180,14 +1211,26 @@ static void split_lock_warn(unsigned long ip)
current->comm, current->pid, ip);
current->reported_split_lock = 1;
- /* misery factor #1, sleep 10ms before trying to execute split lock */
- if (msleep_interruptible(10) > 0)
- return;
- /* Misery factor #2, only allow one buslocked disabled core at a time */
- if (down_interruptible(&buslock_sem) == -EINTR)
- return;
+ if (sysctl_sld_mitigate) {
+ /*
+ * misery factor #1:
+ * sleep 10ms before trying to execute split lock.
+ */
+ if (msleep_interruptible(10) > 0)
+ return;
+ /*
+ * Misery factor #2:
+ * only allow one buslocked disabled core at a time.
+ */
+ if (down_interruptible(&buslock_sem) == -EINTR)
+ return;
+ work = &sl_reenable_unlock;
+ } else {
+ work = &sl_reenable;
+ }
+
cpu = get_cpu();
- schedule_delayed_work_on(cpu, &split_lock_reenable, 2);
+ schedule_delayed_work_on(cpu, work, 2);
/* Disable split lock detection on this CPU to make progress */
sld_update_msr(false);
--
2.38.1
From: Guennadi Liakhovetski <guennadi.liakhovetski(a)linux.intel.com>
If an IPC4 topology contains an unsupported widget, its .module_info
field won't be set, then sof_ipc4_route_setup() will cause a kernel
Oops trying to dereference it. Add a check for such cases.
Cc: stable(a)vger.kernel.org # 6.2
Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski(a)linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi(a)linux.intel.com>
---
Hi Mark,
This patch is generated on top of 6.3-rc4, it will have conflict with asoc-next
because we have ChainDMA scheduled for 6.4 in there.
I should have taken this patch a faster track, but missed it when arranging the
patches, features.
We noticed this when trying to use our development IPC4 topologies with mainline
which does not yet able to handle the process module types (slated fro 6.4).
IPC4 is still evolving so it is not rare that fw/tplg/kernel needs to be
lock-stepped, but NULL pointer dereference should not happen.
This is how the merge conflict resolution should end up between 6.3 and 6.4:
int ret;
/* no route set up if chain DMA is used */
if (src_pipeline->use_chain_dma || sink_pipeline->use_chain_dma) {
if (!src_pipeline->use_chain_dma || !sink_pipeline->use_chain_dma) {
dev_err(sdev->dev,
"use_chain_dma must be set for both src %s and sink %s pipelines\n",
src_widget->widget->name, sink_widget->widget->name);
return -EINVAL;
}
return 0;
}
if (!src_fw_module || !sink_fw_module) {
/* The NULL module will print as "(efault)" */
dev_err(sdev->dev, "source %s or sink %s widget weren't set up properly\n",
src_fw_module->man4_module_entry.name,
sink_fw_module->man4_module_entry.name);
return -ENODEV;
}
sroute->src_queue_id = sof_ipc4_get_queue_id(src_widget, sink_widget,
SOF_PIN_TYPE_SOURCE);
Can you send this patch for 6.3 cycle?
Thank you,
Peter
sound/soc/sof/ipc4-topology.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c
index a623707c8ffc..669b99a4f76e 100644
--- a/sound/soc/sof/ipc4-topology.c
+++ b/sound/soc/sof/ipc4-topology.c
@@ -1805,6 +1805,14 @@ static int sof_ipc4_route_setup(struct snd_sof_dev *sdev, struct snd_sof_route *
u32 header, extension;
int ret;
+ if (!src_fw_module || !sink_fw_module) {
+ /* The NULL module will print as "(efault)" */
+ dev_err(sdev->dev, "source %s or sink %s widget weren't set up properly\n",
+ src_fw_module->man4_module_entry.name,
+ sink_fw_module->man4_module_entry.name);
+ return -ENODEV;
+ }
+
sroute->src_queue_id = sof_ipc4_get_queue_id(src_widget, sink_widget,
SOF_PIN_TYPE_SOURCE);
if (sroute->src_queue_id < 0) {
--
2.40.0
We got a WARNING in ext4_add_complete_io:
==================================================================
WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250
CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85
RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4]
[...]
Call Trace:
<TASK>
ext4_end_bio+0xa8/0x240 [ext4]
bio_endio+0x195/0x310
blk_update_request+0x184/0x770
scsi_end_request+0x2f/0x240
scsi_io_completion+0x75/0x450
scsi_finish_command+0xef/0x160
scsi_complete+0xa3/0x180
blk_complete_reqs+0x60/0x80
blk_done_softirq+0x25/0x40
__do_softirq+0x119/0x4c8
run_ksoftirqd+0x42/0x70
smpboot_thread_fn+0x136/0x3c0
kthread+0x140/0x1a0
ret_from_fork+0x2c/0x50
==================================================================
Above issue may happen as follows:
cpu1 cpu2
----------------------------|----------------------------
mount -o dioread_lock
ext4_writepages
ext4_do_writepages
*if (ext4_should_dioread_nolock(inode))*
// rsv_blocks is not assigned here
mount -o remount,dioread_nolock
ext4_journal_start_with_reserve
__ext4_journal_start
__ext4_journal_start_sb
jbd2__journal_start
*if (rsv_blocks)*
// h_rsv_handle is not initialized here
mpage_map_and_submit_extent
mpage_map_one_extent
dioread_nolock = ext4_should_dioread_nolock(inode)
if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN))
mpd->io_submit.io_end->handle = handle->h_rsv_handle
ext4_set_io_unwritten_flag
io_end->flag |= EXT4_IO_END_UNWRITTEN
// now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag
scsi_finish_command
scsi_io_completion
scsi_io_completion_action
scsi_end_request
blk_update_request
req_bio_endio
bio_endio
bio->bi_end_io > ext4_end_bio
ext4_put_io_end_defer
ext4_add_complete_io
// trigger WARN_ON(!io_end->handle && sbi->s_journal);
The immediate cause of this problem is that ext4_should_dioread_nolock()
function returns inconsistent values in the ext4_do_writepages() and
mpage_map_one_extent(). There are four conditions in this function that
can be changed at mount time to cause this problem. These four conditions
can be divided into two categories:
(1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl
(2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount
The two in the first category have been fixed by commit c8585c6fcaf2
("ext4: fix races between changing inode journal mode and ext4_writepages")
and commit cb85f4d23f79 ("ext4: fix race between writepages and enabling
EXT4_EXTENTS_FL") respectively.
Two cases in the other category have not yet been fixed, and the above
issue is caused by this situation. We refer to the fix for the first
category, when applying options during remount, we grab s_writepages_rwsem
to avoid racing with writepages ops to trigger this problem.
Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io")
Cc: stable(a)vger.kernel.org
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
---
V1->V2:
Grab s_writepages_rwsem unconditionally during remount.
Remove patches 1,2 that are no longer needed.
V2->V3:
Also grab s_writepages_rwsem when restoring options.
fs/ext4/ext4.h | 3 ++-
fs/ext4/super.c | 12 ++++++++++++
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 9b2cfc32cf78..5f5ee0c20673 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1703,7 +1703,8 @@ struct ext4_sb_info {
/*
* Barrier between writepages ops and changing any inode's JOURNAL_DATA
- * or EXTENTS flag.
+ * or EXTENTS flag or between writepages ops and changing DELALLOC or
+ * DIOREAD_NOLOCK mount options on remount.
*/
struct percpu_rw_semaphore s_writepages_rwsem;
struct dax_device *s_daxdev;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index e6d84c1e34a4..8396da483c17 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -6403,7 +6403,16 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
}
+ /*
+ * Changing the DIOREAD_NOLOCK or DELALLOC mount options may cause
+ * two calls to ext4_should_dioread_nolock() to return inconsistent
+ * values, triggering WARN_ON in ext4_add_complete_io(). we grab
+ * here s_writepages_rwsem to avoid race between writepages ops and
+ * remount.
+ */
+ percpu_down_write(&sbi->s_writepages_rwsem);
ext4_apply_options(fc, sb);
+ percpu_up_write(&sbi->s_writepages_rwsem);
if ((old_opts.s_mount_opt & EXT4_MOUNT_JOURNAL_CHECKSUM) ^
test_opt(sb, JOURNAL_CHECKSUM)) {
@@ -6614,6 +6623,7 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
return 0;
restore_opts:
+ percpu_down_write(&sbi->s_writepages_rwsem);
sb->s_flags = old_sb_flags;
sbi->s_mount_opt = old_opts.s_mount_opt;
sbi->s_mount_opt2 = old_opts.s_mount_opt2;
@@ -6622,6 +6632,8 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
sbi->s_commit_interval = old_opts.s_commit_interval;
sbi->s_min_batch_time = old_opts.s_min_batch_time;
sbi->s_max_batch_time = old_opts.s_max_batch_time;
+ percpu_up_write(&sbi->s_writepages_rwsem);
+
if (!test_opt(sb, BLOCK_VALIDITY) && sbi->s_system_blks)
ext4_release_system_zone(sb);
#ifdef CONFIG_QUOTA
--
2.31.1
The patch below does not apply to the 6.2-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.2.y
git checkout FETCH_HEAD
git cherry-pick -x 88b170088ad2c3e27086fe35769aa49f8a512564
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680003625145213(a)kroah.com' --subject-prefix 'PATCH 6.2.y' HEAD^..
Possible dependencies:
88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
aa7f243f32e1 ("zonefs: Separate zone information from inode information")
34422914dc00 ("zonefs: Reduce struct zonefs_inode_info size")
46a9c526eef7 ("zonefs: Simplify IO error handling")
4008e2a0b01a ("zonefs: Reorganize code")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 88b170088ad2c3e27086fe35769aa49f8a512564 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Mon, 20 Mar 2023 22:49:15 +0900
Subject: [PATCH] zonefs: Fix error message in zonefs_file_dio_append()
Since the expected write location in a sequential file is always at the
end of the file (append write), when an invalid write append location is
detected in zonefs_file_dio_append(), print the invalid written location
instead of the expected write location.
Fixes: a608da3bd730 ("zonefs: Detect append writes at invalid locations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani(a)oracle.com>
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index a545a6d9a32e..617e4f9db42e 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -426,7 +426,7 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
if (bio->bi_iter.bi_sector != wpsector) {
zonefs_warn(inode->i_sb,
"Corrupted write pointer %llu for zone at %llu\n",
- wpsector, z->z_sector);
+ bio->bi_iter.bi_sector, z->z_sector);
ret = -EIO;
}
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 88b170088ad2c3e27086fe35769aa49f8a512564
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680003630102245(a)kroah.com' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
aa7f243f32e1 ("zonefs: Separate zone information from inode information")
34422914dc00 ("zonefs: Reduce struct zonefs_inode_info size")
46a9c526eef7 ("zonefs: Simplify IO error handling")
4008e2a0b01a ("zonefs: Reorganize code")
a608da3bd730 ("zonefs: Detect append writes at invalid locations")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 88b170088ad2c3e27086fe35769aa49f8a512564 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Mon, 20 Mar 2023 22:49:15 +0900
Subject: [PATCH] zonefs: Fix error message in zonefs_file_dio_append()
Since the expected write location in a sequential file is always at the
end of the file (append write), when an invalid write append location is
detected in zonefs_file_dio_append(), print the invalid written location
instead of the expected write location.
Fixes: a608da3bd730 ("zonefs: Detect append writes at invalid locations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani(a)oracle.com>
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index a545a6d9a32e..617e4f9db42e 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -426,7 +426,7 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
if (bio->bi_iter.bi_sector != wpsector) {
zonefs_warn(inode->i_sb,
"Corrupted write pointer %llu for zone at %llu\n",
- wpsector, z->z_sector);
+ bio->bi_iter.bi_sector, z->z_sector);
ret = -EIO;
}
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 88b170088ad2c3e27086fe35769aa49f8a512564
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680003631103149(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
aa7f243f32e1 ("zonefs: Separate zone information from inode information")
34422914dc00 ("zonefs: Reduce struct zonefs_inode_info size")
46a9c526eef7 ("zonefs: Simplify IO error handling")
4008e2a0b01a ("zonefs: Reorganize code")
a608da3bd730 ("zonefs: Detect append writes at invalid locations")
db58653ce0c7 ("zonefs: Fix active zone accounting")
7dd12d65ac64 ("zonefs: fix zone report size in __zonefs_io_error()")
8745889a7fd0 ("Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 88b170088ad2c3e27086fe35769aa49f8a512564 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Mon, 20 Mar 2023 22:49:15 +0900
Subject: [PATCH] zonefs: Fix error message in zonefs_file_dio_append()
Since the expected write location in a sequential file is always at the
end of the file (append write), when an invalid write append location is
detected in zonefs_file_dio_append(), print the invalid written location
instead of the expected write location.
Fixes: a608da3bd730 ("zonefs: Detect append writes at invalid locations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani(a)oracle.com>
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index a545a6d9a32e..617e4f9db42e 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -426,7 +426,7 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
if (bio->bi_iter.bi_sector != wpsector) {
zonefs_warn(inode->i_sb,
"Corrupted write pointer %llu for zone at %llu\n",
- wpsector, z->z_sector);
+ bio->bi_iter.bi_sector, z->z_sector);
ret = -EIO;
}
}
SCI IP on RZ/G2L alike SoCs do not need regshift compared to other SCI
IPs on the SH platform. Currently, it does regshift and configuring Rx
wrongly. Drop adding regshift for RZ/G2L alike SoCs.
Fixes: dfc80387aefb ("serial: sh-sci: Compute the regshift value for SCI ports")
Cc: stable(a)vger.kernel.org
Signed-off-by: Biju Das <biju.das.jz(a)bp.renesas.com>
---
v3->v4:
* Updated the fixes tag
* Replaced sci_port->is_rz_sci with dev->dev.of_node as regshift are only needed
for sh770x/sh7750/sh7760, which don't use DT yet.
* Dropped is_rz_sci variable from struct sci_port.
v3:
* New patch.
---
drivers/tty/serial/sh-sci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
index 616041faab55..15954ca3e9dc 100644
--- a/drivers/tty/serial/sh-sci.c
+++ b/drivers/tty/serial/sh-sci.c
@@ -2937,7 +2937,7 @@ static int sci_init_single(struct platform_device *dev,
port->flags = UPF_FIXED_PORT | UPF_BOOT_AUTOCONF | p->flags;
port->fifosize = sci_port->params->fifosize;
- if (port->type == PORT_SCI) {
+ if (port->type == PORT_SCI && !dev->dev.of_node) {
if (sci_port->reg_size >= 0x20)
port->regshift = 2;
else
--
2.25.1
How are you
I want to inform you that I have succeeded in transferring
the huge amount of funds under the cooperation of the new
partner from London and I have written a Bank Draft of $1.9M for
you.
Have you received it? In-case you have not, Contact Mr.
David Lawrence And Ask him for the Bank draft which I kept
for Compensation okay. His email address
(davidllawrence@consultant.com)Phone+:+1(945)212-0126
Mrs. Ester Nelson Philipsxxxxx
Currently, with VHE, KVM enables the EL0 event counting for the
guest on vcpu_load() or KVM enables it as a part of the PMU
register emulation process, when needed. However, in the migration
case (with VHE), the same handling is lacking. So, enable it on the
first KVM_RUN with VHE (after the migration) when needed.
Fixes: d0c94c49792c ("KVM: arm64: Restore PMU configuration on first run")
Cc: stable(a)vger.kernel.org
Signed-off-by: Reiji Watanabe <reijiw(a)google.com>
---
arch/arm64/kvm/pmu-emul.c | 1 +
arch/arm64/kvm/sys_regs.c | 1 -
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index c243b10f3e15..5eca0cdd961d 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -558,6 +558,7 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
for_each_set_bit(i, &mask, 32)
kvm_pmu_set_pmc_value(kvm_vcpu_idx_to_pmc(vcpu, i), 0, true);
}
+ kvm_vcpu_pmu_restore_guest(vcpu);
}
static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1b2c161120be..34688918c811 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -794,7 +794,6 @@ static bool access_pmcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
if (!kvm_supports_32bit_el0())
val |= ARMV8_PMU_PMCR_LC;
kvm_pmu_handle_pmcr(vcpu, val);
- kvm_vcpu_pmu_restore_guest(vcpu);
} else {
/* PMCR.P & PMCR.C are RAZ */
val = __vcpu_sys_reg(vcpu, PMCR_EL0)
--
2.40.0.348.gf938b09366-goog
The quilt patch titled
Subject: mm: kfence: fix handling discontiguous page
has been removed from the -mm tree. Its filename was
mm-kfence-fix-handling-discontiguous-page.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: kfence: fix handling discontiguous page
Date: Thu, 23 Mar 2023 10:50:03 +0800
The struct pages could be discontiguous when the kfence pool is allocated
via alloc_contig_pages() with CONFIG_SPARSEMEM and
!CONFIG_SPARSEMEM_VMEMMAP.
This may result in setting PG_slab and memcg_data to a arbitrary
address (may be not used as a struct page), which in the worst case
might corrupt the kernel.
So the iteration should use nth_page().
Link: https://lkml.kernel.org/r/20230323025003.94447-1-songmuchun@bytedance.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Marco Elver <elver(a)google.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: SeongJae Park <sjpark(a)amazon.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kfence/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/kfence/core.c~mm-kfence-fix-handling-discontiguous-page
+++ a/mm/kfence/core.c
@@ -556,7 +556,7 @@ static unsigned long kfence_init_pool(vo
* enters __slab_free() slow-path.
*/
for (i = 0; i < KFENCE_POOL_SIZE / PAGE_SIZE; i++) {
- struct slab *slab = page_slab(&pages[i]);
+ struct slab *slab = page_slab(nth_page(pages, i));
if (!i || (i % 2))
continue;
@@ -602,7 +602,7 @@ static unsigned long kfence_init_pool(vo
reset_slab:
for (i = 0; i < KFENCE_POOL_SIZE / PAGE_SIZE; i++) {
- struct slab *slab = page_slab(&pages[i]);
+ struct slab *slab = page_slab(nth_page(pages, i));
if (!i || (i % 2))
continue;
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-hugetlb_vmemmap-simplify-hugetlb_vmemmap_init-a-bit.patch
The quilt patch titled
Subject: mm: kfence: fix PG_slab and memcg_data clearing
has been removed from the -mm tree. Its filename was
mm-kfence-fix-pg_slab-and-memcg_data-clearing.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: kfence: fix PG_slab and memcg_data clearing
Date: Mon, 20 Mar 2023 11:00:59 +0800
It does not reset PG_slab and memcg_data when KFENCE fails to initialize
kfence pool at runtime. It is reporting a "Bad page state" message when
kfence pool is freed to buddy. The checking of whether it is a compound
head page seems unnecessary since we already guarantee this when
allocating kfence pool. Remove the check to simplify the code.
Link: https://lkml.kernel.org/r/20230320030059.20189-1-songmuchun@bytedance.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: SeongJae Park <sjpark(a)amazon.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kfence/core.c | 30 +++++++++++++++---------------
1 file changed, 15 insertions(+), 15 deletions(-)
--- a/mm/kfence/core.c~mm-kfence-fix-pg_slab-and-memcg_data-clearing
+++ a/mm/kfence/core.c
@@ -561,10 +561,6 @@ static unsigned long kfence_init_pool(vo
if (!i || (i % 2))
continue;
- /* Verify we do not have a compound head page. */
- if (WARN_ON(compound_head(&pages[i]) != &pages[i]))
- return addr;
-
__folio_set_slab(slab_folio(slab));
#ifdef CONFIG_MEMCG
slab->memcg_data = (unsigned long)&kfence_metadata[i / 2 - 1].objcg |
@@ -597,12 +593,26 @@ static unsigned long kfence_init_pool(vo
/* Protect the right redzone. */
if (unlikely(!kfence_protect(addr + PAGE_SIZE)))
- return addr;
+ goto reset_slab;
addr += 2 * PAGE_SIZE;
}
return 0;
+
+reset_slab:
+ for (i = 0; i < KFENCE_POOL_SIZE / PAGE_SIZE; i++) {
+ struct slab *slab = page_slab(&pages[i]);
+
+ if (!i || (i % 2))
+ continue;
+#ifdef CONFIG_MEMCG
+ slab->memcg_data = 0;
+#endif
+ __folio_clear_slab(slab_folio(slab));
+ }
+
+ return addr;
}
static bool __init kfence_init_pool_early(void)
@@ -632,16 +642,6 @@ static bool __init kfence_init_pool_earl
* fails for the first page, and therefore expect addr==__kfence_pool in
* most failure cases.
*/
- for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) {
- struct slab *slab = virt_to_slab(p);
-
- if (!slab)
- continue;
-#ifdef CONFIG_MEMCG
- slab->memcg_data = 0;
-#endif
- __folio_clear_slab(slab_folio(slab));
- }
memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool));
__kfence_pool = NULL;
return false;
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-hugetlb_vmemmap-simplify-hugetlb_vmemmap_init-a-bit.patch
The quilt patch titled
Subject: fsdax: dedupe should compare the min of two iters' length
has been removed from the -mm tree. Its filename was
fsdax-dedupe-should-compare-the-min-of-two-iters-length.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Shiyang Ruan <ruansy.fnst(a)fujitsu.com>
Subject: fsdax: dedupe should compare the min of two iters' length
Date: Wed, 22 Mar 2023 07:25:58 +0000
In an dedupe comparison iter loop, the length of iomap_iter decreases
because it implies the remaining length after each iteration.
The dedupe command will fail with -EIO if the range is larger than one
page size and not aligned to the page size. Also report warning in dmesg:
[ 4338.498374] ------------[ cut here ]------------
[ 4338.498689] WARNING: CPU: 3 PID: 1415645 at fs/iomap/iter.c:16
...
The compare function should use the min length of the current iters,
not the total length.
Link: https://lkml.kernel.org/r/1679469958-2-1-git-send-email-ruansy.fnst@fujitsu…
Fixes: 0e79e3736d54 ("fsdax: dedupe: iter two files at the same time")
Signed-off-by: Shiyang Ruan <ruansy.fnst(a)fujitsu.com>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/dax.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/dax.c~fsdax-dedupe-should-compare-the-min-of-two-iters-length
+++ a/fs/dax.c
@@ -2027,8 +2027,8 @@ int dax_dedupe_file_range_compare(struct
while ((ret = iomap_iter(&src_iter, ops)) > 0 &&
(ret = iomap_iter(&dst_iter, ops)) > 0) {
- compared = dax_range_compare_iter(&src_iter, &dst_iter, len,
- same);
+ compared = dax_range_compare_iter(&src_iter, &dst_iter,
+ min(src_iter.len, dst_iter.len), same);
if (compared < 0)
return ret;
src_iter.processed = dst_iter.processed = compared;
_
Patches currently in -mm which might be from ruansy.fnst(a)fujitsu.com are
fsdax-force-clear-dirty-mark-if-cow.patch
The quilt patch titled
Subject: fsdax: unshare: zero destination if srcmap is HOLE or UNWRITTEN
has been removed from the -mm tree. Its filename was
fsdax-unshare-zero-destination-if-srcmap-is-hole-or-unwritten.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Shiyang Ruan <ruansy.fnst(a)fujitsu.com>
Subject: fsdax: unshare: zero destination if srcmap is HOLE or UNWRITTEN
Date: Wed, 22 Mar 2023 11:11:09 +0000
unshare copies data from source to destination. But if the source is
HOLE or UNWRITTEN extents, we should zero the destination, otherwise
the HOLE or UNWRITTEN part will be user-visible old data of the new
allocated extent.
Found by running generic/649 while mounting with -o dax=always on pmem.
Link: https://lkml.kernel.org/r/1679483469-2-1-git-send-email-ruansy.fnst@fujitsu…
Fixes: d984648e428b ("fsdax,xfs: port unshare to fsdax")
Signed-off-by: Shiyang Ruan <ruansy.fnst(a)fujitsu.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Darrick J. Wong <djwong(a)kernel.org>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Alistair Popple <apopple(a)nvidia.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/dax.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
--- a/fs/dax.c~fsdax-unshare-zero-destination-if-srcmap-is-hole-or-unwritten
+++ a/fs/dax.c
@@ -1258,15 +1258,20 @@ static s64 dax_unshare_iter(struct iomap
/* don't bother with blocks that are not shared to start with */
if (!(iomap->flags & IOMAP_F_SHARED))
return length;
- /* don't bother with holes or unwritten extents */
- if (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN)
- return length;
id = dax_read_lock();
ret = dax_iomap_direct_access(iomap, pos, length, &daddr, NULL);
if (ret < 0)
goto out_unlock;
+ /* zero the distance if srcmap is HOLE or UNWRITTEN */
+ if (srcmap->flags & IOMAP_F_SHARED || srcmap->type == IOMAP_UNWRITTEN) {
+ memset(daddr, 0, length);
+ dax_flush(iomap->dax_dev, daddr, length);
+ ret = length;
+ goto out_unlock;
+ }
+
ret = dax_iomap_direct_access(srcmap, pos, length, &saddr, NULL);
if (ret < 0)
goto out_unlock;
_
Patches currently in -mm which might be from ruansy.fnst(a)fujitsu.com are
fsdax-force-clear-dirty-mark-if-cow.patch
On Tue, Mar 28, 2023 at 05:36:27PM +0800, Min Li wrote:
> Userspace can guess the id value and try to race oa_config object creation
> with config remove, resulting in a use-after-free if we dereference the
> object after unlocking the metrics_lock. For that reason, unlocking the
> metrics_lock must be done after we are done dereferencing the object.
>
> Signed-off-by: Min Li <lm0963hack(a)gmail.com>
I think we should also add
Fixes: f89823c21224 ("drm/i915/perf: Implement I915_PERF_ADD/REMOVE_CONFIG interface")
Cc: <stable(a)vger.kernel.org> # v4.14+
Andi
> ---
> drivers/gpu/drm/i915/i915_perf.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 824a34ec0b83..93748ca2c5da 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -4634,13 +4634,13 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
> err = oa_config->id;
> goto sysfs_err;
> }
> -
> - mutex_unlock(&perf->metrics_lock);
> + id = oa_config->id;
>
> drm_dbg(&perf->i915->drm,
> "Added config %s id=%i\n", oa_config->uuid, oa_config->id);
> + mutex_unlock(&perf->metrics_lock);
>
> - return oa_config->id;
> + return id;
>
> sysfs_err:
> mutex_unlock(&perf->metrics_lock);
> --
> 2.25.1
The patch titled
Subject: mm: take a page reference when removing device exclusive entries
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-take-a-page-reference-when-removing-device-exclusive-entries.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Alistair Popple <apopple(a)nvidia.com>
Subject: mm: take a page reference when removing device exclusive entries
Date: Tue, 28 Mar 2023 13:14:34 +1100
Device exclusive page table entries are used to prevent CPU access to a
page whilst it is being accessed from a device. Typically this is used to
implement atomic operations when the underlying bus does not support
atomic access. When a CPU thread encounters a device exclusive entry it
locks the page and restores the original entry after calling mmu notifiers
to signal drivers that exclusive access is no longer available.
The device exclusive entry holds a reference to the page making it safe to
access the struct page whilst the entry is present. However the fault
handling code does not hold the PTL when taking the page lock. This means
if there are multiple threads faulting concurrently on the device
exclusive entry one will remove the entry whilst others will wait on the
page lock without holding a reference.
This can lead to threads locking or waiting on a page with a zero
refcount. Whilst mmap_lock prevents the pages getting freed via munmap()
they may still be freed by a migration. This leads to warnings such as
PAGE_FLAGS_CHECK_AT_FREE due to the page being locked when the refcount
drops to zero. Note that during removal of the device exclusive entry the
PTE is currently re-checked under the PTL so no futher bad page accesses
occur once it is locked.
Link: https://lkml.kernel.org/r/20230328021434.292971-1-apopple@nvidia.com
Fixes: b756a3b5e7ea ("mm: device exclusive memory access")
Signed-off-by: Alistair Popple <apopple(a)nvidia.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Cc: Alistair Popple <apopple(a)nvidia.com>
Cc: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
--- a/mm/memory.c~mm-take-a-page-reference-when-removing-device-exclusive-entries
+++ a/mm/memory.c
@@ -3563,8 +3563,19 @@ static vm_fault_t remove_device_exclusiv
struct vm_area_struct *vma = vmf->vma;
struct mmu_notifier_range range;
- if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags))
+ /*
+ * We need a page reference to lock the page because we don't hold the
+ * PTL so a racing thread can remove the device-exclusive entry and
+ * unmap the page. If the page is free the entry must have been
+ * removed already.
+ */
+ if (!get_page_unless_zero(vmf->page))
+ return 0;
+
+ if (!folio_lock_or_retry(folio, vma->vm_mm, vmf->flags)) {
+ put_page(vmf->page);
return VM_FAULT_RETRY;
+ }
mmu_notifier_range_init_owner(&range, MMU_NOTIFY_EXCLUSIVE, 0,
vma->vm_mm, vmf->address & PAGE_MASK,
(vmf->address & PAGE_MASK) + PAGE_SIZE, NULL);
@@ -3577,6 +3588,7 @@ static vm_fault_t remove_device_exclusiv
pte_unmap_unlock(vmf->pte, vmf->ptl);
folio_unlock(folio);
+ put_page(vmf->page);
mmu_notifier_invalidate_range_end(&range);
return 0;
_
Patches currently in -mm which might be from apopple(a)nvidia.com are
mm-take-a-page-reference-when-removing-device-exclusive-entries.patch
From: Eric Biggers <ebiggers(a)google.com>
commit a075bacde257f755bea0e53400c9f1cdd1b8e8e6 upstream.
[Please apply to 5.10-stable and 5.4-stable.]
The full pagecache drop at the end of FS_IOC_ENABLE_VERITY is causing
performance problems and is hindering adoption of fsverity. It was
intended to solve a race condition where unverified pages might be left
in the pagecache. But actually it doesn't solve it fully.
Since the incomplete solution for this race condition has too much
performance impact for it to be worth it, let's remove it for now.
Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl")
Cc: stable(a)vger.kernel.org
Reviewed-by: Victor Hsieh <victorhsieh(a)google.com>
Link: https://lore.kernel.org/r/20230314235332.50270-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
fs/verity/enable.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index 734862e608fd3..5ceae66e1ae02 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -391,25 +391,27 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *uarg)
goto out_drop_write;
err = enable_verity(filp, &arg);
- if (err)
- goto out_allow_write_access;
/*
- * Some pages of the file may have been evicted from pagecache after
- * being used in the Merkle tree construction, then read into pagecache
- * again by another process reading from the file concurrently. Since
- * these pages didn't undergo verification against the file measurement
- * which fs-verity now claims to be enforcing, we have to wipe the
- * pagecache to ensure that all future reads are verified.
+ * We no longer drop the inode's pagecache after enabling verity. This
+ * used to be done to try to avoid a race condition where pages could be
+ * evicted after being used in the Merkle tree construction, then
+ * re-instantiated by a concurrent read. Such pages are unverified, and
+ * the backing storage could have filled them with different content, so
+ * they shouldn't be used to fulfill reads once verity is enabled.
+ *
+ * But, dropping the pagecache has a big performance impact, and it
+ * doesn't fully solve the race condition anyway. So for those reasons,
+ * and also because this race condition isn't very important relatively
+ * speaking (especially for small-ish files, where the chance of a page
+ * being used, evicted, *and* re-instantiated all while enabling verity
+ * is quite small), we no longer drop the inode's pagecache.
*/
- filemap_write_and_wait(inode->i_mapping);
- invalidate_inode_pages2(inode->i_mapping);
/*
* allow_write_access() is needed to pair with deny_write_access().
* Regardless, the filesystem won't allow writing to verity files.
*/
-out_allow_write_access:
allow_write_access(filp);
out_drop_write:
mnt_drop_write_file(filp);
--
2.40.0
From: Eric Biggers <ebiggers(a)google.com>
commit a075bacde257f755bea0e53400c9f1cdd1b8e8e6 upstream.
[Please apply to 6.1-stable and 5.15-stable.]
The full pagecache drop at the end of FS_IOC_ENABLE_VERITY is causing
performance problems and is hindering adoption of fsverity. It was
intended to solve a race condition where unverified pages might be left
in the pagecache. But actually it doesn't solve it fully.
Since the incomplete solution for this race condition has too much
performance impact for it to be worth it, let's remove it for now.
Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl")
Cc: stable(a)vger.kernel.org
Reviewed-by: Victor Hsieh <victorhsieh(a)google.com>
Link: https://lore.kernel.org/r/20230314235332.50270-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
fs/verity/enable.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index df6b499bf6a14..400c264bf8930 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -390,25 +390,27 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *uarg)
goto out_drop_write;
err = enable_verity(filp, &arg);
- if (err)
- goto out_allow_write_access;
/*
- * Some pages of the file may have been evicted from pagecache after
- * being used in the Merkle tree construction, then read into pagecache
- * again by another process reading from the file concurrently. Since
- * these pages didn't undergo verification against the file digest which
- * fs-verity now claims to be enforcing, we have to wipe the pagecache
- * to ensure that all future reads are verified.
+ * We no longer drop the inode's pagecache after enabling verity. This
+ * used to be done to try to avoid a race condition where pages could be
+ * evicted after being used in the Merkle tree construction, then
+ * re-instantiated by a concurrent read. Such pages are unverified, and
+ * the backing storage could have filled them with different content, so
+ * they shouldn't be used to fulfill reads once verity is enabled.
+ *
+ * But, dropping the pagecache has a big performance impact, and it
+ * doesn't fully solve the race condition anyway. So for those reasons,
+ * and also because this race condition isn't very important relatively
+ * speaking (especially for small-ish files, where the chance of a page
+ * being used, evicted, *and* re-instantiated all while enabling verity
+ * is quite small), we no longer drop the inode's pagecache.
*/
- filemap_write_and_wait(inode->i_mapping);
- invalidate_inode_pages2(inode->i_mapping);
/*
* allow_write_access() is needed to pair with deny_write_access().
* Regardless, the filesystem won't allow writing to verity files.
*/
-out_allow_write_access:
allow_write_access(filp);
out_drop_write:
mnt_drop_write_file(filp);
--
2.40.0
The fix for XSA-423 introduced a bug which resulted in loss of network
connection in some configurations.
The first patch is fixing the issue, while the second one is removing
a test which isn't needed. The third patch is making error messages
more uniform.
Changes in V2:
- add patch 3
- comment addressed (patch 1)
Juergen Gross (3):
xen/netback: don't do grant copy across page boundary
xen/netback: remove not needed test in xenvif_tx_build_gops()
xen/netback: use same error messages for same errors
drivers/net/xen-netback/common.h | 2 +-
drivers/net/xen-netback/netback.c | 37 ++++++++++++++++++++++---------
2 files changed, 28 insertions(+), 11 deletions(-)
--
2.35.3
Greg,
Chandan is preparing a series of backports from v5.11 to 5.4.y.
These two backports were selected by Chandan for 5.4.y, but are
currently missing from 5.10.y.
Specifically, patch #2 fixes a problem seen in the wild on UEK
and the UEK kernels already carry this patch.
The patches have gone through the usual xfs test/review routine.
Thanks,
Amir.
Brian Foster (1):
xfs: don't reuse busy extents on extent trim
Darrick J. Wong (1):
xfs: shut down the filesystem if we screw up quota reservation
fs/xfs/xfs_extent_busy.c | 14 --------------
fs/xfs/xfs_trans_dquot.c | 13 ++++++++++---
2 files changed, 10 insertions(+), 17 deletions(-)
--
2.34.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 90410bcf873cf05f54a32183afff0161f44f9715
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '16793134471133(a)kroah.com' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
90410bcf873c ("ocfs2: fix data corruption after failed write")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 90410bcf873cf05f54a32183afff0161f44f9715 Mon Sep 17 00:00:00 2001
From: Jan Kara via Ocfs2-devel <ocfs2-devel(a)oss.oracle.com>
Date: Thu, 2 Mar 2023 16:38:43 +0100
Subject: [PATCH] ocfs2: fix data corruption after failed write
When buffered write fails to copy data into underlying page cache page,
ocfs2_write_end_nolock() just zeroes out and dirties the page. This can
leave dirty page beyond EOF and if page writeback tries to write this page
before write succeeds and expands i_size, page gets into inconsistent
state where page dirty bit is clear but buffer dirty bits stay set
resulting in page data never getting written and so data copied to the
page is lost. Fix the problem by invalidating page beyond EOF after
failed write.
Link: https://lkml.kernel.org/r/20230302153843.18499-1-jack@suse.cz
Fixes: 6dbf7bb55598 ("fs: Don't invalidate page buffers in block_write_full_page()")
Signed-off-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 1d65f6ef00ca..0394505fdce3 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -1977,11 +1977,26 @@ int ocfs2_write_end_nolock(struct address_space *mapping,
}
if (unlikely(copied < len) && wc->w_target_page) {
+ loff_t new_isize;
+
if (!PageUptodate(wc->w_target_page))
copied = 0;
- ocfs2_zero_new_buffers(wc->w_target_page, start+copied,
- start+len);
+ new_isize = max_t(loff_t, i_size_read(inode), pos + copied);
+ if (new_isize > page_offset(wc->w_target_page))
+ ocfs2_zero_new_buffers(wc->w_target_page, start+copied,
+ start+len);
+ else {
+ /*
+ * When page is fully beyond new isize (data copy
+ * failed), do not bother zeroing the page. Invalidate
+ * it instead so that writeback does not get confused
+ * put page & buffer dirty bits into inconsistent
+ * state.
+ */
+ block_invalidate_folio(page_folio(wc->w_target_page),
+ 0, PAGE_SIZE);
+ }
}
if (wc->w_target_page)
flush_dcache_page(wc->w_target_page);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 1c86a188e03156223a34d09ce290b49bd4dd0403
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '16800049118459(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
1c86a188e031 ("mm: kfence: fix using kfence_metadata without initialization in show_object()")
b33f778bba5e ("kfence: alloc kfence_pool after system startup")
698361bca2d5 ("kfence: allow re-enabling KFENCE after system startup")
07e8481d3c38 ("kfence: always use static branches to guard kfence_alloc()")
08f6b10630f2 ("kfence: limit currently covered allocations when pool nearly full")
a9ab52bbcb52 ("kfence: move saving stack trace of allocations into __kfence_alloc()")
9a19aeb56650 ("kfence: count unexpectedly skipped allocations")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1c86a188e03156223a34d09ce290b49bd4dd0403 Mon Sep 17 00:00:00 2001
From: Muchun Song <muchun.song(a)linux.dev>
Date: Wed, 15 Mar 2023 11:44:41 +0800
Subject: [PATCH] mm: kfence: fix using kfence_metadata without initialization
in show_object()
The variable kfence_metadata is initialized in kfence_init_pool(), then,
it is not initialized if kfence is disabled after booting. In this case,
kfence_metadata will be used (e.g. ->lock and ->state fields) without
initialization when reading /sys/kernel/debug/kfence/objects. There will
be a warning if you enable CONFIG_DEBUG_SPINLOCK. Fix it by creating
debugfs files when necessary.
Link: https://lkml.kernel.org/r/20230315034441.44321-1-songmuchun@bytedance.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Tested-by: Marco Elver <elver(a)google.com>
Reviewed-by: Marco Elver <elver(a)google.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: SeongJae Park <sjpark(a)amazon.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 5349c37a5dac..79c94ee55f97 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -726,10 +726,14 @@ static const struct seq_operations objects_sops = {
};
DEFINE_SEQ_ATTRIBUTE(objects);
-static int __init kfence_debugfs_init(void)
+static int kfence_debugfs_init(void)
{
- struct dentry *kfence_dir = debugfs_create_dir("kfence", NULL);
+ struct dentry *kfence_dir;
+ if (!READ_ONCE(kfence_enabled))
+ return 0;
+
+ kfence_dir = debugfs_create_dir("kfence", NULL);
debugfs_create_file("stats", 0444, kfence_dir, NULL, &stats_fops);
debugfs_create_file("objects", 0400, kfence_dir, NULL, &objects_fops);
return 0;
@@ -883,6 +887,8 @@ static int kfence_init_late(void)
}
kfence_init_enable();
+ kfence_debugfs_init();
+
return 0;
}
[Public]
Hi,
For a product that has the IP GC 11.0.4, there is a lone error message that comes up during bootup related to some missing support for KFD on kernel 6.1.20.
kfd kfd: amdgpu: GC IP 0b0004 not supported in kfd
This is fixed by this series of commits that landed in 6.2 that fixes KFD support on this product (and also fixes a warning).
fd72e2cb2f9d ("drm/amdkfd: introduce dummy cache info for property asic")
c0cc999f3c32 ("drm/amdkfd: Fix the warning of array-index-out-of-bounds")
88c21c2b56aa ("drm/amdkfd: add GC 11.0.4 KFD support")
Can you please bring to 6.1.y?
Thanks,
The fix for XSA-423 introduced a bug which resulted in loss of network
connection in some configurations.
The first patch is fixing the issue, while the second one is removing
a test which isn't needed.
Juergen Gross (2):
xen/netback: don't do grant copy across page boundary
xen/netback: remove not needed test in xenvif_tx_build_gops()
drivers/net/xen-netback/common.h | 2 +-
drivers/net/xen-netback/netback.c | 29 +++++++++++++++++++++++------
2 files changed, 24 insertions(+), 7 deletions(-)
--
2.35.3
Make sure to destroy the workqueue also in case of early errors during
bind (e.g. a subcomponent failing to bind).
Since commit c3b790ea07a1 ("drm: Manage drm_mode_config_init with
drmm_") the mode config will be freed when the drm device is released
also when using the legacy interface, but add an explicit cleanup for
consistency and to facilitate backporting.
Fixes: 060530f1ea67 ("drm/msm: use componentised device support")
Cc: stable(a)vger.kernel.org # 3.15
Cc: Rob Clark <robdclark(a)gmail.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/gpu/drm/msm/msm_drv.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index ac3b77dbfacc..73c597565f99 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -458,7 +458,7 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv)
ret = msm_init_vram(ddev);
if (ret)
- goto err_put_dev;
+ goto err_cleanup_mode_config;
/* Bind all our sub-components: */
ret = component_bind_all(dev, ddev);
@@ -563,6 +563,9 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv)
err_deinit_vram:
msm_deinit_vram(ddev);
+err_cleanup_mode_config:
+ drm_mode_config_cleanup(ddev);
+ destroy_workqueue(priv->wq);
err_put_dev:
drm_dev_put(ddev);
--
2.39.2
在 2023/3/5 12:02, Sasha Levin 写道:
> This is a note to let you know that I've just added the patch titled
>
> sched/fair: sanitize vruntime of entity being placed
>
> to the 4.14-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> sched-fair-sanitize-vruntime-of-entity-being-placed.patch
> and it can be found in the queue-4.14 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit 38247e1de3305a6ef644404ac818bc6129440eae
Hi,
This patch has significant impact on the hackbench.throughput [1].
Please don't backport this patch.
[1] https://lore.kernel.org/lkml/202302211553.9738f304-yujie.liu@intel.com/T/#u
Thanks.
Zhang Qiao.
> Author: Zhang Qiao <zhangqiao22(a)huawei.com>
> Date: Mon Jan 30 13:22:16 2023 +0100
>
> sched/fair: sanitize vruntime of entity being placed
>
> [ Upstream commit 829c1651e9c4a6f78398d3e67651cef9bb6b42cc ]
>
> When a scheduling entity is placed onto cfs_rq, its vruntime is pulled
> to the base level (around cfs_rq->min_vruntime), so that the entity
> doesn't gain extra boost when placed backwards.
>
> However, if the entity being placed wasn't executed for a long time, its
> vruntime may get too far behind (e.g. while cfs_rq was executing a
> low-weight hog), which can inverse the vruntime comparison due to s64
> overflow. This results in the entity being placed with its original
> vruntime way forwards, so that it will effectively never get to the cpu.
>
> To prevent that, ignore the vruntime of the entity being placed if it
> didn't execute for much longer than the characteristic sheduler time
> scale.
>
> [rkagan: formatted, adjusted commit log, comments, cutoff value]
> Signed-off-by: Zhang Qiao <zhangqiao22(a)huawei.com>
> Co-developed-by: Roman Kagan <rkagan(a)amazon.de>
> Signed-off-by: Roman Kagan <rkagan(a)amazon.de>
> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
> Link: https://lkml.kernel.org/r/20230130122216.3555094-1-rkagan@amazon.de
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3ff60230710c9..afa21e43477fa 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3615,6 +3615,7 @@ static void
> place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> {
> u64 vruntime = cfs_rq->min_vruntime;
> + u64 sleep_time;
>
> /*
> * The 'current' period is already promised to the current tasks,
> @@ -3639,8 +3640,18 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> vruntime -= thresh;
> }
>
> - /* ensure we never gain time by being placed backwards. */
> - se->vruntime = max_vruntime(se->vruntime, vruntime);
> + /*
> + * Pull vruntime of the entity being placed to the base level of
> + * cfs_rq, to prevent boosting it if placed backwards. If the entity
> + * slept for a long time, don't even try to compare its vruntime with
> + * the base as it may be too far off and the comparison may get
> + * inversed due to s64 overflow.
> + */
> + sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
> + if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
> + se->vruntime = vruntime;
> + else
> + se->vruntime = max_vruntime(se->vruntime, vruntime);
> }
>
> static void check_enqueue_throttle(struct cfs_rq *cfs_rq);
> .
>
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x d9a02e016aaf5a57fb44e9a5e6da8ccd3b9e2e70
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '168000726237133(a)kroah.com' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
d9a02e016aaf ("dm crypt: avoid accessing uninitialized tasklet")
d3703ef33129 ("dm crypt: use in_hardirq() instead of deprecated in_irq()")
c87a95dc28b1 ("dm crypt: defer decryption to a tasklet if interrupts disabled")
8e14f610159d ("dm crypt: do not call bio_endio() from the dm-crypt tasklet")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d9a02e016aaf5a57fb44e9a5e6da8ccd3b9e2e70 Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer(a)kernel.org>
Date: Wed, 8 Mar 2023 14:39:54 -0500
Subject: [PATCH] dm crypt: avoid accessing uninitialized tasklet
When neither "no_read_workqueue" nor "no_write_workqueue" are enabled,
tasklet_trylock() in crypt_dec_pending() may still return false due to
an uninitialized state, and dm-crypt will unnecessarily do io completion
in io_queue workqueue instead of current context.
Fix this by adding an 'in_tasklet' flag to dm_crypt_io struct and
initialize it to false in crypt_io_init(). Set this flag to true in
kcryptd_queue_crypt() before calling tasklet_schedule(). If set
crypt_dec_pending() will punt io completion to a workqueue.
This also nicely avoids the tasklet_trylock/unlock hack when tasklets
aren't in use.
Fixes: 8e14f610159d ("dm crypt: do not call bio_endio() from the dm-crypt tasklet")
Cc: stable(a)vger.kernel.org
Reported-by: Hou Tao <houtao1(a)huawei.com>
Suggested-by: Ignat Korchagin <ignat(a)cloudflare.com>
Reviewed-by: Ignat Korchagin <ignat(a)cloudflare.com>
Signed-off-by: Mike Snitzer <snitzer(a)kernel.org>
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index faba1be572f9..2764b4ea18a3 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -72,7 +72,9 @@ struct dm_crypt_io {
struct crypt_config *cc;
struct bio *base_bio;
u8 *integrity_metadata;
- bool integrity_metadata_from_pool;
+ bool integrity_metadata_from_pool:1;
+ bool in_tasklet:1;
+
struct work_struct work;
struct tasklet_struct tasklet;
@@ -1731,6 +1733,7 @@ static void crypt_io_init(struct dm_crypt_io *io, struct crypt_config *cc,
io->ctx.r.req = NULL;
io->integrity_metadata = NULL;
io->integrity_metadata_from_pool = false;
+ io->in_tasklet = false;
atomic_set(&io->io_pending, 0);
}
@@ -1777,14 +1780,13 @@ static void crypt_dec_pending(struct dm_crypt_io *io)
* our tasklet. In this case we need to delay bio_endio()
* execution to after the tasklet is done and dequeued.
*/
- if (tasklet_trylock(&io->tasklet)) {
- tasklet_unlock(&io->tasklet);
- bio_endio(base_bio);
+ if (io->in_tasklet) {
+ INIT_WORK(&io->work, kcryptd_io_bio_endio);
+ queue_work(cc->io_queue, &io->work);
return;
}
- INIT_WORK(&io->work, kcryptd_io_bio_endio);
- queue_work(cc->io_queue, &io->work);
+ bio_endio(base_bio);
}
/*
@@ -2233,6 +2235,7 @@ static void kcryptd_queue_crypt(struct dm_crypt_io *io)
* it is being executed with irqs disabled.
*/
if (in_hardirq() || irqs_disabled()) {
+ io->in_tasklet = true;
tasklet_init(&io->tasklet, kcryptd_crypt_tasklet, (unsigned long)&io->work);
tasklet_schedule(&io->tasklet);
return;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x 1adab2922c58e7ff4fa9f0b43695079402cce876
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680007196183163(a)kroah.com' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
1adab2922c58 ("bus: imx-weim: fix branch condition evaluates to a garbage value")
e6cb5408289f ("bus: imx-weim: add DT overlay support for WEIM bus")
3b1261fb72c7 ("bus: imx-weim: remove incorrect __init annotations")
4a92f07816ba ("bus: imx-weim: use module_platform_driver()")
77266e722fea ("bus: imx-weim: optionally enable burst clock mode")
c7995bcb36ef ("bus: imx-weim: guard against timing configuration conflicts")
8b8cb52af34d ("bus: imx-weim: support multiple address ranges per child node")
b1a23445364d ("bus: imx-weim: drop unnecessary DT node name NULL check")
d8dfa59f5a51 ("bus: imx-weim: Remove VLA usage")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1adab2922c58e7ff4fa9f0b43695079402cce876 Mon Sep 17 00:00:00 2001
From: Ivan Bornyakov <i.bornyakov(a)metrotek.ru>
Date: Mon, 6 Mar 2023 16:25:26 +0300
Subject: [PATCH] bus: imx-weim: fix branch condition evaluates to a garbage
value
If bus type is other than imx50_weim_devtype and have no child devices,
variable 'ret' in function weim_parse_dt() will not be initialized, but
will be used as branch condition and return value. Fix this by
initializing 'ret' with 0.
This was discovered with help of clang-analyzer, but the situation is
quite possible in real life.
Fixes: 52c47b63412b ("bus: imx-weim: improve error handling upon child probe-failure")
Signed-off-by: Ivan Bornyakov <i.bornyakov(a)metrotek.ru>
Cc: stable(a)vger.kernel.org
Reviewed-by: Fabio Estevam <festevam(a)gmail.com>
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
diff --git a/drivers/bus/imx-weim.c b/drivers/bus/imx-weim.c
index 2a6b4f676458..36d42484142a 100644
--- a/drivers/bus/imx-weim.c
+++ b/drivers/bus/imx-weim.c
@@ -204,8 +204,8 @@ static int weim_parse_dt(struct platform_device *pdev)
const struct of_device_id *of_id = of_match_device(weim_id_table,
&pdev->dev);
const struct imx_weim_devtype *devtype = of_id->data;
+ int ret = 0, have_child = 0;
struct device_node *child;
- int ret, have_child = 0;
struct weim_priv *priv;
void __iomem *base;
u32 reg;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 1adab2922c58e7ff4fa9f0b43695079402cce876
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '16800071954128(a)kroah.com' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
1adab2922c58 ("bus: imx-weim: fix branch condition evaluates to a garbage value")
e6cb5408289f ("bus: imx-weim: add DT overlay support for WEIM bus")
3b1261fb72c7 ("bus: imx-weim: remove incorrect __init annotations")
4a92f07816ba ("bus: imx-weim: use module_platform_driver()")
77266e722fea ("bus: imx-weim: optionally enable burst clock mode")
c7995bcb36ef ("bus: imx-weim: guard against timing configuration conflicts")
8b8cb52af34d ("bus: imx-weim: support multiple address ranges per child node")
b1a23445364d ("bus: imx-weim: drop unnecessary DT node name NULL check")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1adab2922c58e7ff4fa9f0b43695079402cce876 Mon Sep 17 00:00:00 2001
From: Ivan Bornyakov <i.bornyakov(a)metrotek.ru>
Date: Mon, 6 Mar 2023 16:25:26 +0300
Subject: [PATCH] bus: imx-weim: fix branch condition evaluates to a garbage
value
If bus type is other than imx50_weim_devtype and have no child devices,
variable 'ret' in function weim_parse_dt() will not be initialized, but
will be used as branch condition and return value. Fix this by
initializing 'ret' with 0.
This was discovered with help of clang-analyzer, but the situation is
quite possible in real life.
Fixes: 52c47b63412b ("bus: imx-weim: improve error handling upon child probe-failure")
Signed-off-by: Ivan Bornyakov <i.bornyakov(a)metrotek.ru>
Cc: stable(a)vger.kernel.org
Reviewed-by: Fabio Estevam <festevam(a)gmail.com>
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
diff --git a/drivers/bus/imx-weim.c b/drivers/bus/imx-weim.c
index 2a6b4f676458..36d42484142a 100644
--- a/drivers/bus/imx-weim.c
+++ b/drivers/bus/imx-weim.c
@@ -204,8 +204,8 @@ static int weim_parse_dt(struct platform_device *pdev)
const struct of_device_id *of_id = of_match_device(weim_id_table,
&pdev->dev);
const struct imx_weim_devtype *devtype = of_id->data;
+ int ret = 0, have_child = 0;
struct device_node *child;
- int ret, have_child = 0;
struct weim_priv *priv;
void __iomem *base;
u32 reg;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 1adab2922c58e7ff4fa9f0b43695079402cce876
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680007195157101(a)kroah.com' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
1adab2922c58 ("bus: imx-weim: fix branch condition evaluates to a garbage value")
e6cb5408289f ("bus: imx-weim: add DT overlay support for WEIM bus")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1adab2922c58e7ff4fa9f0b43695079402cce876 Mon Sep 17 00:00:00 2001
From: Ivan Bornyakov <i.bornyakov(a)metrotek.ru>
Date: Mon, 6 Mar 2023 16:25:26 +0300
Subject: [PATCH] bus: imx-weim: fix branch condition evaluates to a garbage
value
If bus type is other than imx50_weim_devtype and have no child devices,
variable 'ret' in function weim_parse_dt() will not be initialized, but
will be used as branch condition and return value. Fix this by
initializing 'ret' with 0.
This was discovered with help of clang-analyzer, but the situation is
quite possible in real life.
Fixes: 52c47b63412b ("bus: imx-weim: improve error handling upon child probe-failure")
Signed-off-by: Ivan Bornyakov <i.bornyakov(a)metrotek.ru>
Cc: stable(a)vger.kernel.org
Reviewed-by: Fabio Estevam <festevam(a)gmail.com>
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
diff --git a/drivers/bus/imx-weim.c b/drivers/bus/imx-weim.c
index 2a6b4f676458..36d42484142a 100644
--- a/drivers/bus/imx-weim.c
+++ b/drivers/bus/imx-weim.c
@@ -204,8 +204,8 @@ static int weim_parse_dt(struct platform_device *pdev)
const struct of_device_id *of_id = of_match_device(weim_id_table,
&pdev->dev);
const struct imx_weim_devtype *devtype = of_id->data;
+ int ret = 0, have_child = 0;
struct device_node *child;
- int ret, have_child = 0;
struct weim_priv *priv;
void __iomem *base;
u32 reg;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 1adab2922c58e7ff4fa9f0b43695079402cce876
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '168000719457118(a)kroah.com' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
1adab2922c58 ("bus: imx-weim: fix branch condition evaluates to a garbage value")
e6cb5408289f ("bus: imx-weim: add DT overlay support for WEIM bus")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1adab2922c58e7ff4fa9f0b43695079402cce876 Mon Sep 17 00:00:00 2001
From: Ivan Bornyakov <i.bornyakov(a)metrotek.ru>
Date: Mon, 6 Mar 2023 16:25:26 +0300
Subject: [PATCH] bus: imx-weim: fix branch condition evaluates to a garbage
value
If bus type is other than imx50_weim_devtype and have no child devices,
variable 'ret' in function weim_parse_dt() will not be initialized, but
will be used as branch condition and return value. Fix this by
initializing 'ret' with 0.
This was discovered with help of clang-analyzer, but the situation is
quite possible in real life.
Fixes: 52c47b63412b ("bus: imx-weim: improve error handling upon child probe-failure")
Signed-off-by: Ivan Bornyakov <i.bornyakov(a)metrotek.ru>
Cc: stable(a)vger.kernel.org
Reviewed-by: Fabio Estevam <festevam(a)gmail.com>
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
diff --git a/drivers/bus/imx-weim.c b/drivers/bus/imx-weim.c
index 2a6b4f676458..36d42484142a 100644
--- a/drivers/bus/imx-weim.c
+++ b/drivers/bus/imx-weim.c
@@ -204,8 +204,8 @@ static int weim_parse_dt(struct platform_device *pdev)
const struct of_device_id *of_id = of_match_device(weim_id_table,
&pdev->dev);
const struct imx_weim_devtype *devtype = of_id->data;
+ int ret = 0, have_child = 0;
struct device_node *child;
- int ret, have_child = 0;
struct weim_priv *priv;
void __iomem *base;
u32 reg;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 1adab2922c58e7ff4fa9f0b43695079402cce876
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '16800071933593(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
1adab2922c58 ("bus: imx-weim: fix branch condition evaluates to a garbage value")
e6cb5408289f ("bus: imx-weim: add DT overlay support for WEIM bus")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1adab2922c58e7ff4fa9f0b43695079402cce876 Mon Sep 17 00:00:00 2001
From: Ivan Bornyakov <i.bornyakov(a)metrotek.ru>
Date: Mon, 6 Mar 2023 16:25:26 +0300
Subject: [PATCH] bus: imx-weim: fix branch condition evaluates to a garbage
value
If bus type is other than imx50_weim_devtype and have no child devices,
variable 'ret' in function weim_parse_dt() will not be initialized, but
will be used as branch condition and return value. Fix this by
initializing 'ret' with 0.
This was discovered with help of clang-analyzer, but the situation is
quite possible in real life.
Fixes: 52c47b63412b ("bus: imx-weim: improve error handling upon child probe-failure")
Signed-off-by: Ivan Bornyakov <i.bornyakov(a)metrotek.ru>
Cc: stable(a)vger.kernel.org
Reviewed-by: Fabio Estevam <festevam(a)gmail.com>
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
diff --git a/drivers/bus/imx-weim.c b/drivers/bus/imx-weim.c
index 2a6b4f676458..36d42484142a 100644
--- a/drivers/bus/imx-weim.c
+++ b/drivers/bus/imx-weim.c
@@ -204,8 +204,8 @@ static int weim_parse_dt(struct platform_device *pdev)
const struct of_device_id *of_id = of_match_device(weim_id_table,
&pdev->dev);
const struct imx_weim_devtype *devtype = of_id->data;
+ int ret = 0, have_child = 0;
struct device_node *child;
- int ret, have_child = 0;
struct weim_priv *priv;
void __iomem *base;
u32 reg;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x ba98413bf45edbf33672e2539e321b851b2cfbd1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680007002228182(a)kroah.com' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
ba98413bf45e ("drm/meson: fix missing component unbind on bind errors")
fa747d75f65d ("drm/meson: Fix error handling when afbcd.ops->init fails")
e67f6037ae1b ("drm/meson: split out encoder from meson_dw_hdmi")
d4cb82aa2e4b ("drm/meson: Make use of the helper function devm_platform_ioremap_resourcexxx()")
65a969655cb9 ("drm/meson: Convert to Linux IRQ interfaces")
2b6cb81b95d1 ("drm/meson: dw-hdmi: Enable the iahb clock early enough")
1dfeea904550 ("drm/meson: dw-hdmi: Disable clocks on driver teardown")
b33340e33acd ("drm/meson: dw-hdmi: Ensure that clocks are enabled before touching the TOP registers")
0405f94a1ae0 ("drm/meson: dw-hdmi: Register a callback to disable the regulator")
e78ad18ba365 ("drm/meson: Unbind all connectors on module removal")
fa62ee25280f ("drm/meson: Free RDMA resources after tearing down DRM")
35a395f1134b ("drm: bridge: dw-hdmi: Constify mode argument to dw_hdmi_phy_ops .init()")
af05bba0fbe2 ("drm: bridge: dw-hdmi: Pass drm_display_info to .mode_valid()")
9bc78d6dc818 ("drm: meson: dw-hdmi: Use dw_hdmi context to replace hack")
96591a4b93fb ("drm: bridge: dw-hdmi: Pass private data pointer to .mode_valid()")
b54d830ccb65 ("drm/meson: Set GEM CMA functions with DRM_GEM_CMA_DRIVER_OPS_WITH_DUMB_CREATE")
48ab4b8f236c ("drm/meson: Use GEM CMA object functions")
8976eeee8de0 ("drm/meson: add mode selection limits against specific SoC revisions")
bd9ff7b521a6 ("drm/meson: Drop explicit drm_mode_config_cleanup call")
8496a2172d7c ("drm/meson: Add YUV420 output support")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ba98413bf45edbf33672e2539e321b851b2cfbd1 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Mon, 6 Mar 2023 11:35:33 +0100
Subject: [PATCH] drm/meson: fix missing component unbind on bind errors
Make sure to unbind all subcomponents when binding the aggregate device
fails.
Fixes: a41e82e6c457 ("drm/meson: Add support for components")
Cc: stable(a)vger.kernel.org # 4.12
Cc: Neil Armstrong <neil.armstrong(a)linaro.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Acked-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230306103533.4915-1-johan+l…
diff --git a/drivers/gpu/drm/meson/meson_drv.c b/drivers/gpu/drm/meson/meson_drv.c
index 79bfe3938d3c..7caf937c3c90 100644
--- a/drivers/gpu/drm/meson/meson_drv.c
+++ b/drivers/gpu/drm/meson/meson_drv.c
@@ -325,23 +325,23 @@ static int meson_drv_bind_master(struct device *dev, bool has_components)
ret = meson_encoder_hdmi_init(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_plane_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_overlay_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_crtc_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = request_irq(priv->vsync_irq, meson_irq, 0, drm->driver->name, drm);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
drm_mode_config_reset(drm);
@@ -359,6 +359,9 @@ static int meson_drv_bind_master(struct device *dev, bool has_components)
uninstall_irq:
free_irq(priv->vsync_irq, drm);
+unbind_all:
+ if (has_components)
+ component_unbind_all(drm->dev, drm);
exit_afbcd:
if (priv->afbcd.ops)
priv->afbcd.ops->exit(priv);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x ba98413bf45edbf33672e2539e321b851b2cfbd1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '16800070013222(a)kroah.com' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
ba98413bf45e ("drm/meson: fix missing component unbind on bind errors")
fa747d75f65d ("drm/meson: Fix error handling when afbcd.ops->init fails")
e67f6037ae1b ("drm/meson: split out encoder from meson_dw_hdmi")
d4cb82aa2e4b ("drm/meson: Make use of the helper function devm_platform_ioremap_resourcexxx()")
65a969655cb9 ("drm/meson: Convert to Linux IRQ interfaces")
2b6cb81b95d1 ("drm/meson: dw-hdmi: Enable the iahb clock early enough")
1dfeea904550 ("drm/meson: dw-hdmi: Disable clocks on driver teardown")
b33340e33acd ("drm/meson: dw-hdmi: Ensure that clocks are enabled before touching the TOP registers")
0405f94a1ae0 ("drm/meson: dw-hdmi: Register a callback to disable the regulator")
e78ad18ba365 ("drm/meson: Unbind all connectors on module removal")
fa62ee25280f ("drm/meson: Free RDMA resources after tearing down DRM")
35a395f1134b ("drm: bridge: dw-hdmi: Constify mode argument to dw_hdmi_phy_ops .init()")
af05bba0fbe2 ("drm: bridge: dw-hdmi: Pass drm_display_info to .mode_valid()")
9bc78d6dc818 ("drm: meson: dw-hdmi: Use dw_hdmi context to replace hack")
96591a4b93fb ("drm: bridge: dw-hdmi: Pass private data pointer to .mode_valid()")
b54d830ccb65 ("drm/meson: Set GEM CMA functions with DRM_GEM_CMA_DRIVER_OPS_WITH_DUMB_CREATE")
48ab4b8f236c ("drm/meson: Use GEM CMA object functions")
8976eeee8de0 ("drm/meson: add mode selection limits against specific SoC revisions")
bd9ff7b521a6 ("drm/meson: Drop explicit drm_mode_config_cleanup call")
8496a2172d7c ("drm/meson: Add YUV420 output support")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ba98413bf45edbf33672e2539e321b851b2cfbd1 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Mon, 6 Mar 2023 11:35:33 +0100
Subject: [PATCH] drm/meson: fix missing component unbind on bind errors
Make sure to unbind all subcomponents when binding the aggregate device
fails.
Fixes: a41e82e6c457 ("drm/meson: Add support for components")
Cc: stable(a)vger.kernel.org # 4.12
Cc: Neil Armstrong <neil.armstrong(a)linaro.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Acked-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230306103533.4915-1-johan+l…
diff --git a/drivers/gpu/drm/meson/meson_drv.c b/drivers/gpu/drm/meson/meson_drv.c
index 79bfe3938d3c..7caf937c3c90 100644
--- a/drivers/gpu/drm/meson/meson_drv.c
+++ b/drivers/gpu/drm/meson/meson_drv.c
@@ -325,23 +325,23 @@ static int meson_drv_bind_master(struct device *dev, bool has_components)
ret = meson_encoder_hdmi_init(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_plane_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_overlay_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_crtc_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = request_irq(priv->vsync_irq, meson_irq, 0, drm->driver->name, drm);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
drm_mode_config_reset(drm);
@@ -359,6 +359,9 @@ static int meson_drv_bind_master(struct device *dev, bool has_components)
uninstall_irq:
free_irq(priv->vsync_irq, drm);
+unbind_all:
+ if (has_components)
+ component_unbind_all(drm->dev, drm);
exit_afbcd:
if (priv->afbcd.ops)
priv->afbcd.ops->exit(priv);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x ba98413bf45edbf33672e2539e321b851b2cfbd1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680007000120162(a)kroah.com' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
ba98413bf45e ("drm/meson: fix missing component unbind on bind errors")
fa747d75f65d ("drm/meson: Fix error handling when afbcd.ops->init fails")
e67f6037ae1b ("drm/meson: split out encoder from meson_dw_hdmi")
d4cb82aa2e4b ("drm/meson: Make use of the helper function devm_platform_ioremap_resourcexxx()")
65a969655cb9 ("drm/meson: Convert to Linux IRQ interfaces")
2b6cb81b95d1 ("drm/meson: dw-hdmi: Enable the iahb clock early enough")
1dfeea904550 ("drm/meson: dw-hdmi: Disable clocks on driver teardown")
b33340e33acd ("drm/meson: dw-hdmi: Ensure that clocks are enabled before touching the TOP registers")
0405f94a1ae0 ("drm/meson: dw-hdmi: Register a callback to disable the regulator")
e78ad18ba365 ("drm/meson: Unbind all connectors on module removal")
fa62ee25280f ("drm/meson: Free RDMA resources after tearing down DRM")
35a395f1134b ("drm: bridge: dw-hdmi: Constify mode argument to dw_hdmi_phy_ops .init()")
af05bba0fbe2 ("drm: bridge: dw-hdmi: Pass drm_display_info to .mode_valid()")
9bc78d6dc818 ("drm: meson: dw-hdmi: Use dw_hdmi context to replace hack")
96591a4b93fb ("drm: bridge: dw-hdmi: Pass private data pointer to .mode_valid()")
b54d830ccb65 ("drm/meson: Set GEM CMA functions with DRM_GEM_CMA_DRIVER_OPS_WITH_DUMB_CREATE")
48ab4b8f236c ("drm/meson: Use GEM CMA object functions")
8976eeee8de0 ("drm/meson: add mode selection limits against specific SoC revisions")
bd9ff7b521a6 ("drm/meson: Drop explicit drm_mode_config_cleanup call")
8496a2172d7c ("drm/meson: Add YUV420 output support")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ba98413bf45edbf33672e2539e321b851b2cfbd1 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Mon, 6 Mar 2023 11:35:33 +0100
Subject: [PATCH] drm/meson: fix missing component unbind on bind errors
Make sure to unbind all subcomponents when binding the aggregate device
fails.
Fixes: a41e82e6c457 ("drm/meson: Add support for components")
Cc: stable(a)vger.kernel.org # 4.12
Cc: Neil Armstrong <neil.armstrong(a)linaro.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Acked-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230306103533.4915-1-johan+l…
diff --git a/drivers/gpu/drm/meson/meson_drv.c b/drivers/gpu/drm/meson/meson_drv.c
index 79bfe3938d3c..7caf937c3c90 100644
--- a/drivers/gpu/drm/meson/meson_drv.c
+++ b/drivers/gpu/drm/meson/meson_drv.c
@@ -325,23 +325,23 @@ static int meson_drv_bind_master(struct device *dev, bool has_components)
ret = meson_encoder_hdmi_init(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_plane_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_overlay_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_crtc_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = request_irq(priv->vsync_irq, meson_irq, 0, drm->driver->name, drm);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
drm_mode_config_reset(drm);
@@ -359,6 +359,9 @@ static int meson_drv_bind_master(struct device *dev, bool has_components)
uninstall_irq:
free_irq(priv->vsync_irq, drm);
+unbind_all:
+ if (has_components)
+ component_unbind_all(drm->dev, drm);
exit_afbcd:
if (priv->afbcd.ops)
priv->afbcd.ops->exit(priv);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x ba98413bf45edbf33672e2539e321b851b2cfbd1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680007000176180(a)kroah.com' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
ba98413bf45e ("drm/meson: fix missing component unbind on bind errors")
fa747d75f65d ("drm/meson: Fix error handling when afbcd.ops->init fails")
e67f6037ae1b ("drm/meson: split out encoder from meson_dw_hdmi")
d4cb82aa2e4b ("drm/meson: Make use of the helper function devm_platform_ioremap_resourcexxx()")
65a969655cb9 ("drm/meson: Convert to Linux IRQ interfaces")
2b6cb81b95d1 ("drm/meson: dw-hdmi: Enable the iahb clock early enough")
1dfeea904550 ("drm/meson: dw-hdmi: Disable clocks on driver teardown")
b33340e33acd ("drm/meson: dw-hdmi: Ensure that clocks are enabled before touching the TOP registers")
0405f94a1ae0 ("drm/meson: dw-hdmi: Register a callback to disable the regulator")
e78ad18ba365 ("drm/meson: Unbind all connectors on module removal")
fa62ee25280f ("drm/meson: Free RDMA resources after tearing down DRM")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ba98413bf45edbf33672e2539e321b851b2cfbd1 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan+linaro(a)kernel.org>
Date: Mon, 6 Mar 2023 11:35:33 +0100
Subject: [PATCH] drm/meson: fix missing component unbind on bind errors
Make sure to unbind all subcomponents when binding the aggregate device
fails.
Fixes: a41e82e6c457 ("drm/meson: Add support for components")
Cc: stable(a)vger.kernel.org # 4.12
Cc: Neil Armstrong <neil.armstrong(a)linaro.org>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Acked-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230306103533.4915-1-johan+l…
diff --git a/drivers/gpu/drm/meson/meson_drv.c b/drivers/gpu/drm/meson/meson_drv.c
index 79bfe3938d3c..7caf937c3c90 100644
--- a/drivers/gpu/drm/meson/meson_drv.c
+++ b/drivers/gpu/drm/meson/meson_drv.c
@@ -325,23 +325,23 @@ static int meson_drv_bind_master(struct device *dev, bool has_components)
ret = meson_encoder_hdmi_init(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_plane_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_overlay_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = meson_crtc_create(priv);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
ret = request_irq(priv->vsync_irq, meson_irq, 0, drm->driver->name, drm);
if (ret)
- goto exit_afbcd;
+ goto unbind_all;
drm_mode_config_reset(drm);
@@ -359,6 +359,9 @@ static int meson_drv_bind_master(struct device *dev, bool has_components)
uninstall_irq:
free_irq(priv->vsync_irq, drm);
+unbind_all:
+ if (has_components)
+ component_unbind_all(drm->dev, drm);
exit_afbcd:
if (priv->afbcd.ops)
priv->afbcd.ops->exit(priv);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 5eb39cde1e2487ba5ec1802dc5e58a77e700d99e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680006111249181(a)kroah.com' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
5eb39cde1e24 ("kcsan: avoid passing -g for test")
6fcd4267a840 ("kernel: kcsan: kcsan_test: build without structleak plugin")
a146fed56f8a ("kcsan: Make test follow KUnit style recommendations")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5eb39cde1e2487ba5ec1802dc5e58a77e700d99e Mon Sep 17 00:00:00 2001
From: Marco Elver <elver(a)google.com>
Date: Thu, 16 Mar 2023 23:47:05 +0100
Subject: [PATCH] kcsan: avoid passing -g for test
Nathan reported that when building with GNU as and a version of clang that
defaults to DWARF5, the assembler will complain with:
Error: non-constant .uleb128 is not supported
This is because `-g` defaults to the compiler debug info default. If the
assembler does not support some of the directives used, the above errors
occur. To fix, remove the explicit passing of `-g`.
All the test wants is that stack traces print valid function names, and
debug info is not required for that. (I currently cannot recall why I
added the explicit `-g`.)
Link: https://lkml.kernel.org/r/20230316224705.709984-2-elver@google.com
Fixes: 1fe84fd4a402 ("kcsan: Add test suite")
Signed-off-by: Marco Elver <elver(a)google.com>
Reported-by: Nathan Chancellor <nathan(a)kernel.org>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/kernel/kcsan/Makefile b/kernel/kcsan/Makefile
index 8cf70f068d92..a45f3dfc8d14 100644
--- a/kernel/kcsan/Makefile
+++ b/kernel/kcsan/Makefile
@@ -16,6 +16,6 @@ obj-y := core.o debugfs.o report.o
KCSAN_INSTRUMENT_BARRIERS_selftest.o := y
obj-$(CONFIG_KCSAN_SELFTEST) += selftest.o
-CFLAGS_kcsan_test.o := $(CFLAGS_KCSAN) -g -fno-omit-frame-pointer
+CFLAGS_kcsan_test.o := $(CFLAGS_KCSAN) -fno-omit-frame-pointer
CFLAGS_kcsan_test.o += $(DISABLE_STRUCTLEAK_PLUGIN)
obj-$(CONFIG_KCSAN_KUNIT_TEST) += kcsan_test.o
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 5eb39cde1e2487ba5ec1802dc5e58a77e700d99e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680006110146160(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
5eb39cde1e24 ("kcsan: avoid passing -g for test")
6fcd4267a840 ("kernel: kcsan: kcsan_test: build without structleak plugin")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5eb39cde1e2487ba5ec1802dc5e58a77e700d99e Mon Sep 17 00:00:00 2001
From: Marco Elver <elver(a)google.com>
Date: Thu, 16 Mar 2023 23:47:05 +0100
Subject: [PATCH] kcsan: avoid passing -g for test
Nathan reported that when building with GNU as and a version of clang that
defaults to DWARF5, the assembler will complain with:
Error: non-constant .uleb128 is not supported
This is because `-g` defaults to the compiler debug info default. If the
assembler does not support some of the directives used, the above errors
occur. To fix, remove the explicit passing of `-g`.
All the test wants is that stack traces print valid function names, and
debug info is not required for that. (I currently cannot recall why I
added the explicit `-g`.)
Link: https://lkml.kernel.org/r/20230316224705.709984-2-elver@google.com
Fixes: 1fe84fd4a402 ("kcsan: Add test suite")
Signed-off-by: Marco Elver <elver(a)google.com>
Reported-by: Nathan Chancellor <nathan(a)kernel.org>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/kernel/kcsan/Makefile b/kernel/kcsan/Makefile
index 8cf70f068d92..a45f3dfc8d14 100644
--- a/kernel/kcsan/Makefile
+++ b/kernel/kcsan/Makefile
@@ -16,6 +16,6 @@ obj-y := core.o debugfs.o report.o
KCSAN_INSTRUMENT_BARRIERS_selftest.o := y
obj-$(CONFIG_KCSAN_SELFTEST) += selftest.o
-CFLAGS_kcsan_test.o := $(CFLAGS_KCSAN) -g -fno-omit-frame-pointer
+CFLAGS_kcsan_test.o := $(CFLAGS_KCSAN) -fno-omit-frame-pointer
CFLAGS_kcsan_test.o += $(DISABLE_STRUCTLEAK_PLUGIN)
obj-$(CONFIG_KCSAN_KUNIT_TEST) += kcsan_test.o
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 5eb39cde1e2487ba5ec1802dc5e58a77e700d99e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '16800061091616(a)kroah.com' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
5eb39cde1e24 ("kcsan: avoid passing -g for test")
6fcd4267a840 ("kernel: kcsan: kcsan_test: build without structleak plugin")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5eb39cde1e2487ba5ec1802dc5e58a77e700d99e Mon Sep 17 00:00:00 2001
From: Marco Elver <elver(a)google.com>
Date: Thu, 16 Mar 2023 23:47:05 +0100
Subject: [PATCH] kcsan: avoid passing -g for test
Nathan reported that when building with GNU as and a version of clang that
defaults to DWARF5, the assembler will complain with:
Error: non-constant .uleb128 is not supported
This is because `-g` defaults to the compiler debug info default. If the
assembler does not support some of the directives used, the above errors
occur. To fix, remove the explicit passing of `-g`.
All the test wants is that stack traces print valid function names, and
debug info is not required for that. (I currently cannot recall why I
added the explicit `-g`.)
Link: https://lkml.kernel.org/r/20230316224705.709984-2-elver@google.com
Fixes: 1fe84fd4a402 ("kcsan: Add test suite")
Signed-off-by: Marco Elver <elver(a)google.com>
Reported-by: Nathan Chancellor <nathan(a)kernel.org>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/kernel/kcsan/Makefile b/kernel/kcsan/Makefile
index 8cf70f068d92..a45f3dfc8d14 100644
--- a/kernel/kcsan/Makefile
+++ b/kernel/kcsan/Makefile
@@ -16,6 +16,6 @@ obj-y := core.o debugfs.o report.o
KCSAN_INSTRUMENT_BARRIERS_selftest.o := y
obj-$(CONFIG_KCSAN_SELFTEST) += selftest.o
-CFLAGS_kcsan_test.o := $(CFLAGS_KCSAN) -g -fno-omit-frame-pointer
+CFLAGS_kcsan_test.o := $(CFLAGS_KCSAN) -fno-omit-frame-pointer
CFLAGS_kcsan_test.o += $(DISABLE_STRUCTLEAK_PLUGIN)
obj-$(CONFIG_KCSAN_KUNIT_TEST) += kcsan_test.o
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x a075bacde257f755bea0e53400c9f1cdd1b8e8e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '16800052233714(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
a075bacde257 ("fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a075bacde257f755bea0e53400c9f1cdd1b8e8e6 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 14 Mar 2023 16:31:32 -0700
Subject: [PATCH] fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
The full pagecache drop at the end of FS_IOC_ENABLE_VERITY is causing
performance problems and is hindering adoption of fsverity. It was
intended to solve a race condition where unverified pages might be left
in the pagecache. But actually it doesn't solve it fully.
Since the incomplete solution for this race condition has too much
performance impact for it to be worth it, let's remove it for now.
Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl")
Cc: stable(a)vger.kernel.org
Reviewed-by: Victor Hsieh <victorhsieh(a)google.com>
Link: https://lore.kernel.org/r/20230314235332.50270-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index e13db6507b38..7a0e3a84d370 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -8,7 +8,6 @@
#include "fsverity_private.h"
#include <linux/mount.h>
-#include <linux/pagemap.h>
#include <linux/sched/signal.h>
#include <linux/uaccess.h>
@@ -367,25 +366,27 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *uarg)
goto out_drop_write;
err = enable_verity(filp, &arg);
- if (err)
- goto out_allow_write_access;
/*
- * Some pages of the file may have been evicted from pagecache after
- * being used in the Merkle tree construction, then read into pagecache
- * again by another process reading from the file concurrently. Since
- * these pages didn't undergo verification against the file digest which
- * fs-verity now claims to be enforcing, we have to wipe the pagecache
- * to ensure that all future reads are verified.
+ * We no longer drop the inode's pagecache after enabling verity. This
+ * used to be done to try to avoid a race condition where pages could be
+ * evicted after being used in the Merkle tree construction, then
+ * re-instantiated by a concurrent read. Such pages are unverified, and
+ * the backing storage could have filled them with different content, so
+ * they shouldn't be used to fulfill reads once verity is enabled.
+ *
+ * But, dropping the pagecache has a big performance impact, and it
+ * doesn't fully solve the race condition anyway. So for those reasons,
+ * and also because this race condition isn't very important relatively
+ * speaking (especially for small-ish files, where the chance of a page
+ * being used, evicted, *and* re-instantiated all while enabling verity
+ * is quite small), we no longer drop the inode's pagecache.
*/
- filemap_write_and_wait(inode->i_mapping);
- invalidate_inode_pages2(inode->i_mapping);
/*
* allow_write_access() is needed to pair with deny_write_access().
* Regardless, the filesystem won't allow writing to verity files.
*/
-out_allow_write_access:
allow_write_access(filp);
out_drop_write:
mnt_drop_write_file(filp);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x a075bacde257f755bea0e53400c9f1cdd1b8e8e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680005223233133(a)kroah.com' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
a075bacde257 ("fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a075bacde257f755bea0e53400c9f1cdd1b8e8e6 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 14 Mar 2023 16:31:32 -0700
Subject: [PATCH] fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
The full pagecache drop at the end of FS_IOC_ENABLE_VERITY is causing
performance problems and is hindering adoption of fsverity. It was
intended to solve a race condition where unverified pages might be left
in the pagecache. But actually it doesn't solve it fully.
Since the incomplete solution for this race condition has too much
performance impact for it to be worth it, let's remove it for now.
Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl")
Cc: stable(a)vger.kernel.org
Reviewed-by: Victor Hsieh <victorhsieh(a)google.com>
Link: https://lore.kernel.org/r/20230314235332.50270-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index e13db6507b38..7a0e3a84d370 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -8,7 +8,6 @@
#include "fsverity_private.h"
#include <linux/mount.h>
-#include <linux/pagemap.h>
#include <linux/sched/signal.h>
#include <linux/uaccess.h>
@@ -367,25 +366,27 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *uarg)
goto out_drop_write;
err = enable_verity(filp, &arg);
- if (err)
- goto out_allow_write_access;
/*
- * Some pages of the file may have been evicted from pagecache after
- * being used in the Merkle tree construction, then read into pagecache
- * again by another process reading from the file concurrently. Since
- * these pages didn't undergo verification against the file digest which
- * fs-verity now claims to be enforcing, we have to wipe the pagecache
- * to ensure that all future reads are verified.
+ * We no longer drop the inode's pagecache after enabling verity. This
+ * used to be done to try to avoid a race condition where pages could be
+ * evicted after being used in the Merkle tree construction, then
+ * re-instantiated by a concurrent read. Such pages are unverified, and
+ * the backing storage could have filled them with different content, so
+ * they shouldn't be used to fulfill reads once verity is enabled.
+ *
+ * But, dropping the pagecache has a big performance impact, and it
+ * doesn't fully solve the race condition anyway. So for those reasons,
+ * and also because this race condition isn't very important relatively
+ * speaking (especially for small-ish files, where the chance of a page
+ * being used, evicted, *and* re-instantiated all while enabling verity
+ * is quite small), we no longer drop the inode's pagecache.
*/
- filemap_write_and_wait(inode->i_mapping);
- invalidate_inode_pages2(inode->i_mapping);
/*
* allow_write_access() is needed to pair with deny_write_access().
* Regardless, the filesystem won't allow writing to verity files.
*/
-out_allow_write_access:
allow_write_access(filp);
out_drop_write:
mnt_drop_write_file(filp);
The patch below does not apply to the 6.2-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.2.y
git checkout FETCH_HEAD
git cherry-pick -x a075bacde257f755bea0e53400c9f1cdd1b8e8e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '168000522214951(a)kroah.com' --subject-prefix 'PATCH 6.2.y' HEAD^..
Possible dependencies:
a075bacde257 ("fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a075bacde257f755bea0e53400c9f1cdd1b8e8e6 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 14 Mar 2023 16:31:32 -0700
Subject: [PATCH] fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
The full pagecache drop at the end of FS_IOC_ENABLE_VERITY is causing
performance problems and is hindering adoption of fsverity. It was
intended to solve a race condition where unverified pages might be left
in the pagecache. But actually it doesn't solve it fully.
Since the incomplete solution for this race condition has too much
performance impact for it to be worth it, let's remove it for now.
Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl")
Cc: stable(a)vger.kernel.org
Reviewed-by: Victor Hsieh <victorhsieh(a)google.com>
Link: https://lore.kernel.org/r/20230314235332.50270-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index e13db6507b38..7a0e3a84d370 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -8,7 +8,6 @@
#include "fsverity_private.h"
#include <linux/mount.h>
-#include <linux/pagemap.h>
#include <linux/sched/signal.h>
#include <linux/uaccess.h>
@@ -367,25 +366,27 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *uarg)
goto out_drop_write;
err = enable_verity(filp, &arg);
- if (err)
- goto out_allow_write_access;
/*
- * Some pages of the file may have been evicted from pagecache after
- * being used in the Merkle tree construction, then read into pagecache
- * again by another process reading from the file concurrently. Since
- * these pages didn't undergo verification against the file digest which
- * fs-verity now claims to be enforcing, we have to wipe the pagecache
- * to ensure that all future reads are verified.
+ * We no longer drop the inode's pagecache after enabling verity. This
+ * used to be done to try to avoid a race condition where pages could be
+ * evicted after being used in the Merkle tree construction, then
+ * re-instantiated by a concurrent read. Such pages are unverified, and
+ * the backing storage could have filled them with different content, so
+ * they shouldn't be used to fulfill reads once verity is enabled.
+ *
+ * But, dropping the pagecache has a big performance impact, and it
+ * doesn't fully solve the race condition anyway. So for those reasons,
+ * and also because this race condition isn't very important relatively
+ * speaking (especially for small-ish files, where the chance of a page
+ * being used, evicted, *and* re-instantiated all while enabling verity
+ * is quite small), we no longer drop the inode's pagecache.
*/
- filemap_write_and_wait(inode->i_mapping);
- invalidate_inode_pages2(inode->i_mapping);
/*
* allow_write_access() is needed to pair with deny_write_access().
* Regardless, the filesystem won't allow writing to verity files.
*/
-out_allow_write_access:
allow_write_access(filp);
out_drop_write:
mnt_drop_write_file(filp);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x fa2068d7e922b434eba5bfb0131e6d39febfdb48
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '168000508916643(a)kroah.com' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
fa2068d7e922 ("btrfs: zoned: count fresh BG region as zone unusable")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fa2068d7e922b434eba5bfb0131e6d39febfdb48 Mon Sep 17 00:00:00 2001
From: Naohiro Aota <naohiro.aota(a)wdc.com>
Date: Mon, 13 Mar 2023 16:06:13 +0900
Subject: [PATCH] btrfs: zoned: count fresh BG region as zone unusable
The naming of space_info->active_total_bytes is misleading. It counts
not only active block groups but also full ones which are previously
active but now inactive. That confusion results in a bug not counting
the full BGs into active_total_bytes on mount time.
For a background, there are three kinds of block groups in terms of
activation.
1. Block groups never activated
2. Block groups currently active
3. Block groups previously active and currently inactive (due to fully
written or zone finish)
What we really wanted to exclude from "total_bytes" is the total size of
BGs #1. They seem empty and allocatable but since they are not activated,
we cannot rely on them to do the space reservation.
And, since BGs #1 never get activated, they should have no "used",
"reserved" and "pinned" bytes.
OTOH, BGs #3 can be counted in the "total", since they are already full
we cannot allocate from them anyway. For them, "total_bytes == used +
reserved + pinned + zone_unusable" should hold.
Tracking #2 and #3 as "active_total_bytes" (current implementation) is
confusing. And, tracking #1 and subtract that properly from "total_bytes"
every time you need space reservation is cumbersome.
Instead, we can count the whole region of a newly allocated block group as
zone_unusable. Then, once that block group is activated, release
[0 .. zone_capacity] from the zone_unusable counters. With this, we can
eliminate the confusing ->active_total_bytes and the code will be common
among regular and the zoned mode. Also, no additional counter is needed
with this approach.
Fixes: 6a921de58992 ("btrfs: zoned: introduce space_info->active_total_bytes")
CC: stable(a)vger.kernel.org # 6.1+
Signed-off-by: Naohiro Aota <naohiro.aota(a)wdc.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 0d250d052487..d84cef89cdff 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -2693,8 +2693,13 @@ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold);
spin_lock(&ctl->tree_lock);
+ /* Count initial region as zone_unusable until it gets activated. */
if (!used)
to_free = size;
+ else if (initial &&
+ test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
+ (block_group->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM)))
+ to_free = 0;
else if (initial)
to_free = block_group->zone_capacity;
else if (offset >= block_group->alloc_offset)
@@ -2722,7 +2727,8 @@ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
reclaimable_unusable = block_group->zone_unusable -
(block_group->length - block_group->zone_capacity);
/* All the region is now unusable. Mark it as unused and reclaim */
- if (block_group->zone_unusable == block_group->length) {
+ if (block_group->zone_unusable == block_group->length &&
+ block_group->alloc_offset) {
btrfs_mark_bg_unused(block_group);
} else if (bg_reclaim_threshold &&
reclaimable_unusable >=
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 22d1e930a916..6828712578ca 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1580,9 +1580,19 @@ void btrfs_calc_zone_unusable(struct btrfs_block_group *cache)
return;
WARN_ON(cache->bytes_super != 0);
- unusable = (cache->alloc_offset - cache->used) +
- (cache->length - cache->zone_capacity);
- free = cache->zone_capacity - cache->alloc_offset;
+
+ /* Check for block groups never get activated */
+ if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &cache->fs_info->flags) &&
+ cache->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM) &&
+ !test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags) &&
+ cache->alloc_offset == 0) {
+ unusable = cache->length;
+ free = 0;
+ } else {
+ unusable = (cache->alloc_offset - cache->used) +
+ (cache->length - cache->zone_capacity);
+ free = cache->zone_capacity - cache->alloc_offset;
+ }
/* We only need ->free_space in ALLOC_SEQ block groups */
cache->cached = BTRFS_CACHE_FINISHED;
@@ -1901,7 +1911,11 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
/* Successfully activated all the zones */
set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags);
- space_info->active_total_bytes += block_group->length;
+ WARN_ON(block_group->alloc_offset != 0);
+ if (block_group->zone_unusable == block_group->length) {
+ block_group->zone_unusable = block_group->length - block_group->zone_capacity;
+ space_info->bytes_zone_unusable -= block_group->zone_capacity;
+ }
spin_unlock(&block_group->lock);
btrfs_try_granting_tickets(fs_info, space_info);
spin_unlock(&space_info->lock);
@@ -2265,7 +2279,7 @@ int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info)
u64 avail;
spin_lock(&block_group->lock);
- if (block_group->reserved ||
+ if (block_group->reserved || block_group->alloc_offset == 0 ||
(block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM)) {
spin_unlock(&block_group->lock);
continue;
The patch below does not apply to the 6.2-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.2.y
git checkout FETCH_HEAD
git cherry-pick -x fa2068d7e922b434eba5bfb0131e6d39febfdb48
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '168000508740232(a)kroah.com' --subject-prefix 'PATCH 6.2.y' HEAD^..
Possible dependencies:
fa2068d7e922 ("btrfs: zoned: count fresh BG region as zone unusable")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fa2068d7e922b434eba5bfb0131e6d39febfdb48 Mon Sep 17 00:00:00 2001
From: Naohiro Aota <naohiro.aota(a)wdc.com>
Date: Mon, 13 Mar 2023 16:06:13 +0900
Subject: [PATCH] btrfs: zoned: count fresh BG region as zone unusable
The naming of space_info->active_total_bytes is misleading. It counts
not only active block groups but also full ones which are previously
active but now inactive. That confusion results in a bug not counting
the full BGs into active_total_bytes on mount time.
For a background, there are three kinds of block groups in terms of
activation.
1. Block groups never activated
2. Block groups currently active
3. Block groups previously active and currently inactive (due to fully
written or zone finish)
What we really wanted to exclude from "total_bytes" is the total size of
BGs #1. They seem empty and allocatable but since they are not activated,
we cannot rely on them to do the space reservation.
And, since BGs #1 never get activated, they should have no "used",
"reserved" and "pinned" bytes.
OTOH, BGs #3 can be counted in the "total", since they are already full
we cannot allocate from them anyway. For them, "total_bytes == used +
reserved + pinned + zone_unusable" should hold.
Tracking #2 and #3 as "active_total_bytes" (current implementation) is
confusing. And, tracking #1 and subtract that properly from "total_bytes"
every time you need space reservation is cumbersome.
Instead, we can count the whole region of a newly allocated block group as
zone_unusable. Then, once that block group is activated, release
[0 .. zone_capacity] from the zone_unusable counters. With this, we can
eliminate the confusing ->active_total_bytes and the code will be common
among regular and the zoned mode. Also, no additional counter is needed
with this approach.
Fixes: 6a921de58992 ("btrfs: zoned: introduce space_info->active_total_bytes")
CC: stable(a)vger.kernel.org # 6.1+
Signed-off-by: Naohiro Aota <naohiro.aota(a)wdc.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 0d250d052487..d84cef89cdff 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -2693,8 +2693,13 @@ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold);
spin_lock(&ctl->tree_lock);
+ /* Count initial region as zone_unusable until it gets activated. */
if (!used)
to_free = size;
+ else if (initial &&
+ test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) &&
+ (block_group->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM)))
+ to_free = 0;
else if (initial)
to_free = block_group->zone_capacity;
else if (offset >= block_group->alloc_offset)
@@ -2722,7 +2727,8 @@ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
reclaimable_unusable = block_group->zone_unusable -
(block_group->length - block_group->zone_capacity);
/* All the region is now unusable. Mark it as unused and reclaim */
- if (block_group->zone_unusable == block_group->length) {
+ if (block_group->zone_unusable == block_group->length &&
+ block_group->alloc_offset) {
btrfs_mark_bg_unused(block_group);
} else if (bg_reclaim_threshold &&
reclaimable_unusable >=
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 22d1e930a916..6828712578ca 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1580,9 +1580,19 @@ void btrfs_calc_zone_unusable(struct btrfs_block_group *cache)
return;
WARN_ON(cache->bytes_super != 0);
- unusable = (cache->alloc_offset - cache->used) +
- (cache->length - cache->zone_capacity);
- free = cache->zone_capacity - cache->alloc_offset;
+
+ /* Check for block groups never get activated */
+ if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &cache->fs_info->flags) &&
+ cache->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM) &&
+ !test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags) &&
+ cache->alloc_offset == 0) {
+ unusable = cache->length;
+ free = 0;
+ } else {
+ unusable = (cache->alloc_offset - cache->used) +
+ (cache->length - cache->zone_capacity);
+ free = cache->zone_capacity - cache->alloc_offset;
+ }
/* We only need ->free_space in ALLOC_SEQ block groups */
cache->cached = BTRFS_CACHE_FINISHED;
@@ -1901,7 +1911,11 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
/* Successfully activated all the zones */
set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags);
- space_info->active_total_bytes += block_group->length;
+ WARN_ON(block_group->alloc_offset != 0);
+ if (block_group->zone_unusable == block_group->length) {
+ block_group->zone_unusable = block_group->length - block_group->zone_capacity;
+ space_info->bytes_zone_unusable -= block_group->zone_capacity;
+ }
spin_unlock(&block_group->lock);
btrfs_try_granting_tickets(fs_info, space_info);
spin_unlock(&space_info->lock);
@@ -2265,7 +2279,7 @@ int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info)
u64 avail;
spin_lock(&block_group->lock);
- if (block_group->reserved ||
+ if (block_group->reserved || block_group->alloc_offset == 0 ||
(block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM)) {
spin_unlock(&block_group->lock);
continue;
During warm reset device->fw_client is set to NULL. If a bus driver is
registered after this NULL setting and before new firmware clients are
enumerated by ISHTP, kernel panic will result in the function
ishtp_cl_bus_match(). This is because of reference to
device->fw_client->props.protocol_name.
ISH firmware after getting successfully loaded, sends a warm reset
notification to remove all clients from the bus and sets
device->fw_client to NULL. Until kernel v5.15, all enabled ISHTP kernel
module drivers were loaded right after any of the first ISHTP device was
registered, regardless of whether it was a matched or an unmatched
device. This resulted in all drivers getting registered much before the
warm reset notification from ISH.
Starting kernel v5.16, this issue got exposed after the change was
introduced to load only bus drivers for the respective matching devices.
In this scenario, cros_ec_ishtp device and cros_ec_ishtp driver are
registered after the warm reset device fw_client NULL setting.
cros_ec_ishtp driver_register() triggers the callback to
ishtp_cl_bus_match() to match ISHTP driver to the device and causes kernel
panic in guid_equal() when dereferencing fw_client NULL pointer to get
protocol_name.
Fixes: f155dfeaa4ee ("platform/x86: isthp_eclite: only load for matching devices")
Fixes: facfe0a4fdce ("platform/chrome: chros_ec_ishtp: only load for matching devices")
Fixes: 0d0cccc0fd83 ("HID: intel-ish-hid: hid-client: only load for matching devices")
Fixes: 44e2a58cb880 ("HID: intel-ish-hid: fw-loader: only load for matching devices")
Cc: <stable(a)vger.kernel.org> # 5.16+
Signed-off-by: Tanu Malhotra <tanu.malhotra(a)intel.com>
Tested-by: Shaunak Saha <shaunak.saha(a)intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
---
v2:
- Cleaned up coding indentation
- Updated commit description
- Added stable mailing list to Cc
---
drivers/hid/intel-ish-hid/ishtp/bus.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hid/intel-ish-hid/ishtp/bus.c b/drivers/hid/intel-ish-hid/ishtp/bus.c
index 81385ab37fa9..7fc738a22375 100644
--- a/drivers/hid/intel-ish-hid/ishtp/bus.c
+++ b/drivers/hid/intel-ish-hid/ishtp/bus.c
@@ -241,8 +241,8 @@ static int ishtp_cl_bus_match(struct device *dev, struct device_driver *drv)
struct ishtp_cl_device *device = to_ishtp_cl_device(dev);
struct ishtp_cl_driver *driver = to_ishtp_cl_driver(drv);
- return guid_equal(&driver->id[0].guid,
- &device->fw_client->props.protocol_name);
+ return(device->fw_client ? guid_equal(&driver->id[0].guid,
+ &device->fw_client->props.protocol_name) : 0);
}
/**
base-commit: 1e760fa3596e8c7f08412712c168288b79670d78
--
2.34.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 0482c34ec6f8557e06cd0f8e2d0e20e8ede6a22c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680004884147142(a)kroah.com' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
0482c34ec6f8 ("usb: ucsi: Fix ucsi->connector race")
f87fb985452a ("usb: ucsi: Fix NULL pointer deref in ucsi_connector_change()")
924fb3ec50f5 ("Merge 6.2-rc7 into usb-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0482c34ec6f8557e06cd0f8e2d0e20e8ede6a22c Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Wed, 8 Mar 2023 16:42:43 +0100
Subject: [PATCH] usb: ucsi: Fix ucsi->connector race
ucsi_init() which runs from a workqueue sets ucsi->connector and
on an error will clear it again.
ucsi->connector gets dereferenced by ucsi_resume(), this checks for
ucsi->connector being NULL in case ucsi_init() has not finished yet;
or in case ucsi_init() has failed.
ucsi_init() setting ucsi->connector and then clearing it again on
an error creates a race where the check in ucsi_resume() may pass,
only to have ucsi->connector free-ed underneath it when ucsi_init()
hits an error.
Fix this race by making ucsi_init() store the connector array in
a local variable and only assign it to ucsi->connector on success.
Fixes: bdc62f2bae8f ("usb: typec: ucsi: Simplified registration and I/O API")
Cc: stable(a)vger.kernel.org
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Link: https://lore.kernel.org/r/20230308154244.722337-3-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index 0623861c597b..8d1baf28df55 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -1125,12 +1125,11 @@ static struct fwnode_handle *ucsi_find_fwnode(struct ucsi_connector *con)
return NULL;
}
-static int ucsi_register_port(struct ucsi *ucsi, int index)
+static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con)
{
struct usb_power_delivery_desc desc = { ucsi->cap.pd_version};
struct usb_power_delivery_capabilities_desc pd_caps;
struct usb_power_delivery_capabilities *pd_cap;
- struct ucsi_connector *con = &ucsi->connector[index];
struct typec_capability *cap = &con->typec_cap;
enum typec_accessory *accessory = cap->accessory;
enum usb_role u_role = USB_ROLE_NONE;
@@ -1151,7 +1150,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
init_completion(&con->complete);
mutex_init(&con->lock);
INIT_LIST_HEAD(&con->partner_tasks);
- con->num = index + 1;
con->ucsi = ucsi;
cap->fwnode = ucsi_find_fwnode(con);
@@ -1328,7 +1326,7 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
*/
static int ucsi_init(struct ucsi *ucsi)
{
- struct ucsi_connector *con;
+ struct ucsi_connector *con, *connector;
u64 command, ntfy;
int ret;
int i;
@@ -1359,16 +1357,16 @@ static int ucsi_init(struct ucsi *ucsi)
}
/* Allocate the connectors. Released in ucsi_unregister() */
- ucsi->connector = kcalloc(ucsi->cap.num_connectors + 1,
- sizeof(*ucsi->connector), GFP_KERNEL);
- if (!ucsi->connector) {
+ connector = kcalloc(ucsi->cap.num_connectors + 1, sizeof(*connector), GFP_KERNEL);
+ if (!connector) {
ret = -ENOMEM;
goto err_reset;
}
/* Register all connectors */
for (i = 0; i < ucsi->cap.num_connectors; i++) {
- ret = ucsi_register_port(ucsi, i);
+ connector[i].num = i + 1;
+ ret = ucsi_register_port(ucsi, &connector[i]);
if (ret)
goto err_unregister;
}
@@ -1380,11 +1378,12 @@ static int ucsi_init(struct ucsi *ucsi)
if (ret < 0)
goto err_unregister;
+ ucsi->connector = connector;
ucsi->ntfy = ntfy;
return 0;
err_unregister:
- for (con = ucsi->connector; con->port; con++) {
+ for (con = connector; con->port; con++) {
ucsi_unregister_partner(con);
ucsi_unregister_altmodes(con, UCSI_RECIPIENT_CON);
ucsi_unregister_port_psy(con);
@@ -1400,10 +1399,7 @@ static int ucsi_init(struct ucsi *ucsi)
typec_unregister_port(con->port);
con->port = NULL;
}
-
- kfree(ucsi->connector);
- ucsi->connector = NULL;
-
+ kfree(connector);
err_reset:
memset(&ucsi->cap, 0, sizeof(ucsi->cap));
ucsi_reset_ppm(ucsi);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 0482c34ec6f8557e06cd0f8e2d0e20e8ede6a22c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680004883221229(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
0482c34ec6f8 ("usb: ucsi: Fix ucsi->connector race")
f87fb985452a ("usb: ucsi: Fix NULL pointer deref in ucsi_connector_change()")
924fb3ec50f5 ("Merge 6.2-rc7 into usb-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0482c34ec6f8557e06cd0f8e2d0e20e8ede6a22c Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Wed, 8 Mar 2023 16:42:43 +0100
Subject: [PATCH] usb: ucsi: Fix ucsi->connector race
ucsi_init() which runs from a workqueue sets ucsi->connector and
on an error will clear it again.
ucsi->connector gets dereferenced by ucsi_resume(), this checks for
ucsi->connector being NULL in case ucsi_init() has not finished yet;
or in case ucsi_init() has failed.
ucsi_init() setting ucsi->connector and then clearing it again on
an error creates a race where the check in ucsi_resume() may pass,
only to have ucsi->connector free-ed underneath it when ucsi_init()
hits an error.
Fix this race by making ucsi_init() store the connector array in
a local variable and only assign it to ucsi->connector on success.
Fixes: bdc62f2bae8f ("usb: typec: ucsi: Simplified registration and I/O API")
Cc: stable(a)vger.kernel.org
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Link: https://lore.kernel.org/r/20230308154244.722337-3-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index 0623861c597b..8d1baf28df55 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -1125,12 +1125,11 @@ static struct fwnode_handle *ucsi_find_fwnode(struct ucsi_connector *con)
return NULL;
}
-static int ucsi_register_port(struct ucsi *ucsi, int index)
+static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con)
{
struct usb_power_delivery_desc desc = { ucsi->cap.pd_version};
struct usb_power_delivery_capabilities_desc pd_caps;
struct usb_power_delivery_capabilities *pd_cap;
- struct ucsi_connector *con = &ucsi->connector[index];
struct typec_capability *cap = &con->typec_cap;
enum typec_accessory *accessory = cap->accessory;
enum usb_role u_role = USB_ROLE_NONE;
@@ -1151,7 +1150,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
init_completion(&con->complete);
mutex_init(&con->lock);
INIT_LIST_HEAD(&con->partner_tasks);
- con->num = index + 1;
con->ucsi = ucsi;
cap->fwnode = ucsi_find_fwnode(con);
@@ -1328,7 +1326,7 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
*/
static int ucsi_init(struct ucsi *ucsi)
{
- struct ucsi_connector *con;
+ struct ucsi_connector *con, *connector;
u64 command, ntfy;
int ret;
int i;
@@ -1359,16 +1357,16 @@ static int ucsi_init(struct ucsi *ucsi)
}
/* Allocate the connectors. Released in ucsi_unregister() */
- ucsi->connector = kcalloc(ucsi->cap.num_connectors + 1,
- sizeof(*ucsi->connector), GFP_KERNEL);
- if (!ucsi->connector) {
+ connector = kcalloc(ucsi->cap.num_connectors + 1, sizeof(*connector), GFP_KERNEL);
+ if (!connector) {
ret = -ENOMEM;
goto err_reset;
}
/* Register all connectors */
for (i = 0; i < ucsi->cap.num_connectors; i++) {
- ret = ucsi_register_port(ucsi, i);
+ connector[i].num = i + 1;
+ ret = ucsi_register_port(ucsi, &connector[i]);
if (ret)
goto err_unregister;
}
@@ -1380,11 +1378,12 @@ static int ucsi_init(struct ucsi *ucsi)
if (ret < 0)
goto err_unregister;
+ ucsi->connector = connector;
ucsi->ntfy = ntfy;
return 0;
err_unregister:
- for (con = ucsi->connector; con->port; con++) {
+ for (con = connector; con->port; con++) {
ucsi_unregister_partner(con);
ucsi_unregister_altmodes(con, UCSI_RECIPIENT_CON);
ucsi_unregister_port_psy(con);
@@ -1400,10 +1399,7 @@ static int ucsi_init(struct ucsi *ucsi)
typec_unregister_port(con->port);
con->port = NULL;
}
-
- kfree(ucsi->connector);
- ucsi->connector = NULL;
-
+ kfree(connector);
err_reset:
memset(&ucsi->cap, 0, sizeof(ucsi->cap));
ucsi_reset_ppm(ucsi);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 0482c34ec6f8557e06cd0f8e2d0e20e8ede6a22c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '16800048821504(a)kroah.com' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
0482c34ec6f8 ("usb: ucsi: Fix ucsi->connector race")
f87fb985452a ("usb: ucsi: Fix NULL pointer deref in ucsi_connector_change()")
924fb3ec50f5 ("Merge 6.2-rc7 into usb-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0482c34ec6f8557e06cd0f8e2d0e20e8ede6a22c Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Wed, 8 Mar 2023 16:42:43 +0100
Subject: [PATCH] usb: ucsi: Fix ucsi->connector race
ucsi_init() which runs from a workqueue sets ucsi->connector and
on an error will clear it again.
ucsi->connector gets dereferenced by ucsi_resume(), this checks for
ucsi->connector being NULL in case ucsi_init() has not finished yet;
or in case ucsi_init() has failed.
ucsi_init() setting ucsi->connector and then clearing it again on
an error creates a race where the check in ucsi_resume() may pass,
only to have ucsi->connector free-ed underneath it when ucsi_init()
hits an error.
Fix this race by making ucsi_init() store the connector array in
a local variable and only assign it to ucsi->connector on success.
Fixes: bdc62f2bae8f ("usb: typec: ucsi: Simplified registration and I/O API")
Cc: stable(a)vger.kernel.org
Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Link: https://lore.kernel.org/r/20230308154244.722337-3-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index 0623861c597b..8d1baf28df55 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -1125,12 +1125,11 @@ static struct fwnode_handle *ucsi_find_fwnode(struct ucsi_connector *con)
return NULL;
}
-static int ucsi_register_port(struct ucsi *ucsi, int index)
+static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con)
{
struct usb_power_delivery_desc desc = { ucsi->cap.pd_version};
struct usb_power_delivery_capabilities_desc pd_caps;
struct usb_power_delivery_capabilities *pd_cap;
- struct ucsi_connector *con = &ucsi->connector[index];
struct typec_capability *cap = &con->typec_cap;
enum typec_accessory *accessory = cap->accessory;
enum usb_role u_role = USB_ROLE_NONE;
@@ -1151,7 +1150,6 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
init_completion(&con->complete);
mutex_init(&con->lock);
INIT_LIST_HEAD(&con->partner_tasks);
- con->num = index + 1;
con->ucsi = ucsi;
cap->fwnode = ucsi_find_fwnode(con);
@@ -1328,7 +1326,7 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
*/
static int ucsi_init(struct ucsi *ucsi)
{
- struct ucsi_connector *con;
+ struct ucsi_connector *con, *connector;
u64 command, ntfy;
int ret;
int i;
@@ -1359,16 +1357,16 @@ static int ucsi_init(struct ucsi *ucsi)
}
/* Allocate the connectors. Released in ucsi_unregister() */
- ucsi->connector = kcalloc(ucsi->cap.num_connectors + 1,
- sizeof(*ucsi->connector), GFP_KERNEL);
- if (!ucsi->connector) {
+ connector = kcalloc(ucsi->cap.num_connectors + 1, sizeof(*connector), GFP_KERNEL);
+ if (!connector) {
ret = -ENOMEM;
goto err_reset;
}
/* Register all connectors */
for (i = 0; i < ucsi->cap.num_connectors; i++) {
- ret = ucsi_register_port(ucsi, i);
+ connector[i].num = i + 1;
+ ret = ucsi_register_port(ucsi, &connector[i]);
if (ret)
goto err_unregister;
}
@@ -1380,11 +1378,12 @@ static int ucsi_init(struct ucsi *ucsi)
if (ret < 0)
goto err_unregister;
+ ucsi->connector = connector;
ucsi->ntfy = ntfy;
return 0;
err_unregister:
- for (con = ucsi->connector; con->port; con++) {
+ for (con = connector; con->port; con++) {
ucsi_unregister_partner(con);
ucsi_unregister_altmodes(con, UCSI_RECIPIENT_CON);
ucsi_unregister_port_psy(con);
@@ -1400,10 +1399,7 @@ static int ucsi_init(struct ucsi *ucsi)
typec_unregister_port(con->port);
con->port = NULL;
}
-
- kfree(ucsi->connector);
- ucsi->connector = NULL;
-
+ kfree(connector);
err_reset:
memset(&ucsi->cap, 0, sizeof(ucsi->cap));
ucsi_reset_ppm(ucsi);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x d8a2bb4eb75866275b5cf7de2e593ac3449643e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '168000482628177(a)kroah.com' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
d8a2bb4eb758 ("usb: dwc3: gadget: Add 1ms delay after end transfer command without IOC")
e192cc7b5239 ("usb: dwc3: gadget: move cmd_endtransfer to extra function")
d74dc3e9f58c ("usb: dwc3: gadget: Ignore NoStream after End Transfer")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d8a2bb4eb75866275b5cf7de2e593ac3449643e2 Mon Sep 17 00:00:00 2001
From: Wesley Cheng <quic_wcheng(a)quicinc.com>
Date: Mon, 6 Mar 2023 12:05:57 -0800
Subject: [PATCH] usb: dwc3: gadget: Add 1ms delay after end transfer command
without IOC
Previously, there was a 100uS delay inserted after issuing an end transfer
command for specific controller revisions. This was due to the fact that
there was a GUCTL2 bit field which enabled synchronous completion of the
end transfer command once the CMDACT bit was cleared in the DEPCMD
register. Since this bit does not exist for all controller revisions and
the current implementation heavily relies on utizling the EndTransfer
command completion interrupt, add the delay back in for uses where the
interrupt on completion bit is not set, and increase the duration to 1ms
for the controller to complete the command.
An issue was seen where the USB request buffer was unmapped while the DWC3
controller was still accessing the TRB. However, it was confirmed that the
end transfer command was successfully submitted. (no end transfer timeout)
In situations, such as dwc3_gadget_soft_disconnect() and
__dwc3_gadget_ep_disable(), the dwc3_remove_request() is utilized, which
will issue the end transfer command, and follow up with
dwc3_gadget_giveback(). At least for the USB ep disable path, it is
required for any pending and started requests to be completed and returned
to the function driver in the same context of the disable call. Without
the GUCTL2 bit, it is not ensured that the end transfer is completed before
the buffers are unmapped.
Fixes: cf2f8b63f7f1 ("usb: dwc3: gadget: Remove END_TRANSFER delay")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Wesley Cheng <quic_wcheng(a)quicinc.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Link: https://lore.kernel.org/r/20230306200557.29387-1-quic_wcheng@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 3c63fa97a680..cf5b4f49c3ed 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1699,6 +1699,7 @@ static int __dwc3_gadget_get_frame(struct dwc3 *dwc)
*/
static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool interrupt)
{
+ struct dwc3 *dwc = dep->dwc;
struct dwc3_gadget_ep_cmd_params params;
u32 cmd;
int ret;
@@ -1722,10 +1723,13 @@ static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int
WARN_ON_ONCE(ret);
dep->resource_index = 0;
- if (!interrupt)
+ if (!interrupt) {
+ if (!DWC3_IP_IS(DWC3) || DWC3_VER_IS_PRIOR(DWC3, 310A))
+ mdelay(1);
dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
- else if (!ret)
+ } else if (!ret) {
dep->flags |= DWC3_EP_END_TRANSFER_PENDING;
+ }
dep->flags &= ~DWC3_EP_DELAY_STOP;
return ret;
@@ -3774,7 +3778,11 @@ void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
* enabled, the EndTransfer command will have completed upon
* returning from this function.
*
- * This mode is NOT available on the DWC_usb31 IP.
+ * This mode is NOT available on the DWC_usb31 IP. In this
+ * case, if the IOC bit is not set, then delay by 1ms
+ * after issuing the EndTransfer command. This allows for the
+ * controller to handle the command completely before DWC3
+ * remove requests attempts to unmap USB request buffers.
*/
__dwc3_stop_active_transfer(dep, force, interrupt);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x d8a2bb4eb75866275b5cf7de2e593ac3449643e2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '168000482518837(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
d8a2bb4eb758 ("usb: dwc3: gadget: Add 1ms delay after end transfer command without IOC")
e192cc7b5239 ("usb: dwc3: gadget: move cmd_endtransfer to extra function")
d74dc3e9f58c ("usb: dwc3: gadget: Ignore NoStream after End Transfer")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d8a2bb4eb75866275b5cf7de2e593ac3449643e2 Mon Sep 17 00:00:00 2001
From: Wesley Cheng <quic_wcheng(a)quicinc.com>
Date: Mon, 6 Mar 2023 12:05:57 -0800
Subject: [PATCH] usb: dwc3: gadget: Add 1ms delay after end transfer command
without IOC
Previously, there was a 100uS delay inserted after issuing an end transfer
command for specific controller revisions. This was due to the fact that
there was a GUCTL2 bit field which enabled synchronous completion of the
end transfer command once the CMDACT bit was cleared in the DEPCMD
register. Since this bit does not exist for all controller revisions and
the current implementation heavily relies on utizling the EndTransfer
command completion interrupt, add the delay back in for uses where the
interrupt on completion bit is not set, and increase the duration to 1ms
for the controller to complete the command.
An issue was seen where the USB request buffer was unmapped while the DWC3
controller was still accessing the TRB. However, it was confirmed that the
end transfer command was successfully submitted. (no end transfer timeout)
In situations, such as dwc3_gadget_soft_disconnect() and
__dwc3_gadget_ep_disable(), the dwc3_remove_request() is utilized, which
will issue the end transfer command, and follow up with
dwc3_gadget_giveback(). At least for the USB ep disable path, it is
required for any pending and started requests to be completed and returned
to the function driver in the same context of the disable call. Without
the GUCTL2 bit, it is not ensured that the end transfer is completed before
the buffers are unmapped.
Fixes: cf2f8b63f7f1 ("usb: dwc3: gadget: Remove END_TRANSFER delay")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Wesley Cheng <quic_wcheng(a)quicinc.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
Link: https://lore.kernel.org/r/20230306200557.29387-1-quic_wcheng@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 3c63fa97a680..cf5b4f49c3ed 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1699,6 +1699,7 @@ static int __dwc3_gadget_get_frame(struct dwc3 *dwc)
*/
static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool interrupt)
{
+ struct dwc3 *dwc = dep->dwc;
struct dwc3_gadget_ep_cmd_params params;
u32 cmd;
int ret;
@@ -1722,10 +1723,13 @@ static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int
WARN_ON_ONCE(ret);
dep->resource_index = 0;
- if (!interrupt)
+ if (!interrupt) {
+ if (!DWC3_IP_IS(DWC3) || DWC3_VER_IS_PRIOR(DWC3, 310A))
+ mdelay(1);
dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
- else if (!ret)
+ } else if (!ret) {
dep->flags |= DWC3_EP_END_TRANSFER_PENDING;
+ }
dep->flags &= ~DWC3_EP_DELAY_STOP;
return ret;
@@ -3774,7 +3778,11 @@ void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
* enabled, the EndTransfer command will have completed upon
* returning from this function.
*
- * This mode is NOT available on the DWC_usb31 IP.
+ * This mode is NOT available on the DWC_usb31 IP. In this
+ * case, if the IOC bit is not set, then delay by 1ms
+ * after issuing the EndTransfer command. This allows for the
+ * controller to handle the command completely before DWC3
+ * remove requests attempts to unmap USB request buffers.
*/
__dwc3_stop_active_transfer(dep, force, interrupt);
From: Jeff Layton <jlayton(a)kernel.org>
commit 7ff84910c66c9144cc0de9d9deed9fb84c03aff0 upstream.
Commit 6930bcbfb6ce dropped the setting of the file_lock range when
decoding a nlm_lock off the wire. This causes the client side grant
callback to miss matching blocks and reject the lock, only to rerequest
it 30s later.
Add a helper function to set the file_lock range from the start and end
values that the protocol uses, and have the nlm_lock decoder call that to
set up the file_lock args properly.
Fixes: 6930bcbfb6ce ("lockd: detect and reject lock arguments that overflow")
Reported-by: Amir Goldstein <amir73il(a)gmail.com>
Signed-off-by: Jeff Layton <jlayton(a)kernel.org>
Tested-by: Amir Goldstein <amir73il(a)gmail.com>
Cc: stable(a)vger.kernel.org #6.0
Signed-off-by: Anna Schumaker <Anna.Schumaker(a)Netapp.com>
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
---
Greg,
The upstream fix applies cleanly to 6.1.y and 6.2.y, so as the
Cc stable mentions, please apply upstream fix to those trees.
Alas, the regressing commit was also applied to v5.15.61,
so please apply this backport to fix 5.15.y.
Thanks,
Amir.
fs/lockd/clnt4xdr.c | 9 +--------
fs/lockd/xdr4.c | 13 ++++++++++++-
include/linux/lockd/xdr4.h | 1 +
3 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/fs/lockd/clnt4xdr.c b/fs/lockd/clnt4xdr.c
index 7df6324ccb8a..8161667c976f 100644
--- a/fs/lockd/clnt4xdr.c
+++ b/fs/lockd/clnt4xdr.c
@@ -261,7 +261,6 @@ static int decode_nlm4_holder(struct xdr_stream *xdr, struct nlm_res *result)
u32 exclusive;
int error;
__be32 *p;
- s32 end;
memset(lock, 0, sizeof(*lock));
locks_init_lock(fl);
@@ -285,13 +284,7 @@ static int decode_nlm4_holder(struct xdr_stream *xdr, struct nlm_res *result)
fl->fl_type = exclusive != 0 ? F_WRLCK : F_RDLCK;
p = xdr_decode_hyper(p, &l_offset);
xdr_decode_hyper(p, &l_len);
- end = l_offset + l_len - 1;
-
- fl->fl_start = (loff_t)l_offset;
- if (l_len == 0 || end < 0)
- fl->fl_end = OFFSET_MAX;
- else
- fl->fl_end = (loff_t)end;
+ nlm4svc_set_file_lock_range(fl, l_offset, l_len);
error = 0;
out:
return error;
diff --git a/fs/lockd/xdr4.c b/fs/lockd/xdr4.c
index 72f7d190fb3b..b303ecd74f33 100644
--- a/fs/lockd/xdr4.c
+++ b/fs/lockd/xdr4.c
@@ -33,6 +33,17 @@ loff_t_to_s64(loff_t offset)
return res;
}
+void nlm4svc_set_file_lock_range(struct file_lock *fl, u64 off, u64 len)
+{
+ s64 end = off + len - 1;
+
+ fl->fl_start = off;
+ if (len == 0 || end < 0)
+ fl->fl_end = OFFSET_MAX;
+ else
+ fl->fl_end = end;
+}
+
/*
* NLM file handles are defined by specification to be a variable-length
* XDR opaque no longer than 1024 bytes. However, this implementation
@@ -80,7 +91,7 @@ svcxdr_decode_lock(struct xdr_stream *xdr, struct nlm_lock *lock)
locks_init_lock(fl);
fl->fl_flags = FL_POSIX;
fl->fl_type = F_RDLCK;
-
+ nlm4svc_set_file_lock_range(fl, lock->lock_start, lock->lock_len);
return true;
}
diff --git a/include/linux/lockd/xdr4.h b/include/linux/lockd/xdr4.h
index 5ae766f26e04..025250ade98e 100644
--- a/include/linux/lockd/xdr4.h
+++ b/include/linux/lockd/xdr4.h
@@ -24,6 +24,7 @@
+void nlm4svc_set_file_lock_range(struct file_lock *fl, u64 off, u64 len);
int nlm4svc_decode_testargs(struct svc_rqst *, __be32 *);
int nlm4svc_encode_testres(struct svc_rqst *, __be32 *);
int nlm4svc_decode_lockargs(struct svc_rqst *, __be32 *);
--
2.16.5
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x a075bacde257f755bea0e53400c9f1cdd1b8e8e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '168000454829207(a)kroah.com' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
a075bacde257 ("fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY")
ed45e2016493 ("fs-verity: rename "file measurement" to "file digest"")
9e90f30e7857 ("fs-verity: rename fsverity_signed_digest to fsverity_formatted_digest")
7bf765dd8442 ("fs-verity: remove filenames from file comments")
6377a38bd345 ("fs-verity: fix all kerneldoc warnings")
439bea104c3d ("fs-verity: use mempool for hash requests")
fd39073dba86 ("fs-verity: implement readahead of Merkle tree pages")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a075bacde257f755bea0e53400c9f1cdd1b8e8e6 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 14 Mar 2023 16:31:32 -0700
Subject: [PATCH] fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
The full pagecache drop at the end of FS_IOC_ENABLE_VERITY is causing
performance problems and is hindering adoption of fsverity. It was
intended to solve a race condition where unverified pages might be left
in the pagecache. But actually it doesn't solve it fully.
Since the incomplete solution for this race condition has too much
performance impact for it to be worth it, let's remove it for now.
Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl")
Cc: stable(a)vger.kernel.org
Reviewed-by: Victor Hsieh <victorhsieh(a)google.com>
Link: https://lore.kernel.org/r/20230314235332.50270-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index e13db6507b38..7a0e3a84d370 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -8,7 +8,6 @@
#include "fsverity_private.h"
#include <linux/mount.h>
-#include <linux/pagemap.h>
#include <linux/sched/signal.h>
#include <linux/uaccess.h>
@@ -367,25 +366,27 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *uarg)
goto out_drop_write;
err = enable_verity(filp, &arg);
- if (err)
- goto out_allow_write_access;
/*
- * Some pages of the file may have been evicted from pagecache after
- * being used in the Merkle tree construction, then read into pagecache
- * again by another process reading from the file concurrently. Since
- * these pages didn't undergo verification against the file digest which
- * fs-verity now claims to be enforcing, we have to wipe the pagecache
- * to ensure that all future reads are verified.
+ * We no longer drop the inode's pagecache after enabling verity. This
+ * used to be done to try to avoid a race condition where pages could be
+ * evicted after being used in the Merkle tree construction, then
+ * re-instantiated by a concurrent read. Such pages are unverified, and
+ * the backing storage could have filled them with different content, so
+ * they shouldn't be used to fulfill reads once verity is enabled.
+ *
+ * But, dropping the pagecache has a big performance impact, and it
+ * doesn't fully solve the race condition anyway. So for those reasons,
+ * and also because this race condition isn't very important relatively
+ * speaking (especially for small-ish files, where the chance of a page
+ * being used, evicted, *and* re-instantiated all while enabling verity
+ * is quite small), we no longer drop the inode's pagecache.
*/
- filemap_write_and_wait(inode->i_mapping);
- invalidate_inode_pages2(inode->i_mapping);
/*
* allow_write_access() is needed to pair with deny_write_access().
* Regardless, the filesystem won't allow writing to verity files.
*/
-out_allow_write_access:
allow_write_access(filp);
out_drop_write:
mnt_drop_write_file(filp);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x a075bacde257f755bea0e53400c9f1cdd1b8e8e6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '168000454721840(a)kroah.com' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
a075bacde257 ("fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY")
ed45e2016493 ("fs-verity: rename "file measurement" to "file digest"")
9e90f30e7857 ("fs-verity: rename fsverity_signed_digest to fsverity_formatted_digest")
7bf765dd8442 ("fs-verity: remove filenames from file comments")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a075bacde257f755bea0e53400c9f1cdd1b8e8e6 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Tue, 14 Mar 2023 16:31:32 -0700
Subject: [PATCH] fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
The full pagecache drop at the end of FS_IOC_ENABLE_VERITY is causing
performance problems and is hindering adoption of fsverity. It was
intended to solve a race condition where unverified pages might be left
in the pagecache. But actually it doesn't solve it fully.
Since the incomplete solution for this race condition has too much
performance impact for it to be worth it, let's remove it for now.
Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl")
Cc: stable(a)vger.kernel.org
Reviewed-by: Victor Hsieh <victorhsieh(a)google.com>
Link: https://lore.kernel.org/r/20230314235332.50270-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index e13db6507b38..7a0e3a84d370 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -8,7 +8,6 @@
#include "fsverity_private.h"
#include <linux/mount.h>
-#include <linux/pagemap.h>
#include <linux/sched/signal.h>
#include <linux/uaccess.h>
@@ -367,25 +366,27 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *uarg)
goto out_drop_write;
err = enable_verity(filp, &arg);
- if (err)
- goto out_allow_write_access;
/*
- * Some pages of the file may have been evicted from pagecache after
- * being used in the Merkle tree construction, then read into pagecache
- * again by another process reading from the file concurrently. Since
- * these pages didn't undergo verification against the file digest which
- * fs-verity now claims to be enforcing, we have to wipe the pagecache
- * to ensure that all future reads are verified.
+ * We no longer drop the inode's pagecache after enabling verity. This
+ * used to be done to try to avoid a race condition where pages could be
+ * evicted after being used in the Merkle tree construction, then
+ * re-instantiated by a concurrent read. Such pages are unverified, and
+ * the backing storage could have filled them with different content, so
+ * they shouldn't be used to fulfill reads once verity is enabled.
+ *
+ * But, dropping the pagecache has a big performance impact, and it
+ * doesn't fully solve the race condition anyway. So for those reasons,
+ * and also because this race condition isn't very important relatively
+ * speaking (especially for small-ish files, where the chance of a page
+ * being used, evicted, *and* re-instantiated all while enabling verity
+ * is quite small), we no longer drop the inode's pagecache.
*/
- filemap_write_and_wait(inode->i_mapping);
- invalidate_inode_pages2(inode->i_mapping);
/*
* allow_write_access() is needed to pair with deny_write_access().
* Regardless, the filesystem won't allow writing to verity files.
*/
-out_allow_write_access:
allow_write_access(filp);
out_drop_write:
mnt_drop_write_file(filp);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x ccb820dc7d2236b1af0d54ae038a27b5b6d5ae5a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '168000451363159(a)kroah.com' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
ccb820dc7d22 ("fscrypt: destroy keyring after security_sb_delete()")
83e804f0bfee ("fs,security: Add sb_delete hook")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ccb820dc7d2236b1af0d54ae038a27b5b6d5ae5a Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Mon, 13 Mar 2023 15:12:29 -0700
Subject: [PATCH] fscrypt: destroy keyring after security_sb_delete()
fscrypt_destroy_keyring() must be called after all potentially-encrypted
inodes were evicted; otherwise it cannot safely destroy the keyring.
Since inodes that are in-use by the Landlock LSM don't get evicted until
security_sb_delete(), this means that fscrypt_destroy_keyring() must be
called *after* security_sb_delete().
This fixes a WARN_ON followed by a NULL dereference, only possible if
Landlock was being used on encrypted files.
Fixes: d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+93e495f6a4f748827c88(a)syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/00000000000044651705f6ca1e30@google.com
Reviewed-by: Christian Brauner <brauner(a)kernel.org>
Link: https://lore.kernel.org/r/20230313221231.272498-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
diff --git a/fs/super.c b/fs/super.c
index 84332d5cb817..04bc62ab7dfe 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -475,13 +475,22 @@ void generic_shutdown_super(struct super_block *sb)
cgroup_writeback_umount();
- /* evict all inodes with zero refcount */
+ /* Evict all inodes with zero refcount. */
evict_inodes(sb);
- /* only nonzero refcount inodes can have marks */
+
+ /*
+ * Clean up and evict any inodes that still have references due
+ * to fsnotify or the security policy.
+ */
fsnotify_sb_delete(sb);
- fscrypt_destroy_keyring(sb);
security_sb_delete(sb);
+ /*
+ * Now that all potentially-encrypted inodes have been evicted,
+ * the fscrypt keyring can be destroyed.
+ */
+ fscrypt_destroy_keyring(sb);
+
if (sb->s_dio_done_wq) {
destroy_workqueue(sb->s_dio_done_wq);
sb->s_dio_done_wq = NULL;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 2f4e429c846972c8405951a9ff7a82aceeca7461
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1680003340205249(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
2f4e429c8469 ("cifs: lock chan_lock outside match_session")
d7d7a66aacd6 ("cifs: avoid use of global locks for high contention data")
9543c8ab3016 ("cifs: list_for_each() -> list_for_each_entry()")
af3a6d1018f0 ("cifs: update cifs_ses::ip_addr after failover")
b54034a73baf ("cifs: during reconnect, update interface if necessary")
cc391b694ff0 ("cifs: fix potential deadlock in direct reclaim")
5752bf645f9d ("cifs: avoid parallel session setups on same channel")
dd3cd8709ed5 ("cifs: use new enum for ses_status")
1a6a41d4cedd ("cifs: do not use tcpStatus after negotiate completes")
cd70a3e8988a ("cifs: use correct lock type in cifs_reconnect()")
fb39d30e2272 ("cifs: force new session setup and tcon for dfs")
9a005bea4f59 ("Merge tag '5.18-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2f4e429c846972c8405951a9ff7a82aceeca7461 Mon Sep 17 00:00:00 2001
From: Shyam Prasad N <sprasad(a)microsoft.com>
Date: Mon, 20 Feb 2023 13:02:11 +0000
Subject: [PATCH] cifs: lock chan_lock outside match_session
Coverity had rightly indicated a possible deadlock
due to chan_lock being done inside match_session.
All callers of match_* functions should pick up the
necessary locks and call them.
Signed-off-by: Shyam Prasad N <sprasad(a)microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc(a)manguebit.com>
Cc: stable(a)vger.kernel.org
Fixes: 724244cdb382 ("cifs: protect session channel fields with chan_lock")
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 49b37594e991..f42cc7077312 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1721,7 +1721,7 @@ cifs_get_tcp_session(struct smb3_fs_context *ctx,
return ERR_PTR(rc);
}
-/* this function must be called with ses_lock held */
+/* this function must be called with ses_lock and chan_lock held */
static int match_session(struct cifs_ses *ses, struct smb3_fs_context *ctx)
{
if (ctx->sectype != Unspecified &&
@@ -1732,12 +1732,8 @@ static int match_session(struct cifs_ses *ses, struct smb3_fs_context *ctx)
* If an existing session is limited to less channels than
* requested, it should not be reused
*/
- spin_lock(&ses->chan_lock);
- if (ses->chan_max < ctx->max_channels) {
- spin_unlock(&ses->chan_lock);
+ if (ses->chan_max < ctx->max_channels)
return 0;
- }
- spin_unlock(&ses->chan_lock);
switch (ses->sectype) {
case Kerberos:
@@ -1865,10 +1861,13 @@ cifs_find_smb_ses(struct TCP_Server_Info *server, struct smb3_fs_context *ctx)
spin_unlock(&ses->ses_lock);
continue;
}
+ spin_lock(&ses->chan_lock);
if (!match_session(ses, ctx)) {
+ spin_unlock(&ses->chan_lock);
spin_unlock(&ses->ses_lock);
continue;
}
+ spin_unlock(&ses->chan_lock);
spin_unlock(&ses->ses_lock);
++ses->ses_count;
@@ -2693,6 +2692,7 @@ cifs_match_super(struct super_block *sb, void *data)
spin_lock(&tcp_srv->srv_lock);
spin_lock(&ses->ses_lock);
+ spin_lock(&ses->chan_lock);
spin_lock(&tcon->tc_lock);
if (!match_server(tcp_srv, ctx, dfs_super_cmp) ||
!match_session(ses, ctx) ||
@@ -2705,6 +2705,7 @@ cifs_match_super(struct super_block *sb, void *data)
rc = compare_mount_options(sb, mnt_data);
out:
spin_unlock(&tcon->tc_lock);
+ spin_unlock(&ses->chan_lock);
spin_unlock(&ses->ses_lock);
spin_unlock(&tcp_srv->srv_lock);
This is a note to let you know that I've just added the patch titled
iio: adc: ti-ads7950: Set `can_sleep` flag for GPIO chip
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 363c7dc72f79edd55bf1c4380e0fbf7f1bbc2c86 Mon Sep 17 00:00:00 2001
From: Lars-Peter Clausen <lars(a)metafoo.de>
Date: Sun, 12 Mar 2023 14:09:33 -0700
Subject: iio: adc: ti-ads7950: Set `can_sleep` flag for GPIO chip
The ads7950 uses a mutex as well as SPI transfers in its GPIO callbacks.
This means these callbacks can sleep and the `can_sleep` flag should be
set.
Having the flag set will make sure that warnings are generated when calling
any of the callbacks from a potentially non-sleeping context.
Fixes: c97dce792dc8 ("iio: adc: ti-ads7950: add GPIO support")
Signed-off-by: Lars-Peter Clausen <lars(a)metafoo.de>
Acked-by: David Lechner <david(a)lechnology.com>
Link: https://lore.kernel.org/r/20230312210933.2275376-1-lars@metafoo.de
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/ti-ads7950.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/iio/adc/ti-ads7950.c b/drivers/iio/adc/ti-ads7950.c
index 2cc9a9bd9db6..263fc3a1b87e 100644
--- a/drivers/iio/adc/ti-ads7950.c
+++ b/drivers/iio/adc/ti-ads7950.c
@@ -634,6 +634,7 @@ static int ti_ads7950_probe(struct spi_device *spi)
st->chip.label = dev_name(&st->spi->dev);
st->chip.parent = &st->spi->dev;
st->chip.owner = THIS_MODULE;
+ st->chip.can_sleep = true;
st->chip.base = -1;
st->chip.ngpio = TI_ADS7950_NUM_GPIOS;
st->chip.get_direction = ti_ads7950_get_direction;
--
2.40.0
This is a note to let you know that I've just added the patch titled
iio: adc: max11410: fix read_poll_timeout() usage
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 7b3825e9487d77e83bf1e27b10a74cd729b8f972 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nuno=20S=C3=A1?= <nuno.sa(a)analog.com>
Date: Tue, 7 Mar 2023 10:53:03 +0100
Subject: iio: adc: max11410: fix read_poll_timeout() usage
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Even though we are passing 'ret' as stop condition for
read_poll_timeout(), that return code is still being ignored. The reason
is that the poll will stop if the passed condition is true which will
happen if the passed op() returns error. However, read_poll_timeout()
returns 0 if the *complete* condition evaluates to true. Therefore, the
error code returned by op() will be ignored.
To fix this we need to check for both error codes:
* The one returned by read_poll_timeout() which is either 0 or
ETIMEDOUT.
* The one returned by the passed op().
Fixes: a44ef7c46097 ("iio: adc: add max11410 adc driver")
Signed-off-by: Nuno Sá <nuno.sa(a)analog.com>
Acked-by: Ibrahim Tilki <Ibrahim.Tilki(a)analog.com>
Link: https://lore.kernel.org/r/20230307095303.713251-1-nuno.sa@analog.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/max11410.c | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/drivers/iio/adc/max11410.c b/drivers/iio/adc/max11410.c
index fdc9f03135b5..e64cd979688d 100644
--- a/drivers/iio/adc/max11410.c
+++ b/drivers/iio/adc/max11410.c
@@ -413,13 +413,17 @@ static int max11410_sample(struct max11410_state *st, int *sample_raw,
if (!ret)
return -ETIMEDOUT;
} else {
+ int ret2;
+
/* Wait for status register Conversion Ready flag */
- ret = read_poll_timeout(max11410_read_reg, ret,
- ret || (val & MAX11410_STATUS_CONV_READY_BIT),
+ ret = read_poll_timeout(max11410_read_reg, ret2,
+ ret2 || (val & MAX11410_STATUS_CONV_READY_BIT),
5000, MAX11410_CONVERSION_TIMEOUT_MS * 1000,
true, st, MAX11410_REG_STATUS, &val);
if (ret)
return ret;
+ if (ret2)
+ return ret2;
}
/* Read ADC Data */
@@ -850,17 +854,21 @@ static int max11410_init_vref(struct device *dev,
static int max11410_calibrate(struct max11410_state *st, u32 cal_type)
{
- int ret, val;
+ int ret, ret2, val;
ret = max11410_write_reg(st, MAX11410_REG_CAL_START, cal_type);
if (ret)
return ret;
/* Wait for status register Calibration Ready flag */
- return read_poll_timeout(max11410_read_reg, ret,
- ret || (val & MAX11410_STATUS_CAL_READY_BIT),
- 50000, MAX11410_CALIB_TIMEOUT_MS * 1000, true,
- st, MAX11410_REG_STATUS, &val);
+ ret = read_poll_timeout(max11410_read_reg, ret2,
+ ret2 || (val & MAX11410_STATUS_CAL_READY_BIT),
+ 50000, MAX11410_CALIB_TIMEOUT_MS * 1000, true,
+ st, MAX11410_REG_STATUS, &val);
+ if (ret)
+ return ret;
+
+ return ret2;
}
static int max11410_self_calibrate(struct max11410_state *st)
--
2.40.0
This is a note to let you know that I've just added the patch titled
iio: light: cm32181: Unregister second I2C client if present
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 099cc90a5a62e68b2fe3a42da011ab929b98bf73 Mon Sep 17 00:00:00 2001
From: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Date: Thu, 23 Feb 2023 10:00:59 +0800
Subject: iio: light: cm32181: Unregister second I2C client if present
If a second dummy client that talks to the actual I2C address was
created in probe(), there should be a proper cleanup on driver and
device removal to avoid leakage.
So unregister the dummy client via another callback.
Reviewed-by: Hans de Goede <hdegoede(a)redhat.com>
Suggested-by: Hans de Goede <hdegoede(a)redhat.com>
Fixes: c1e62062ff54 ("iio: light: cm32181: Handle CM3218 ACPI devices with 2 I2C resources")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2152281
Signed-off-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Link: https://lore.kernel.org/r/20230223020059.2013993-1-kai.heng.feng@canonical.…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/light/cm32181.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/iio/light/cm32181.c b/drivers/iio/light/cm32181.c
index b1674a5bfa36..d4a34a3bf00d 100644
--- a/drivers/iio/light/cm32181.c
+++ b/drivers/iio/light/cm32181.c
@@ -429,6 +429,14 @@ static const struct iio_info cm32181_info = {
.attrs = &cm32181_attribute_group,
};
+static void cm32181_unregister_dummy_client(void *data)
+{
+ struct i2c_client *client = data;
+
+ /* Unregister the dummy client */
+ i2c_unregister_device(client);
+}
+
static int cm32181_probe(struct i2c_client *client)
{
struct device *dev = &client->dev;
@@ -460,6 +468,10 @@ static int cm32181_probe(struct i2c_client *client)
client = i2c_acpi_new_device(dev, 1, &board_info);
if (IS_ERR(client))
return PTR_ERR(client);
+
+ ret = devm_add_action_or_reset(dev, cm32181_unregister_dummy_client, client);
+ if (ret)
+ return ret;
}
cm32181 = iio_priv(indio_dev);
--
2.40.0
This is a note to let you know that I've just added the patch titled
iio: accel: kionix-kx022a: Get the timestamp from the driver's
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 03fada47311a3e668f73efc9278c4a559e64ee85 Mon Sep 17 00:00:00 2001
From: Mehdi Djait <mehdi.djait.k(a)gmail.com>
Date: Sat, 18 Feb 2023 14:51:11 +0100
Subject: iio: accel: kionix-kx022a: Get the timestamp from the driver's
private data in the trigger_handler
The trigger_handler gets called from the IRQ thread handler using
iio_trigger_poll_chained() which will only call the bottom half of the
pollfunc and therefore pf->timestamp will not get set.
Use instead the timestamp from the driver's private data which is always
set in the IRQ handler.
Fixes: 7c1d1677b322 ("iio: accel: Support Kionix/ROHM KX022A accelerometer")
Link: https://lore.kernel.org/linux-iio/Y+6QoBLh1k82cJVN@carbian/
Reviewed-by: Matti Vaittinen <mazziesaccount(a)gmail.com>
Signed-off-by: Mehdi Djait <mehdi.djait.k(a)gmail.com>
Link: https://lore.kernel.org/r/20230218135111.90061-1-mehdi.djait.k@gmail.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/accel/kionix-kx022a.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/accel/kionix-kx022a.c b/drivers/iio/accel/kionix-kx022a.c
index f866859855cd..1c3a72380fb8 100644
--- a/drivers/iio/accel/kionix-kx022a.c
+++ b/drivers/iio/accel/kionix-kx022a.c
@@ -864,7 +864,7 @@ static irqreturn_t kx022a_trigger_handler(int irq, void *p)
if (ret < 0)
goto err_read;
- iio_push_to_buffers_with_timestamp(idev, data->buffer, pf->timestamp);
+ iio_push_to_buffers_with_timestamp(idev, data->buffer, data->timestamp);
err_read:
iio_trigger_notify_done(idev->trig);
--
2.40.0
This is a note to let you know that I've just added the patch titled
iio: buffer: make sure O_NONBLOCK is respected
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 3da1814184582ed0faf039275a3f02e6f69944ee Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nuno=20S=C3=A1?= <nuno.sa(a)analog.com>
Date: Thu, 16 Feb 2023 11:14:51 +0100
Subject: iio: buffer: make sure O_NONBLOCK is respected
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
For output buffers, there's no guarantee that the buffer won't be full
in the first iteration of the loop in which case we would block
independently of userspace passing O_NONBLOCK or not. Fix it by always
checking the flag before going to sleep.
While at it (and as it's a bit related), refactored the loop so that the
stop condition is 'written != n', i.e, run the loop until all data has
been copied into the IIO buffers. This makes the code a bit simpler.
Fixes: 9eeee3b0bf190 ("iio: Add output buffer support")
Signed-off-by: Nuno Sá <nuno.sa(a)analog.com>
Reviewed-by: Lars-Peter Clausen <lars(a)metafoo.de>
Link: https://lore.kernel.org/r/20230216101452.591805-3-nuno.sa@analog.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/industrialio-buffer.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c
index 6340d8e1430b..a7a080bed180 100644
--- a/drivers/iio/industrialio-buffer.c
+++ b/drivers/iio/industrialio-buffer.c
@@ -203,21 +203,24 @@ static ssize_t iio_buffer_write(struct file *filp, const char __user *buf,
break;
}
+ if (filp->f_flags & O_NONBLOCK) {
+ if (!written)
+ ret = -EAGAIN;
+ break;
+ }
+
wait_woken(&wait, TASK_INTERRUPTIBLE,
MAX_SCHEDULE_TIMEOUT);
continue;
}
ret = rb->access->write(rb, n - written, buf + written);
- if (ret == 0 && (filp->f_flags & O_NONBLOCK))
- ret = -EAGAIN;
+ if (ret < 0)
+ break;
- if (ret > 0) {
- written += ret;
- if (written != n && !(filp->f_flags & O_NONBLOCK))
- continue;
- }
- } while (ret == 0);
+ written += ret;
+
+ } while (written != n);
remove_wait_queue(&rb->pollq, &wait);
return ret < 0 ? ret : written;
--
2.40.0
This is a note to let you know that I've just added the patch titled
iio: buffer: correctly return bytes written in output buffers
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From b5184a26a28fac1d708b0bfeeb958a9260c2924c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nuno=20S=C3=A1?= <nuno.sa(a)analog.com>
Date: Thu, 16 Feb 2023 11:14:50 +0100
Subject: iio: buffer: correctly return bytes written in output buffers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If for some reason 'rb->access->write()' does not write the full
requested data and the O_NONBLOCK is set, we would return 'n' to
userspace which is not really truth. Hence, let's return the number of
bytes we effectively wrote.
Fixes: 9eeee3b0bf190 ("iio: Add output buffer support")
Signed-off-by: Nuno Sá <nuno.sa(a)analog.com>
Reviewed-by: Lars-Peter Clausen <lars(a)metafoo.de>
Link: https://lore.kernel.org/r/20230216101452.591805-2-nuno.sa@analog.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/industrialio-buffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c
index 80c78bd6bbef..6340d8e1430b 100644
--- a/drivers/iio/industrialio-buffer.c
+++ b/drivers/iio/industrialio-buffer.c
@@ -220,7 +220,7 @@ static ssize_t iio_buffer_write(struct file *filp, const char __user *buf,
} while (ret == 0);
remove_wait_queue(&rb->pollq, &wait);
- return ret < 0 ? ret : n;
+ return ret < 0 ? ret : written;
}
/**
--
2.40.0
This is a note to let you know that I've just added the patch titled
iio: light: vcnl4000: Fix WARN_ON on uninitialized lock
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 42ec40b0883c1cce58b06e8fa82049a61033151c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?M=C3=A5rten=20Lindahl?= <marten.lindahl(a)axis.com>
Date: Tue, 31 Jan 2023 15:01:09 +0100
Subject: iio: light: vcnl4000: Fix WARN_ON on uninitialized lock
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
There are different init functions for the sensors in this driver in
which only one initializes the generic vcnl4000_lock. With commit
e21b5b1f2669 ("iio: light: vcnl4000: Preserve conf bits when toggle power")
the vcnl4040 sensor started to depend on the lock, but it was missed to
initialize it in vcnl4040's init function. This has not been visible
until we run lockdep on it:
DEBUG_LOCKS_WARN_ON(lock->magic != lock)
at kernel/locking/mutex.c:575 __mutex_lock+0x4f8/0x890
Call trace:
__mutex_lock
mutex_lock_nested
vcnl4200_set_power_state
vcnl4200_init
vcnl4000_probe
Fix this by initializing the lock in the probe function instead of doing
it in the chip specific init functions.
Fixes: e21b5b1f2669 ("iio: light: vcnl4000: Preserve conf bits when toggle power")
Signed-off-by: Mårten Lindahl <marten.lindahl(a)axis.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Link: https://lore.kernel.org/r/20230131140109.2067577-1-marten.lindahl@axis.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/light/vcnl4000.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/light/vcnl4000.c b/drivers/iio/light/vcnl4000.c
index cc1a2062e76d..69c5bc987e26 100644
--- a/drivers/iio/light/vcnl4000.c
+++ b/drivers/iio/light/vcnl4000.c
@@ -199,7 +199,6 @@ static int vcnl4000_init(struct vcnl4000_data *data)
data->rev = ret & 0xf;
data->al_scale = 250000;
- mutex_init(&data->vcnl4000_lock);
return data->chip_spec->set_power_state(data, true);
};
@@ -1197,6 +1196,8 @@ static int vcnl4000_probe(struct i2c_client *client)
data->id = id->driver_data;
data->chip_spec = &vcnl4000_chip_spec_cfg[data->id];
+ mutex_init(&data->vcnl4000_lock);
+
ret = data->chip_spec->init(data);
if (ret < 0)
return ret;
--
2.40.0
This is a note to let you know that I've just added the patch titled
iio: adis16480: select CONFIG_CRC32
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From d9b540ee461cca7edca0dd2c2a42625c6b9ffb8f Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd(a)arndb.de>
Date: Tue, 31 Jan 2023 10:46:11 +0100
Subject: iio: adis16480: select CONFIG_CRC32
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
In rare randconfig builds, the missing CRC32 helper causes
a link error:
ld.lld: error: undefined symbol: crc32_le
>>> referenced by usercopy_64.c
>>> vmlinux.o:(adis16480_trigger_handler)
Fixes: 941f130881fa ("iio: adis16480: support burst read function")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Reviewed-by: Nuno Sá <nuno.sa(a)analog.com>
Link: https://lore.kernel.org/r/20230131094616.130238-1-arnd@kernel.org
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/imu/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/iio/imu/Kconfig b/drivers/iio/imu/Kconfig
index f1d7d4b5e222..c2f97629e9cd 100644
--- a/drivers/iio/imu/Kconfig
+++ b/drivers/iio/imu/Kconfig
@@ -47,6 +47,7 @@ config ADIS16480
depends on SPI
select IIO_ADIS_LIB
select IIO_ADIS_LIB_BUFFER if IIO_BUFFER
+ select CRC32
help
Say yes here to build support for Analog Devices ADIS16375, ADIS16480,
ADIS16485, ADIS16488 inertial sensors.
--
2.40.0
This is a note to let you know that I've just added the patch titled
drivers: iio: adc: ltc2497: fix LSB shift
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 6327a930ab7bfa1ab33bcdffd5f5f4b1e7131504 Mon Sep 17 00:00:00 2001
From: Ian Ray <ian.ray(a)ge.com>
Date: Fri, 27 Jan 2023 14:57:14 +0200
Subject: drivers: iio: adc: ltc2497: fix LSB shift
Correct the "sub_lsb" shift for the ltc2497 and drop the sub_lsb element
which is now constant.
An earlier version of the code shifted by 14 but this was a consequence
of reading three bytes into a __be32 buffer and using be32_to_cpu(), so
eight extra bits needed to be skipped. Now we use get_unaligned_be24()
and thus the additional skip is wrong.
Fixes: 2187cfeb3626 ("drivers: iio: adc: ltc2497: LTC2499 support")
Signed-off-by: Ian Ray <ian.ray(a)ge.com>
Link: https://lore.kernel.org/r/20230127125714.44608-1-ian.ray@ge.com
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/ltc2497.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/iio/adc/ltc2497.c b/drivers/iio/adc/ltc2497.c
index 17370c5eb6fe..ec198c6f13d6 100644
--- a/drivers/iio/adc/ltc2497.c
+++ b/drivers/iio/adc/ltc2497.c
@@ -28,7 +28,6 @@ struct ltc2497_driverdata {
struct ltc2497core_driverdata common_ddata;
struct i2c_client *client;
u32 recv_size;
- u32 sub_lsb;
/*
* DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
@@ -65,10 +64,10 @@ static int ltc2497_result_and_measure(struct ltc2497core_driverdata *ddata,
* equivalent to a sign extension.
*/
if (st->recv_size == 3) {
- *val = (get_unaligned_be24(st->data.d8) >> st->sub_lsb)
+ *val = (get_unaligned_be24(st->data.d8) >> 6)
- BIT(ddata->chip_info->resolution + 1);
} else {
- *val = (be32_to_cpu(st->data.d32) >> st->sub_lsb)
+ *val = (be32_to_cpu(st->data.d32) >> 6)
- BIT(ddata->chip_info->resolution + 1);
}
@@ -122,7 +121,6 @@ static int ltc2497_probe(struct i2c_client *client)
st->common_ddata.chip_info = chip_info;
resolution = chip_info->resolution;
- st->sub_lsb = 31 - (resolution + 1);
st->recv_size = BITS_TO_BYTES(resolution) + 1;
return ltc2497core_probe(dev, indio_dev);
--
2.40.0
This is a note to let you know that I've just added the patch titled
iio: adc: qcom-spmi-adc5: Fix the channel name
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 701c875aded880013aacac608832995c4b052257 Mon Sep 17 00:00:00 2001
From: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Date: Wed, 18 Jan 2023 12:06:23 +0200
Subject: iio: adc: qcom-spmi-adc5: Fix the channel name
The node name can contain an address part which is unused
by the driver. Moreover, this string is propagated into
the userspace label, sysfs filenames *and breaking ABI*.
Cut the address part out before assigning the channel name.
Fixes: 4f47a236a23d ("iio: adc: qcom-spmi-adc5: convert to device properties")
Reported-by: Marijn Suijten <marijn.suijten(a)somainline.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Reviewed-by: Marijn Suijten <marijn.suijten(a)somainline.org>
Link: https://lore.kernel.org/r/20230118100623.42255-1-andriy.shevchenko@linux.in…
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/adc/qcom-spmi-adc5.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/adc/qcom-spmi-adc5.c b/drivers/iio/adc/qcom-spmi-adc5.c
index 821fee60a765..d1b86570768a 100644
--- a/drivers/iio/adc/qcom-spmi-adc5.c
+++ b/drivers/iio/adc/qcom-spmi-adc5.c
@@ -626,12 +626,20 @@ static int adc5_get_fw_channel_data(struct adc5_chip *adc,
struct fwnode_handle *fwnode,
const struct adc5_data *data)
{
- const char *name = fwnode_get_name(fwnode), *channel_name;
+ const char *channel_name;
+ char *name;
u32 chan, value, varr[2];
u32 sid = 0;
int ret;
struct device *dev = adc->dev;
+ name = devm_kasprintf(dev, GFP_KERNEL, "%pfwP", fwnode);
+ if (!name)
+ return -ENOMEM;
+
+ /* Cut the address part */
+ name[strchrnul(name, '@') - name] = '\0';
+
ret = fwnode_property_read_u32(fwnode, "reg", &chan);
if (ret) {
dev_err(dev, "invalid channel number %s\n", name);
--
2.40.0
It is possible that there may be a pending interrupt logged into the dwc IP
while the interrupts were disabled in the driver. And when the wakeup
interrupt is enabled, the pending interrupt might fire which is not
required to be serviced by the driver.
So always clear the pending interrupt before enabling wake interrupt.
Cc: stable(a)vger.kernel.org # 5.20
Fixes: 360e8230516d ("usb: dwc3: qcom: Add helper functions to enable,disable wake irqs")
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
---
drivers/usb/dwc3/dwc3-qcom.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/dwc3/dwc3-qcom.c b/drivers/usb/dwc3/dwc3-qcom.c
index bbf67f705d0d..f1059dfcb0e8 100644
--- a/drivers/usb/dwc3/dwc3-qcom.c
+++ b/drivers/usb/dwc3/dwc3-qcom.c
@@ -346,6 +346,8 @@ static void dwc3_qcom_enable_wakeup_irq(int irq, unsigned int polarity)
if (!irq)
return;
+ irq_set_irqchip_state(irq, IRQCHIP_STATE_PENDING, 0);
+
if (polarity)
irq_set_irq_type(irq, polarity);
--
2.25.1
We got a WARNING in ext4_add_complete_io:
==================================================================
WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250
CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85
RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4]
[...]
Call Trace:
<TASK>
ext4_end_bio+0xa8/0x240 [ext4]
bio_endio+0x195/0x310
blk_update_request+0x184/0x770
scsi_end_request+0x2f/0x240
scsi_io_completion+0x75/0x450
scsi_finish_command+0xef/0x160
scsi_complete+0xa3/0x180
blk_complete_reqs+0x60/0x80
blk_done_softirq+0x25/0x40
__do_softirq+0x119/0x4c8
run_ksoftirqd+0x42/0x70
smpboot_thread_fn+0x136/0x3c0
kthread+0x140/0x1a0
ret_from_fork+0x2c/0x50
==================================================================
Above issue may happen as follows:
cpu1 cpu2
----------------------------|----------------------------
mount -o dioread_lock
ext4_writepages
ext4_do_writepages
*if (ext4_should_dioread_nolock(inode))*
// rsv_blocks is not assigned here
mount -o remount,dioread_nolock
ext4_journal_start_with_reserve
__ext4_journal_start
__ext4_journal_start_sb
jbd2__journal_start
*if (rsv_blocks)*
// h_rsv_handle is not initialized here
mpage_map_and_submit_extent
mpage_map_one_extent
dioread_nolock = ext4_should_dioread_nolock(inode)
if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN))
mpd->io_submit.io_end->handle = handle->h_rsv_handle
ext4_set_io_unwritten_flag
io_end->flag |= EXT4_IO_END_UNWRITTEN
// now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag
scsi_finish_command
scsi_io_completion
scsi_io_completion_action
scsi_end_request
blk_update_request
req_bio_endio
bio_endio
bio->bi_end_io > ext4_end_bio
ext4_put_io_end_defer
ext4_add_complete_io
// trigger WARN_ON(!io_end->handle && sbi->s_journal);
The immediate cause of this problem is that ext4_should_dioread_nolock()
function returns inconsistent values in the ext4_do_writepages() and
mpage_map_one_extent(). There are four conditions in this function that
can be changed at mount time to cause this problem. These four conditions
can be divided into two categories:
(1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl
(2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount
The two in the first category have been fixed by commit c8585c6fcaf2
("ext4: fix races between changing inode journal mode and ext4_writepages")
and commit cb85f4d23f79 ("ext4: fix race between writepages and enabling
EXT4_EXTENTS_FL") respectively.
Two cases in the other category have not yet been fixed, and the above
issue is caused by this situation. We refer to the fix for the first
category, when applying options during remount, we grab s_writepages_rwsem
to avoid racing with writepages ops to trigger this problem.
Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io")
Cc: stable(a)vger.kernel.org
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
---
V1->V2:
Grab s_writepages_rwsem unconditionally during remount.
Remove patches 1,2 that are no longer needed.
fs/ext4/ext4.h | 3 ++-
fs/ext4/super.c | 9 +++++++++
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 9b2cfc32cf78..5f5ee0c20673 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1703,7 +1703,8 @@ struct ext4_sb_info {
/*
* Barrier between writepages ops and changing any inode's JOURNAL_DATA
- * or EXTENTS flag.
+ * or EXTENTS flag or between writepages ops and changing DELALLOC or
+ * DIOREAD_NOLOCK mount options on remount.
*/
struct percpu_rw_semaphore s_writepages_rwsem;
struct dax_device *s_daxdev;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index e6d84c1e34a4..607ebf2a008b 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -6403,7 +6403,16 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
}
+ /*
+ * Changing the DIOREAD_NOLOCK or DELALLOC mount options may cause
+ * two calls to ext4_should_dioread_nolock() to return inconsistent
+ * values, triggering WARN_ON in ext4_add_complete_io(). we grab
+ * here s_writepages_rwsem to avoid race between writepages ops and
+ * remount.
+ */
+ percpu_down_write(&sbi->s_writepages_rwsem);
ext4_apply_options(fc, sb);
+ percpu_up_write(&sbi->s_writepages_rwsem);
if ((old_opts.s_mount_opt & EXT4_MOUNT_JOURNAL_CHECKSUM) ^
test_opt(sb, JOURNAL_CHECKSUM)) {
--
2.31.1
Dzień dobry,
chciałbym poinformować Państwa o możliwości pozyskania nowych zleceń ze strony www.
Widzimy zainteresowanie potencjalnych Klientów Państwa firmą, dlatego chętnie pomożemy Państwu dotrzeć z ofertą do większego grona odbiorców poprzez efektywne metody pozycjonowania strony w Google.
Czy mógłbym liczyć na kontakt zwrotny?
Pozdrawiam serdecznie,
Wiktor Nurek
Hello,
I am reaching out to provide awareness for a part-time career. We are currently sourcing for representatives in Europe to work from home to act as our company's regional business representative in that region. It will by no means interfere with your current job or business.
Kindly email me for detailed information.
Regards,
Bleiholder Novak
This patch fixes an issue that a hugetlb uffd-wr-protected mapping can be
writable even with uffd-wp bit set. It only happens with hugetlb private
mappings, when someone firstly wr-protects a missing pte (which will
install a pte marker), then a write to the same page without any prior
access to the page.
Userfaultfd-wp trap for hugetlb was implemented in hugetlb_fault() before
reaching hugetlb_wp() to avoid taking more locks that userfault won't need.
However there's one CoW optimization path that can trigger hugetlb_wp()
inside hugetlb_no_page(), which will bypass the trap.
This patch skips hugetlb_wp() for CoW and retries the fault if uffd-wp bit
is detected. The new path will only trigger in the CoW optimization path
because generic hugetlb_fault() (e.g. when a present pte was wr-protected)
will resolve the uffd-wp bit already. Also make sure anonymous UNSHARE
won't be affected and can still be resolved, IOW only skip CoW not CoR.
This patch will be needed for v5.19+ hence copy stable.
Reported-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: linux-stable <stable(a)vger.kernel.org>
Fixes: 166f3ecc0daf ("mm/hugetlb: hook page faults for uffd write protection")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
---
Notes:
v2 is not on the list but in an attachment in the reply; this v3 is mostly
to make sure it's not the same as the patch used to be attached. Sorry
Andrew, we need to drop the queued one as I rewrote the commit message.
Muhammad, I didn't attach your T-b because of the slight functional change.
Please feel free to re-attach if it still works for you (which I believe
should).
thanks,
---
mm/hugetlb.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 8bfd07f4c143..a58b3739ed4b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5478,7 +5478,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
struct folio *pagecache_folio, spinlock_t *ptl)
{
const bool unshare = flags & FAULT_FLAG_UNSHARE;
- pte_t pte;
+ pte_t pte = huge_ptep_get(ptep);
struct hstate *h = hstate_vma(vma);
struct page *old_page;
struct folio *new_folio;
@@ -5487,6 +5487,17 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long haddr = address & huge_page_mask(h);
struct mmu_notifier_range range;
+ /*
+ * Never handle CoW for uffd-wp protected pages. It should be only
+ * handled when the uffd-wp protection is removed.
+ *
+ * Note that only the CoW optimization path (in hugetlb_no_page())
+ * can trigger this, because hugetlb_fault() will always resolve
+ * uffd-wp bit first.
+ */
+ if (!unshare && huge_pte_uffd_wp(pte))
+ return 0;
+
/*
* hugetlb does not support FOLL_FORCE-style write faults that keep the
* PTE mapped R/O such as maybe_mkwrite() would do.
@@ -5500,7 +5511,6 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
return 0;
}
- pte = huge_ptep_get(ptep);
old_page = pte_page(pte);
delayacct_wpcopy_start();
--
2.39.1
The patch titled
Subject: nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
nilfs2-fix-potential-uaf-of-struct-nilfs_sc_info-in-nilfs_segctor_thread.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread()
Date: Tue, 28 Mar 2023 02:53:18 +0900
The finalization of nilfs_segctor_thread() can race with
nilfs_segctor_kill_thread() which terminates that thread, potentially
causing a use-after-free BUG as KASAN detected.
At the end of nilfs_segctor_thread(), it assigns NULL to "sc_task" member
of "struct nilfs_sc_info" to indicate the thread has finished, and then
notifies nilfs_segctor_kill_thread() of this using waitqueue
"sc_wait_task" on the struct nilfs_sc_info.
However, here, immediately after the NULL assignment to "sc_task", it is
possible that nilfs_segctor_kill_thread() will detect it and return to
continue the deallocation, freeing the nilfs_sc_info structure before the
thread does the notification.
This fixes the issue by protecting the NULL assignment to "sc_task" and
its notification, with spinlock "sc_state_lock" of the struct
nilfs_sc_info. Since nilfs_segctor_kill_thread() does a final check to
see if "sc_task" is NULL with "sc_state_lock" locked, this can eliminate
the race.
Link: https://lkml.kernel.org/r/20230327175318.8060-1-konishi.ryusuke@gmail.com
Reported-by: syzbot+b08ebcc22f8f3e6be43a(a)syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/00000000000000660d05f7dfa877@google.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/segment.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/fs/nilfs2/segment.c~nilfs2-fix-potential-uaf-of-struct-nilfs_sc_info-in-nilfs_segctor_thread
+++ a/fs/nilfs2/segment.c
@@ -2609,11 +2609,10 @@ static int nilfs_segctor_thread(void *ar
goto loop;
end_thread:
- spin_unlock(&sci->sc_state_lock);
-
/* end sync. */
sci->sc_task = NULL;
wake_up(&sci->sc_wait_task); /* for nilfs_segctor_kill_thread() */
+ spin_unlock(&sci->sc_state_lock);
return 0;
}
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-fix-potential-uaf-of-struct-nilfs_sc_info-in-nilfs_segctor_thread.patch
From: "Liam R. Howlett" <Liam.Howlett(a)Oracle.com>
The call to mte_set_dead_node() before the smp_wmb() already calls
smp_wmb() so this is not needed. This is an optimization for the RCU mode
of the maple tree.
Link: https://lkml.kernel.org/r/20230227173632.3292573-5-surenb@google.com
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Cc: stable(a)vger.kernel.org
Signed-off-by: Liam Howlett <Liam.Howlett(a)oracle.com>
---
lib/maple_tree.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 44d6ce30b28e..3d5ab02f981a 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5517,7 +5517,6 @@ unsigned char mas_dead_leaves(struct ma_state *mas, void __rcu **slots,
break;
mte_set_node_dead(entry);
- smp_wmb(); /* Needed for RCU */
node->type = type;
rcu_assign_pointer(slots[offset], node);
}
--
2.39.2
From: "Liam R. Howlett" <Liam.Howlett(a)Oracle.com>
When initially starting a search, the root node may already be in the
process of being replaced in RCU mode. Detect and restart the walk if
this is the case. This is necessary for RCU mode of the maple tree.
Link: https://lkml.kernel.org/r/20230227173632.3292573-3-surenb@google.com
Cc: <Stable(a)vger.kernel.org>
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
---
lib/maple_tree.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index cc356b8369ad..089cd76ec379 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1360,12 +1360,16 @@ static inline struct maple_enode *mas_start(struct ma_state *mas)
mas->max = ULONG_MAX;
mas->depth = 0;
+retry:
root = mas_root(mas);
/* Tree with nodes */
if (likely(xa_is_node(root))) {
mas->depth = 1;
mas->node = mte_safe_root(root);
mas->offset = 0;
+ if (mte_dead_node(mas->node))
+ goto retry;
+
return NULL;
}
--
2.39.2
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x f0a57dd33b3eadf540912cd130db727ea824d174
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '167993613776154(a)kroah.com' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
f0a57dd33b3e ("thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers")
7af9da8ce8f9 ("thunderbolt: Add quirk to disable CLx")
6ce3563520be ("thunderbolt: Add support for DisplayPort bandwidth allocation mode")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f0a57dd33b3eadf540912cd130db727ea824d174 Mon Sep 17 00:00:00 2001
From: Gil Fine <gil.fine(a)linux.intel.com>
Date: Tue, 31 Jan 2023 13:04:52 +0200
Subject: [PATCH] thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host
routers
Current Intel USB4 host routers have hardware limitation that the USB3
bandwidth cannot go higher than 16376 Mb/s. Work this around by adding a
new quirk that limits the bandwidth for the affected host routers.
Cc: stable(a)vger.kernel.org
Signed-off-by: Gil Fine <gil.fine(a)linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
diff --git a/drivers/thunderbolt/quirks.c b/drivers/thunderbolt/quirks.c
index ae28a03fa890..1157b8869bcc 100644
--- a/drivers/thunderbolt/quirks.c
+++ b/drivers/thunderbolt/quirks.c
@@ -26,6 +26,19 @@ static void quirk_clx_disable(struct tb_switch *sw)
tb_sw_dbg(sw, "disabling CL states\n");
}
+static void quirk_usb3_maximum_bandwidth(struct tb_switch *sw)
+{
+ struct tb_port *port;
+
+ tb_switch_for_each_port(sw, port) {
+ if (!tb_port_is_usb3_down(port))
+ continue;
+ port->max_bw = 16376;
+ tb_port_dbg(port, "USB3 maximum bandwidth limited to %u Mb/s\n",
+ port->max_bw);
+ }
+}
+
struct tb_quirk {
u16 hw_vendor_id;
u16 hw_device_id;
@@ -43,6 +56,24 @@ static const struct tb_quirk tb_quirks[] = {
* DP buffers.
*/
{ 0x8087, 0x0b26, 0x0000, 0x0000, quirk_dp_credit_allocation },
+ /*
+ * Limit the maximum USB3 bandwidth for the following Intel USB4
+ * host routers due to a hardware issue.
+ */
+ { 0x8087, PCI_DEVICE_ID_INTEL_ADL_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_ADL_NHI1, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_RPL_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_RPL_NHI1, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_MTL_M_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_MTL_P_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_MTL_P_NHI1, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
/*
* CLx is not supported on AMD USB4 Yellow Carp and Pink Sardine platforms.
*/
diff --git a/drivers/thunderbolt/tb.h b/drivers/thunderbolt/tb.h
index b3cd13dc783b..275ff5219a3a 100644
--- a/drivers/thunderbolt/tb.h
+++ b/drivers/thunderbolt/tb.h
@@ -272,6 +272,8 @@ struct tb_bandwidth_group {
* @group: Bandwidth allocation group the adapter is assigned to. Only
* used for DP IN adapters for now.
* @group_list: The adapter is linked to the group's list of ports through this
+ * @max_bw: Maximum possible bandwidth through this adapter if set to
+ * non-zero.
*
* In USB4 terminology this structure represents an adapter (protocol or
* lane adapter).
@@ -299,6 +301,7 @@ struct tb_port {
unsigned int dma_credits;
struct tb_bandwidth_group *group;
struct list_head group_list;
+ unsigned int max_bw;
};
/**
diff --git a/drivers/thunderbolt/usb4.c b/drivers/thunderbolt/usb4.c
index 95ff02395822..6e87cf993c68 100644
--- a/drivers/thunderbolt/usb4.c
+++ b/drivers/thunderbolt/usb4.c
@@ -1882,6 +1882,15 @@ int usb4_port_retimer_nvm_read(struct tb_port *port, u8 index,
usb4_port_retimer_nvm_read_block, &info);
}
+static inline unsigned int
+usb4_usb3_port_max_bandwidth(const struct tb_port *port, unsigned int bw)
+{
+ /* Take the possible bandwidth limitation into account */
+ if (port->max_bw)
+ return min(bw, port->max_bw);
+ return bw;
+}
+
/**
* usb4_usb3_port_max_link_rate() - Maximum support USB3 link rate
* @port: USB3 adapter port
@@ -1903,7 +1912,9 @@ int usb4_usb3_port_max_link_rate(struct tb_port *port)
return ret;
lr = (val & ADP_USB3_CS_4_MSLR_MASK) >> ADP_USB3_CS_4_MSLR_SHIFT;
- return lr == ADP_USB3_CS_4_MSLR_20G ? 20000 : 10000;
+ ret = lr == ADP_USB3_CS_4_MSLR_20G ? 20000 : 10000;
+
+ return usb4_usb3_port_max_bandwidth(port, ret);
}
/**
@@ -1930,7 +1941,9 @@ int usb4_usb3_port_actual_link_rate(struct tb_port *port)
return 0;
lr = val & ADP_USB3_CS_4_ALR_MASK;
- return lr == ADP_USB3_CS_4_ALR_20G ? 20000 : 10000;
+ ret = lr == ADP_USB3_CS_4_ALR_20G ? 20000 : 10000;
+
+ return usb4_usb3_port_max_bandwidth(port, ret);
}
static int usb4_usb3_port_cm_request(struct tb_port *port, bool request)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x f0a57dd33b3eadf540912cd130db727ea824d174
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '167993613759241(a)kroah.com' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
f0a57dd33b3e ("thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers")
7af9da8ce8f9 ("thunderbolt: Add quirk to disable CLx")
6ce3563520be ("thunderbolt: Add support for DisplayPort bandwidth allocation mode")
aef9c693e7e5 ("thunderbolt: Move vendor specific NVM handling into nvm.c")
8b02b2da77c8 ("thunderbolt: Provide tb_retimer_nvm_read() analogous to tb_switch_nvm_read()")
7bfafaa5185b ("thunderbolt: Rename and make nvm_read() available for other files")
5424e1bf16f9 ("thunderbolt: Extend NVM version fields to 32-bits")
7f333ace0257 ("thunderbolt: Move tb_xdomain_parent() to tb.h")
3084b48fa139 ("thunderbolt: Change TMU mode to HiFi uni-directional once DisplayPort tunneled")
43f977bc60b1 ("thunderbolt: Enable CL0s for Intel Titan Ridge")
23ccd21ccb56 ("thunderbolt: Implement TMU time disruption for Intel Titan Ridge")
8a90e4fa3b4d ("thunderbolt: Add CL0s support for USB4 routers")
a28ec0e165ba ("thunderbolt: Add TMU uni-directional mode")
43bddb26e20a ("thunderbolt: Tear down existing tunnels when resuming from hibernate")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f0a57dd33b3eadf540912cd130db727ea824d174 Mon Sep 17 00:00:00 2001
From: Gil Fine <gil.fine(a)linux.intel.com>
Date: Tue, 31 Jan 2023 13:04:52 +0200
Subject: [PATCH] thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host
routers
Current Intel USB4 host routers have hardware limitation that the USB3
bandwidth cannot go higher than 16376 Mb/s. Work this around by adding a
new quirk that limits the bandwidth for the affected host routers.
Cc: stable(a)vger.kernel.org
Signed-off-by: Gil Fine <gil.fine(a)linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
diff --git a/drivers/thunderbolt/quirks.c b/drivers/thunderbolt/quirks.c
index ae28a03fa890..1157b8869bcc 100644
--- a/drivers/thunderbolt/quirks.c
+++ b/drivers/thunderbolt/quirks.c
@@ -26,6 +26,19 @@ static void quirk_clx_disable(struct tb_switch *sw)
tb_sw_dbg(sw, "disabling CL states\n");
}
+static void quirk_usb3_maximum_bandwidth(struct tb_switch *sw)
+{
+ struct tb_port *port;
+
+ tb_switch_for_each_port(sw, port) {
+ if (!tb_port_is_usb3_down(port))
+ continue;
+ port->max_bw = 16376;
+ tb_port_dbg(port, "USB3 maximum bandwidth limited to %u Mb/s\n",
+ port->max_bw);
+ }
+}
+
struct tb_quirk {
u16 hw_vendor_id;
u16 hw_device_id;
@@ -43,6 +56,24 @@ static const struct tb_quirk tb_quirks[] = {
* DP buffers.
*/
{ 0x8087, 0x0b26, 0x0000, 0x0000, quirk_dp_credit_allocation },
+ /*
+ * Limit the maximum USB3 bandwidth for the following Intel USB4
+ * host routers due to a hardware issue.
+ */
+ { 0x8087, PCI_DEVICE_ID_INTEL_ADL_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_ADL_NHI1, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_RPL_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_RPL_NHI1, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_MTL_M_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_MTL_P_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_MTL_P_NHI1, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
/*
* CLx is not supported on AMD USB4 Yellow Carp and Pink Sardine platforms.
*/
diff --git a/drivers/thunderbolt/tb.h b/drivers/thunderbolt/tb.h
index b3cd13dc783b..275ff5219a3a 100644
--- a/drivers/thunderbolt/tb.h
+++ b/drivers/thunderbolt/tb.h
@@ -272,6 +272,8 @@ struct tb_bandwidth_group {
* @group: Bandwidth allocation group the adapter is assigned to. Only
* used for DP IN adapters for now.
* @group_list: The adapter is linked to the group's list of ports through this
+ * @max_bw: Maximum possible bandwidth through this adapter if set to
+ * non-zero.
*
* In USB4 terminology this structure represents an adapter (protocol or
* lane adapter).
@@ -299,6 +301,7 @@ struct tb_port {
unsigned int dma_credits;
struct tb_bandwidth_group *group;
struct list_head group_list;
+ unsigned int max_bw;
};
/**
diff --git a/drivers/thunderbolt/usb4.c b/drivers/thunderbolt/usb4.c
index 95ff02395822..6e87cf993c68 100644
--- a/drivers/thunderbolt/usb4.c
+++ b/drivers/thunderbolt/usb4.c
@@ -1882,6 +1882,15 @@ int usb4_port_retimer_nvm_read(struct tb_port *port, u8 index,
usb4_port_retimer_nvm_read_block, &info);
}
+static inline unsigned int
+usb4_usb3_port_max_bandwidth(const struct tb_port *port, unsigned int bw)
+{
+ /* Take the possible bandwidth limitation into account */
+ if (port->max_bw)
+ return min(bw, port->max_bw);
+ return bw;
+}
+
/**
* usb4_usb3_port_max_link_rate() - Maximum support USB3 link rate
* @port: USB3 adapter port
@@ -1903,7 +1912,9 @@ int usb4_usb3_port_max_link_rate(struct tb_port *port)
return ret;
lr = (val & ADP_USB3_CS_4_MSLR_MASK) >> ADP_USB3_CS_4_MSLR_SHIFT;
- return lr == ADP_USB3_CS_4_MSLR_20G ? 20000 : 10000;
+ ret = lr == ADP_USB3_CS_4_MSLR_20G ? 20000 : 10000;
+
+ return usb4_usb3_port_max_bandwidth(port, ret);
}
/**
@@ -1930,7 +1941,9 @@ int usb4_usb3_port_actual_link_rate(struct tb_port *port)
return 0;
lr = val & ADP_USB3_CS_4_ALR_MASK;
- return lr == ADP_USB3_CS_4_ALR_20G ? 20000 : 10000;
+ ret = lr == ADP_USB3_CS_4_ALR_20G ? 20000 : 10000;
+
+ return usb4_usb3_port_max_bandwidth(port, ret);
}
static int usb4_usb3_port_cm_request(struct tb_port *port, bool request)
The patch below does not apply to the 6.2-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.2.y
git checkout FETCH_HEAD
git cherry-pick -x f0a57dd33b3eadf540912cd130db727ea824d174
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '1679936136244141(a)kroah.com' --subject-prefix 'PATCH 6.2.y' HEAD^..
Possible dependencies:
f0a57dd33b3e ("thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers")
7af9da8ce8f9 ("thunderbolt: Add quirk to disable CLx")
6ce3563520be ("thunderbolt: Add support for DisplayPort bandwidth allocation mode")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f0a57dd33b3eadf540912cd130db727ea824d174 Mon Sep 17 00:00:00 2001
From: Gil Fine <gil.fine(a)linux.intel.com>
Date: Tue, 31 Jan 2023 13:04:52 +0200
Subject: [PATCH] thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host
routers
Current Intel USB4 host routers have hardware limitation that the USB3
bandwidth cannot go higher than 16376 Mb/s. Work this around by adding a
new quirk that limits the bandwidth for the affected host routers.
Cc: stable(a)vger.kernel.org
Signed-off-by: Gil Fine <gil.fine(a)linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
diff --git a/drivers/thunderbolt/quirks.c b/drivers/thunderbolt/quirks.c
index ae28a03fa890..1157b8869bcc 100644
--- a/drivers/thunderbolt/quirks.c
+++ b/drivers/thunderbolt/quirks.c
@@ -26,6 +26,19 @@ static void quirk_clx_disable(struct tb_switch *sw)
tb_sw_dbg(sw, "disabling CL states\n");
}
+static void quirk_usb3_maximum_bandwidth(struct tb_switch *sw)
+{
+ struct tb_port *port;
+
+ tb_switch_for_each_port(sw, port) {
+ if (!tb_port_is_usb3_down(port))
+ continue;
+ port->max_bw = 16376;
+ tb_port_dbg(port, "USB3 maximum bandwidth limited to %u Mb/s\n",
+ port->max_bw);
+ }
+}
+
struct tb_quirk {
u16 hw_vendor_id;
u16 hw_device_id;
@@ -43,6 +56,24 @@ static const struct tb_quirk tb_quirks[] = {
* DP buffers.
*/
{ 0x8087, 0x0b26, 0x0000, 0x0000, quirk_dp_credit_allocation },
+ /*
+ * Limit the maximum USB3 bandwidth for the following Intel USB4
+ * host routers due to a hardware issue.
+ */
+ { 0x8087, PCI_DEVICE_ID_INTEL_ADL_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_ADL_NHI1, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_RPL_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_RPL_NHI1, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_MTL_M_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_MTL_P_NHI0, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_MTL_P_NHI1, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
/*
* CLx is not supported on AMD USB4 Yellow Carp and Pink Sardine platforms.
*/
diff --git a/drivers/thunderbolt/tb.h b/drivers/thunderbolt/tb.h
index b3cd13dc783b..275ff5219a3a 100644
--- a/drivers/thunderbolt/tb.h
+++ b/drivers/thunderbolt/tb.h
@@ -272,6 +272,8 @@ struct tb_bandwidth_group {
* @group: Bandwidth allocation group the adapter is assigned to. Only
* used for DP IN adapters for now.
* @group_list: The adapter is linked to the group's list of ports through this
+ * @max_bw: Maximum possible bandwidth through this adapter if set to
+ * non-zero.
*
* In USB4 terminology this structure represents an adapter (protocol or
* lane adapter).
@@ -299,6 +301,7 @@ struct tb_port {
unsigned int dma_credits;
struct tb_bandwidth_group *group;
struct list_head group_list;
+ unsigned int max_bw;
};
/**
diff --git a/drivers/thunderbolt/usb4.c b/drivers/thunderbolt/usb4.c
index 95ff02395822..6e87cf993c68 100644
--- a/drivers/thunderbolt/usb4.c
+++ b/drivers/thunderbolt/usb4.c
@@ -1882,6 +1882,15 @@ int usb4_port_retimer_nvm_read(struct tb_port *port, u8 index,
usb4_port_retimer_nvm_read_block, &info);
}
+static inline unsigned int
+usb4_usb3_port_max_bandwidth(const struct tb_port *port, unsigned int bw)
+{
+ /* Take the possible bandwidth limitation into account */
+ if (port->max_bw)
+ return min(bw, port->max_bw);
+ return bw;
+}
+
/**
* usb4_usb3_port_max_link_rate() - Maximum support USB3 link rate
* @port: USB3 adapter port
@@ -1903,7 +1912,9 @@ int usb4_usb3_port_max_link_rate(struct tb_port *port)
return ret;
lr = (val & ADP_USB3_CS_4_MSLR_MASK) >> ADP_USB3_CS_4_MSLR_SHIFT;
- return lr == ADP_USB3_CS_4_MSLR_20G ? 20000 : 10000;
+ ret = lr == ADP_USB3_CS_4_MSLR_20G ? 20000 : 10000;
+
+ return usb4_usb3_port_max_bandwidth(port, ret);
}
/**
@@ -1930,7 +1941,9 @@ int usb4_usb3_port_actual_link_rate(struct tb_port *port)
return 0;
lr = val & ADP_USB3_CS_4_ALR_MASK;
- return lr == ADP_USB3_CS_4_ALR_20G ? 20000 : 10000;
+ ret = lr == ADP_USB3_CS_4_ALR_20G ? 20000 : 10000;
+
+ return usb4_usb3_port_max_bandwidth(port, ret);
}
static int usb4_usb3_port_cm_request(struct tb_port *port, bool request)
There are various bits of VM-scoped data that can only be configured
before the first call to KVM_RUN, such as the hypercall bitmaps and
the PMU. As these fields are protected by the kvm->lock and accessed
while holding vcpu->mutex, this is yet another example of lock
inversion.
Change out the kvm->lock for kvm->arch.config_lock in all of these
instances. Opportunistically simplify the locking mechanics of the
PMU configuration by holding the config_lock for the entirety of
kvm_arm_pmu_v3_set_attr().
Note that this also addresses a couple of bugs. There is an unguarded
read of the PMU version in KVM_ARM_VCPU_PMU_V3_FILTER which could race
with KVM_ARM_VCPU_PMU_V3_SET_PMU. Additionally, until now writes to the
per-vCPU vPMU irq were not serialized VM-wide, meaning concurrent calls
to KVM_ARM_VCPU_PMU_V3_IRQ could lead to a false positive in
pmu_irq_is_valid().
Cc: stable(a)vger.kernel.org
Tested-by: Jeremy Linton <jeremy.linton(a)arm.com>
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
---
arch/arm64/kvm/arm.c | 4 ++--
arch/arm64/kvm/guest.c | 2 ++
arch/arm64/kvm/hypercalls.c | 4 ++--
arch/arm64/kvm/pmu-emul.c | 23 ++++++-----------------
4 files changed, 12 insertions(+), 21 deletions(-)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 1620ec3d95ef..fd8d355aca15 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -624,9 +624,9 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
if (kvm_vm_is_protected(kvm))
kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
- mutex_lock(&kvm->lock);
+ mutex_lock(&kvm->arch.config_lock);
set_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags);
- mutex_unlock(&kvm->lock);
+ mutex_unlock(&kvm->arch.config_lock);
return ret;
}
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 07444fa22888..481c79cf22cd 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -957,7 +957,9 @@ int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
switch (attr->group) {
case KVM_ARM_VCPU_PMU_V3_CTRL:
+ mutex_lock(&vcpu->kvm->arch.config_lock);
ret = kvm_arm_pmu_v3_set_attr(vcpu, attr);
+ mutex_unlock(&vcpu->kvm->arch.config_lock);
break;
case KVM_ARM_VCPU_TIMER_CTRL:
ret = kvm_arm_timer_set_attr(vcpu, attr);
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 5da884e11337..fbdbf4257f76 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -377,7 +377,7 @@ static int kvm_arm_set_fw_reg_bmap(struct kvm_vcpu *vcpu, u64 reg_id, u64 val)
if (val & ~fw_reg_features)
return -EINVAL;
- mutex_lock(&kvm->lock);
+ mutex_lock(&kvm->arch.config_lock);
if (test_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags) &&
val != *fw_reg_bmap) {
@@ -387,7 +387,7 @@ static int kvm_arm_set_fw_reg_bmap(struct kvm_vcpu *vcpu, u64 reg_id, u64 val)
WRITE_ONCE(*fw_reg_bmap, val);
out:
- mutex_unlock(&kvm->lock);
+ mutex_unlock(&kvm->arch.config_lock);
return ret;
}
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index c243b10f3e15..82991d89c2ea 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -875,7 +875,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
struct arm_pmu *arm_pmu;
int ret = -ENXIO;
- mutex_lock(&kvm->lock);
+ lockdep_assert_held(&kvm->arch.config_lock);
mutex_lock(&arm_pmus_lock);
list_for_each_entry(entry, &arm_pmus, entry) {
@@ -895,7 +895,6 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
}
mutex_unlock(&arm_pmus_lock);
- mutex_unlock(&kvm->lock);
return ret;
}
@@ -903,22 +902,20 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
{
struct kvm *kvm = vcpu->kvm;
+ lockdep_assert_held(&kvm->arch.config_lock);
+
if (!kvm_vcpu_has_pmu(vcpu))
return -ENODEV;
if (vcpu->arch.pmu.created)
return -EBUSY;
- mutex_lock(&kvm->lock);
if (!kvm->arch.arm_pmu) {
/* No PMU set, get the default one */
kvm->arch.arm_pmu = kvm_pmu_probe_armpmu();
- if (!kvm->arch.arm_pmu) {
- mutex_unlock(&kvm->lock);
+ if (!kvm->arch.arm_pmu)
return -ENODEV;
- }
}
- mutex_unlock(&kvm->lock);
switch (attr->attr) {
case KVM_ARM_VCPU_PMU_V3_IRQ: {
@@ -962,19 +959,13 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
filter.action != KVM_PMU_EVENT_DENY))
return -EINVAL;
- mutex_lock(&kvm->lock);
-
- if (test_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags)) {
- mutex_unlock(&kvm->lock);
+ if (test_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags))
return -EBUSY;
- }
if (!kvm->arch.pmu_filter) {
kvm->arch.pmu_filter = bitmap_alloc(nr_events, GFP_KERNEL_ACCOUNT);
- if (!kvm->arch.pmu_filter) {
- mutex_unlock(&kvm->lock);
+ if (!kvm->arch.pmu_filter)
return -ENOMEM;
- }
/*
* The default depends on the first applied filter.
@@ -993,8 +984,6 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
else
bitmap_clear(kvm->arch.pmu_filter, filter.base_event, filter.nevents);
- mutex_unlock(&kvm->lock);
-
return 0;
}
case KVM_ARM_VCPU_PMU_V3_SET_PMU: {
--
2.40.0.348.gf938b09366-goog
kvm->lock must be taken outside of the vcpu->mutex. Of course, the
locking documentation for KVM makes this abundantly clear. Nonetheless,
the locking order in KVM/arm64 has been wrong for quite a while; we
acquire the kvm->lock while holding the vcpu->mutex all over the shop.
All was seemingly fine until commit 42a90008f890 ("KVM: Ensure lockdep
knows about kvm->lock vs. vcpu->mutex ordering rule") caught us with our
pants down, leading to lockdep barfing:
======================================================
WARNING: possible circular locking dependency detected
6.2.0-rc7+ #19 Not tainted
------------------------------------------------------
qemu-system-aar/859 is trying to acquire lock:
ffff5aa69269eba0 (&host_kvm->lock){+.+.}-{3:3}, at: kvm_reset_vcpu+0x34/0x274
but task is already holding lock:
ffff5aa68768c0b8 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x8c/0xba0
which lock already depends on the new lock.
Add a dedicated lock to serialize writes to VM-scoped configuration from
the context of a vCPU. Protect the register width flags with the new
lock, thus avoiding the need to grab the kvm->lock while holding
vcpu->mutex in kvm_reset_vcpu().
Cc: stable(a)vger.kernel.org
Reported-by: Jeremy Linton <jeremy.linton(a)arm.com>
Link: https://lore.kernel.org/kvmarm/f6452cdd-65ff-34b8-bab0-5c06416da5f6@arm.com/
Tested-by: Jeremy Linton <jeremy.linton(a)arm.com>
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
---
arch/arm64/include/asm/kvm_host.h | 3 +++
arch/arm64/kvm/arm.c | 18 ++++++++++++++++++
arch/arm64/kvm/reset.c | 6 +++---
3 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 917586237a4d..cd1ef8716719 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -199,6 +199,9 @@ struct kvm_arch {
/* Mandated version of PSCI */
u32 psci_version;
+ /* Protects VM-scoped configuration data */
+ struct mutex config_lock;
+
/*
* If we encounter a data abort without valid instruction syndrome
* information, report this to user space. User space can (and
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 647798da8c41..1620ec3d95ef 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -128,6 +128,16 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
{
int ret;
+ mutex_init(&kvm->arch.config_lock);
+
+#ifdef CONFIG_LOCKDEP
+ /* Clue in lockdep that the config_lock must be taken inside kvm->lock */
+ mutex_lock(&kvm->lock);
+ mutex_lock(&kvm->arch.config_lock);
+ mutex_unlock(&kvm->arch.config_lock);
+ mutex_unlock(&kvm->lock);
+#endif
+
ret = kvm_share_hyp(kvm, kvm + 1);
if (ret)
return ret;
@@ -328,6 +338,14 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
spin_lock_init(&vcpu->arch.mp_state_lock);
+#ifdef CONFIG_LOCKDEP
+ /* Inform lockdep that the config_lock is acquired after vcpu->mutex */
+ mutex_lock(&vcpu->mutex);
+ mutex_lock(&vcpu->kvm->arch.config_lock);
+ mutex_unlock(&vcpu->kvm->arch.config_lock);
+ mutex_unlock(&vcpu->mutex);
+#endif
+
/* Force users to call KVM_ARM_VCPU_INIT */
vcpu->arch.target = -1;
bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 9e023546bde0..b5dee8e57e77 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -205,7 +205,7 @@ static int kvm_set_vm_width(struct kvm_vcpu *vcpu)
is32bit = vcpu_has_feature(vcpu, KVM_ARM_VCPU_EL1_32BIT);
- lockdep_assert_held(&kvm->lock);
+ lockdep_assert_held(&kvm->arch.config_lock);
if (test_bit(KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED, &kvm->arch.flags)) {
/*
@@ -262,9 +262,9 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
bool loaded;
u32 pstate;
- mutex_lock(&vcpu->kvm->lock);
+ mutex_lock(&vcpu->kvm->arch.config_lock);
ret = kvm_set_vm_width(vcpu);
- mutex_unlock(&vcpu->kvm->lock);
+ mutex_unlock(&vcpu->kvm->arch.config_lock);
if (ret)
return ret;
--
2.40.0.348.gf938b09366-goog
Hello,
I am reaching out to provide awareness for a part-time career. We are currently sourcing for representatives in Europe to work from home to act as our company's regional business representative in that region. It will by no means interfere with your current job or business.
Kindly email me for detailed information.
Regards,
Bleiholder Novak
From: Jeremi Piotrowski <jpiotrowski(a)linux.microsoft.com>
The Hyper-V "EnlightenedNptTlb" enlightenment is always enabled when KVM
is running on top of Hyper-V and Hyper-V exposes support for it (which
is always). On AMD CPUs this enlightenment results in ASID invalidations
not flushing TLB entries derived from the NPT. To force the underlying
(L0) hypervisor to rebuild its shadow page tables, an explicit hypercall
is needed.
The original KVM implementation of Hyper-V's "EnlightenedNptTlb" on SVM
only added remote TLB flush hooks. This worked out fine for a while, as
sufficient remote TLB flushes where being issued in KVM to mask the
problem. Since v5.17, changes in the TDP code reduced the number of
flushes and the out-of-sync TLB prevents guests from booting
successfully.
Split svm_flush_tlb_current() into separate callbacks for the 3 cases
(guest/all/current), and issue the required Hyper-V hypercall when a
Hyper-V TLB flush is needed. The most important case where the TLB flush
was missing is when loading a new PGD, which is followed by what is now
svm_flush_tlb_current().
Cc: stable(a)vger.kernel.org # v5.17+
Fixes: 1e0c7d40758b ("KVM: SVM: hyper-v: Remote TLB flush for SVM")
Link: https://lore.kernel.org/lkml/43980946-7bbf-dcef-7e40-af904c456250@linux.mic…
Suggested-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Jeremi Piotrowski <jpiotrowski(a)linux.microsoft.com>
---
Changes since v1:
- lookup enlightened_npt_tlb in vmcb to determine whether to do the
flush
- when KVM wants a hyperv_flush_guest_mapping() call, don't try to
optimize it out
- don't hide hyperv flush behind helper, make it visible in
svm.c
arch/x86/kvm/kvm_onhyperv.h | 5 +++++
arch/x86/kvm/svm/svm.c | 37 ++++++++++++++++++++++++++++++---
arch/x86/kvm/svm/svm_onhyperv.h | 15 +++++++++++++
3 files changed, 54 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/kvm_onhyperv.h b/arch/x86/kvm/kvm_onhyperv.h
index 287e98ef9df3..67b53057e41c 100644
--- a/arch/x86/kvm/kvm_onhyperv.h
+++ b/arch/x86/kvm/kvm_onhyperv.h
@@ -12,6 +12,11 @@ int hv_remote_flush_tlb_with_range(struct kvm *kvm,
int hv_remote_flush_tlb(struct kvm *kvm);
void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp);
#else /* !CONFIG_HYPERV */
+static inline int hv_remote_flush_tlb(struct kvm *kvm)
+{
+ return -1;
+}
+
static inline void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp)
{
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 252e7f37e4e2..f25bc3cbb250 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3729,7 +3729,7 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
svm->vmcb->save.rflags |= (X86_EFLAGS_TF | X86_EFLAGS_RF);
}
-static void svm_flush_tlb_current(struct kvm_vcpu *vcpu)
+static void svm_flush_tlb_asid(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -3753,6 +3753,37 @@ static void svm_flush_tlb_current(struct kvm_vcpu *vcpu)
svm->current_vmcb->asid_generation--;
}
+static void svm_flush_tlb_current(struct kvm_vcpu *vcpu)
+{
+ hpa_t root_tdp = vcpu->arch.mmu->root.hpa;
+
+ /*
+ * When running on Hyper-V with EnlightenedNptTlb enabled, explicitly
+ * flush the NPT mappings via hypercall as flushing the ASID only
+ * affects virtual to physical mappings, it does not invalidate guest
+ * physical to host physical mappings.
+ */
+ if (svm_hv_is_enlightened_tlb_enabled(vcpu) && VALID_PAGE(root_tdp))
+ hyperv_flush_guest_mapping(root_tdp);
+
+ svm_flush_tlb_asid(vcpu);
+}
+
+static void svm_flush_tlb_all(struct kvm_vcpu *vcpu)
+{
+ /*
+ * When running on Hyper-V with EnlightenedNptTlb enabled, remote TLB
+ * flushes should be routed to hv_remote_flush_tlb() without requesting
+ * a "regular" remote flush. Reaching this point means either there's
+ * a KVM bug or a prior hv_remote_flush_tlb() call failed, both of
+ * which might be fatal to the guest. Yell, but try to recover.
+ */
+ if (WARN_ON_ONCE(svm_hv_is_enlightened_tlb_enabled(vcpu)))
+ hv_remote_flush_tlb(vcpu->kvm);
+
+ svm_flush_tlb_asid(vcpu);
+}
+
static void svm_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t gva)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -4745,10 +4776,10 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
.set_rflags = svm_set_rflags,
.get_if_flag = svm_get_if_flag,
- .flush_tlb_all = svm_flush_tlb_current,
+ .flush_tlb_all = svm_flush_tlb_all,
.flush_tlb_current = svm_flush_tlb_current,
.flush_tlb_gva = svm_flush_tlb_gva,
- .flush_tlb_guest = svm_flush_tlb_current,
+ .flush_tlb_guest = svm_flush_tlb_asid,
.vcpu_pre_run = svm_vcpu_pre_run,
.vcpu_run = svm_vcpu_run,
diff --git a/arch/x86/kvm/svm/svm_onhyperv.h b/arch/x86/kvm/svm/svm_onhyperv.h
index cff838f15db5..786d46d73a8e 100644
--- a/arch/x86/kvm/svm/svm_onhyperv.h
+++ b/arch/x86/kvm/svm/svm_onhyperv.h
@@ -6,6 +6,8 @@
#ifndef __ARCH_X86_KVM_SVM_ONHYPERV_H__
#define __ARCH_X86_KVM_SVM_ONHYPERV_H__
+#include <asm/mshyperv.h>
+
#if IS_ENABLED(CONFIG_HYPERV)
#include "kvm_onhyperv.h"
@@ -15,6 +17,14 @@ static struct kvm_x86_ops svm_x86_ops;
int svm_hv_enable_l2_tlb_flush(struct kvm_vcpu *vcpu);
+static inline bool svm_hv_is_enlightened_tlb_enabled(struct kvm_vcpu *vcpu)
+{
+ struct hv_vmcb_enlightenments *hve = &to_svm(vcpu)->vmcb->control.hv_enlightenments;
+
+ return ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB &&
+ !!hve->hv_enlightenments_control.enlightened_npt_tlb;
+}
+
static inline void svm_hv_init_vmcb(struct vmcb *vmcb)
{
struct hv_vmcb_enlightenments *hve = &vmcb->control.hv_enlightenments;
@@ -80,6 +90,11 @@ static inline void svm_hv_update_vp_id(struct vmcb *vmcb, struct kvm_vcpu *vcpu)
}
#else
+static inline bool svm_hv_is_enlightened_tlb_enabled(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
static inline void svm_hv_init_vmcb(struct vmcb *vmcb)
{
}
--
2.37.2
Following warnings / errors noticed on Linux stable queue 5.15.
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi:460.3-52: Warning
(pci_device_reg): /pcie@f8000000/pcie@0,0:reg: PCI reg address is not
configuration space
arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi:460.3-52: Warning
(pci_device_reg): /pcie@f8000000/pcie@0,0:reg: PCI reg address is not
configuration space
arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi:460.3-52: Warning
(pci_device_reg): /pcie@f8000000/pcie@0,0:reg: PCI reg address is not
configuration space
arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi:460.3-52: Warning
(pci_device_reg): /pcie@f8000000/pcie@0,0:reg: PCI reg address is not
configuration space
drivers/interconnect/qcom/icc-rpmh.c: In function 'qcom_icc_rpmh_probe':
drivers/interconnect/qcom/icc-rpmh.c:221:9: error: implicit
declaration of function 'icc_provider_init'; did you mean
'icc_provider_del'? [-Werror=implicit-function-declaration]
221 | icc_provider_init(provider);
| ^~~~~~~~~~~~~~~~~
| icc_provider_del
drivers/interconnect/qcom/icc-rpmh.c:257:15: error: implicit
declaration of function 'icc_provider_register'; did you mean
'icc_provider_del'? [-Werror=implicit-function-declaration]
257 | ret = icc_provider_register(provider);
| ^~~~~~~~~~~~~~~~~~~~~
| icc_provider_del
drivers/interconnect/qcom/icc-rpmh.c: In function 'qcom_icc_rpmh_remove':
drivers/interconnect/qcom/icc-rpmh.c:276:9: error: implicit
declaration of function 'icc_provider_deregister'; did you mean
'icc_provider_del'? [-Werror=implicit-function-declaration]
276 | icc_provider_deregister(&qp->provider);
| ^~~~~~~~~~~~~~~~~~~~~~~
| icc_provider_del
cc1: some warnings being treated as errors
Links,
- https://qa-reports.linaro.org/lkft/linux-stable-rc-queues-queue_5.15-sanity…
- https://qa-reports.linaro.org/lkft/linux-stable-rc-queues-queue_5.15-sanity…
--
Linaro LKFT
https://lkft.linaro.org
From: Lee Jones <lee(a)kernel.org>
[ Upstream commit 1c5d4221240a233df2440fe75c881465cdf8da07 ]
The default maximum data buffer size for this interface is UHID_DATA_MAX
(4k). When data buffers are being processed, ensure this value is used
when ensuring the sanity, rather than a value between the user provided
value and HID_MAX_BUFFER_SIZE (16k).
Signed-off-by: Lee Jones <lee(a)kernel.org>
Signed-off-by: Jiri Kosina <jkosina(a)suse.cz>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/hid/uhid.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/hid/uhid.c b/drivers/hid/uhid.c
index fc06d8bb42e0f..ba0ca652b9dab 100644
--- a/drivers/hid/uhid.c
+++ b/drivers/hid/uhid.c
@@ -395,6 +395,7 @@ struct hid_ll_driver uhid_hid_driver = {
.parse = uhid_hid_parse,
.raw_request = uhid_hid_raw_request,
.output_report = uhid_hid_output_report,
+ .max_buffer_size = UHID_DATA_MAX,
};
EXPORT_SYMBOL_GPL(uhid_hid_driver);
--
2.39.2
We got a WARNING in ext4_add_complete_io:
==================================================================
WARNING: at fs/ext4/page-io.c:231 ext4_put_io_end_defer+0x182/0x250
CPU: 10 PID: 77 Comm: ksoftirqd/10 Tainted: 6.3.0-rc2 #85
RIP: 0010:ext4_put_io_end_defer+0x182/0x250 [ext4]
[...]
Call Trace:
<TASK>
ext4_end_bio+0xa8/0x240 [ext4]
bio_endio+0x195/0x310
blk_update_request+0x184/0x770
scsi_end_request+0x2f/0x240
scsi_io_completion+0x75/0x450
scsi_finish_command+0xef/0x160
scsi_complete+0xa3/0x180
blk_complete_reqs+0x60/0x80
blk_done_softirq+0x25/0x40
__do_softirq+0x119/0x4c8
run_ksoftirqd+0x42/0x70
smpboot_thread_fn+0x136/0x3c0
kthread+0x140/0x1a0
ret_from_fork+0x2c/0x50
==================================================================
Above issue may happen as follows:
cpu1 cpu2
______________________________|_____________________________
mount -o dioread_lock
ext4_writepages
ext4_do_writepages
*if (ext4_should_dioread_nolock(inode))*
// rsv_blocks is not assigned here
mount -o remount,dioread_nolock
ext4_journal_start_with_reserve
__ext4_journal_start
__ext4_journal_start_sb
jbd2__journal_start
*if (rsv_blocks)*
// h_rsv_handle is not initialized here
mpage_map_and_submit_extent
mpage_map_one_extent
dioread_nolock = ext4_should_dioread_nolock(inode)
if (dioread_nolock && (map->m_flags & EXT4_MAP_UNWRITTEN))
mpd->io_submit.io_end->handle = handle->h_rsv_handle
ext4_set_io_unwritten_flag
io_end->flag |= EXT4_IO_END_UNWRITTEN
// now io_end->handle is NULL but has EXT4_IO_END_UNWRITTEN flag
scsi_finish_command
scsi_io_completion
scsi_io_completion_action
scsi_end_request
blk_update_request
req_bio_endio
bio_endio
bio->bi_end_io > ext4_end_bio
ext4_put_io_end_defer
ext4_add_complete_io
// trigger WARN_ON(!io_end->handle && sbi->s_journal);
The immediate cause of this problem is that ext4_should_dioread_nolock()
function returns inconsistent values in the ext4_do_writepages() and
mpage_map_one_extent(). There are four conditions in this function that
can be changed at mount time to cause this problem. These four conditions
can be divided into two categories:
(1) journal_data and EXT4_EXTENTS_FL, which can be changed by ioctl
(2) DELALLOC and DIOREAD_NOLOCK, which can be changed by remount
The two in the first category have been fixed by commit c8585c6fcaf2
("ext4: fix races between changing inode journal mode and ext4_writepages")
and commit cb85f4d23f79 ("ext4: fix race between writepages and enabling
EXT4_EXTENTS_FL") respectively.
Two cases in the other category have not yet been fixed, and the above
issue is caused by this situation. We refer to the fix for the first
category, When DELALLOC or DIOREAD_NOLOCK is detected to be changed
during remount, we hold the s_writepages_rwsem lock to avoid racing with
ext4_writepages to trigger the problem.
Moreover, we add an EXT4_MOUNT_SHOULD_DIOREAD_NOLOCK macro to ensure that
the mount options used by ext4_should_dioread_nolock() and __ext4_remount()
are always consistent.
Fixes: 6b523df4fb5a ("ext4: use transaction reservation for extent conversion in ext4_end_io")
Cc: stable(a)vger.kernel.org
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
---
fs/ext4/ext4.h | 3 ++-
fs/ext4/ext4_jbd2.h | 9 +++++----
fs/ext4/super.c | 14 ++++++++++++++
3 files changed, 21 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 08b29c289da4..f60967fa648f 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1703,7 +1703,8 @@ struct ext4_sb_info {
/*
* Barrier between writepages ops and changing any inode's JOURNAL_DATA
- * or EXTENTS flag.
+ * or EXTENTS flag or between changing SHOULD_DIOREAD_NOLOCK flag on
+ * remount and writepages ops.
*/
struct percpu_rw_semaphore s_writepages_rwsem;
struct dax_device *s_daxdev;
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index 0c77697d5e90..d82bfcdd56e5 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -488,6 +488,9 @@ static inline int ext4_free_data_revoke_credits(struct inode *inode, int blocks)
return blocks + 2*(EXT4_SB(inode->i_sb)->s_cluster_ratio - 1);
}
+/* delalloc is a temporary fix to prevent generic/422 test failures*/
+#define EXT4_MOUNT_SHOULD_DIOREAD_NOLOCK (EXT4_MOUNT_DIOREAD_NOLOCK | \
+ EXT4_MOUNT_DELALLOC)
/*
* This function controls whether or not we should try to go down the
* dioread_nolock code paths, which makes it safe to avoid taking
@@ -499,7 +502,8 @@ static inline int ext4_free_data_revoke_credits(struct inode *inode, int blocks)
*/
static inline int ext4_should_dioread_nolock(struct inode *inode)
{
- if (!test_opt(inode->i_sb, DIOREAD_NOLOCK))
+ if (test_opt(inode->i_sb, SHOULD_DIOREAD_NOLOCK) !=
+ EXT4_MOUNT_SHOULD_DIOREAD_NOLOCK)
return 0;
if (!S_ISREG(inode->i_mode))
return 0;
@@ -507,9 +511,6 @@ static inline int ext4_should_dioread_nolock(struct inode *inode)
return 0;
if (ext4_should_journal_data(inode))
return 0;
- /* temporary fix to prevent generic/422 test failures */
- if (!test_opt(inode->i_sb, DELALLOC))
- return 0;
return 1;
}
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index fefcd42f34ea..bdf6b288aeff 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -6403,8 +6403,22 @@ static int __ext4_remount(struct fs_context *fc, struct super_block *sb)
}
+ /* Get the flag we really need to set/clear. */
+ ctx->mask_s_mount_opt &= sbi->s_mount_opt;
+ ctx->vals_s_mount_opt &= ~sbi->s_mount_opt;
+
+ /*
+ * If EXT4_MOUNT_SHOULD_DIOREAD_NOLOCK change on remount, we need
+ * to hold s_writepages_rwsem to avoid racing with writepages ops.
+ */
+ if (ctx_changed_mount_opt(ctx, EXT4_MOUNT_SHOULD_DIOREAD_NOLOCK))
+ percpu_down_write(&sbi->s_writepages_rwsem);
+
ext4_apply_options(fc, sb);
+ if (ctx_changed_mount_opt(ctx, EXT4_MOUNT_SHOULD_DIOREAD_NOLOCK))
+ percpu_up_write(&sbi->s_writepages_rwsem);
+
if ((old_opts.s_mount_opt & EXT4_MOUNT_JOURNAL_CHECKSUM) ^
test_opt(sb, JOURNAL_CHECKSUM)) {
ext4_msg(sb, KERN_ERR, "changing journal_checksum "
--
2.31.1