These patches were back-ported to v5.15.x, but we might as well
include them in v5.10.x too.
-Alex
Stephen Boyd (2):
interconnect: qcom: sc7180: Drop IP0 interconnects
interconnect: Restore sync state by ignoring ipa-virt in provider
count
drivers/interconnect/core.c | 7 ++++++-
drivers/interconnect/qcom/sc7180.c | 21 ---------------------
2 files changed, 6 insertions(+), 22 deletions(-)
--
2.34.1
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 228432551bd8783211e494ab35f42a4344580502 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang(a)redhat.com>
Date: Wed, 8 Jun 2022 14:14:22 +0800
Subject: [PATCH] virtio-rng: make device ready before making request
Current virtio-rng does a entropy request before DRIVER_OK, this
violates the spec:
virtio spec requires that all drivers set DRIVER_OK
before using devices.
Further, kernel will ignore the interrupt after commit
8b4ec69d7e09 ("virtio: harden vring IRQ").
Fixing this by making device ready before the request.
Cc: stable(a)vger.kernel.org
Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ")
Fixes: f7f510ec1957 ("virtio: An entropy device, as suggested by hpa.")
Reported-and-tested-by: syzbot+5b59d6d459306a556f54(a)syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Message-Id: <20220608061422.38437-1-jasowang(a)redhat.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Reviewed-by: Laurent Vivier <lvivier(a)redhat.com>
diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index e856df7e285c..a6f3a8a2aca6 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -159,6 +159,8 @@ static int probe_common(struct virtio_device *vdev)
goto err_find;
}
+ virtio_device_ready(vdev);
+
/* we always have a pending entropy request */
request_entropy(vi);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 228432551bd8783211e494ab35f42a4344580502 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang(a)redhat.com>
Date: Wed, 8 Jun 2022 14:14:22 +0800
Subject: [PATCH] virtio-rng: make device ready before making request
Current virtio-rng does a entropy request before DRIVER_OK, this
violates the spec:
virtio spec requires that all drivers set DRIVER_OK
before using devices.
Further, kernel will ignore the interrupt after commit
8b4ec69d7e09 ("virtio: harden vring IRQ").
Fixing this by making device ready before the request.
Cc: stable(a)vger.kernel.org
Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ")
Fixes: f7f510ec1957 ("virtio: An entropy device, as suggested by hpa.")
Reported-and-tested-by: syzbot+5b59d6d459306a556f54(a)syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Message-Id: <20220608061422.38437-1-jasowang(a)redhat.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Reviewed-by: Laurent Vivier <lvivier(a)redhat.com>
diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index e856df7e285c..a6f3a8a2aca6 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -159,6 +159,8 @@ static int probe_common(struct virtio_device *vdev)
goto err_find;
}
+ virtio_device_ready(vdev);
+
/* we always have a pending entropy request */
request_entropy(vi);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 228432551bd8783211e494ab35f42a4344580502 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang(a)redhat.com>
Date: Wed, 8 Jun 2022 14:14:22 +0800
Subject: [PATCH] virtio-rng: make device ready before making request
Current virtio-rng does a entropy request before DRIVER_OK, this
violates the spec:
virtio spec requires that all drivers set DRIVER_OK
before using devices.
Further, kernel will ignore the interrupt after commit
8b4ec69d7e09 ("virtio: harden vring IRQ").
Fixing this by making device ready before the request.
Cc: stable(a)vger.kernel.org
Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ")
Fixes: f7f510ec1957 ("virtio: An entropy device, as suggested by hpa.")
Reported-and-tested-by: syzbot+5b59d6d459306a556f54(a)syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Message-Id: <20220608061422.38437-1-jasowang(a)redhat.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Reviewed-by: Laurent Vivier <lvivier(a)redhat.com>
diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index e856df7e285c..a6f3a8a2aca6 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -159,6 +159,8 @@ static int probe_common(struct virtio_device *vdev)
goto err_find;
}
+ virtio_device_ready(vdev);
+
/* we always have a pending entropy request */
request_entropy(vi);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 228432551bd8783211e494ab35f42a4344580502 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang(a)redhat.com>
Date: Wed, 8 Jun 2022 14:14:22 +0800
Subject: [PATCH] virtio-rng: make device ready before making request
Current virtio-rng does a entropy request before DRIVER_OK, this
violates the spec:
virtio spec requires that all drivers set DRIVER_OK
before using devices.
Further, kernel will ignore the interrupt after commit
8b4ec69d7e09 ("virtio: harden vring IRQ").
Fixing this by making device ready before the request.
Cc: stable(a)vger.kernel.org
Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ")
Fixes: f7f510ec1957 ("virtio: An entropy device, as suggested by hpa.")
Reported-and-tested-by: syzbot+5b59d6d459306a556f54(a)syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Message-Id: <20220608061422.38437-1-jasowang(a)redhat.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Reviewed-by: Laurent Vivier <lvivier(a)redhat.com>
diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index e856df7e285c..a6f3a8a2aca6 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -159,6 +159,8 @@ static int probe_common(struct virtio_device *vdev)
goto err_find;
}
+ virtio_device_ready(vdev);
+
/* we always have a pending entropy request */
request_entropy(vi);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 228432551bd8783211e494ab35f42a4344580502 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang(a)redhat.com>
Date: Wed, 8 Jun 2022 14:14:22 +0800
Subject: [PATCH] virtio-rng: make device ready before making request
Current virtio-rng does a entropy request before DRIVER_OK, this
violates the spec:
virtio spec requires that all drivers set DRIVER_OK
before using devices.
Further, kernel will ignore the interrupt after commit
8b4ec69d7e09 ("virtio: harden vring IRQ").
Fixing this by making device ready before the request.
Cc: stable(a)vger.kernel.org
Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ")
Fixes: f7f510ec1957 ("virtio: An entropy device, as suggested by hpa.")
Reported-and-tested-by: syzbot+5b59d6d459306a556f54(a)syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Message-Id: <20220608061422.38437-1-jasowang(a)redhat.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Reviewed-by: Laurent Vivier <lvivier(a)redhat.com>
diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index e856df7e285c..a6f3a8a2aca6 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -159,6 +159,8 @@ static int probe_common(struct virtio_device *vdev)
goto err_find;
}
+ virtio_device_ready(vdev);
+
/* we always have a pending entropy request */
request_entropy(vi);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 228432551bd8783211e494ab35f42a4344580502 Mon Sep 17 00:00:00 2001
From: Jason Wang <jasowang(a)redhat.com>
Date: Wed, 8 Jun 2022 14:14:22 +0800
Subject: [PATCH] virtio-rng: make device ready before making request
Current virtio-rng does a entropy request before DRIVER_OK, this
violates the spec:
virtio spec requires that all drivers set DRIVER_OK
before using devices.
Further, kernel will ignore the interrupt after commit
8b4ec69d7e09 ("virtio: harden vring IRQ").
Fixing this by making device ready before the request.
Cc: stable(a)vger.kernel.org
Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ")
Fixes: f7f510ec1957 ("virtio: An entropy device, as suggested by hpa.")
Reported-and-tested-by: syzbot+5b59d6d459306a556f54(a)syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang(a)redhat.com>
Message-Id: <20220608061422.38437-1-jasowang(a)redhat.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Reviewed-by: Laurent Vivier <lvivier(a)redhat.com>
diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index e856df7e285c..a6f3a8a2aca6 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -159,6 +159,8 @@ static int probe_common(struct virtio_device *vdev)
goto err_find;
}
+ virtio_device_ready(vdev);
+
/* we always have a pending entropy request */
request_entropy(vi);
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 371017309a9f1725bfd3283afe61efa4ac34d30c Mon Sep 17 00:00:00 2001
From: Sunil Khatri <sunil.khatri(a)amd.com>
Date: Mon, 30 May 2022 23:24:09 +0530
Subject: [PATCH] drm/amdgpu: enable tmz by default for GC 10.3.7
Add IP GC 10.3.7 in the list of target to have
tmz enabled by default.
Signed-off-by: Sunil Khatri <sunil.khatri(a)amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 5.18.x
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 798c56214a23..aebc384531ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -518,6 +518,8 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev)
case IP_VERSION(9, 1, 0):
/* RENOIR looks like RAVEN */
case IP_VERSION(9, 3, 0):
+ /* GC 10.3.7 */
+ case IP_VERSION(10, 3, 7):
if (amdgpu_tmz == 0) {
adev->gmc.tmz_enabled = false;
dev_info(adev->dev,
@@ -540,8 +542,6 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev)
case IP_VERSION(10, 3, 1):
/* YELLOW_CARP*/
case IP_VERSION(10, 3, 3):
- /* GC 10.3.7 */
- case IP_VERSION(10, 3, 7):
/* Don't enable it by default yet.
*/
if (amdgpu_tmz < 1) {
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 88467db6e2f46a2e79b1b67ce6873c284e4cf417 Mon Sep 17 00:00:00 2001
From: Philip Yang <Philip.Yang(a)amd.com>
Date: Fri, 3 Jun 2022 09:19:34 -0400
Subject: [PATCH] drm/amdkfd: Fix partial migration bugs
Migration range from system memory to VRAM, if system page can not be
locked or unmapped, we do partial migration and leave some pages in
system memory. Several bugs found to copy pages and update GPU mapping
for this situation:
1. copy to vram should use migrate->npage which is total pages of range
as npages, not migrate->cpages which is number of pages can be migrated.
2. After partial copy, set VRAM res cursor as j + 1, j is number of
system pages copied plus 1 page to skip copy.
3. copy to ram, should collect all continuous VRAM pages and copy
together.
4. Call amdgpu_vm_update_range, should pass in offset as bytes, not
as number of pages.
Signed-off-by: Philip Yang <Philip.Yang(a)amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 997650d597ec..e44376c2ecdc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -296,7 +296,7 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct svm_range *prange,
struct migrate_vma *migrate, struct dma_fence **mfence,
dma_addr_t *scratch)
{
- uint64_t npages = migrate->cpages;
+ uint64_t npages = migrate->npages;
struct device *dev = adev->dev;
struct amdgpu_res_cursor cursor;
dma_addr_t *src;
@@ -344,7 +344,7 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct svm_range *prange,
mfence);
if (r)
goto out_free_vram_pages;
- amdgpu_res_next(&cursor, j << PAGE_SHIFT);
+ amdgpu_res_next(&cursor, (j + 1) << PAGE_SHIFT);
j = 0;
} else {
amdgpu_res_next(&cursor, PAGE_SIZE);
@@ -590,7 +590,7 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, struct svm_range *prange,
continue;
}
src[i] = svm_migrate_addr(adev, spage);
- if (i > 0 && src[i] != src[i - 1] + PAGE_SIZE) {
+ if (j > 0 && src[i] != src[i - 1] + PAGE_SIZE) {
r = svm_migrate_copy_memory_gart(adev, dst + i - j,
src + i - j, j,
FROM_VRAM_TO_RAM,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 3bd0f1a670bb..7b332246eda3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1295,7 +1295,7 @@ svm_range_map_to_gpu(struct kfd_process_device *pdd, struct svm_range *prange,
r = amdgpu_vm_update_range(adev, vm, false, false, flush_tlb, NULL,
last_start, prange->start + i,
pte_flags,
- last_start - prange->start,
+ (last_start - prange->start) << PAGE_SHIFT,
bo_adev ? bo_adev->vm_manager.vram_base_offset : 0,
NULL, dma_addr, &vm->last_update);
The patch below does not apply to the 5.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 88467db6e2f46a2e79b1b67ce6873c284e4cf417 Mon Sep 17 00:00:00 2001
From: Philip Yang <Philip.Yang(a)amd.com>
Date: Fri, 3 Jun 2022 09:19:34 -0400
Subject: [PATCH] drm/amdkfd: Fix partial migration bugs
Migration range from system memory to VRAM, if system page can not be
locked or unmapped, we do partial migration and leave some pages in
system memory. Several bugs found to copy pages and update GPU mapping
for this situation:
1. copy to vram should use migrate->npage which is total pages of range
as npages, not migrate->cpages which is number of pages can be migrated.
2. After partial copy, set VRAM res cursor as j + 1, j is number of
system pages copied plus 1 page to skip copy.
3. copy to ram, should collect all continuous VRAM pages and copy
together.
4. Call amdgpu_vm_update_range, should pass in offset as bytes, not
as number of pages.
Signed-off-by: Philip Yang <Philip.Yang(a)amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 997650d597ec..e44376c2ecdc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -296,7 +296,7 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct svm_range *prange,
struct migrate_vma *migrate, struct dma_fence **mfence,
dma_addr_t *scratch)
{
- uint64_t npages = migrate->cpages;
+ uint64_t npages = migrate->npages;
struct device *dev = adev->dev;
struct amdgpu_res_cursor cursor;
dma_addr_t *src;
@@ -344,7 +344,7 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct svm_range *prange,
mfence);
if (r)
goto out_free_vram_pages;
- amdgpu_res_next(&cursor, j << PAGE_SHIFT);
+ amdgpu_res_next(&cursor, (j + 1) << PAGE_SHIFT);
j = 0;
} else {
amdgpu_res_next(&cursor, PAGE_SIZE);
@@ -590,7 +590,7 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, struct svm_range *prange,
continue;
}
src[i] = svm_migrate_addr(adev, spage);
- if (i > 0 && src[i] != src[i - 1] + PAGE_SIZE) {
+ if (j > 0 && src[i] != src[i - 1] + PAGE_SIZE) {
r = svm_migrate_copy_memory_gart(adev, dst + i - j,
src + i - j, j,
FROM_VRAM_TO_RAM,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 3bd0f1a670bb..7b332246eda3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1295,7 +1295,7 @@ svm_range_map_to_gpu(struct kfd_process_device *pdd, struct svm_range *prange,
r = amdgpu_vm_update_range(adev, vm, false, false, flush_tlb, NULL,
last_start, prange->start + i,
pte_flags,
- last_start - prange->start,
+ (last_start - prange->start) << PAGE_SHIFT,
bo_adev ? bo_adev->vm_manager.vram_base_offset : 0,
NULL, dma_addr, &vm->last_update);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e54a4424925a27ed94dff046db3ce5caf4b1e748 Mon Sep 17 00:00:00 2001
From: Brian Norris <briannorris(a)chromium.org>
Date: Mon, 28 Feb 2022 12:25:32 -0800
Subject: [PATCH] drm/atomic: Force bridge self-refresh-exit on CRTC switch
It's possible to change which CRTC is in use for a given
connector/encoder/bridge while we're in self-refresh without fully
disabling the connector/encoder/bridge along the way. This can confuse
the bridge encoder/bridge, because
(a) it needs to track the SR state (trying to perform "active"
operations while the panel is still in SR can be Bad(TM)); and
(b) it tracks the SR state via the CRTC state (and after the switch, the
previous SR state is lost).
Thus, we need to either somehow carry the self-refresh state over to the
new CRTC, or else force an encoder/bridge self-refresh transition during
such a switch.
I choose the latter, so we disable the encoder (and exit PSR) before
attaching it to the new CRTC (where we can continue to assume a clean
(non-self-refresh) state).
This fixes PSR issues seen on Rockchip RK3399 systems with
drivers/gpu/drm/bridge/analogix/analogix_dp_core.c.
Change in v2:
- Drop "->enable" condition; this could possibly be "->active" to
reflect the intended hardware state, but it also is a little
over-specific. We want to make a transition through "disabled" any
time we're exiting PSR at the same time as a CRTC switch.
(Thanks Liu Ying)
Cc: Liu Ying <victor.liu(a)oss.nxp.com>
Cc: <stable(a)vger.kernel.org>
Fixes: 1452c25b0e60 ("drm: Add helpers to kick off self refresh mode in drivers")
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
Reviewed-by: Sean Paul <seanpaul(a)chromium.org>
Signed-off-by: Douglas Anderson <dianders(a)chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228122522.v2.2.Ic15a2ef6…
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 9603193d2fa1..987e4b212e9f 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1011,9 +1011,19 @@ crtc_needs_disable(struct drm_crtc_state *old_state,
return drm_atomic_crtc_effectively_active(old_state);
/*
- * We need to run through the crtc_funcs->disable() function if the CRTC
- * is currently on, if it's transitioning to self refresh mode, or if
- * it's in self refresh mode and needs to be fully disabled.
+ * We need to disable bridge(s) and CRTC if we're transitioning out of
+ * self-refresh and changing CRTCs at the same time, because the
+ * bridge tracks self-refresh status via CRTC state.
+ */
+ if (old_state->self_refresh_active &&
+ old_state->crtc != new_state->crtc)
+ return true;
+
+ /*
+ * We also need to run through the crtc_funcs->disable() function if
+ * the CRTC is currently on, if it's transitioning to self refresh
+ * mode, or if it's in self refresh mode and needs to be fully
+ * disabled.
*/
return old_state->active ||
(old_state->self_refresh_active && !new_state->active) ||
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ca871659ec1606d33b1e76de8d4cf924cf627e34 Mon Sep 17 00:00:00 2001
From: Brian Norris <briannorris(a)chromium.org>
Date: Mon, 28 Feb 2022 12:25:31 -0800
Subject: [PATCH] drm/bridge: analogix_dp: Support PSR-exit to disable
transition
Most eDP panel functions only work correctly when the panel is not in
self-refresh. In particular, analogix_dp_bridge_disable() tends to hit
AUX channel errors if the panel is in self-refresh.
Given the above, it appears that so far, this driver assumes that we are
never in self-refresh when it comes time to fully disable the bridge.
Prior to commit 846c7dfc1193 ("drm/atomic: Try to preserve the crtc
enabled state in drm_atomic_remove_fb, v2."), this tended to be true,
because we would automatically disable the pipe when framebuffers were
removed, and so we'd typically disable the bridge shortly after the last
display activity.
However, that is not guaranteed: an idle (self-refresh) display pipe may
be disabled, e.g., when switching CRTCs. We need to exit PSR first.
Stable notes: this is definitely a bugfix, and the bug has likely
existed in some form for quite a while. It may predate the "PSR helpers"
refactor, but the code looked very different before that, and it's
probably not worth rewriting the fix.
Cc: <stable(a)vger.kernel.org>
Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR")
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
Reviewed-by: Sean Paul <seanpaul(a)chromium.org>
Signed-off-by: Douglas Anderson <dianders(a)chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228122522.v2.1.I161904be…
diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index eb590fb8e8d0..0300f670a4fd 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -1268,6 +1268,25 @@ static int analogix_dp_bridge_attach(struct drm_bridge *bridge,
return 0;
}
+static
+struct drm_crtc *analogix_dp_get_old_crtc(struct analogix_dp_device *dp,
+ struct drm_atomic_state *state)
+{
+ struct drm_encoder *encoder = dp->encoder;
+ struct drm_connector *connector;
+ struct drm_connector_state *conn_state;
+
+ connector = drm_atomic_get_old_connector_for_encoder(state, encoder);
+ if (!connector)
+ return NULL;
+
+ conn_state = drm_atomic_get_old_connector_state(state, connector);
+ if (!conn_state)
+ return NULL;
+
+ return conn_state->crtc;
+}
+
static
struct drm_crtc *analogix_dp_get_new_crtc(struct analogix_dp_device *dp,
struct drm_atomic_state *state)
@@ -1448,14 +1467,16 @@ analogix_dp_bridge_atomic_disable(struct drm_bridge *bridge,
{
struct drm_atomic_state *old_state = old_bridge_state->base.state;
struct analogix_dp_device *dp = bridge->driver_private;
- struct drm_crtc *crtc;
+ struct drm_crtc *old_crtc, *new_crtc;
+ struct drm_crtc_state *old_crtc_state = NULL;
struct drm_crtc_state *new_crtc_state = NULL;
+ int ret;
- crtc = analogix_dp_get_new_crtc(dp, old_state);
- if (!crtc)
+ new_crtc = analogix_dp_get_new_crtc(dp, old_state);
+ if (!new_crtc)
goto out;
- new_crtc_state = drm_atomic_get_new_crtc_state(old_state, crtc);
+ new_crtc_state = drm_atomic_get_new_crtc_state(old_state, new_crtc);
if (!new_crtc_state)
goto out;
@@ -1464,6 +1485,19 @@ analogix_dp_bridge_atomic_disable(struct drm_bridge *bridge,
return;
out:
+ old_crtc = analogix_dp_get_old_crtc(dp, old_state);
+ if (old_crtc) {
+ old_crtc_state = drm_atomic_get_old_crtc_state(old_state,
+ old_crtc);
+
+ /* When moving from PSR to fully disabled, exit PSR first. */
+ if (old_crtc_state && old_crtc_state->self_refresh_active) {
+ ret = analogix_dp_disable_psr(dp);
+ if (ret)
+ DRM_ERROR("Failed to disable psr (%d)\n", ret);
+ }
+ }
+
analogix_dp_bridge_disable(bridge);
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a2a513be7139b279f1b5b2cee59c6c4950c34346 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Thu, 2 Jun 2022 23:16:57 +0900
Subject: [PATCH] zonefs: fix handling of explicit_open option on mount
Ignoring the explicit_open mount option on mount for devices that do not
have a limit on the number of open zones must be done after the mount
options are parsed and set in s_mount_opts. Move the check to ignore
the explicit_open option after the call to zonefs_parse_options() in
zonefs_fill_super().
Fixes: b5c00e975779 ("zonefs: open/close zone on file open/close")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index bcb21aea990a..ecce84909ca1 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -1760,12 +1760,6 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
atomic_set(&sbi->s_wro_seq_files, 0);
sbi->s_max_wro_seq_files = bdev_max_open_zones(sb->s_bdev);
- if (!sbi->s_max_wro_seq_files &&
- sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) {
- zonefs_info(sb, "No open zones limit. Ignoring explicit_open mount option\n");
- sbi->s_mount_opts &= ~ZONEFS_MNTOPT_EXPLICIT_OPEN;
- }
-
atomic_set(&sbi->s_active_seq_files, 0);
sbi->s_max_active_seq_files = bdev_max_active_zones(sb->s_bdev);
@@ -1790,6 +1784,12 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
zonefs_info(sb, "Mounting %u zones",
blkdev_nr_zones(sb->s_bdev->bd_disk));
+ if (!sbi->s_max_wro_seq_files &&
+ sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) {
+ zonefs_info(sb, "No open zones limit. Ignoring explicit_open mount option\n");
+ sbi->s_mount_opts &= ~ZONEFS_MNTOPT_EXPLICIT_OPEN;
+ }
+
/* Create root directory inode */
ret = -ENOMEM;
inode = new_inode(sb);
The patch below does not apply to the 5.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a2a513be7139b279f1b5b2cee59c6c4950c34346 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Thu, 2 Jun 2022 23:16:57 +0900
Subject: [PATCH] zonefs: fix handling of explicit_open option on mount
Ignoring the explicit_open mount option on mount for devices that do not
have a limit on the number of open zones must be done after the mount
options are parsed and set in s_mount_opts. Move the check to ignore
the explicit_open option after the call to zonefs_parse_options() in
zonefs_fill_super().
Fixes: b5c00e975779 ("zonefs: open/close zone on file open/close")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index bcb21aea990a..ecce84909ca1 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -1760,12 +1760,6 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
atomic_set(&sbi->s_wro_seq_files, 0);
sbi->s_max_wro_seq_files = bdev_max_open_zones(sb->s_bdev);
- if (!sbi->s_max_wro_seq_files &&
- sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) {
- zonefs_info(sb, "No open zones limit. Ignoring explicit_open mount option\n");
- sbi->s_mount_opts &= ~ZONEFS_MNTOPT_EXPLICIT_OPEN;
- }
-
atomic_set(&sbi->s_active_seq_files, 0);
sbi->s_max_active_seq_files = bdev_max_active_zones(sb->s_bdev);
@@ -1790,6 +1784,12 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
zonefs_info(sb, "Mounting %u zones",
blkdev_nr_zones(sb->s_bdev->bd_disk));
+ if (!sbi->s_max_wro_seq_files &&
+ sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) {
+ zonefs_info(sb, "No open zones limit. Ignoring explicit_open mount option\n");
+ sbi->s_mount_opts &= ~ZONEFS_MNTOPT_EXPLICIT_OPEN;
+ }
+
/* Create root directory inode */
ret = -ENOMEM;
inode = new_inode(sb);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a2a513be7139b279f1b5b2cee59c6c4950c34346 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Date: Thu, 2 Jun 2022 23:16:57 +0900
Subject: [PATCH] zonefs: fix handling of explicit_open option on mount
Ignoring the explicit_open mount option on mount for devices that do not
have a limit on the number of open zones must be done after the mount
options are parsed and set in s_mount_opts. Move the check to ignore
the explicit_open option after the call to zonefs_parse_options() in
zonefs_fill_super().
Fixes: b5c00e975779 ("zonefs: open/close zone on file open/close")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal(a)opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index bcb21aea990a..ecce84909ca1 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -1760,12 +1760,6 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
atomic_set(&sbi->s_wro_seq_files, 0);
sbi->s_max_wro_seq_files = bdev_max_open_zones(sb->s_bdev);
- if (!sbi->s_max_wro_seq_files &&
- sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) {
- zonefs_info(sb, "No open zones limit. Ignoring explicit_open mount option\n");
- sbi->s_mount_opts &= ~ZONEFS_MNTOPT_EXPLICIT_OPEN;
- }
-
atomic_set(&sbi->s_active_seq_files, 0);
sbi->s_max_active_seq_files = bdev_max_active_zones(sb->s_bdev);
@@ -1790,6 +1784,12 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
zonefs_info(sb, "Mounting %u zones",
blkdev_nr_zones(sb->s_bdev->bd_disk));
+ if (!sbi->s_max_wro_seq_files &&
+ sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) {
+ zonefs_info(sb, "No open zones limit. Ignoring explicit_open mount option\n");
+ sbi->s_mount_opts &= ~ZONEFS_MNTOPT_EXPLICIT_OPEN;
+ }
+
/* Create root directory inode */
ret = -ENOMEM;
inode = new_inode(sb);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7bb0fb7c63df95d6027dc50d6af3bc3bbbc25483 Mon Sep 17 00:00:00 2001
From: Olivier Matz <olivier.matz(a)6wind.com>
Date: Wed, 6 Apr 2022 11:52:52 +0200
Subject: [PATCH] ixgbe: fix unexpected VLAN Rx in promisc mode on VF
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When the promiscuous mode is enabled on a VF, the IXGBE_VMOLR_VPE
bit (VLAN Promiscuous Enable) is set. This means that the VF will
receive packets whose VLAN is not the same than the VLAN of the VF.
For instance, in this situation:
┌────────┐ ┌────────┐ ┌────────┐
│ │ │ │ │ │
│ │ │ │ │ │
│ VF0├────┤VF1 VF2├────┤VF3 │
│ │ │ │ │ │
└────────┘ └────────┘ └────────┘
VM1 VM2 VM3
vf 0: vlan 1000
vf 1: vlan 1000
vf 2: vlan 1001
vf 3: vlan 1001
If we tcpdump on VF3, we see all the packets, even those transmitted
on vlan 1000.
This behavior prevents to bridge VF1 and VF2 in VM2, because it will
create a loop: packets transmitted on VF1 will be received by VF2 and
vice-versa, and bridged again through the software bridge.
This patch remove the activation of VLAN Promiscuous when a VF enables
the promiscuous mode. However, the IXGBE_VMOLR_UPE bit (Unicast
Promiscuous) is kept, so that a VF receives all packets that has the
same VLAN, whatever the destination MAC address.
Fixes: 8443c1a4b192 ("ixgbe, ixgbevf: Add new mbox API xcast mode")
Cc: stable(a)vger.kernel.org
Cc: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Signed-off-by: Olivier Matz <olivier.matz(a)6wind.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 8d108a78941b..d4e63f0644c3 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -1208,9 +1208,9 @@ static int ixgbe_update_vf_xcast_mode(struct ixgbe_adapter *adapter,
return -EPERM;
}
- disable = 0;
+ disable = IXGBE_VMOLR_VPE;
enable = IXGBE_VMOLR_BAM | IXGBE_VMOLR_ROMPE |
- IXGBE_VMOLR_MPE | IXGBE_VMOLR_UPE | IXGBE_VMOLR_VPE;
+ IXGBE_VMOLR_MPE | IXGBE_VMOLR_UPE;
break;
default:
return -EOPNOTSUPP;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 803e9895ea2b0fe80bc85980ae2d7a7e44037914 Mon Sep 17 00:00:00 2001
From: Olivier Matz <olivier.matz(a)6wind.com>
Date: Wed, 6 Apr 2022 11:52:51 +0200
Subject: [PATCH] ixgbe: fix bcast packets Rx on VF after promisc removal
After a VF requested to remove the promiscuous flag on an interface, the
broadcast packets are not received anymore. This breaks some protocols
like ARP.
In ixgbe_update_vf_xcast_mode(), we should keep the IXGBE_VMOLR_BAM
bit (Broadcast Accept) on promiscuous removal.
This flag is already set by default in ixgbe_set_vmolr() on VF reset.
Fixes: 8443c1a4b192 ("ixgbe, ixgbevf: Add new mbox API xcast mode")
Cc: stable(a)vger.kernel.org
Cc: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Signed-off-by: Olivier Matz <olivier.matz(a)6wind.com>
Tested-by: Konrad Jankowski <konrad0.jankowski(a)intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com>
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 7f11c0a8e7a9..8d108a78941b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -1184,9 +1184,9 @@ static int ixgbe_update_vf_xcast_mode(struct ixgbe_adapter *adapter,
switch (xcast_mode) {
case IXGBEVF_XCAST_MODE_NONE:
- disable = IXGBE_VMOLR_BAM | IXGBE_VMOLR_ROMPE |
+ disable = IXGBE_VMOLR_ROMPE |
IXGBE_VMOLR_MPE | IXGBE_VMOLR_UPE | IXGBE_VMOLR_VPE;
- enable = 0;
+ enable = IXGBE_VMOLR_BAM;
break;
case IXGBEVF_XCAST_MODE_MULTI:
disable = IXGBE_VMOLR_MPE | IXGBE_VMOLR_UPE | IXGBE_VMOLR_VPE;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f2e19b36593caed4c977c2f55aeba7408aeb2132 Mon Sep 17 00:00:00 2001
From: Martin Faltesek <mfaltesek(a)google.com>
Date: Mon, 6 Jun 2022 21:57:29 -0500
Subject: [PATCH] nfc: st21nfca: fix incorrect sizing calculations in
EVT_TRANSACTION
The transaction buffer is allocated by using the size of the packet buf,
and subtracting two which seem intended to remove the two tags which are
not present in the target structure. This calculation leads to under
counting memory because of differences between the packet contents and the
target structure. The aid_len field is a u8 in the packet, but a u32 in
the structure, resulting in at least 3 bytes always being under counted.
Further, the aid data is a variable length field in the packet, but fixed
in the structure, so if this field is less than the max, the difference is
added to the under counting.
The last validation check for transaction->params_len is also incorrect
since it employs the same accounting error.
To fix, perform validation checks progressively to safely reach the
next field, to determine the size of both buffers and verify both tags.
Once all validation checks pass, allocate the buffer and copy the data.
This eliminates freeing memory on the error path, as those checks are
moved ahead of memory allocation.
Fixes: 26fc6c7f02cb ("NFC: st21nfca: Add HCI transaction event support")
Fixes: 4fbcc1a4cb20 ("nfc: st21nfca: Fix potential buffer overflows in EVT_TRANSACTION")
Cc: stable(a)vger.kernel.org
Signed-off-by: Martin Faltesek <mfaltesek(a)google.com>
Reviewed-by: Guenter Roeck <groeck(a)chromium.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/nfc/st21nfca/se.c b/drivers/nfc/st21nfca/se.c
index 8e1113ce139b..df8d27cf2956 100644
--- a/drivers/nfc/st21nfca/se.c
+++ b/drivers/nfc/st21nfca/se.c
@@ -300,6 +300,8 @@ int st21nfca_connectivity_event_received(struct nfc_hci_dev *hdev, u8 host,
int r = 0;
struct device *dev = &hdev->ndev->dev;
struct nfc_evt_transaction *transaction;
+ u32 aid_len;
+ u8 params_len;
pr_debug("connectivity gate event: %x\n", event);
@@ -308,50 +310,48 @@ int st21nfca_connectivity_event_received(struct nfc_hci_dev *hdev, u8 host,
r = nfc_se_connectivity(hdev->ndev, host);
break;
case ST21NFCA_EVT_TRANSACTION:
- /*
- * According to specification etsi 102 622
+ /* According to specification etsi 102 622
* 11.2.2.4 EVT_TRANSACTION Table 52
* Description Tag Length
* AID 81 5 to 16
* PARAMETERS 82 0 to 255
+ *
+ * The key differences are aid storage length is variably sized
+ * in the packet, but fixed in nfc_evt_transaction, and that the aid_len
+ * is u8 in the packet, but u32 in the structure, and the tags in
+ * the packet are not included in nfc_evt_transaction.
+ *
+ * size in bytes: 1 1 5-16 1 1 0-255
+ * offset: 0 1 2 aid_len + 2 aid_len + 3 aid_len + 4
+ * member name: aid_tag(M) aid_len aid params_tag(M) params_len params
+ * example: 0x81 5-16 X 0x82 0-255 X
*/
- if (skb->len < NFC_MIN_AID_LENGTH + 2 ||
- skb->data[0] != NFC_EVT_TRANSACTION_AID_TAG)
+ if (skb->len < 2 || skb->data[0] != NFC_EVT_TRANSACTION_AID_TAG)
return -EPROTO;
- transaction = devm_kzalloc(dev, skb->len - 2, GFP_KERNEL);
- if (!transaction)
- return -ENOMEM;
-
- transaction->aid_len = skb->data[1];
+ aid_len = skb->data[1];
- /* Checking if the length of the AID is valid */
- if (transaction->aid_len > sizeof(transaction->aid)) {
- devm_kfree(dev, transaction);
- return -EINVAL;
- }
+ if (skb->len < aid_len + 4 || aid_len > sizeof(transaction->aid))
+ return -EPROTO;
- memcpy(transaction->aid, &skb->data[2],
- transaction->aid_len);
+ params_len = skb->data[aid_len + 3];
- /* Check next byte is PARAMETERS tag (82) */
- if (skb->data[transaction->aid_len + 2] !=
- NFC_EVT_TRANSACTION_PARAMS_TAG) {
- devm_kfree(dev, transaction);
+ /* Verify PARAMETERS tag is (82), and final check that there is enough
+ * space in the packet to read everything.
+ */
+ if ((skb->data[aid_len + 2] != NFC_EVT_TRANSACTION_PARAMS_TAG) ||
+ (skb->len < aid_len + 4 + params_len))
return -EPROTO;
- }
- transaction->params_len = skb->data[transaction->aid_len + 3];
+ transaction = devm_kzalloc(dev, sizeof(*transaction) + params_len, GFP_KERNEL);
+ if (!transaction)
+ return -ENOMEM;
- /* Total size is allocated (skb->len - 2) minus fixed array members */
- if (transaction->params_len > ((skb->len - 2) -
- sizeof(struct nfc_evt_transaction))) {
- devm_kfree(dev, transaction);
- return -EINVAL;
- }
+ transaction->aid_len = aid_len;
+ transaction->params_len = params_len;
- memcpy(transaction->params, skb->data +
- transaction->aid_len + 4, transaction->params_len);
+ memcpy(transaction->aid, &skb->data[2], aid_len);
+ memcpy(transaction->params, &skb->data[aid_len + 4], params_len);
r = nfc_se_transaction(hdev->ndev, host, transaction);
break;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f2e19b36593caed4c977c2f55aeba7408aeb2132 Mon Sep 17 00:00:00 2001
From: Martin Faltesek <mfaltesek(a)google.com>
Date: Mon, 6 Jun 2022 21:57:29 -0500
Subject: [PATCH] nfc: st21nfca: fix incorrect sizing calculations in
EVT_TRANSACTION
The transaction buffer is allocated by using the size of the packet buf,
and subtracting two which seem intended to remove the two tags which are
not present in the target structure. This calculation leads to under
counting memory because of differences between the packet contents and the
target structure. The aid_len field is a u8 in the packet, but a u32 in
the structure, resulting in at least 3 bytes always being under counted.
Further, the aid data is a variable length field in the packet, but fixed
in the structure, so if this field is less than the max, the difference is
added to the under counting.
The last validation check for transaction->params_len is also incorrect
since it employs the same accounting error.
To fix, perform validation checks progressively to safely reach the
next field, to determine the size of both buffers and verify both tags.
Once all validation checks pass, allocate the buffer and copy the data.
This eliminates freeing memory on the error path, as those checks are
moved ahead of memory allocation.
Fixes: 26fc6c7f02cb ("NFC: st21nfca: Add HCI transaction event support")
Fixes: 4fbcc1a4cb20 ("nfc: st21nfca: Fix potential buffer overflows in EVT_TRANSACTION")
Cc: stable(a)vger.kernel.org
Signed-off-by: Martin Faltesek <mfaltesek(a)google.com>
Reviewed-by: Guenter Roeck <groeck(a)chromium.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/nfc/st21nfca/se.c b/drivers/nfc/st21nfca/se.c
index 8e1113ce139b..df8d27cf2956 100644
--- a/drivers/nfc/st21nfca/se.c
+++ b/drivers/nfc/st21nfca/se.c
@@ -300,6 +300,8 @@ int st21nfca_connectivity_event_received(struct nfc_hci_dev *hdev, u8 host,
int r = 0;
struct device *dev = &hdev->ndev->dev;
struct nfc_evt_transaction *transaction;
+ u32 aid_len;
+ u8 params_len;
pr_debug("connectivity gate event: %x\n", event);
@@ -308,50 +310,48 @@ int st21nfca_connectivity_event_received(struct nfc_hci_dev *hdev, u8 host,
r = nfc_se_connectivity(hdev->ndev, host);
break;
case ST21NFCA_EVT_TRANSACTION:
- /*
- * According to specification etsi 102 622
+ /* According to specification etsi 102 622
* 11.2.2.4 EVT_TRANSACTION Table 52
* Description Tag Length
* AID 81 5 to 16
* PARAMETERS 82 0 to 255
+ *
+ * The key differences are aid storage length is variably sized
+ * in the packet, but fixed in nfc_evt_transaction, and that the aid_len
+ * is u8 in the packet, but u32 in the structure, and the tags in
+ * the packet are not included in nfc_evt_transaction.
+ *
+ * size in bytes: 1 1 5-16 1 1 0-255
+ * offset: 0 1 2 aid_len + 2 aid_len + 3 aid_len + 4
+ * member name: aid_tag(M) aid_len aid params_tag(M) params_len params
+ * example: 0x81 5-16 X 0x82 0-255 X
*/
- if (skb->len < NFC_MIN_AID_LENGTH + 2 ||
- skb->data[0] != NFC_EVT_TRANSACTION_AID_TAG)
+ if (skb->len < 2 || skb->data[0] != NFC_EVT_TRANSACTION_AID_TAG)
return -EPROTO;
- transaction = devm_kzalloc(dev, skb->len - 2, GFP_KERNEL);
- if (!transaction)
- return -ENOMEM;
-
- transaction->aid_len = skb->data[1];
+ aid_len = skb->data[1];
- /* Checking if the length of the AID is valid */
- if (transaction->aid_len > sizeof(transaction->aid)) {
- devm_kfree(dev, transaction);
- return -EINVAL;
- }
+ if (skb->len < aid_len + 4 || aid_len > sizeof(transaction->aid))
+ return -EPROTO;
- memcpy(transaction->aid, &skb->data[2],
- transaction->aid_len);
+ params_len = skb->data[aid_len + 3];
- /* Check next byte is PARAMETERS tag (82) */
- if (skb->data[transaction->aid_len + 2] !=
- NFC_EVT_TRANSACTION_PARAMS_TAG) {
- devm_kfree(dev, transaction);
+ /* Verify PARAMETERS tag is (82), and final check that there is enough
+ * space in the packet to read everything.
+ */
+ if ((skb->data[aid_len + 2] != NFC_EVT_TRANSACTION_PARAMS_TAG) ||
+ (skb->len < aid_len + 4 + params_len))
return -EPROTO;
- }
- transaction->params_len = skb->data[transaction->aid_len + 3];
+ transaction = devm_kzalloc(dev, sizeof(*transaction) + params_len, GFP_KERNEL);
+ if (!transaction)
+ return -ENOMEM;
- /* Total size is allocated (skb->len - 2) minus fixed array members */
- if (transaction->params_len > ((skb->len - 2) -
- sizeof(struct nfc_evt_transaction))) {
- devm_kfree(dev, transaction);
- return -EINVAL;
- }
+ transaction->aid_len = aid_len;
+ transaction->params_len = params_len;
- memcpy(transaction->params, skb->data +
- transaction->aid_len + 4, transaction->params_len);
+ memcpy(transaction->aid, &skb->data[2], aid_len);
+ memcpy(transaction->params, &skb->data[aid_len + 4], params_len);
r = nfc_se_transaction(hdev->ndev, host, transaction);
break;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 10e14073107dd0b6d97d9516a02845a8e501c2c9 Mon Sep 17 00:00:00 2001
From: Jchao Sun <sunjunchao2870(a)gmail.com>
Date: Tue, 24 May 2022 08:05:40 -0700
Subject: [PATCH] writeback: Fix inode->i_io_list not be protected by
inode->i_lock error
Commit b35250c0816c ("writeback: Protect inode->i_io_list with
inode->i_lock") made inode->i_io_list not only protected by
wb->list_lock but also inode->i_lock, but inode_io_list_move_locked()
was missed. Add lock there and also update comment describing
things protected by inode->i_lock. This also fixes a race where
__mark_inode_dirty() could move inode under flush worker's hands
and thus sync(2) could miss writing some inodes.
Fixes: b35250c0816c ("writeback: Protect inode->i_io_list with inode->i_lock")
Link: https://lore.kernel.org/r/20220524150540.12552-1-sunjunchao2870@gmail.com
CC: stable(a)vger.kernel.org
Signed-off-by: Jchao Sun <sunjunchao2870(a)gmail.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index a21d8f1a56d1..05221366a16d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -120,6 +120,7 @@ static bool inode_io_list_move_locked(struct inode *inode,
struct list_head *head)
{
assert_spin_locked(&wb->list_lock);
+ assert_spin_locked(&inode->i_lock);
list_move(&inode->i_io_list, head);
@@ -1365,9 +1366,9 @@ static int move_expired_inodes(struct list_head *delaying_queue,
inode = wb_inode(delaying_queue->prev);
if (inode_dirtied_after(inode, dirtied_before))
break;
+ spin_lock(&inode->i_lock);
list_move(&inode->i_io_list, &tmp);
moved++;
- spin_lock(&inode->i_lock);
inode->i_state |= I_SYNC_QUEUED;
spin_unlock(&inode->i_lock);
if (sb_is_blkdev_sb(inode->i_sb))
@@ -1383,7 +1384,12 @@ static int move_expired_inodes(struct list_head *delaying_queue,
goto out;
}
- /* Move inodes from one superblock together */
+ /*
+ * Although inode's i_io_list is moved from 'tmp' to 'dispatch_queue',
+ * we don't take inode->i_lock here because it is just a pointless overhead.
+ * Inode is already marked as I_SYNC_QUEUED so writeback list handling is
+ * fully under our control.
+ */
while (!list_empty(&tmp)) {
sb = wb_inode(tmp.prev)->i_sb;
list_for_each_prev_safe(pos, node, &tmp) {
@@ -1826,8 +1832,8 @@ static long writeback_sb_inodes(struct super_block *sb,
* We'll have another go at writing back this inode
* when we completed a full scan of b_io.
*/
- spin_unlock(&inode->i_lock);
requeue_io(inode, wb);
+ spin_unlock(&inode->i_lock);
trace_writeback_sb_inodes_requeue(inode);
continue;
}
@@ -2358,6 +2364,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
{
struct super_block *sb = inode->i_sb;
int dirtytime = 0;
+ struct bdi_writeback *wb = NULL;
trace_writeback_mark_inode_dirty(inode, flags);
@@ -2409,6 +2416,17 @@ void __mark_inode_dirty(struct inode *inode, int flags)
inode->i_state &= ~I_DIRTY_TIME;
inode->i_state |= flags;
+ /*
+ * Grab inode's wb early because it requires dropping i_lock and we
+ * need to make sure following checks happen atomically with dirty
+ * list handling so that we don't move inodes under flush worker's
+ * hands.
+ */
+ if (!was_dirty) {
+ wb = locked_inode_to_wb_and_lock_list(inode);
+ spin_lock(&inode->i_lock);
+ }
+
/*
* If the inode is queued for writeback by flush worker, just
* update its dirty state. Once the flush worker is done with
@@ -2416,7 +2434,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
* list, based upon its state.
*/
if (inode->i_state & I_SYNC_QUEUED)
- goto out_unlock_inode;
+ goto out_unlock;
/*
* Only add valid (hashed) inodes to the superblock's
@@ -2424,22 +2442,19 @@ void __mark_inode_dirty(struct inode *inode, int flags)
*/
if (!S_ISBLK(inode->i_mode)) {
if (inode_unhashed(inode))
- goto out_unlock_inode;
+ goto out_unlock;
}
if (inode->i_state & I_FREEING)
- goto out_unlock_inode;
+ goto out_unlock;
/*
* If the inode was already on b_dirty/b_io/b_more_io, don't
* reposition it (that would break b_dirty time-ordering).
*/
if (!was_dirty) {
- struct bdi_writeback *wb;
struct list_head *dirty_list;
bool wakeup_bdi = false;
- wb = locked_inode_to_wb_and_lock_list(inode);
-
inode->dirtied_when = jiffies;
if (dirtytime)
inode->dirtied_time_when = jiffies;
@@ -2453,6 +2468,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
dirty_list);
spin_unlock(&wb->list_lock);
+ spin_unlock(&inode->i_lock);
trace_writeback_dirty_inode_enqueue(inode);
/*
@@ -2467,6 +2483,9 @@ void __mark_inode_dirty(struct inode *inode, int flags)
return;
}
}
+out_unlock:
+ if (wb)
+ spin_unlock(&wb->list_lock);
out_unlock_inode:
spin_unlock(&inode->i_lock);
}
diff --git a/fs/inode.c b/fs/inode.c
index 9d9b422504d1..bd4da9c5207e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -27,7 +27,7 @@
* Inode locking rules:
*
* inode->i_lock protects:
- * inode->i_state, inode->i_hash, __iget()
+ * inode->i_state, inode->i_hash, __iget(), inode->i_io_list
* Inode LRU list locks protect:
* inode->i_sb->s_inode_lru, inode->i_lru
* inode->i_sb->s_inode_list_lock protects:
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 10e14073107dd0b6d97d9516a02845a8e501c2c9 Mon Sep 17 00:00:00 2001
From: Jchao Sun <sunjunchao2870(a)gmail.com>
Date: Tue, 24 May 2022 08:05:40 -0700
Subject: [PATCH] writeback: Fix inode->i_io_list not be protected by
inode->i_lock error
Commit b35250c0816c ("writeback: Protect inode->i_io_list with
inode->i_lock") made inode->i_io_list not only protected by
wb->list_lock but also inode->i_lock, but inode_io_list_move_locked()
was missed. Add lock there and also update comment describing
things protected by inode->i_lock. This also fixes a race where
__mark_inode_dirty() could move inode under flush worker's hands
and thus sync(2) could miss writing some inodes.
Fixes: b35250c0816c ("writeback: Protect inode->i_io_list with inode->i_lock")
Link: https://lore.kernel.org/r/20220524150540.12552-1-sunjunchao2870@gmail.com
CC: stable(a)vger.kernel.org
Signed-off-by: Jchao Sun <sunjunchao2870(a)gmail.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index a21d8f1a56d1..05221366a16d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -120,6 +120,7 @@ static bool inode_io_list_move_locked(struct inode *inode,
struct list_head *head)
{
assert_spin_locked(&wb->list_lock);
+ assert_spin_locked(&inode->i_lock);
list_move(&inode->i_io_list, head);
@@ -1365,9 +1366,9 @@ static int move_expired_inodes(struct list_head *delaying_queue,
inode = wb_inode(delaying_queue->prev);
if (inode_dirtied_after(inode, dirtied_before))
break;
+ spin_lock(&inode->i_lock);
list_move(&inode->i_io_list, &tmp);
moved++;
- spin_lock(&inode->i_lock);
inode->i_state |= I_SYNC_QUEUED;
spin_unlock(&inode->i_lock);
if (sb_is_blkdev_sb(inode->i_sb))
@@ -1383,7 +1384,12 @@ static int move_expired_inodes(struct list_head *delaying_queue,
goto out;
}
- /* Move inodes from one superblock together */
+ /*
+ * Although inode's i_io_list is moved from 'tmp' to 'dispatch_queue',
+ * we don't take inode->i_lock here because it is just a pointless overhead.
+ * Inode is already marked as I_SYNC_QUEUED so writeback list handling is
+ * fully under our control.
+ */
while (!list_empty(&tmp)) {
sb = wb_inode(tmp.prev)->i_sb;
list_for_each_prev_safe(pos, node, &tmp) {
@@ -1826,8 +1832,8 @@ static long writeback_sb_inodes(struct super_block *sb,
* We'll have another go at writing back this inode
* when we completed a full scan of b_io.
*/
- spin_unlock(&inode->i_lock);
requeue_io(inode, wb);
+ spin_unlock(&inode->i_lock);
trace_writeback_sb_inodes_requeue(inode);
continue;
}
@@ -2358,6 +2364,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
{
struct super_block *sb = inode->i_sb;
int dirtytime = 0;
+ struct bdi_writeback *wb = NULL;
trace_writeback_mark_inode_dirty(inode, flags);
@@ -2409,6 +2416,17 @@ void __mark_inode_dirty(struct inode *inode, int flags)
inode->i_state &= ~I_DIRTY_TIME;
inode->i_state |= flags;
+ /*
+ * Grab inode's wb early because it requires dropping i_lock and we
+ * need to make sure following checks happen atomically with dirty
+ * list handling so that we don't move inodes under flush worker's
+ * hands.
+ */
+ if (!was_dirty) {
+ wb = locked_inode_to_wb_and_lock_list(inode);
+ spin_lock(&inode->i_lock);
+ }
+
/*
* If the inode is queued for writeback by flush worker, just
* update its dirty state. Once the flush worker is done with
@@ -2416,7 +2434,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
* list, based upon its state.
*/
if (inode->i_state & I_SYNC_QUEUED)
- goto out_unlock_inode;
+ goto out_unlock;
/*
* Only add valid (hashed) inodes to the superblock's
@@ -2424,22 +2442,19 @@ void __mark_inode_dirty(struct inode *inode, int flags)
*/
if (!S_ISBLK(inode->i_mode)) {
if (inode_unhashed(inode))
- goto out_unlock_inode;
+ goto out_unlock;
}
if (inode->i_state & I_FREEING)
- goto out_unlock_inode;
+ goto out_unlock;
/*
* If the inode was already on b_dirty/b_io/b_more_io, don't
* reposition it (that would break b_dirty time-ordering).
*/
if (!was_dirty) {
- struct bdi_writeback *wb;
struct list_head *dirty_list;
bool wakeup_bdi = false;
- wb = locked_inode_to_wb_and_lock_list(inode);
-
inode->dirtied_when = jiffies;
if (dirtytime)
inode->dirtied_time_when = jiffies;
@@ -2453,6 +2468,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
dirty_list);
spin_unlock(&wb->list_lock);
+ spin_unlock(&inode->i_lock);
trace_writeback_dirty_inode_enqueue(inode);
/*
@@ -2467,6 +2483,9 @@ void __mark_inode_dirty(struct inode *inode, int flags)
return;
}
}
+out_unlock:
+ if (wb)
+ spin_unlock(&wb->list_lock);
out_unlock_inode:
spin_unlock(&inode->i_lock);
}
diff --git a/fs/inode.c b/fs/inode.c
index 9d9b422504d1..bd4da9c5207e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -27,7 +27,7 @@
* Inode locking rules:
*
* inode->i_lock protects:
- * inode->i_state, inode->i_hash, __iget()
+ * inode->i_state, inode->i_hash, __iget(), inode->i_io_list
* Inode LRU list locks protect:
* inode->i_sb->s_inode_lru, inode->i_lru
* inode->i_sb->s_inode_list_lock protects:
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 10e14073107dd0b6d97d9516a02845a8e501c2c9 Mon Sep 17 00:00:00 2001
From: Jchao Sun <sunjunchao2870(a)gmail.com>
Date: Tue, 24 May 2022 08:05:40 -0700
Subject: [PATCH] writeback: Fix inode->i_io_list not be protected by
inode->i_lock error
Commit b35250c0816c ("writeback: Protect inode->i_io_list with
inode->i_lock") made inode->i_io_list not only protected by
wb->list_lock but also inode->i_lock, but inode_io_list_move_locked()
was missed. Add lock there and also update comment describing
things protected by inode->i_lock. This also fixes a race where
__mark_inode_dirty() could move inode under flush worker's hands
and thus sync(2) could miss writing some inodes.
Fixes: b35250c0816c ("writeback: Protect inode->i_io_list with inode->i_lock")
Link: https://lore.kernel.org/r/20220524150540.12552-1-sunjunchao2870@gmail.com
CC: stable(a)vger.kernel.org
Signed-off-by: Jchao Sun <sunjunchao2870(a)gmail.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index a21d8f1a56d1..05221366a16d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -120,6 +120,7 @@ static bool inode_io_list_move_locked(struct inode *inode,
struct list_head *head)
{
assert_spin_locked(&wb->list_lock);
+ assert_spin_locked(&inode->i_lock);
list_move(&inode->i_io_list, head);
@@ -1365,9 +1366,9 @@ static int move_expired_inodes(struct list_head *delaying_queue,
inode = wb_inode(delaying_queue->prev);
if (inode_dirtied_after(inode, dirtied_before))
break;
+ spin_lock(&inode->i_lock);
list_move(&inode->i_io_list, &tmp);
moved++;
- spin_lock(&inode->i_lock);
inode->i_state |= I_SYNC_QUEUED;
spin_unlock(&inode->i_lock);
if (sb_is_blkdev_sb(inode->i_sb))
@@ -1383,7 +1384,12 @@ static int move_expired_inodes(struct list_head *delaying_queue,
goto out;
}
- /* Move inodes from one superblock together */
+ /*
+ * Although inode's i_io_list is moved from 'tmp' to 'dispatch_queue',
+ * we don't take inode->i_lock here because it is just a pointless overhead.
+ * Inode is already marked as I_SYNC_QUEUED so writeback list handling is
+ * fully under our control.
+ */
while (!list_empty(&tmp)) {
sb = wb_inode(tmp.prev)->i_sb;
list_for_each_prev_safe(pos, node, &tmp) {
@@ -1826,8 +1832,8 @@ static long writeback_sb_inodes(struct super_block *sb,
* We'll have another go at writing back this inode
* when we completed a full scan of b_io.
*/
- spin_unlock(&inode->i_lock);
requeue_io(inode, wb);
+ spin_unlock(&inode->i_lock);
trace_writeback_sb_inodes_requeue(inode);
continue;
}
@@ -2358,6 +2364,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
{
struct super_block *sb = inode->i_sb;
int dirtytime = 0;
+ struct bdi_writeback *wb = NULL;
trace_writeback_mark_inode_dirty(inode, flags);
@@ -2409,6 +2416,17 @@ void __mark_inode_dirty(struct inode *inode, int flags)
inode->i_state &= ~I_DIRTY_TIME;
inode->i_state |= flags;
+ /*
+ * Grab inode's wb early because it requires dropping i_lock and we
+ * need to make sure following checks happen atomically with dirty
+ * list handling so that we don't move inodes under flush worker's
+ * hands.
+ */
+ if (!was_dirty) {
+ wb = locked_inode_to_wb_and_lock_list(inode);
+ spin_lock(&inode->i_lock);
+ }
+
/*
* If the inode is queued for writeback by flush worker, just
* update its dirty state. Once the flush worker is done with
@@ -2416,7 +2434,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
* list, based upon its state.
*/
if (inode->i_state & I_SYNC_QUEUED)
- goto out_unlock_inode;
+ goto out_unlock;
/*
* Only add valid (hashed) inodes to the superblock's
@@ -2424,22 +2442,19 @@ void __mark_inode_dirty(struct inode *inode, int flags)
*/
if (!S_ISBLK(inode->i_mode)) {
if (inode_unhashed(inode))
- goto out_unlock_inode;
+ goto out_unlock;
}
if (inode->i_state & I_FREEING)
- goto out_unlock_inode;
+ goto out_unlock;
/*
* If the inode was already on b_dirty/b_io/b_more_io, don't
* reposition it (that would break b_dirty time-ordering).
*/
if (!was_dirty) {
- struct bdi_writeback *wb;
struct list_head *dirty_list;
bool wakeup_bdi = false;
- wb = locked_inode_to_wb_and_lock_list(inode);
-
inode->dirtied_when = jiffies;
if (dirtytime)
inode->dirtied_time_when = jiffies;
@@ -2453,6 +2468,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
dirty_list);
spin_unlock(&wb->list_lock);
+ spin_unlock(&inode->i_lock);
trace_writeback_dirty_inode_enqueue(inode);
/*
@@ -2467,6 +2483,9 @@ void __mark_inode_dirty(struct inode *inode, int flags)
return;
}
}
+out_unlock:
+ if (wb)
+ spin_unlock(&wb->list_lock);
out_unlock_inode:
spin_unlock(&inode->i_lock);
}
diff --git a/fs/inode.c b/fs/inode.c
index 9d9b422504d1..bd4da9c5207e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -27,7 +27,7 @@
* Inode locking rules:
*
* inode->i_lock protects:
- * inode->i_state, inode->i_hash, __iget()
+ * inode->i_state, inode->i_hash, __iget(), inode->i_io_list
* Inode LRU list locks protect:
* inode->i_sb->s_inode_lru, inode->i_lru
* inode->i_sb->s_inode_list_lock protects:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 10e14073107dd0b6d97d9516a02845a8e501c2c9 Mon Sep 17 00:00:00 2001
From: Jchao Sun <sunjunchao2870(a)gmail.com>
Date: Tue, 24 May 2022 08:05:40 -0700
Subject: [PATCH] writeback: Fix inode->i_io_list not be protected by
inode->i_lock error
Commit b35250c0816c ("writeback: Protect inode->i_io_list with
inode->i_lock") made inode->i_io_list not only protected by
wb->list_lock but also inode->i_lock, but inode_io_list_move_locked()
was missed. Add lock there and also update comment describing
things protected by inode->i_lock. This also fixes a race where
__mark_inode_dirty() could move inode under flush worker's hands
and thus sync(2) could miss writing some inodes.
Fixes: b35250c0816c ("writeback: Protect inode->i_io_list with inode->i_lock")
Link: https://lore.kernel.org/r/20220524150540.12552-1-sunjunchao2870@gmail.com
CC: stable(a)vger.kernel.org
Signed-off-by: Jchao Sun <sunjunchao2870(a)gmail.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index a21d8f1a56d1..05221366a16d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -120,6 +120,7 @@ static bool inode_io_list_move_locked(struct inode *inode,
struct list_head *head)
{
assert_spin_locked(&wb->list_lock);
+ assert_spin_locked(&inode->i_lock);
list_move(&inode->i_io_list, head);
@@ -1365,9 +1366,9 @@ static int move_expired_inodes(struct list_head *delaying_queue,
inode = wb_inode(delaying_queue->prev);
if (inode_dirtied_after(inode, dirtied_before))
break;
+ spin_lock(&inode->i_lock);
list_move(&inode->i_io_list, &tmp);
moved++;
- spin_lock(&inode->i_lock);
inode->i_state |= I_SYNC_QUEUED;
spin_unlock(&inode->i_lock);
if (sb_is_blkdev_sb(inode->i_sb))
@@ -1383,7 +1384,12 @@ static int move_expired_inodes(struct list_head *delaying_queue,
goto out;
}
- /* Move inodes from one superblock together */
+ /*
+ * Although inode's i_io_list is moved from 'tmp' to 'dispatch_queue',
+ * we don't take inode->i_lock here because it is just a pointless overhead.
+ * Inode is already marked as I_SYNC_QUEUED so writeback list handling is
+ * fully under our control.
+ */
while (!list_empty(&tmp)) {
sb = wb_inode(tmp.prev)->i_sb;
list_for_each_prev_safe(pos, node, &tmp) {
@@ -1826,8 +1832,8 @@ static long writeback_sb_inodes(struct super_block *sb,
* We'll have another go at writing back this inode
* when we completed a full scan of b_io.
*/
- spin_unlock(&inode->i_lock);
requeue_io(inode, wb);
+ spin_unlock(&inode->i_lock);
trace_writeback_sb_inodes_requeue(inode);
continue;
}
@@ -2358,6 +2364,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
{
struct super_block *sb = inode->i_sb;
int dirtytime = 0;
+ struct bdi_writeback *wb = NULL;
trace_writeback_mark_inode_dirty(inode, flags);
@@ -2409,6 +2416,17 @@ void __mark_inode_dirty(struct inode *inode, int flags)
inode->i_state &= ~I_DIRTY_TIME;
inode->i_state |= flags;
+ /*
+ * Grab inode's wb early because it requires dropping i_lock and we
+ * need to make sure following checks happen atomically with dirty
+ * list handling so that we don't move inodes under flush worker's
+ * hands.
+ */
+ if (!was_dirty) {
+ wb = locked_inode_to_wb_and_lock_list(inode);
+ spin_lock(&inode->i_lock);
+ }
+
/*
* If the inode is queued for writeback by flush worker, just
* update its dirty state. Once the flush worker is done with
@@ -2416,7 +2434,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
* list, based upon its state.
*/
if (inode->i_state & I_SYNC_QUEUED)
- goto out_unlock_inode;
+ goto out_unlock;
/*
* Only add valid (hashed) inodes to the superblock's
@@ -2424,22 +2442,19 @@ void __mark_inode_dirty(struct inode *inode, int flags)
*/
if (!S_ISBLK(inode->i_mode)) {
if (inode_unhashed(inode))
- goto out_unlock_inode;
+ goto out_unlock;
}
if (inode->i_state & I_FREEING)
- goto out_unlock_inode;
+ goto out_unlock;
/*
* If the inode was already on b_dirty/b_io/b_more_io, don't
* reposition it (that would break b_dirty time-ordering).
*/
if (!was_dirty) {
- struct bdi_writeback *wb;
struct list_head *dirty_list;
bool wakeup_bdi = false;
- wb = locked_inode_to_wb_and_lock_list(inode);
-
inode->dirtied_when = jiffies;
if (dirtytime)
inode->dirtied_time_when = jiffies;
@@ -2453,6 +2468,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
dirty_list);
spin_unlock(&wb->list_lock);
+ spin_unlock(&inode->i_lock);
trace_writeback_dirty_inode_enqueue(inode);
/*
@@ -2467,6 +2483,9 @@ void __mark_inode_dirty(struct inode *inode, int flags)
return;
}
}
+out_unlock:
+ if (wb)
+ spin_unlock(&wb->list_lock);
out_unlock_inode:
spin_unlock(&inode->i_lock);
}
diff --git a/fs/inode.c b/fs/inode.c
index 9d9b422504d1..bd4da9c5207e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -27,7 +27,7 @@
* Inode locking rules:
*
* inode->i_lock protects:
- * inode->i_state, inode->i_hash, __iget()
+ * inode->i_state, inode->i_hash, __iget(), inode->i_io_list
* Inode LRU list locks protect:
* inode->i_sb->s_inode_lru, inode->i_lru
* inode->i_sb->s_inode_list_lock protects:
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 10e14073107dd0b6d97d9516a02845a8e501c2c9 Mon Sep 17 00:00:00 2001
From: Jchao Sun <sunjunchao2870(a)gmail.com>
Date: Tue, 24 May 2022 08:05:40 -0700
Subject: [PATCH] writeback: Fix inode->i_io_list not be protected by
inode->i_lock error
Commit b35250c0816c ("writeback: Protect inode->i_io_list with
inode->i_lock") made inode->i_io_list not only protected by
wb->list_lock but also inode->i_lock, but inode_io_list_move_locked()
was missed. Add lock there and also update comment describing
things protected by inode->i_lock. This also fixes a race where
__mark_inode_dirty() could move inode under flush worker's hands
and thus sync(2) could miss writing some inodes.
Fixes: b35250c0816c ("writeback: Protect inode->i_io_list with inode->i_lock")
Link: https://lore.kernel.org/r/20220524150540.12552-1-sunjunchao2870@gmail.com
CC: stable(a)vger.kernel.org
Signed-off-by: Jchao Sun <sunjunchao2870(a)gmail.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index a21d8f1a56d1..05221366a16d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -120,6 +120,7 @@ static bool inode_io_list_move_locked(struct inode *inode,
struct list_head *head)
{
assert_spin_locked(&wb->list_lock);
+ assert_spin_locked(&inode->i_lock);
list_move(&inode->i_io_list, head);
@@ -1365,9 +1366,9 @@ static int move_expired_inodes(struct list_head *delaying_queue,
inode = wb_inode(delaying_queue->prev);
if (inode_dirtied_after(inode, dirtied_before))
break;
+ spin_lock(&inode->i_lock);
list_move(&inode->i_io_list, &tmp);
moved++;
- spin_lock(&inode->i_lock);
inode->i_state |= I_SYNC_QUEUED;
spin_unlock(&inode->i_lock);
if (sb_is_blkdev_sb(inode->i_sb))
@@ -1383,7 +1384,12 @@ static int move_expired_inodes(struct list_head *delaying_queue,
goto out;
}
- /* Move inodes from one superblock together */
+ /*
+ * Although inode's i_io_list is moved from 'tmp' to 'dispatch_queue',
+ * we don't take inode->i_lock here because it is just a pointless overhead.
+ * Inode is already marked as I_SYNC_QUEUED so writeback list handling is
+ * fully under our control.
+ */
while (!list_empty(&tmp)) {
sb = wb_inode(tmp.prev)->i_sb;
list_for_each_prev_safe(pos, node, &tmp) {
@@ -1826,8 +1832,8 @@ static long writeback_sb_inodes(struct super_block *sb,
* We'll have another go at writing back this inode
* when we completed a full scan of b_io.
*/
- spin_unlock(&inode->i_lock);
requeue_io(inode, wb);
+ spin_unlock(&inode->i_lock);
trace_writeback_sb_inodes_requeue(inode);
continue;
}
@@ -2358,6 +2364,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
{
struct super_block *sb = inode->i_sb;
int dirtytime = 0;
+ struct bdi_writeback *wb = NULL;
trace_writeback_mark_inode_dirty(inode, flags);
@@ -2409,6 +2416,17 @@ void __mark_inode_dirty(struct inode *inode, int flags)
inode->i_state &= ~I_DIRTY_TIME;
inode->i_state |= flags;
+ /*
+ * Grab inode's wb early because it requires dropping i_lock and we
+ * need to make sure following checks happen atomically with dirty
+ * list handling so that we don't move inodes under flush worker's
+ * hands.
+ */
+ if (!was_dirty) {
+ wb = locked_inode_to_wb_and_lock_list(inode);
+ spin_lock(&inode->i_lock);
+ }
+
/*
* If the inode is queued for writeback by flush worker, just
* update its dirty state. Once the flush worker is done with
@@ -2416,7 +2434,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
* list, based upon its state.
*/
if (inode->i_state & I_SYNC_QUEUED)
- goto out_unlock_inode;
+ goto out_unlock;
/*
* Only add valid (hashed) inodes to the superblock's
@@ -2424,22 +2442,19 @@ void __mark_inode_dirty(struct inode *inode, int flags)
*/
if (!S_ISBLK(inode->i_mode)) {
if (inode_unhashed(inode))
- goto out_unlock_inode;
+ goto out_unlock;
}
if (inode->i_state & I_FREEING)
- goto out_unlock_inode;
+ goto out_unlock;
/*
* If the inode was already on b_dirty/b_io/b_more_io, don't
* reposition it (that would break b_dirty time-ordering).
*/
if (!was_dirty) {
- struct bdi_writeback *wb;
struct list_head *dirty_list;
bool wakeup_bdi = false;
- wb = locked_inode_to_wb_and_lock_list(inode);
-
inode->dirtied_when = jiffies;
if (dirtytime)
inode->dirtied_time_when = jiffies;
@@ -2453,6 +2468,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
dirty_list);
spin_unlock(&wb->list_lock);
+ spin_unlock(&inode->i_lock);
trace_writeback_dirty_inode_enqueue(inode);
/*
@@ -2467,6 +2483,9 @@ void __mark_inode_dirty(struct inode *inode, int flags)
return;
}
}
+out_unlock:
+ if (wb)
+ spin_unlock(&wb->list_lock);
out_unlock_inode:
spin_unlock(&inode->i_lock);
}
diff --git a/fs/inode.c b/fs/inode.c
index 9d9b422504d1..bd4da9c5207e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -27,7 +27,7 @@
* Inode locking rules:
*
* inode->i_lock protects:
- * inode->i_state, inode->i_hash, __iget()
+ * inode->i_state, inode->i_hash, __iget(), inode->i_io_list
* Inode LRU list locks protect:
* inode->i_sb->s_inode_lru, inode->i_lru
* inode->i_sb->s_inode_list_lock protects:
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2061ecfdf2350994e5b61c43e50e98a7a70e95ee Mon Sep 17 00:00:00 2001
From: Ilya Maximets <i.maximets(a)ovn.org>
Date: Tue, 7 Jun 2022 00:11:40 +0200
Subject: [PATCH] net: openvswitch: fix misuse of the cached connection on
tuple changes
If packet headers changed, the cached nfct is no longer relevant
for the packet and attempt to re-use it leads to the incorrect packet
classification.
This issue is causing broken connectivity in OpenStack deployments
with OVS/OVN due to hairpin traffic being unexpectedly dropped.
The setup has datapath flows with several conntrack actions and tuple
changes between them:
actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
ct(zone=8),recirc(0x4)
After the first ct() action the packet headers are almost fully
re-written. The next ct() tries to re-use the existing nfct entry
and marks the packet as invalid, so it gets dropped later in the
pipeline.
Clearing the cached conntrack entry whenever packet tuple is changed
to avoid the issue.
The flow key should not be cleared though, because we should still
be able to match on the ct_state if the recirculation happens after
the tuple change but before the next ct() action.
Cc: stable(a)vger.kernel.org
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Frode Nordahl <frode.nordahl(a)canonical.com>
Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
Signed-off-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 1b5d73079dc9..868db4669a29 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -373,6 +373,7 @@ static void set_ip_addr(struct sk_buff *skb, struct iphdr *nh,
update_ip_l4_checksum(skb, nh, *addr, new_addr);
csum_replace4(&nh->check, *addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
*addr = new_addr;
}
@@ -420,6 +421,7 @@ static void set_ipv6_addr(struct sk_buff *skb, u8 l4_proto,
update_ipv6_checksum(skb, l4_proto, addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
memcpy(addr, new_addr, sizeof(__be32[4]));
}
@@ -660,6 +662,7 @@ static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key,
static void set_tp_port(struct sk_buff *skb, __be16 *port,
__be16 new_port, __sum16 *check)
{
+ ovs_ct_clear(skb, NULL);
inet_proto_csum_replace2(check, skb, *port, new_port, false);
*port = new_port;
}
@@ -699,6 +702,7 @@ static int set_udp(struct sk_buff *skb, struct sw_flow_key *flow_key,
uh->dest = dst;
flow_key->tp.src = src;
flow_key->tp.dst = dst;
+ ovs_ct_clear(skb, NULL);
}
skb_clear_hash(skb);
@@ -761,6 +765,8 @@ static int set_sctp(struct sk_buff *skb, struct sw_flow_key *flow_key,
sh->checksum = old_csum ^ old_correct_csum ^ new_csum;
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
+
flow_key->tp.src = sh->source;
flow_key->tp.dst = sh->dest;
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 4a947c13c813..4e70df91d0f2 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -1342,7 +1342,9 @@ int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key)
nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
- ovs_ct_fill_key(skb, key, false);
+
+ if (key)
+ ovs_ct_fill_key(skb, key, false);
return 0;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2061ecfdf2350994e5b61c43e50e98a7a70e95ee Mon Sep 17 00:00:00 2001
From: Ilya Maximets <i.maximets(a)ovn.org>
Date: Tue, 7 Jun 2022 00:11:40 +0200
Subject: [PATCH] net: openvswitch: fix misuse of the cached connection on
tuple changes
If packet headers changed, the cached nfct is no longer relevant
for the packet and attempt to re-use it leads to the incorrect packet
classification.
This issue is causing broken connectivity in OpenStack deployments
with OVS/OVN due to hairpin traffic being unexpectedly dropped.
The setup has datapath flows with several conntrack actions and tuple
changes between them:
actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
ct(zone=8),recirc(0x4)
After the first ct() action the packet headers are almost fully
re-written. The next ct() tries to re-use the existing nfct entry
and marks the packet as invalid, so it gets dropped later in the
pipeline.
Clearing the cached conntrack entry whenever packet tuple is changed
to avoid the issue.
The flow key should not be cleared though, because we should still
be able to match on the ct_state if the recirculation happens after
the tuple change but before the next ct() action.
Cc: stable(a)vger.kernel.org
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Frode Nordahl <frode.nordahl(a)canonical.com>
Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
Signed-off-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 1b5d73079dc9..868db4669a29 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -373,6 +373,7 @@ static void set_ip_addr(struct sk_buff *skb, struct iphdr *nh,
update_ip_l4_checksum(skb, nh, *addr, new_addr);
csum_replace4(&nh->check, *addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
*addr = new_addr;
}
@@ -420,6 +421,7 @@ static void set_ipv6_addr(struct sk_buff *skb, u8 l4_proto,
update_ipv6_checksum(skb, l4_proto, addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
memcpy(addr, new_addr, sizeof(__be32[4]));
}
@@ -660,6 +662,7 @@ static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key,
static void set_tp_port(struct sk_buff *skb, __be16 *port,
__be16 new_port, __sum16 *check)
{
+ ovs_ct_clear(skb, NULL);
inet_proto_csum_replace2(check, skb, *port, new_port, false);
*port = new_port;
}
@@ -699,6 +702,7 @@ static int set_udp(struct sk_buff *skb, struct sw_flow_key *flow_key,
uh->dest = dst;
flow_key->tp.src = src;
flow_key->tp.dst = dst;
+ ovs_ct_clear(skb, NULL);
}
skb_clear_hash(skb);
@@ -761,6 +765,8 @@ static int set_sctp(struct sk_buff *skb, struct sw_flow_key *flow_key,
sh->checksum = old_csum ^ old_correct_csum ^ new_csum;
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
+
flow_key->tp.src = sh->source;
flow_key->tp.dst = sh->dest;
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 4a947c13c813..4e70df91d0f2 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -1342,7 +1342,9 @@ int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key)
nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
- ovs_ct_fill_key(skb, key, false);
+
+ if (key)
+ ovs_ct_fill_key(skb, key, false);
return 0;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2061ecfdf2350994e5b61c43e50e98a7a70e95ee Mon Sep 17 00:00:00 2001
From: Ilya Maximets <i.maximets(a)ovn.org>
Date: Tue, 7 Jun 2022 00:11:40 +0200
Subject: [PATCH] net: openvswitch: fix misuse of the cached connection on
tuple changes
If packet headers changed, the cached nfct is no longer relevant
for the packet and attempt to re-use it leads to the incorrect packet
classification.
This issue is causing broken connectivity in OpenStack deployments
with OVS/OVN due to hairpin traffic being unexpectedly dropped.
The setup has datapath flows with several conntrack actions and tuple
changes between them:
actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
ct(zone=8),recirc(0x4)
After the first ct() action the packet headers are almost fully
re-written. The next ct() tries to re-use the existing nfct entry
and marks the packet as invalid, so it gets dropped later in the
pipeline.
Clearing the cached conntrack entry whenever packet tuple is changed
to avoid the issue.
The flow key should not be cleared though, because we should still
be able to match on the ct_state if the recirculation happens after
the tuple change but before the next ct() action.
Cc: stable(a)vger.kernel.org
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Frode Nordahl <frode.nordahl(a)canonical.com>
Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
Signed-off-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 1b5d73079dc9..868db4669a29 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -373,6 +373,7 @@ static void set_ip_addr(struct sk_buff *skb, struct iphdr *nh,
update_ip_l4_checksum(skb, nh, *addr, new_addr);
csum_replace4(&nh->check, *addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
*addr = new_addr;
}
@@ -420,6 +421,7 @@ static void set_ipv6_addr(struct sk_buff *skb, u8 l4_proto,
update_ipv6_checksum(skb, l4_proto, addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
memcpy(addr, new_addr, sizeof(__be32[4]));
}
@@ -660,6 +662,7 @@ static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key,
static void set_tp_port(struct sk_buff *skb, __be16 *port,
__be16 new_port, __sum16 *check)
{
+ ovs_ct_clear(skb, NULL);
inet_proto_csum_replace2(check, skb, *port, new_port, false);
*port = new_port;
}
@@ -699,6 +702,7 @@ static int set_udp(struct sk_buff *skb, struct sw_flow_key *flow_key,
uh->dest = dst;
flow_key->tp.src = src;
flow_key->tp.dst = dst;
+ ovs_ct_clear(skb, NULL);
}
skb_clear_hash(skb);
@@ -761,6 +765,8 @@ static int set_sctp(struct sk_buff *skb, struct sw_flow_key *flow_key,
sh->checksum = old_csum ^ old_correct_csum ^ new_csum;
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
+
flow_key->tp.src = sh->source;
flow_key->tp.dst = sh->dest;
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 4a947c13c813..4e70df91d0f2 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -1342,7 +1342,9 @@ int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key)
nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
- ovs_ct_fill_key(skb, key, false);
+
+ if (key)
+ ovs_ct_fill_key(skb, key, false);
return 0;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2061ecfdf2350994e5b61c43e50e98a7a70e95ee Mon Sep 17 00:00:00 2001
From: Ilya Maximets <i.maximets(a)ovn.org>
Date: Tue, 7 Jun 2022 00:11:40 +0200
Subject: [PATCH] net: openvswitch: fix misuse of the cached connection on
tuple changes
If packet headers changed, the cached nfct is no longer relevant
for the packet and attempt to re-use it leads to the incorrect packet
classification.
This issue is causing broken connectivity in OpenStack deployments
with OVS/OVN due to hairpin traffic being unexpectedly dropped.
The setup has datapath flows with several conntrack actions and tuple
changes between them:
actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
ct(zone=8),recirc(0x4)
After the first ct() action the packet headers are almost fully
re-written. The next ct() tries to re-use the existing nfct entry
and marks the packet as invalid, so it gets dropped later in the
pipeline.
Clearing the cached conntrack entry whenever packet tuple is changed
to avoid the issue.
The flow key should not be cleared though, because we should still
be able to match on the ct_state if the recirculation happens after
the tuple change but before the next ct() action.
Cc: stable(a)vger.kernel.org
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Frode Nordahl <frode.nordahl(a)canonical.com>
Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
Signed-off-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 1b5d73079dc9..868db4669a29 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -373,6 +373,7 @@ static void set_ip_addr(struct sk_buff *skb, struct iphdr *nh,
update_ip_l4_checksum(skb, nh, *addr, new_addr);
csum_replace4(&nh->check, *addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
*addr = new_addr;
}
@@ -420,6 +421,7 @@ static void set_ipv6_addr(struct sk_buff *skb, u8 l4_proto,
update_ipv6_checksum(skb, l4_proto, addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
memcpy(addr, new_addr, sizeof(__be32[4]));
}
@@ -660,6 +662,7 @@ static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key,
static void set_tp_port(struct sk_buff *skb, __be16 *port,
__be16 new_port, __sum16 *check)
{
+ ovs_ct_clear(skb, NULL);
inet_proto_csum_replace2(check, skb, *port, new_port, false);
*port = new_port;
}
@@ -699,6 +702,7 @@ static int set_udp(struct sk_buff *skb, struct sw_flow_key *flow_key,
uh->dest = dst;
flow_key->tp.src = src;
flow_key->tp.dst = dst;
+ ovs_ct_clear(skb, NULL);
}
skb_clear_hash(skb);
@@ -761,6 +765,8 @@ static int set_sctp(struct sk_buff *skb, struct sw_flow_key *flow_key,
sh->checksum = old_csum ^ old_correct_csum ^ new_csum;
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
+
flow_key->tp.src = sh->source;
flow_key->tp.dst = sh->dest;
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 4a947c13c813..4e70df91d0f2 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -1342,7 +1342,9 @@ int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key)
nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
- ovs_ct_fill_key(skb, key, false);
+
+ if (key)
+ ovs_ct_fill_key(skb, key, false);
return 0;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2061ecfdf2350994e5b61c43e50e98a7a70e95ee Mon Sep 17 00:00:00 2001
From: Ilya Maximets <i.maximets(a)ovn.org>
Date: Tue, 7 Jun 2022 00:11:40 +0200
Subject: [PATCH] net: openvswitch: fix misuse of the cached connection on
tuple changes
If packet headers changed, the cached nfct is no longer relevant
for the packet and attempt to re-use it leads to the incorrect packet
classification.
This issue is causing broken connectivity in OpenStack deployments
with OVS/OVN due to hairpin traffic being unexpectedly dropped.
The setup has datapath flows with several conntrack actions and tuple
changes between them:
actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
ct(zone=8),recirc(0x4)
After the first ct() action the packet headers are almost fully
re-written. The next ct() tries to re-use the existing nfct entry
and marks the packet as invalid, so it gets dropped later in the
pipeline.
Clearing the cached conntrack entry whenever packet tuple is changed
to avoid the issue.
The flow key should not be cleared though, because we should still
be able to match on the ct_state if the recirculation happens after
the tuple change but before the next ct() action.
Cc: stable(a)vger.kernel.org
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Frode Nordahl <frode.nordahl(a)canonical.com>
Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
Signed-off-by: Ilya Maximets <i.maximets(a)ovn.org>
Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 1b5d73079dc9..868db4669a29 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -373,6 +373,7 @@ static void set_ip_addr(struct sk_buff *skb, struct iphdr *nh,
update_ip_l4_checksum(skb, nh, *addr, new_addr);
csum_replace4(&nh->check, *addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
*addr = new_addr;
}
@@ -420,6 +421,7 @@ static void set_ipv6_addr(struct sk_buff *skb, u8 l4_proto,
update_ipv6_checksum(skb, l4_proto, addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
memcpy(addr, new_addr, sizeof(__be32[4]));
}
@@ -660,6 +662,7 @@ static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key,
static void set_tp_port(struct sk_buff *skb, __be16 *port,
__be16 new_port, __sum16 *check)
{
+ ovs_ct_clear(skb, NULL);
inet_proto_csum_replace2(check, skb, *port, new_port, false);
*port = new_port;
}
@@ -699,6 +702,7 @@ static int set_udp(struct sk_buff *skb, struct sw_flow_key *flow_key,
uh->dest = dst;
flow_key->tp.src = src;
flow_key->tp.dst = dst;
+ ovs_ct_clear(skb, NULL);
}
skb_clear_hash(skb);
@@ -761,6 +765,8 @@ static int set_sctp(struct sk_buff *skb, struct sw_flow_key *flow_key,
sh->checksum = old_csum ^ old_correct_csum ^ new_csum;
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
+
flow_key->tp.src = sh->source;
flow_key->tp.dst = sh->dest;
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 4a947c13c813..4e70df91d0f2 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -1342,7 +1342,9 @@ int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key)
nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
- ovs_ct_fill_key(skb, key, false);
+
+ if (key)
+ ovs_ct_fill_key(skb, key, false);
return 0;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c76acfb7e19dcc3a0964e0563770b1d11b8d4540 Mon Sep 17 00:00:00 2001
From: Tan Tee Min <tee.min.tan(a)linux.intel.com>
Date: Thu, 26 May 2022 17:03:47 +0800
Subject: [PATCH] net: phy: dp83867: retrigger SGMII AN when link change
There is a limitation in TI DP83867 PHY device where SGMII AN is only
triggered once after the device is booted up. Even after the PHY TPI is
down and up again, SGMII AN is not triggered and hence no new in-band
message from PHY to MAC side SGMII.
This could cause an issue during power up, when PHY is up prior to MAC.
At this condition, once MAC side SGMII is up, MAC side SGMII wouldn`t
receive new in-band message from TI PHY with correct link status, speed
and duplex info.
As suggested by TI, implemented a SW solution here to retrigger SGMII
Auto-Neg whenever there is a link change.
v2: Add Fixes tag in commit message.
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
Cc: <stable(a)vger.kernel.org> # 5.4.x
Signed-off-by: Sit, Michael Wei Hong <michael.wei.hong.sit(a)intel.com>
Reviewed-by: Voon Weifeng <weifeng.voon(a)intel.com>
Signed-off-by: Tan Tee Min <tee.min.tan(a)linux.intel.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Link: https://lore.kernel.org/r/20220526090347.128742-1-tee.min.tan@linux.intel.c…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c
index 8561f2d4443b..13dafe7a29bd 100644
--- a/drivers/net/phy/dp83867.c
+++ b/drivers/net/phy/dp83867.c
@@ -137,6 +137,7 @@
#define DP83867_DOWNSHIFT_2_COUNT 2
#define DP83867_DOWNSHIFT_4_COUNT 4
#define DP83867_DOWNSHIFT_8_COUNT 8
+#define DP83867_SGMII_AUTONEG_EN BIT(7)
/* CFG3 bits */
#define DP83867_CFG3_INT_OE BIT(7)
@@ -855,6 +856,32 @@ static int dp83867_phy_reset(struct phy_device *phydev)
DP83867_PHYCR_FORCE_LINK_GOOD, 0);
}
+static void dp83867_link_change_notify(struct phy_device *phydev)
+{
+ /* There is a limitation in DP83867 PHY device where SGMII AN is
+ * only triggered once after the device is booted up. Even after the
+ * PHY TPI is down and up again, SGMII AN is not triggered and
+ * hence no new in-band message from PHY to MAC side SGMII.
+ * This could cause an issue during power up, when PHY is up prior
+ * to MAC. At this condition, once MAC side SGMII is up, MAC side
+ * SGMII wouldn`t receive new in-band message from TI PHY with
+ * correct link status, speed and duplex info.
+ * Thus, implemented a SW solution here to retrigger SGMII Auto-Neg
+ * whenever there is a link change.
+ */
+ if (phydev->interface == PHY_INTERFACE_MODE_SGMII) {
+ int val = 0;
+
+ val = phy_clear_bits(phydev, DP83867_CFG2,
+ DP83867_SGMII_AUTONEG_EN);
+ if (val < 0)
+ return;
+
+ phy_set_bits(phydev, DP83867_CFG2,
+ DP83867_SGMII_AUTONEG_EN);
+ }
+}
+
static struct phy_driver dp83867_driver[] = {
{
.phy_id = DP83867_PHY_ID,
@@ -879,6 +906,8 @@ static struct phy_driver dp83867_driver[] = {
.suspend = genphy_suspend,
.resume = genphy_resume,
+
+ .link_change_notify = dp83867_link_change_notify,
},
};
module_phy_driver(dp83867_driver);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c76acfb7e19dcc3a0964e0563770b1d11b8d4540 Mon Sep 17 00:00:00 2001
From: Tan Tee Min <tee.min.tan(a)linux.intel.com>
Date: Thu, 26 May 2022 17:03:47 +0800
Subject: [PATCH] net: phy: dp83867: retrigger SGMII AN when link change
There is a limitation in TI DP83867 PHY device where SGMII AN is only
triggered once after the device is booted up. Even after the PHY TPI is
down and up again, SGMII AN is not triggered and hence no new in-band
message from PHY to MAC side SGMII.
This could cause an issue during power up, when PHY is up prior to MAC.
At this condition, once MAC side SGMII is up, MAC side SGMII wouldn`t
receive new in-band message from TI PHY with correct link status, speed
and duplex info.
As suggested by TI, implemented a SW solution here to retrigger SGMII
Auto-Neg whenever there is a link change.
v2: Add Fixes tag in commit message.
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
Cc: <stable(a)vger.kernel.org> # 5.4.x
Signed-off-by: Sit, Michael Wei Hong <michael.wei.hong.sit(a)intel.com>
Reviewed-by: Voon Weifeng <weifeng.voon(a)intel.com>
Signed-off-by: Tan Tee Min <tee.min.tan(a)linux.intel.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Link: https://lore.kernel.org/r/20220526090347.128742-1-tee.min.tan@linux.intel.c…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c
index 8561f2d4443b..13dafe7a29bd 100644
--- a/drivers/net/phy/dp83867.c
+++ b/drivers/net/phy/dp83867.c
@@ -137,6 +137,7 @@
#define DP83867_DOWNSHIFT_2_COUNT 2
#define DP83867_DOWNSHIFT_4_COUNT 4
#define DP83867_DOWNSHIFT_8_COUNT 8
+#define DP83867_SGMII_AUTONEG_EN BIT(7)
/* CFG3 bits */
#define DP83867_CFG3_INT_OE BIT(7)
@@ -855,6 +856,32 @@ static int dp83867_phy_reset(struct phy_device *phydev)
DP83867_PHYCR_FORCE_LINK_GOOD, 0);
}
+static void dp83867_link_change_notify(struct phy_device *phydev)
+{
+ /* There is a limitation in DP83867 PHY device where SGMII AN is
+ * only triggered once after the device is booted up. Even after the
+ * PHY TPI is down and up again, SGMII AN is not triggered and
+ * hence no new in-band message from PHY to MAC side SGMII.
+ * This could cause an issue during power up, when PHY is up prior
+ * to MAC. At this condition, once MAC side SGMII is up, MAC side
+ * SGMII wouldn`t receive new in-band message from TI PHY with
+ * correct link status, speed and duplex info.
+ * Thus, implemented a SW solution here to retrigger SGMII Auto-Neg
+ * whenever there is a link change.
+ */
+ if (phydev->interface == PHY_INTERFACE_MODE_SGMII) {
+ int val = 0;
+
+ val = phy_clear_bits(phydev, DP83867_CFG2,
+ DP83867_SGMII_AUTONEG_EN);
+ if (val < 0)
+ return;
+
+ phy_set_bits(phydev, DP83867_CFG2,
+ DP83867_SGMII_AUTONEG_EN);
+ }
+}
+
static struct phy_driver dp83867_driver[] = {
{
.phy_id = DP83867_PHY_ID,
@@ -879,6 +906,8 @@ static struct phy_driver dp83867_driver[] = {
.suspend = genphy_suspend,
.resume = genphy_resume,
+
+ .link_change_notify = dp83867_link_change_notify,
},
};
module_phy_driver(dp83867_driver);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c76acfb7e19dcc3a0964e0563770b1d11b8d4540 Mon Sep 17 00:00:00 2001
From: Tan Tee Min <tee.min.tan(a)linux.intel.com>
Date: Thu, 26 May 2022 17:03:47 +0800
Subject: [PATCH] net: phy: dp83867: retrigger SGMII AN when link change
There is a limitation in TI DP83867 PHY device where SGMII AN is only
triggered once after the device is booted up. Even after the PHY TPI is
down and up again, SGMII AN is not triggered and hence no new in-band
message from PHY to MAC side SGMII.
This could cause an issue during power up, when PHY is up prior to MAC.
At this condition, once MAC side SGMII is up, MAC side SGMII wouldn`t
receive new in-band message from TI PHY with correct link status, speed
and duplex info.
As suggested by TI, implemented a SW solution here to retrigger SGMII
Auto-Neg whenever there is a link change.
v2: Add Fixes tag in commit message.
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
Cc: <stable(a)vger.kernel.org> # 5.4.x
Signed-off-by: Sit, Michael Wei Hong <michael.wei.hong.sit(a)intel.com>
Reviewed-by: Voon Weifeng <weifeng.voon(a)intel.com>
Signed-off-by: Tan Tee Min <tee.min.tan(a)linux.intel.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Link: https://lore.kernel.org/r/20220526090347.128742-1-tee.min.tan@linux.intel.c…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c
index 8561f2d4443b..13dafe7a29bd 100644
--- a/drivers/net/phy/dp83867.c
+++ b/drivers/net/phy/dp83867.c
@@ -137,6 +137,7 @@
#define DP83867_DOWNSHIFT_2_COUNT 2
#define DP83867_DOWNSHIFT_4_COUNT 4
#define DP83867_DOWNSHIFT_8_COUNT 8
+#define DP83867_SGMII_AUTONEG_EN BIT(7)
/* CFG3 bits */
#define DP83867_CFG3_INT_OE BIT(7)
@@ -855,6 +856,32 @@ static int dp83867_phy_reset(struct phy_device *phydev)
DP83867_PHYCR_FORCE_LINK_GOOD, 0);
}
+static void dp83867_link_change_notify(struct phy_device *phydev)
+{
+ /* There is a limitation in DP83867 PHY device where SGMII AN is
+ * only triggered once after the device is booted up. Even after the
+ * PHY TPI is down and up again, SGMII AN is not triggered and
+ * hence no new in-band message from PHY to MAC side SGMII.
+ * This could cause an issue during power up, when PHY is up prior
+ * to MAC. At this condition, once MAC side SGMII is up, MAC side
+ * SGMII wouldn`t receive new in-band message from TI PHY with
+ * correct link status, speed and duplex info.
+ * Thus, implemented a SW solution here to retrigger SGMII Auto-Neg
+ * whenever there is a link change.
+ */
+ if (phydev->interface == PHY_INTERFACE_MODE_SGMII) {
+ int val = 0;
+
+ val = phy_clear_bits(phydev, DP83867_CFG2,
+ DP83867_SGMII_AUTONEG_EN);
+ if (val < 0)
+ return;
+
+ phy_set_bits(phydev, DP83867_CFG2,
+ DP83867_SGMII_AUTONEG_EN);
+ }
+}
+
static struct phy_driver dp83867_driver[] = {
{
.phy_id = DP83867_PHY_ID,
@@ -879,6 +906,8 @@ static struct phy_driver dp83867_driver[] = {
.suspend = genphy_suspend,
.resume = genphy_resume,
+
+ .link_change_notify = dp83867_link_change_notify,
},
};
module_phy_driver(dp83867_driver);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c76acfb7e19dcc3a0964e0563770b1d11b8d4540 Mon Sep 17 00:00:00 2001
From: Tan Tee Min <tee.min.tan(a)linux.intel.com>
Date: Thu, 26 May 2022 17:03:47 +0800
Subject: [PATCH] net: phy: dp83867: retrigger SGMII AN when link change
There is a limitation in TI DP83867 PHY device where SGMII AN is only
triggered once after the device is booted up. Even after the PHY TPI is
down and up again, SGMII AN is not triggered and hence no new in-band
message from PHY to MAC side SGMII.
This could cause an issue during power up, when PHY is up prior to MAC.
At this condition, once MAC side SGMII is up, MAC side SGMII wouldn`t
receive new in-band message from TI PHY with correct link status, speed
and duplex info.
As suggested by TI, implemented a SW solution here to retrigger SGMII
Auto-Neg whenever there is a link change.
v2: Add Fixes tag in commit message.
Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy")
Cc: <stable(a)vger.kernel.org> # 5.4.x
Signed-off-by: Sit, Michael Wei Hong <michael.wei.hong.sit(a)intel.com>
Reviewed-by: Voon Weifeng <weifeng.voon(a)intel.com>
Signed-off-by: Tan Tee Min <tee.min.tan(a)linux.intel.com>
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Link: https://lore.kernel.org/r/20220526090347.128742-1-tee.min.tan@linux.intel.c…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c
index 8561f2d4443b..13dafe7a29bd 100644
--- a/drivers/net/phy/dp83867.c
+++ b/drivers/net/phy/dp83867.c
@@ -137,6 +137,7 @@
#define DP83867_DOWNSHIFT_2_COUNT 2
#define DP83867_DOWNSHIFT_4_COUNT 4
#define DP83867_DOWNSHIFT_8_COUNT 8
+#define DP83867_SGMII_AUTONEG_EN BIT(7)
/* CFG3 bits */
#define DP83867_CFG3_INT_OE BIT(7)
@@ -855,6 +856,32 @@ static int dp83867_phy_reset(struct phy_device *phydev)
DP83867_PHYCR_FORCE_LINK_GOOD, 0);
}
+static void dp83867_link_change_notify(struct phy_device *phydev)
+{
+ /* There is a limitation in DP83867 PHY device where SGMII AN is
+ * only triggered once after the device is booted up. Even after the
+ * PHY TPI is down and up again, SGMII AN is not triggered and
+ * hence no new in-band message from PHY to MAC side SGMII.
+ * This could cause an issue during power up, when PHY is up prior
+ * to MAC. At this condition, once MAC side SGMII is up, MAC side
+ * SGMII wouldn`t receive new in-band message from TI PHY with
+ * correct link status, speed and duplex info.
+ * Thus, implemented a SW solution here to retrigger SGMII Auto-Neg
+ * whenever there is a link change.
+ */
+ if (phydev->interface == PHY_INTERFACE_MODE_SGMII) {
+ int val = 0;
+
+ val = phy_clear_bits(phydev, DP83867_CFG2,
+ DP83867_SGMII_AUTONEG_EN);
+ if (val < 0)
+ return;
+
+ phy_set_bits(phydev, DP83867_CFG2,
+ DP83867_SGMII_AUTONEG_EN);
+ }
+}
+
static struct phy_driver dp83867_driver[] = {
{
.phy_id = DP83867_PHY_ID,
@@ -879,6 +906,8 @@ static struct phy_driver dp83867_driver[] = {
.suspend = genphy_suspend,
.resume = genphy_resume,
+
+ .link_change_notify = dp83867_link_change_notify,
},
};
module_phy_driver(dp83867_driver);
The patch below does not apply to the 5.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 11d39e8cc43e1c6737af19ca9372e590061b5ad2 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Mon, 6 Jun 2022 21:11:49 +0300
Subject: [PATCH] KVM: SVM: fix tsc scaling cache logic
SVM uses a per-cpu variable to cache the current value of the
tsc scaling multiplier msr on each cpu.
Commit 1ab9287add5e2
("KVM: X86: Add vendor callbacks for writing the TSC multiplier")
broke this caching logic.
Refactor the code so that all TSC scaling multiplier writes go through
a single function which checks and updates the cache.
This fixes the following scenario:
1. A CPU runs a guest with some tsc scaling ratio.
2. New guest with different tsc scaling ratio starts on this CPU
and terminates almost immediately.
This ensures that the short running guest had set the tsc scaling ratio just
once when it was set via KVM_SET_TSC_KHZ. Due to the bug,
the per-cpu cache is not updated.
3. The original guest continues to run, it doesn't restore the msr
value back to its own value, because the cache matches,
and thus continues to run with a wrong tsc scaling ratio.
Fixes: 1ab9287add5e2 ("KVM: X86: Add vendor callbacks for writing the TSC multiplier")
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Message-Id: <20220606181149.103072-1-mlevitsk(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index bed5e1692cef..3361258640a2 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -982,7 +982,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
if (svm->tsc_ratio_msr != kvm_default_tsc_scaling_ratio) {
WARN_ON(!svm->tsc_scaling_enabled);
vcpu->arch.tsc_scaling_ratio = vcpu->arch.l1_tsc_scaling_ratio;
- svm_write_tsc_multiplier(vcpu, vcpu->arch.tsc_scaling_ratio);
+ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
}
svm->nested.ctl.nested_cr3 = 0;
@@ -1387,7 +1387,7 @@ void nested_svm_update_tsc_ratio_msr(struct kvm_vcpu *vcpu)
vcpu->arch.tsc_scaling_ratio =
kvm_calc_nested_tsc_multiplier(vcpu->arch.l1_tsc_scaling_ratio,
svm->tsc_ratio_msr);
- svm_write_tsc_multiplier(vcpu, vcpu->arch.tsc_scaling_ratio);
+ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
}
/* Inverse operation of nested_copy_vmcb_control_to_cache(). asid is copied too. */
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 63880b33ce37..478e6ee81d88 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -465,11 +465,24 @@ static int has_svm(void)
return 1;
}
+void __svm_write_tsc_multiplier(u64 multiplier)
+{
+ preempt_disable();
+
+ if (multiplier == __this_cpu_read(current_tsc_ratio))
+ goto out;
+
+ wrmsrl(MSR_AMD64_TSC_RATIO, multiplier);
+ __this_cpu_write(current_tsc_ratio, multiplier);
+out:
+ preempt_enable();
+}
+
static void svm_hardware_disable(void)
{
/* Make sure we clean up behind us */
if (tsc_scaling)
- wrmsrl(MSR_AMD64_TSC_RATIO, SVM_TSC_RATIO_DEFAULT);
+ __svm_write_tsc_multiplier(SVM_TSC_RATIO_DEFAULT);
cpu_svm_disable();
@@ -515,8 +528,7 @@ static int svm_hardware_enable(void)
* Set the default value, even if we don't use TSC scaling
* to avoid having stale value in the msr
*/
- wrmsrl(MSR_AMD64_TSC_RATIO, SVM_TSC_RATIO_DEFAULT);
- __this_cpu_write(current_tsc_ratio, SVM_TSC_RATIO_DEFAULT);
+ __svm_write_tsc_multiplier(SVM_TSC_RATIO_DEFAULT);
}
@@ -999,11 +1011,12 @@ static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
vmcb_mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
}
-void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
+static void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
{
- wrmsrl(MSR_AMD64_TSC_RATIO, multiplier);
+ __svm_write_tsc_multiplier(multiplier);
}
+
/* Evaluate instruction intercepts that depend on guest CPUID features. */
static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu,
struct vcpu_svm *svm)
@@ -1363,13 +1376,8 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
sev_es_prepare_switch_to_guest(hostsa);
}
- if (tsc_scaling) {
- u64 tsc_ratio = vcpu->arch.tsc_scaling_ratio;
- if (tsc_ratio != __this_cpu_read(current_tsc_ratio)) {
- __this_cpu_write(current_tsc_ratio, tsc_ratio);
- wrmsrl(MSR_AMD64_TSC_RATIO, tsc_ratio);
- }
- }
+ if (tsc_scaling)
+ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
if (likely(tsc_aux_uret_slot >= 0))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 45a87b2a8b3c..29d6fd205a49 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -590,7 +590,7 @@ int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
bool has_error_code, u32 error_code);
int nested_svm_exit_special(struct vcpu_svm *svm);
void nested_svm_update_tsc_ratio_msr(struct kvm_vcpu *vcpu);
-void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier);
+void __svm_write_tsc_multiplier(u64 multiplier);
void nested_copy_vmcb_control_to_cache(struct vcpu_svm *svm,
struct vmcb_control_area *control);
void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 11d39e8cc43e1c6737af19ca9372e590061b5ad2 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Mon, 6 Jun 2022 21:11:49 +0300
Subject: [PATCH] KVM: SVM: fix tsc scaling cache logic
SVM uses a per-cpu variable to cache the current value of the
tsc scaling multiplier msr on each cpu.
Commit 1ab9287add5e2
("KVM: X86: Add vendor callbacks for writing the TSC multiplier")
broke this caching logic.
Refactor the code so that all TSC scaling multiplier writes go through
a single function which checks and updates the cache.
This fixes the following scenario:
1. A CPU runs a guest with some tsc scaling ratio.
2. New guest with different tsc scaling ratio starts on this CPU
and terminates almost immediately.
This ensures that the short running guest had set the tsc scaling ratio just
once when it was set via KVM_SET_TSC_KHZ. Due to the bug,
the per-cpu cache is not updated.
3. The original guest continues to run, it doesn't restore the msr
value back to its own value, because the cache matches,
and thus continues to run with a wrong tsc scaling ratio.
Fixes: 1ab9287add5e2 ("KVM: X86: Add vendor callbacks for writing the TSC multiplier")
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Message-Id: <20220606181149.103072-1-mlevitsk(a)redhat.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index bed5e1692cef..3361258640a2 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -982,7 +982,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
if (svm->tsc_ratio_msr != kvm_default_tsc_scaling_ratio) {
WARN_ON(!svm->tsc_scaling_enabled);
vcpu->arch.tsc_scaling_ratio = vcpu->arch.l1_tsc_scaling_ratio;
- svm_write_tsc_multiplier(vcpu, vcpu->arch.tsc_scaling_ratio);
+ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
}
svm->nested.ctl.nested_cr3 = 0;
@@ -1387,7 +1387,7 @@ void nested_svm_update_tsc_ratio_msr(struct kvm_vcpu *vcpu)
vcpu->arch.tsc_scaling_ratio =
kvm_calc_nested_tsc_multiplier(vcpu->arch.l1_tsc_scaling_ratio,
svm->tsc_ratio_msr);
- svm_write_tsc_multiplier(vcpu, vcpu->arch.tsc_scaling_ratio);
+ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
}
/* Inverse operation of nested_copy_vmcb_control_to_cache(). asid is copied too. */
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 63880b33ce37..478e6ee81d88 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -465,11 +465,24 @@ static int has_svm(void)
return 1;
}
+void __svm_write_tsc_multiplier(u64 multiplier)
+{
+ preempt_disable();
+
+ if (multiplier == __this_cpu_read(current_tsc_ratio))
+ goto out;
+
+ wrmsrl(MSR_AMD64_TSC_RATIO, multiplier);
+ __this_cpu_write(current_tsc_ratio, multiplier);
+out:
+ preempt_enable();
+}
+
static void svm_hardware_disable(void)
{
/* Make sure we clean up behind us */
if (tsc_scaling)
- wrmsrl(MSR_AMD64_TSC_RATIO, SVM_TSC_RATIO_DEFAULT);
+ __svm_write_tsc_multiplier(SVM_TSC_RATIO_DEFAULT);
cpu_svm_disable();
@@ -515,8 +528,7 @@ static int svm_hardware_enable(void)
* Set the default value, even if we don't use TSC scaling
* to avoid having stale value in the msr
*/
- wrmsrl(MSR_AMD64_TSC_RATIO, SVM_TSC_RATIO_DEFAULT);
- __this_cpu_write(current_tsc_ratio, SVM_TSC_RATIO_DEFAULT);
+ __svm_write_tsc_multiplier(SVM_TSC_RATIO_DEFAULT);
}
@@ -999,11 +1011,12 @@ static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
vmcb_mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
}
-void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
+static void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier)
{
- wrmsrl(MSR_AMD64_TSC_RATIO, multiplier);
+ __svm_write_tsc_multiplier(multiplier);
}
+
/* Evaluate instruction intercepts that depend on guest CPUID features. */
static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu,
struct vcpu_svm *svm)
@@ -1363,13 +1376,8 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
sev_es_prepare_switch_to_guest(hostsa);
}
- if (tsc_scaling) {
- u64 tsc_ratio = vcpu->arch.tsc_scaling_ratio;
- if (tsc_ratio != __this_cpu_read(current_tsc_ratio)) {
- __this_cpu_write(current_tsc_ratio, tsc_ratio);
- wrmsrl(MSR_AMD64_TSC_RATIO, tsc_ratio);
- }
- }
+ if (tsc_scaling)
+ __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
if (likely(tsc_aux_uret_slot >= 0))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 45a87b2a8b3c..29d6fd205a49 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -590,7 +590,7 @@ int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
bool has_error_code, u32 error_code);
int nested_svm_exit_special(struct vcpu_svm *svm);
void nested_svm_update_tsc_ratio_msr(struct kvm_vcpu *vcpu);
-void svm_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier);
+void __svm_write_tsc_multiplier(u64 multiplier);
void nested_copy_vmcb_control_to_cache(struct vcpu_svm *svm,
struct vmcb_control_area *control);
void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm,
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77fc95f8c0dc9e1f8e620ec14d2fb65028fb7adc Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 17:04:38 +0200
Subject: [PATCH] random: account for arch randomness in bits
Rather than accounting in bytes and multiplying (shifting), we can just
account in bits and avoid the shift. The main motivation for this is
there are other patches in flux that expand this code a bit, and
avoiding the duplication of "* 8" everywhere makes things a bit clearer.
Cc: stable(a)vger.kernel.org
Fixes: 12e45a2a6308 ("random: credit architectural init the exact amount")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0d6fb3eaf609..60d701a2516b 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -776,7 +776,7 @@ static struct notifier_block pm_notifier = { .notifier_call = random_pm_notifica
int __init random_init(const char *command_line)
{
ktime_t now = ktime_get_real();
- unsigned int i, arch_bytes;
+ unsigned int i, arch_bits;
unsigned long entropy;
#if defined(LATENT_ENTROPY_PLUGIN)
@@ -784,12 +784,12 @@ int __init random_init(const char *command_line)
_mix_pool_bytes(compiletime_seed, sizeof(compiletime_seed));
#endif
- for (i = 0, arch_bytes = BLAKE2S_BLOCK_SIZE;
+ for (i = 0, arch_bits = BLAKE2S_BLOCK_SIZE * 8;
i < BLAKE2S_BLOCK_SIZE; i += sizeof(entropy)) {
if (!arch_get_random_seed_long_early(&entropy) &&
!arch_get_random_long_early(&entropy)) {
entropy = random_get_entropy();
- arch_bytes -= sizeof(entropy);
+ arch_bits -= sizeof(entropy) * 8;
}
_mix_pool_bytes(&entropy, sizeof(entropy));
}
@@ -801,8 +801,8 @@ int __init random_init(const char *command_line)
if (crng_ready())
crng_reseed();
else if (trust_cpu)
- _credit_init_bits(arch_bytes * 8);
- used_arch_random = arch_bytes * 8 >= POOL_READY_BITS;
+ _credit_init_bits(arch_bits);
+ used_arch_random = arch_bits >= POOL_READY_BITS;
WARN_ON(register_pm_notifier(&pm_notifier));
The patch below does not apply to the 5.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77fc95f8c0dc9e1f8e620ec14d2fb65028fb7adc Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 17:04:38 +0200
Subject: [PATCH] random: account for arch randomness in bits
Rather than accounting in bytes and multiplying (shifting), we can just
account in bits and avoid the shift. The main motivation for this is
there are other patches in flux that expand this code a bit, and
avoiding the duplication of "* 8" everywhere makes things a bit clearer.
Cc: stable(a)vger.kernel.org
Fixes: 12e45a2a6308 ("random: credit architectural init the exact amount")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0d6fb3eaf609..60d701a2516b 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -776,7 +776,7 @@ static struct notifier_block pm_notifier = { .notifier_call = random_pm_notifica
int __init random_init(const char *command_line)
{
ktime_t now = ktime_get_real();
- unsigned int i, arch_bytes;
+ unsigned int i, arch_bits;
unsigned long entropy;
#if defined(LATENT_ENTROPY_PLUGIN)
@@ -784,12 +784,12 @@ int __init random_init(const char *command_line)
_mix_pool_bytes(compiletime_seed, sizeof(compiletime_seed));
#endif
- for (i = 0, arch_bytes = BLAKE2S_BLOCK_SIZE;
+ for (i = 0, arch_bits = BLAKE2S_BLOCK_SIZE * 8;
i < BLAKE2S_BLOCK_SIZE; i += sizeof(entropy)) {
if (!arch_get_random_seed_long_early(&entropy) &&
!arch_get_random_long_early(&entropy)) {
entropy = random_get_entropy();
- arch_bytes -= sizeof(entropy);
+ arch_bits -= sizeof(entropy) * 8;
}
_mix_pool_bytes(&entropy, sizeof(entropy));
}
@@ -801,8 +801,8 @@ int __init random_init(const char *command_line)
if (crng_ready())
crng_reseed();
else if (trust_cpu)
- _credit_init_bits(arch_bytes * 8);
- used_arch_random = arch_bytes * 8 >= POOL_READY_BITS;
+ _credit_init_bits(arch_bits);
+ used_arch_random = arch_bits >= POOL_READY_BITS;
WARN_ON(register_pm_notifier(&pm_notifier));
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77fc95f8c0dc9e1f8e620ec14d2fb65028fb7adc Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 17:04:38 +0200
Subject: [PATCH] random: account for arch randomness in bits
Rather than accounting in bytes and multiplying (shifting), we can just
account in bits and avoid the shift. The main motivation for this is
there are other patches in flux that expand this code a bit, and
avoiding the duplication of "* 8" everywhere makes things a bit clearer.
Cc: stable(a)vger.kernel.org
Fixes: 12e45a2a6308 ("random: credit architectural init the exact amount")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0d6fb3eaf609..60d701a2516b 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -776,7 +776,7 @@ static struct notifier_block pm_notifier = { .notifier_call = random_pm_notifica
int __init random_init(const char *command_line)
{
ktime_t now = ktime_get_real();
- unsigned int i, arch_bytes;
+ unsigned int i, arch_bits;
unsigned long entropy;
#if defined(LATENT_ENTROPY_PLUGIN)
@@ -784,12 +784,12 @@ int __init random_init(const char *command_line)
_mix_pool_bytes(compiletime_seed, sizeof(compiletime_seed));
#endif
- for (i = 0, arch_bytes = BLAKE2S_BLOCK_SIZE;
+ for (i = 0, arch_bits = BLAKE2S_BLOCK_SIZE * 8;
i < BLAKE2S_BLOCK_SIZE; i += sizeof(entropy)) {
if (!arch_get_random_seed_long_early(&entropy) &&
!arch_get_random_long_early(&entropy)) {
entropy = random_get_entropy();
- arch_bytes -= sizeof(entropy);
+ arch_bits -= sizeof(entropy) * 8;
}
_mix_pool_bytes(&entropy, sizeof(entropy));
}
@@ -801,8 +801,8 @@ int __init random_init(const char *command_line)
if (crng_ready())
crng_reseed();
else if (trust_cpu)
- _credit_init_bits(arch_bytes * 8);
- used_arch_random = arch_bytes * 8 >= POOL_READY_BITS;
+ _credit_init_bits(arch_bits);
+ used_arch_random = arch_bits >= POOL_READY_BITS;
WARN_ON(register_pm_notifier(&pm_notifier));
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77fc95f8c0dc9e1f8e620ec14d2fb65028fb7adc Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 17:04:38 +0200
Subject: [PATCH] random: account for arch randomness in bits
Rather than accounting in bytes and multiplying (shifting), we can just
account in bits and avoid the shift. The main motivation for this is
there are other patches in flux that expand this code a bit, and
avoiding the duplication of "* 8" everywhere makes things a bit clearer.
Cc: stable(a)vger.kernel.org
Fixes: 12e45a2a6308 ("random: credit architectural init the exact amount")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0d6fb3eaf609..60d701a2516b 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -776,7 +776,7 @@ static struct notifier_block pm_notifier = { .notifier_call = random_pm_notifica
int __init random_init(const char *command_line)
{
ktime_t now = ktime_get_real();
- unsigned int i, arch_bytes;
+ unsigned int i, arch_bits;
unsigned long entropy;
#if defined(LATENT_ENTROPY_PLUGIN)
@@ -784,12 +784,12 @@ int __init random_init(const char *command_line)
_mix_pool_bytes(compiletime_seed, sizeof(compiletime_seed));
#endif
- for (i = 0, arch_bytes = BLAKE2S_BLOCK_SIZE;
+ for (i = 0, arch_bits = BLAKE2S_BLOCK_SIZE * 8;
i < BLAKE2S_BLOCK_SIZE; i += sizeof(entropy)) {
if (!arch_get_random_seed_long_early(&entropy) &&
!arch_get_random_long_early(&entropy)) {
entropy = random_get_entropy();
- arch_bytes -= sizeof(entropy);
+ arch_bits -= sizeof(entropy) * 8;
}
_mix_pool_bytes(&entropy, sizeof(entropy));
}
@@ -801,8 +801,8 @@ int __init random_init(const char *command_line)
if (crng_ready())
crng_reseed();
else if (trust_cpu)
- _credit_init_bits(arch_bytes * 8);
- used_arch_random = arch_bytes * 8 >= POOL_READY_BITS;
+ _credit_init_bits(arch_bits);
+ used_arch_random = arch_bits >= POOL_READY_BITS;
WARN_ON(register_pm_notifier(&pm_notifier));
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 77fc95f8c0dc9e1f8e620ec14d2fb65028fb7adc Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 17:04:38 +0200
Subject: [PATCH] random: account for arch randomness in bits
Rather than accounting in bytes and multiplying (shifting), we can just
account in bits and avoid the shift. The main motivation for this is
there are other patches in flux that expand this code a bit, and
avoiding the duplication of "* 8" everywhere makes things a bit clearer.
Cc: stable(a)vger.kernel.org
Fixes: 12e45a2a6308 ("random: credit architectural init the exact amount")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0d6fb3eaf609..60d701a2516b 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -776,7 +776,7 @@ static struct notifier_block pm_notifier = { .notifier_call = random_pm_notifica
int __init random_init(const char *command_line)
{
ktime_t now = ktime_get_real();
- unsigned int i, arch_bytes;
+ unsigned int i, arch_bits;
unsigned long entropy;
#if defined(LATENT_ENTROPY_PLUGIN)
@@ -784,12 +784,12 @@ int __init random_init(const char *command_line)
_mix_pool_bytes(compiletime_seed, sizeof(compiletime_seed));
#endif
- for (i = 0, arch_bytes = BLAKE2S_BLOCK_SIZE;
+ for (i = 0, arch_bits = BLAKE2S_BLOCK_SIZE * 8;
i < BLAKE2S_BLOCK_SIZE; i += sizeof(entropy)) {
if (!arch_get_random_seed_long_early(&entropy) &&
!arch_get_random_long_early(&entropy)) {
entropy = random_get_entropy();
- arch_bytes -= sizeof(entropy);
+ arch_bits -= sizeof(entropy) * 8;
}
_mix_pool_bytes(&entropy, sizeof(entropy));
}
@@ -801,8 +801,8 @@ int __init random_init(const char *command_line)
if (crng_ready())
crng_reseed();
else if (trust_cpu)
- _credit_init_bits(arch_bytes * 8);
- used_arch_random = arch_bytes * 8 >= POOL_READY_BITS;
+ _credit_init_bits(arch_bits);
+ used_arch_random = arch_bits >= POOL_READY_BITS;
WARN_ON(register_pm_notifier(&pm_notifier));
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b29b6b20376ab64e1b043df6301d8a92378e631 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 09:44:07 +0200
Subject: [PATCH] random: avoid checking crng_ready() twice in random_init()
The current flow expands to:
if (crng_ready())
...
else if (...)
if (!crng_ready())
...
The second crng_ready() call is redundant, but can't so easily be
optimized out by the compiler.
This commit simplifies that to:
if (crng_ready()
...
else if (...)
...
Fixes: 560181c27b58 ("random: move initialization functions out of hot pages")
Cc: stable(a)vger.kernel.org
Cc: Dominik Brodowski <linux(a)dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index b691b9d59503..4862d4d3ec49 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -801,7 +801,7 @@ int __init random_init(const char *command_line)
if (crng_ready())
crng_reseed();
else if (trust_cpu)
- credit_init_bits(arch_bytes * 8);
+ _credit_init_bits(arch_bytes * 8);
used_arch_random = arch_bytes * 8 >= POOL_READY_BITS;
WARN_ON(register_pm_notifier(&pm_notifier));
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b29b6b20376ab64e1b043df6301d8a92378e631 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 09:44:07 +0200
Subject: [PATCH] random: avoid checking crng_ready() twice in random_init()
The current flow expands to:
if (crng_ready())
...
else if (...)
if (!crng_ready())
...
The second crng_ready() call is redundant, but can't so easily be
optimized out by the compiler.
This commit simplifies that to:
if (crng_ready()
...
else if (...)
...
Fixes: 560181c27b58 ("random: move initialization functions out of hot pages")
Cc: stable(a)vger.kernel.org
Cc: Dominik Brodowski <linux(a)dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index b691b9d59503..4862d4d3ec49 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -801,7 +801,7 @@ int __init random_init(const char *command_line)
if (crng_ready())
crng_reseed();
else if (trust_cpu)
- credit_init_bits(arch_bytes * 8);
+ _credit_init_bits(arch_bytes * 8);
used_arch_random = arch_bytes * 8 >= POOL_READY_BITS;
WARN_ON(register_pm_notifier(&pm_notifier));
The patch below does not apply to the 5.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b29b6b20376ab64e1b043df6301d8a92378e631 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 09:44:07 +0200
Subject: [PATCH] random: avoid checking crng_ready() twice in random_init()
The current flow expands to:
if (crng_ready())
...
else if (...)
if (!crng_ready())
...
The second crng_ready() call is redundant, but can't so easily be
optimized out by the compiler.
This commit simplifies that to:
if (crng_ready()
...
else if (...)
...
Fixes: 560181c27b58 ("random: move initialization functions out of hot pages")
Cc: stable(a)vger.kernel.org
Cc: Dominik Brodowski <linux(a)dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index b691b9d59503..4862d4d3ec49 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -801,7 +801,7 @@ int __init random_init(const char *command_line)
if (crng_ready())
crng_reseed();
else if (trust_cpu)
- credit_init_bits(arch_bytes * 8);
+ _credit_init_bits(arch_bytes * 8);
used_arch_random = arch_bytes * 8 >= POOL_READY_BITS;
WARN_ON(register_pm_notifier(&pm_notifier));
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9b29b6b20376ab64e1b043df6301d8a92378e631 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 09:44:07 +0200
Subject: [PATCH] random: avoid checking crng_ready() twice in random_init()
The current flow expands to:
if (crng_ready())
...
else if (...)
if (!crng_ready())
...
The second crng_ready() call is redundant, but can't so easily be
optimized out by the compiler.
This commit simplifies that to:
if (crng_ready()
...
else if (...)
...
Fixes: 560181c27b58 ("random: move initialization functions out of hot pages")
Cc: stable(a)vger.kernel.org
Cc: Dominik Brodowski <linux(a)dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index b691b9d59503..4862d4d3ec49 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -801,7 +801,7 @@ int __init random_init(const char *command_line)
if (crng_ready())
crng_reseed();
else if (trust_cpu)
- credit_init_bits(arch_bytes * 8);
+ _credit_init_bits(arch_bytes * 8);
used_arch_random = arch_bytes * 8 >= POOL_READY_BITS;
WARN_ON(register_pm_notifier(&pm_notifier));
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 39e0f991a62ed5efabd20711a7b6e7da92603170 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 17:00:16 +0200
Subject: [PATCH] random: mark bootloader randomness code as __init
add_bootloader_randomness() and the variables it touches are only used
during __init and not after, so mark these as __init. At the same time,
unexport this, since it's only called by other __init code that's
built-in.
Cc: stable(a)vger.kernel.org
Fixes: 428826f5358c ("fdt: add support for rng-seed")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 4862d4d3ec49..0d6fb3eaf609 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -725,8 +725,8 @@ static void __cold _credit_init_bits(size_t bits)
**********************************************************************/
static bool used_arch_random;
-static bool trust_cpu __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
-static bool trust_bootloader __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER);
+static bool trust_cpu __initdata = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
+static bool trust_bootloader __initdata = IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER);
static int __init parse_trust_cpu(char *arg)
{
return kstrtobool(arg, &trust_cpu);
@@ -865,13 +865,12 @@ EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
* Handle random seed passed by bootloader, and credit it if
* CONFIG_RANDOM_TRUST_BOOTLOADER is set.
*/
-void __cold add_bootloader_randomness(const void *buf, size_t len)
+void __init add_bootloader_randomness(const void *buf, size_t len)
{
mix_pool_bytes(buf, len);
if (trust_bootloader)
credit_init_bits(len * 8);
}
-EXPORT_SYMBOL_GPL(add_bootloader_randomness);
#if IS_ENABLED(CONFIG_VMGENID)
static BLOCKING_NOTIFIER_HEAD(vmfork_chain);
diff --git a/include/linux/random.h b/include/linux/random.h
index fae0c84027fd..223b4bd584e7 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -13,7 +13,7 @@
struct notifier_block;
void add_device_randomness(const void *buf, size_t len);
-void add_bootloader_randomness(const void *buf, size_t len);
+void __init add_bootloader_randomness(const void *buf, size_t len);
void add_input_randomness(unsigned int type, unsigned int code,
unsigned int value) __latent_entropy;
void add_interrupt_randomness(int irq) __latent_entropy;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 39e0f991a62ed5efabd20711a7b6e7da92603170 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 17:00:16 +0200
Subject: [PATCH] random: mark bootloader randomness code as __init
add_bootloader_randomness() and the variables it touches are only used
during __init and not after, so mark these as __init. At the same time,
unexport this, since it's only called by other __init code that's
built-in.
Cc: stable(a)vger.kernel.org
Fixes: 428826f5358c ("fdt: add support for rng-seed")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 4862d4d3ec49..0d6fb3eaf609 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -725,8 +725,8 @@ static void __cold _credit_init_bits(size_t bits)
**********************************************************************/
static bool used_arch_random;
-static bool trust_cpu __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
-static bool trust_bootloader __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER);
+static bool trust_cpu __initdata = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
+static bool trust_bootloader __initdata = IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER);
static int __init parse_trust_cpu(char *arg)
{
return kstrtobool(arg, &trust_cpu);
@@ -865,13 +865,12 @@ EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
* Handle random seed passed by bootloader, and credit it if
* CONFIG_RANDOM_TRUST_BOOTLOADER is set.
*/
-void __cold add_bootloader_randomness(const void *buf, size_t len)
+void __init add_bootloader_randomness(const void *buf, size_t len)
{
mix_pool_bytes(buf, len);
if (trust_bootloader)
credit_init_bits(len * 8);
}
-EXPORT_SYMBOL_GPL(add_bootloader_randomness);
#if IS_ENABLED(CONFIG_VMGENID)
static BLOCKING_NOTIFIER_HEAD(vmfork_chain);
diff --git a/include/linux/random.h b/include/linux/random.h
index fae0c84027fd..223b4bd584e7 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -13,7 +13,7 @@
struct notifier_block;
void add_device_randomness(const void *buf, size_t len);
-void add_bootloader_randomness(const void *buf, size_t len);
+void __init add_bootloader_randomness(const void *buf, size_t len);
void add_input_randomness(unsigned int type, unsigned int code,
unsigned int value) __latent_entropy;
void add_interrupt_randomness(int irq) __latent_entropy;
The patch below does not apply to the 5.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 39e0f991a62ed5efabd20711a7b6e7da92603170 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 17:00:16 +0200
Subject: [PATCH] random: mark bootloader randomness code as __init
add_bootloader_randomness() and the variables it touches are only used
during __init and not after, so mark these as __init. At the same time,
unexport this, since it's only called by other __init code that's
built-in.
Cc: stable(a)vger.kernel.org
Fixes: 428826f5358c ("fdt: add support for rng-seed")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 4862d4d3ec49..0d6fb3eaf609 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -725,8 +725,8 @@ static void __cold _credit_init_bits(size_t bits)
**********************************************************************/
static bool used_arch_random;
-static bool trust_cpu __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
-static bool trust_bootloader __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER);
+static bool trust_cpu __initdata = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
+static bool trust_bootloader __initdata = IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER);
static int __init parse_trust_cpu(char *arg)
{
return kstrtobool(arg, &trust_cpu);
@@ -865,13 +865,12 @@ EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
* Handle random seed passed by bootloader, and credit it if
* CONFIG_RANDOM_TRUST_BOOTLOADER is set.
*/
-void __cold add_bootloader_randomness(const void *buf, size_t len)
+void __init add_bootloader_randomness(const void *buf, size_t len)
{
mix_pool_bytes(buf, len);
if (trust_bootloader)
credit_init_bits(len * 8);
}
-EXPORT_SYMBOL_GPL(add_bootloader_randomness);
#if IS_ENABLED(CONFIG_VMGENID)
static BLOCKING_NOTIFIER_HEAD(vmfork_chain);
diff --git a/include/linux/random.h b/include/linux/random.h
index fae0c84027fd..223b4bd584e7 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -13,7 +13,7 @@
struct notifier_block;
void add_device_randomness(const void *buf, size_t len);
-void add_bootloader_randomness(const void *buf, size_t len);
+void __init add_bootloader_randomness(const void *buf, size_t len);
void add_input_randomness(unsigned int type, unsigned int code,
unsigned int value) __latent_entropy;
void add_interrupt_randomness(int irq) __latent_entropy;
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 39e0f991a62ed5efabd20711a7b6e7da92603170 Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Date: Tue, 7 Jun 2022 17:00:16 +0200
Subject: [PATCH] random: mark bootloader randomness code as __init
add_bootloader_randomness() and the variables it touches are only used
during __init and not after, so mark these as __init. At the same time,
unexport this, since it's only called by other __init code that's
built-in.
Cc: stable(a)vger.kernel.org
Fixes: 428826f5358c ("fdt: add support for rng-seed")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 4862d4d3ec49..0d6fb3eaf609 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -725,8 +725,8 @@ static void __cold _credit_init_bits(size_t bits)
**********************************************************************/
static bool used_arch_random;
-static bool trust_cpu __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
-static bool trust_bootloader __ro_after_init = IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER);
+static bool trust_cpu __initdata = IS_ENABLED(CONFIG_RANDOM_TRUST_CPU);
+static bool trust_bootloader __initdata = IS_ENABLED(CONFIG_RANDOM_TRUST_BOOTLOADER);
static int __init parse_trust_cpu(char *arg)
{
return kstrtobool(arg, &trust_cpu);
@@ -865,13 +865,12 @@ EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
* Handle random seed passed by bootloader, and credit it if
* CONFIG_RANDOM_TRUST_BOOTLOADER is set.
*/
-void __cold add_bootloader_randomness(const void *buf, size_t len)
+void __init add_bootloader_randomness(const void *buf, size_t len)
{
mix_pool_bytes(buf, len);
if (trust_bootloader)
credit_init_bits(len * 8);
}
-EXPORT_SYMBOL_GPL(add_bootloader_randomness);
#if IS_ENABLED(CONFIG_VMGENID)
static BLOCKING_NOTIFIER_HEAD(vmfork_chain);
diff --git a/include/linux/random.h b/include/linux/random.h
index fae0c84027fd..223b4bd584e7 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -13,7 +13,7 @@
struct notifier_block;
void add_device_randomness(const void *buf, size_t len);
-void add_bootloader_randomness(const void *buf, size_t len);
+void __init add_bootloader_randomness(const void *buf, size_t len);
void add_input_randomness(unsigned int type, unsigned int code,
unsigned int value) __latent_entropy;
void add_interrupt_randomness(int irq) __latent_entropy;
SANTANDER BANK COMPENSATION UNIT, IN AFFILIATION WITH THE UNITED NATION.
Your compensation fund of 6 million dollars is ready for payment
contact me for more details.
Thanks
Greetings,
I sent this mail praying it will find you in a good condition, since I
myself am in a very critical health condition in which I sleep every
night without knowing if I may be alive to see the next day. I am
Mrs. Maria Roland, a widow suffering from a long time illness. I have
some funds I inherited from my late husband, the sum of
($11,000,000.00) my Doctor told me recently that I have serious
sickness which is a cancer problem. What disturbs me most is my stroke
sickness. Having known my condition, I decided to donate this fund to
a good person that will utilize it the way I am going to instruct
herein. I need a very honest God.
fearing a person who can claim this money and use it for Charity
works, for orphanages, widows and also build schools for less
privileges that will be named after my late husband if possible and to
promote the word of God and the effort that the house of God is
maintained. I do not want a situation where this money will be used in
an ungodly manner. That's why I' making this decision. I'm not afraid
of death so I know where I'm going. I accept this decision because I
do not have any child who will inherit this money after I die. Please
I want your sincere and urgent answer to know if you will be able to
execute this project, and I will give you more information on how the
fund will be transferred to your bank account. I am waiting for your reply,
May God Bless you,
Mrs. Maria Roland,
From: Chuang Wang <nashuiliang(a)gmail.com>
In aggrprobe scenes, if arm_kprobe() returns an error(e.g. livepatch and
kprobe are using the same function X), kprobe flags, while has been
modified to ~KPROBE_FLAG_DISABLED, is not rollled back.
Then, __disable_kprobe() will be failed in __unregister_kprobe_top(),
the kprobe list will be not removed from aggrprobe, memory leaks or
illegal pointers will be caused.
WARN disarm_kprobe:
Failed to disarm kprobe-ftrace at 00000000c729fdbc (-2)
RIP: 0010:disarm_kprobe+0xcc/0x110
Call Trace:
__disable_kprobe+0x78/0x90
__unregister_kprobe_top+0x13/0x1b0
? _cond_resched+0x15/0x30
unregister_kprobes+0x32/0x80
unregister_kprobe+0x1a/0x20
Illegal Pointers:
BUG: unable to handle kernel paging request at 0000000000656369
RIP: 0010:__get_valid_kprobe+0x69/0x90
Call Trace:
register_kprobe+0x30/0x60
__register_trace_kprobe.part.7+0x8b/0xc0
create_local_trace_kprobe+0xd2/0x130
perf_kprobe_init+0x83/0xd0
Fixes: 12310e343755 ("kprobes: Propagate error from arm_kprobe_ftrace()")
Signed-off-by: Chuang Wang <nashuiliang(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Jingren Zhou <zhoujingren(a)didiglobal.com>
---
v1->v2:
- Supplement commit information: fixline, Cc stable
kernel/kprobes.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index f214f8c088ed..c11c79e05a4c 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2422,8 +2422,11 @@ int enable_kprobe(struct kprobe *kp)
if (!kprobes_all_disarmed && kprobe_disabled(p)) {
p->flags &= ~KPROBE_FLAG_DISABLED;
ret = arm_kprobe(p);
- if (ret)
+ if (ret) {
p->flags |= KPROBE_FLAG_DISABLED;
+ if (p != kp)
+ kp->flags |= KPROBE_FLAG_DISABLED;
+ }
}
out:
mutex_unlock(&kprobe_mutex);
--
2.34.1
The following changes since commit f2906aa863381afb0015a9eb7fefad885d4e5a56:
Linux 5.19-rc1 (2022-06-05 17:18:54 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
for you to fetch changes up to eacea844594ff338db06437806707313210d4865:
um: virt-pci: set device ready in probe() (2022-06-10 20:38:06 -0400)
----------------------------------------------------------------
virtio,vdpa: fixes
Fixes all over the place, most notably fixes for latent
bugs in drivers that got exposed by suppressing
interrupts before DRIVER_OK, which in turn has been
done by 8b4ec69d7e09 ("virtio: harden vring IRQ").
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
----------------------------------------------------------------
Bo Liu (1):
virtio: Fix all occurences of the "the the" typo
Dan Carpenter (2):
vdpa/mlx5: fix error code for deleting vlan
vdpa/mlx5: clean up indenting in handle_ctrl_vlan()
Jason Wang (2):
virtio-rng: make device ready before making request
vdpa: make get_vq_group and set_group_asid optional
Vincent Whitchurch (1):
um: virt-pci: set device ready in probe()
Xiang wangx (1):
vdpa/mlx5: Fix syntax errors in comments
Xie Yongji (2):
vringh: Fix loop descriptors check in the indirect cases
vduse: Fix NULL pointer dereference on sysfs access
chengkaitao (1):
virtio-mmio: fix missing put_device() when vm_cmdline_parent registration failed
arch/um/drivers/virt-pci.c | 7 ++++++-
drivers/char/hw_random/virtio-rng.c | 2 ++
drivers/vdpa/mlx5/net/mlx5_vnet.c | 9 +++++----
drivers/vdpa/vdpa_user/vduse_dev.c | 7 +++----
drivers/vhost/vdpa.c | 2 ++
drivers/vhost/vringh.c | 10 ++++++++--
drivers/virtio/virtio_mmio.c | 3 ++-
drivers/virtio/virtio_pci_modern_dev.c | 2 +-
include/linux/vdpa.h | 5 +++--
9 files changed, 32 insertions(+), 15 deletions(-)
The patch titled
Subject: mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pages
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-userfaultfd-fix-uffdio_continue-on-fallocated-shmem-pages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Axel Rasmussen <axelrasmussen(a)google.com>
Subject: mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pages
Date: Fri, 10 Jun 2022 10:38:12 -0700
When fallocate() is used on a shmem file, the pages we allocate can end up
with !PageUptodate.
Since UFFDIO_CONTINUE tries to find the existing page the user wants to
map with SGP_READ, we would fail to find such a page, since
shmem_getpage_gfp returns with a "NULL" pagep for SGP_READ if it discovers
!PageUptodate. As a result, UFFDIO_CONTINUE returns -EFAULT, as it would
do if the page wasn't found in the page cache at all.
This isn't the intended behavior. UFFDIO_CONTINUE is just trying to find
if a page exists, and doesn't care whether it still needs to be cleared or
not. So, instead of SGP_READ, pass in SGP_NOALLOC. This is the same,
except for one critical difference: in the !PageUptodate case, SGP_NOALLOC
will clear the page and then return it. With this change, UFFDIO_CONTINUE
works properly (succeeds) on a shmem file which has been fallocated, but
otherwise not modified.
Link: https://lkml.kernel.org/r/20220610173812.1768919-1-axelrasmussen@google.com
Fixes: 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem")
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/userfaultfd.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/userfaultfd.c~mm-userfaultfd-fix-uffdio_continue-on-fallocated-shmem-pages
+++ a/mm/userfaultfd.c
@@ -246,7 +246,10 @@ static int mcontinue_atomic_pte(struct m
struct page *page;
int ret;
- ret = shmem_getpage(inode, pgoff, &page, SGP_READ);
+ ret = shmem_getpage(inode, pgoff, &page, SGP_NOALLOC);
+ /* Our caller expects us to return -EFAULT if we failed to find page. */
+ if (ret == -ENOENT)
+ ret = -EFAULT;
if (ret)
goto out;
if (!page) {
_
Patches currently in -mm which might be from axelrasmussen(a)google.com are
mm-userfaultfd-fix-uffdio_continue-on-fallocated-shmem-pages.patch
The quilt patch titled
Subject: mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pages
has been removed from the -mm tree. Its filename was
mm-userfaultfd-fix-uffdio_continue-on-fallocated-shmem-pages.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Axel Rasmussen <axelrasmussen(a)google.com>
Subject: mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pages
Date: Fri, 3 Jun 2022 13:57:41 -0700
When fallocate() is used on a shmem file, the pages we allocate can end up
with !PageUptodate.
Since UFFDIO_CONTINUE tries to find the existing page the user wants to
map with SGP_READ, we would fail to find such a page, since
shmem_getpage_gfp returns with a "NULL" pagep for SGP_READ if it discovers
!PageUptodate. As a result, UFFDIO_CONTINUE returns -EFAULT, as it would
do if the page wasn't found in the page cache at all.
This isn't the intended behavior. UFFDIO_CONTINUE is just trying to find
if a page exists, and doesn't care whether it still needs to be cleared or
not. So, instead of SGP_READ, pass in SGP_NOALLOC. This is the same,
except for one critical difference: in the !PageUptodate case, SGP_NOALLOC
will clear the page and then return it. With this change, UFFDIO_CONTINUE
works properly (succeeds) on a shmem file which has been fallocated, but
otherwise not modified.
Link: https://lkml.kernel.org/r/20220603205741.12888-1-axelrasmussen@google.com
Fixes: 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem")
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/userfaultfd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/userfaultfd.c~mm-userfaultfd-fix-uffdio_continue-on-fallocated-shmem-pages
+++ a/mm/userfaultfd.c
@@ -246,7 +246,7 @@ static int mcontinue_atomic_pte(struct m
struct page *page;
int ret;
- ret = shmem_getpage(inode, pgoff, &page, SGP_READ);
+ ret = shmem_getpage(inode, pgoff, &page, SGP_NOALLOC);
if (ret)
goto out;
if (!page) {
_
Patches currently in -mm which might be from axelrasmussen(a)google.com are
Hello [lvuugu]
Only we have the lowest prices for E-mail database [kmkzqq]
Want now new.bussines.year(a)gmail.com [nsozzk]
To unsubscribe, send us an email with the subject 'unsubscribe' [gnualis]
The platform's RNG must be available before random_init() in order to be
useful for initial seeding, which in turn means that it needs to be
called from setup_arch(), rather than from an init call. Fortunately,
each platform already has a setup_arch function pointer, which means
it's easy to wire this up. This commit also removes some noisy log
messages that don't add much.
Cc: stable(a)vger.kernel.org
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Fixes: a489043f4626 ("powerpc/pseries: Implement arch_get_random_long() based on H_RANDOM")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
arch/powerpc/platforms/pseries/pseries.h | 2 ++
arch/powerpc/platforms/pseries/rng.c | 11 +++--------
arch/powerpc/platforms/pseries/setup.c | 1 +
3 files changed, 6 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
index f5c916c839c9..1d75b7742ef0 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -122,4 +122,6 @@ void pseries_lpar_read_hblkrm_characteristics(void);
static inline void pseries_lpar_read_hblkrm_characteristics(void) { }
#endif
+void pseries_rng_init(void);
+
#endif /* _PSERIES_PSERIES_H */
diff --git a/arch/powerpc/platforms/pseries/rng.c b/arch/powerpc/platforms/pseries/rng.c
index 6268545947b8..6ddfdeaace9e 100644
--- a/arch/powerpc/platforms/pseries/rng.c
+++ b/arch/powerpc/platforms/pseries/rng.c
@@ -10,6 +10,7 @@
#include <asm/archrandom.h>
#include <asm/machdep.h>
#include <asm/plpar_wrappers.h>
+#include "pseries.h"
static int pseries_get_random_long(unsigned long *v)
@@ -24,19 +25,13 @@ static int pseries_get_random_long(unsigned long *v)
return 0;
}
-static __init int rng_init(void)
+void __init pseries_rng_init(void)
{
struct device_node *dn;
dn = of_find_compatible_node(NULL, NULL, "ibm,random");
if (!dn)
- return -ENODEV;
-
- pr_info("Registering arch random hook.\n");
-
+ return;
ppc_md.get_random_seed = pseries_get_random_long;
-
of_node_put(dn);
- return 0;
}
-machine_subsys_initcall(pseries, rng_init);
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index afb074269b42..ee4f1db49515 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -839,6 +839,7 @@ static void __init pSeries_setup_arch(void)
}
ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare;
+ pseries_rng_init();
}
static void pseries_panic(char *str)
--
2.35.1
Before commit 9998943f6dfc ("media: rkvdec: Stop overclocking the
decoder"), the rkvdec driver was forcing the VDU clock rate. After that
commit, we rely on the default clock rate. That rate works OK on many
boards, with the default PLL settings (CPLL is 800MHz, VDU dividers
leave it at 400MHz); but some boards change PLL settings.
Assign the expected default clock rate explicitly, so that the rate is
consistent, regardless of PLL configuration.
This was particularly broken on RK3399 Gru Scarlet systems, where the
rk3399-gru-scarlet.dtsi assigns PLL_CPLL to 1.6 GHz, and so the VDU
clock ends up at 800 MHz (twice the expected rate), and causes video
artifacts and other issues.
Note: I assign the clock rate in the clock controller instead of the
vdec node, because there are multiple nodes that use this clock, and per
the clock.yaml specification:
Configuring a clock's parent and rate through the device node that
consumes the clock can be done only for clocks that have a single
user. Specifying conflicting parent or rate configuration in multiple
consumer nodes for a shared clock is forbidden.
Configuration of common clocks, which affect multiple consumer devices
can be similarly specified in the clock provider node.
Fixes: 9998943f6dfc ("media: rkvdec: Stop overclocking the decoder")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
---
This is a candidate for 5.19 IMO, since commit 9998943f6dfc landed in
5.19-rc1 and is being queued up for -stable as we speak.
arch/arm64/boot/dts/rockchip/rk3399-gru-scarlet.dtsi | 4 +++-
arch/arm64/boot/dts/rockchip/rk3399.dtsi | 6 ++++--
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru-scarlet.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-gru-scarlet.dtsi
index 913d845eb51a..1977103a5ef4 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399-gru-scarlet.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399-gru-scarlet.dtsi
@@ -376,7 +376,8 @@ &cru {
<&cru ACLK_VIO>,
<&cru ACLK_GIC_PRE>,
<&cru PCLK_DDR>,
- <&cru ACLK_HDCP>;
+ <&cru ACLK_HDCP>,
+ <&cru ACLK_VDU>;
assigned-clock-rates =
<600000000>, <1600000000>,
<1000000000>,
@@ -388,6 +389,7 @@ &cru {
<400000000>,
<200000000>,
<200000000>,
+ <400000000>,
<400000000>;
};
diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index fbd0346624e6..9d5b0e8c9cca 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -1462,7 +1462,8 @@ cru: clock-controller@ff760000 {
<&cru HCLK_PERILP1>, <&cru PCLK_PERILP1>,
<&cru ACLK_VIO>, <&cru ACLK_HDCP>,
<&cru ACLK_GIC_PRE>,
- <&cru PCLK_DDR>;
+ <&cru PCLK_DDR>,
+ <&cru ACLK_VDU>;
assigned-clock-rates =
<594000000>, <800000000>,
<1000000000>,
@@ -1473,7 +1474,8 @@ cru: clock-controller@ff760000 {
<100000000>, <50000000>,
<400000000>, <400000000>,
<200000000>,
- <200000000>;
+ <200000000>,
+ <400000000>;
};
grf: syscon@ff770000 {
--
2.36.1.255.ge46751e96f-goog
The platform's RNG must be available before random_init() in order to be
useful for initial seeding, which in turn means that it needs to be
called from setup_arch(), rather than from an init call. Fortunately,
each platform already has a setup_arch function pointer, which means
it's easy to wire this up for each of the three platforms that have an
RNG. This commit also removes some noisy log messages that don't add
much.
Cc: stable(a)vger.kernel.org
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Fixes: c25769fddaec ("powerpc/microwatt: Add support for hardware random number generator")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
arch/powerpc/platforms/microwatt/rng.c | 9 ++-------
arch/powerpc/platforms/microwatt/setup.c | 8 ++++++++
2 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/platforms/microwatt/rng.c b/arch/powerpc/platforms/microwatt/rng.c
index 7bc4d1cbfaf0..d13f656910ad 100644
--- a/arch/powerpc/platforms/microwatt/rng.c
+++ b/arch/powerpc/platforms/microwatt/rng.c
@@ -29,7 +29,7 @@ static int microwatt_get_random_darn(unsigned long *v)
return 1;
}
-static __init int rng_init(void)
+__init void microwatt_rng_init(void)
{
unsigned long val;
int i;
@@ -37,12 +37,7 @@ static __init int rng_init(void)
for (i = 0; i < 10; i++) {
if (microwatt_get_random_darn(&val)) {
ppc_md.get_random_seed = microwatt_get_random_darn;
- return 0;
+ return;
}
}
-
- pr_warn("Unable to use DARN for get_random_seed()\n");
-
- return -EIO;
}
-machine_subsys_initcall(, rng_init);
diff --git a/arch/powerpc/platforms/microwatt/setup.c b/arch/powerpc/platforms/microwatt/setup.c
index 0b02603bdb74..23c996dcc870 100644
--- a/arch/powerpc/platforms/microwatt/setup.c
+++ b/arch/powerpc/platforms/microwatt/setup.c
@@ -32,10 +32,18 @@ static int __init microwatt_populate(void)
}
machine_arch_initcall(microwatt, microwatt_populate);
+__init void microwatt_rng_init(void);
+
+static void __init microwatt_setup_arch(void)
+{
+ microwatt_rng_init();
+}
+
define_machine(microwatt) {
.name = "microwatt",
.probe = microwatt_probe,
.init_IRQ = microwatt_init_IRQ,
+ .setup_arch = microwatt_setup_arch,
.progress = udbg_progress,
.calibrate_decr = generic_calibrate_decr,
};
--
2.35.1
The platform's RNG must be available before random_init() in order to be
useful for initial seeding, which in turn means that it needs to be
called from setup_arch(), rather than from an init call. Fortunately,
each platform already has a setup_arch function pointer, which means
it's easy to wire this up for each of the three platforms that have an
RNG. This commit also removes some noisy log messages that don't add
much.
Cc: stable(a)vger.kernel.org
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Fixes: a489043f4626 ("powerpc/pseries: Implement arch_get_random_long() based on H_RANDOM")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
arch/powerpc/platforms/pseries/rng.c | 11 ++---------
arch/powerpc/platforms/pseries/setup.c | 3 +++
2 files changed, 5 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/rng.c b/arch/powerpc/platforms/pseries/rng.c
index 6268545947b8..d39bfce39aa1 100644
--- a/arch/powerpc/platforms/pseries/rng.c
+++ b/arch/powerpc/platforms/pseries/rng.c
@@ -24,19 +24,12 @@ static int pseries_get_random_long(unsigned long *v)
return 0;
}
-static __init int rng_init(void)
+__init void pseries_rng_init(void)
{
struct device_node *dn;
-
dn = of_find_compatible_node(NULL, NULL, "ibm,random");
if (!dn)
- return -ENODEV;
-
- pr_info("Registering arch random hook.\n");
-
+ return;
ppc_md.get_random_seed = pseries_get_random_long;
-
of_node_put(dn);
- return 0;
}
-machine_subsys_initcall(pseries, rng_init);
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index afb074269b42..7f3ee2658163 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -779,6 +779,8 @@ static resource_size_t pseries_pci_iov_resource_alignment(struct pci_dev *pdev,
}
#endif
+__init void pseries_rng_init(void);
+
static void __init pSeries_setup_arch(void)
{
set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
@@ -839,6 +841,7 @@ static void __init pSeries_setup_arch(void)
}
ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare;
+ pseries_rng_init();
}
static void pseries_panic(char *str)
--
2.35.1
The platform's RNG must be available before random_init() in order to be
useful for initial seeding, which in turn means that it needs to be
called from setup_arch(), rather than from an init call. Fortunately,
each platform already has a setup_arch function pointer, which means
it's easy to wire this up for each of the three platforms that have an
RNG. This commit also removes some noisy log messages that don't add
much.
Cc: stable(a)vger.kernel.org
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Fixes: a4da0d50b2a0 ("powerpc: Implement arch_get_random_long/int() for powernv")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
arch/powerpc/platforms/powernv/rng.c | 17 ++++-------------
arch/powerpc/platforms/powernv/setup.c | 4 ++++
2 files changed, 8 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/platforms/powernv/rng.c b/arch/powerpc/platforms/powernv/rng.c
index e3d44b36ae98..ef24e72a1b69 100644
--- a/arch/powerpc/platforms/powernv/rng.c
+++ b/arch/powerpc/platforms/powernv/rng.c
@@ -84,24 +84,20 @@ static int powernv_get_random_darn(unsigned long *v)
return 1;
}
-static int __init initialise_darn(void)
+static void __init initialise_darn(void)
{
unsigned long val;
int i;
if (!cpu_has_feature(CPU_FTR_ARCH_300))
- return -ENODEV;
+ return;
for (i = 0; i < 10; i++) {
if (powernv_get_random_darn(&val)) {
ppc_md.get_random_seed = powernv_get_random_darn;
- return 0;
+ return;
}
}
-
- pr_warn("Unable to use DARN for get_random_seed()\n");
-
- return -EIO;
}
int powernv_get_random_long(unsigned long *v)
@@ -163,14 +159,12 @@ static __init int rng_create(struct device_node *dn)
rng_init_per_cpu(rng, dn);
- pr_info_once("Registering arch random hook.\n");
-
ppc_md.get_random_seed = powernv_get_random_long;
return 0;
}
-static __init int rng_init(void)
+__init void powernv_rng_init(void)
{
struct device_node *dn;
int rc;
@@ -188,7 +182,4 @@ static __init int rng_init(void)
}
initialise_darn();
-
- return 0;
}
-machine_subsys_initcall(powernv, rng_init);
diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index 824c3ad7a0fa..a0c5217bc5c0 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -184,6 +184,8 @@ static void __init pnv_check_guarded_cores(void)
}
}
+__init void powernv_rng_init(void);
+
static void __init pnv_setup_arch(void)
{
set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
@@ -203,6 +205,8 @@ static void __init pnv_setup_arch(void)
pnv_check_guarded_cores();
/* XXX PMCS */
+
+ powernv_rng_init();
}
static void __init pnv_init(void)
--
2.35.1
Commit 6c64a664e1cff339ec698d803fa8cbb9af5d95ce “xhci: Set HCD flag to defer
primary roothub registration” added for Linux 5.18.3 caused a regression on
some Amlogic S912 devices (original user forum report with an Android TV box
and confirmed with a Khadas VIM2 board). I do not see issues with older S905
(WeTeK Play2) or newer S922X (Odroid N2+) devices running the same kernel.
There are no kernel splats or error messages but lsusb shows no output and
nothing works. Simple revert restores the previous good working behaviour.
dmesg with commit http://ix.io/3ZTv
dmesg with revert http://ix.io/3ZTw
I’m not a coder so will need to be fed instructions to assist with debugging
the issue further if you don't have access to an Amlogic S912 device. I can
share devices online for remote poking around if needed.
Christian
--
Dear Fund Beneficiary,
Refer to my previous emails informing you again and again that your
fund has been released. Please be advised that your long delayed
payment has just been released and loaded onto an ATM Master
Card.Please reconfirm your names and address for delivery.
Looking forward to hearing from you,
Nigel Higgins, (Barclays Group Chairman),
Barclays Bank Plc,
1 Churchill Place, London, ENG E14 5HP,
commit 8e1278444446fc97778a5e5c99bca1ce0bbc5ec9 upstream.
The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
to read/write registers of another process.
To get/set a register, the API takes an index into an imaginary address
space called the "USER area", where the registers of the process are
laid out in some fashion.
The kernel then maps that index to a particular register in its own data
structures and gets/sets the value.
The API only allows a single machine-word to be read/written at a time.
So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.
The way floating point registers (FPRs) are addressed is somewhat
complicated, because double precision float values are 64-bit even on
32-bit CPUs. That means on 32-bit kernels each FPR occupies two
word-sized locations in the USER area. On 64-bit kernels each FPR
occupies one word-sized location in the USER area.
Internally the kernel stores the FPRs in an array of u64s, or if VSX is
enabled, an array of pairs of u64s where one half of each pair stores
the FPR. Which half of the pair stores the FPR depends on the kernel's
endianness.
To handle the different layouts of the FPRs depending on VSX/no-VSX and
big/little endian, the TS_FPR() macro was introduced.
Unfortunately the TS_FPR() macro does not take into account the fact
that the addressing of each FPR differs between 32-bit and 64-bit
kernels. It just takes the index into the "USER area" passed from
userspace and indexes into the fp_state.fpr array.
On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
the fp_state.fpr array, meaning the user can read/write 256 bytes past
the end of the array. Because the fp_state sits in the middle of the
thread_struct there are various fields than can be overwritten,
including some pointers. As such it may be exploitable.
It has also been observed to cause systems to hang or otherwise
misbehave when using gdbserver, and is probably the root cause of this
report which could not be easily reproduced:
https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@k…
Rather than trying to make the TS_FPR() macro even more complicated to
fix the bug, or add more macros, instead add a special-case for 32-bit
kernels. This is more obvious and hopefully avoids a similar bug
happening again in future.
Note that because 32-bit kernels never have VSX enabled the code doesn't
need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
ensure that 32-bit && VSX is never enabled.
Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds")
Cc: stable(a)vger.kernel.org # v3.13+
Reported-by: Ariel Miculas <ariel.miculas(a)belden.com>
Tested-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
---
arch/powerpc/kernel/ptrace.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index e08b32ccf1d9..073117ba75fe 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -3007,8 +3007,13 @@ long arch_ptrace(struct task_struct *child, long request,
flush_fp_to_thread(child);
if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(&tmp, &child->thread.TS_FPR(fpidx),
- sizeof(long));
+ if (IS_ENABLED(CONFIG_PPC32)) {
+ // On 32-bit the index we are passed refers to 32-bit words
+ tmp = ((u32 *)child->thread.fp_state.fpr)[fpidx];
+ } else {
+ memcpy(&tmp, &child->thread.TS_FPR(fpidx),
+ sizeof(long));
+ }
else
tmp = child->thread.fp_state.fpscr;
}
@@ -3040,8 +3045,13 @@ long arch_ptrace(struct task_struct *child, long request,
flush_fp_to_thread(child);
if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(&child->thread.TS_FPR(fpidx), &data,
- sizeof(long));
+ if (IS_ENABLED(CONFIG_PPC32)) {
+ // On 32-bit the index we are passed refers to 32-bit words
+ ((u32 *)child->thread.fp_state.fpr)[fpidx] = data;
+ } else {
+ memcpy(&child->thread.TS_FPR(fpidx), &data,
+ sizeof(long));
+ }
else
child->thread.fp_state.fpscr = data;
ret = 0;
--
2.35.3
commit 8e1278444446fc97778a5e5c99bca1ce0bbc5ec9 upstream.
The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
to read/write registers of another process.
To get/set a register, the API takes an index into an imaginary address
space called the "USER area", where the registers of the process are
laid out in some fashion.
The kernel then maps that index to a particular register in its own data
structures and gets/sets the value.
The API only allows a single machine-word to be read/written at a time.
So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.
The way floating point registers (FPRs) are addressed is somewhat
complicated, because double precision float values are 64-bit even on
32-bit CPUs. That means on 32-bit kernels each FPR occupies two
word-sized locations in the USER area. On 64-bit kernels each FPR
occupies one word-sized location in the USER area.
Internally the kernel stores the FPRs in an array of u64s, or if VSX is
enabled, an array of pairs of u64s where one half of each pair stores
the FPR. Which half of the pair stores the FPR depends on the kernel's
endianness.
To handle the different layouts of the FPRs depending on VSX/no-VSX and
big/little endian, the TS_FPR() macro was introduced.
Unfortunately the TS_FPR() macro does not take into account the fact
that the addressing of each FPR differs between 32-bit and 64-bit
kernels. It just takes the index into the "USER area" passed from
userspace and indexes into the fp_state.fpr array.
On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
the fp_state.fpr array, meaning the user can read/write 256 bytes past
the end of the array. Because the fp_state sits in the middle of the
thread_struct there are various fields than can be overwritten,
including some pointers. As such it may be exploitable.
It has also been observed to cause systems to hang or otherwise
misbehave when using gdbserver, and is probably the root cause of this
report which could not be easily reproduced:
https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@k…
Rather than trying to make the TS_FPR() macro even more complicated to
fix the bug, or add more macros, instead add a special-case for 32-bit
kernels. This is more obvious and hopefully avoids a similar bug
happening again in future.
Note that because 32-bit kernels never have VSX enabled the code doesn't
need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
ensure that 32-bit && VSX is never enabled.
Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds")
Cc: stable(a)vger.kernel.org # v3.13+
Reported-by: Ariel Miculas <ariel.miculas(a)belden.com>
Tested-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
---
arch/powerpc/kernel/ptrace.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 8c92febf5f44..63bfc5250b67 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -3014,8 +3014,13 @@ long arch_ptrace(struct task_struct *child, long request,
flush_fp_to_thread(child);
if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(&tmp, &child->thread.TS_FPR(fpidx),
- sizeof(long));
+ if (IS_ENABLED(CONFIG_PPC32)) {
+ // On 32-bit the index we are passed refers to 32-bit words
+ tmp = ((u32 *)child->thread.fp_state.fpr)[fpidx];
+ } else {
+ memcpy(&tmp, &child->thread.TS_FPR(fpidx),
+ sizeof(long));
+ }
else
tmp = child->thread.fp_state.fpscr;
}
@@ -3047,8 +3052,13 @@ long arch_ptrace(struct task_struct *child, long request,
flush_fp_to_thread(child);
if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(&child->thread.TS_FPR(fpidx), &data,
- sizeof(long));
+ if (IS_ENABLED(CONFIG_PPC32)) {
+ // On 32-bit the index we are passed refers to 32-bit words
+ ((u32 *)child->thread.fp_state.fpr)[fpidx] = data;
+ } else {
+ memcpy(&child->thread.TS_FPR(fpidx), &data,
+ sizeof(long));
+ }
else
child->thread.fp_state.fpscr = data;
ret = 0;
@@ -3398,4 +3408,7 @@ void __init pt_regs_check(void)
offsetof(struct user_pt_regs, result));
BUILD_BUG_ON(sizeof(struct user_pt_regs) > sizeof(struct pt_regs));
+
+ // ptrace_get/put_fpr() rely on PPC32 and VSX being incompatible
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_VSX));
}
--
2.35.3
commit 8e1278444446fc97778a5e5c99bca1ce0bbc5ec9 upstream.
The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
to read/write registers of another process.
To get/set a register, the API takes an index into an imaginary address
space called the "USER area", where the registers of the process are
laid out in some fashion.
The kernel then maps that index to a particular register in its own data
structures and gets/sets the value.
The API only allows a single machine-word to be read/written at a time.
So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.
The way floating point registers (FPRs) are addressed is somewhat
complicated, because double precision float values are 64-bit even on
32-bit CPUs. That means on 32-bit kernels each FPR occupies two
word-sized locations in the USER area. On 64-bit kernels each FPR
occupies one word-sized location in the USER area.
Internally the kernel stores the FPRs in an array of u64s, or if VSX is
enabled, an array of pairs of u64s where one half of each pair stores
the FPR. Which half of the pair stores the FPR depends on the kernel's
endianness.
To handle the different layouts of the FPRs depending on VSX/no-VSX and
big/little endian, the TS_FPR() macro was introduced.
Unfortunately the TS_FPR() macro does not take into account the fact
that the addressing of each FPR differs between 32-bit and 64-bit
kernels. It just takes the index into the "USER area" passed from
userspace and indexes into the fp_state.fpr array.
On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
the fp_state.fpr array, meaning the user can read/write 256 bytes past
the end of the array. Because the fp_state sits in the middle of the
thread_struct there are various fields than can be overwritten,
including some pointers. As such it may be exploitable.
It has also been observed to cause systems to hang or otherwise
misbehave when using gdbserver, and is probably the root cause of this
report which could not be easily reproduced:
https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@k…
Rather than trying to make the TS_FPR() macro even more complicated to
fix the bug, or add more macros, instead add a special-case for 32-bit
kernels. This is more obvious and hopefully avoids a similar bug
happening again in future.
Note that because 32-bit kernels never have VSX enabled the code doesn't
need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
ensure that 32-bit && VSX is never enabled.
Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds")
Cc: stable(a)vger.kernel.org # v3.13+
Reported-by: Ariel Miculas <ariel.miculas(a)belden.com>
Tested-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
---
arch/powerpc/kernel/ptrace/ptrace.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index f6e51be47c6e..9ea9ee513ae1 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -75,8 +75,13 @@ long arch_ptrace(struct task_struct *child, long request,
flush_fp_to_thread(child);
if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(&tmp, &child->thread.TS_FPR(fpidx),
- sizeof(long));
+ if (IS_ENABLED(CONFIG_PPC32)) {
+ // On 32-bit the index we are passed refers to 32-bit words
+ tmp = ((u32 *)child->thread.fp_state.fpr)[fpidx];
+ } else {
+ memcpy(&tmp, &child->thread.TS_FPR(fpidx),
+ sizeof(long));
+ }
else
tmp = child->thread.fp_state.fpscr;
}
@@ -108,8 +113,13 @@ long arch_ptrace(struct task_struct *child, long request,
flush_fp_to_thread(child);
if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(&child->thread.TS_FPR(fpidx), &data,
- sizeof(long));
+ if (IS_ENABLED(CONFIG_PPC32)) {
+ // On 32-bit the index we are passed refers to 32-bit words
+ ((u32 *)child->thread.fp_state.fpr)[fpidx] = data;
+ } else {
+ memcpy(&child->thread.TS_FPR(fpidx), &data,
+ sizeof(long));
+ }
else
child->thread.fp_state.fpscr = data;
ret = 0;
@@ -478,4 +488,7 @@ void __init pt_regs_check(void)
* real registers.
*/
BUILD_BUG_ON(PT_DSCR < sizeof(struct user_pt_regs) / sizeof(unsigned long));
+
+ // ptrace_get/put_fpr() rely on PPC32 and VSX being incompatible
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_VSX));
}
--
2.35.3
commit 8e1278444446fc97778a5e5c99bca1ce0bbc5ec9 upstream.
The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
to read/write registers of another process.
To get/set a register, the API takes an index into an imaginary address
space called the "USER area", where the registers of the process are
laid out in some fashion.
The kernel then maps that index to a particular register in its own data
structures and gets/sets the value.
The API only allows a single machine-word to be read/written at a time.
So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.
The way floating point registers (FPRs) are addressed is somewhat
complicated, because double precision float values are 64-bit even on
32-bit CPUs. That means on 32-bit kernels each FPR occupies two
word-sized locations in the USER area. On 64-bit kernels each FPR
occupies one word-sized location in the USER area.
Internally the kernel stores the FPRs in an array of u64s, or if VSX is
enabled, an array of pairs of u64s where one half of each pair stores
the FPR. Which half of the pair stores the FPR depends on the kernel's
endianness.
To handle the different layouts of the FPRs depending on VSX/no-VSX and
big/little endian, the TS_FPR() macro was introduced.
Unfortunately the TS_FPR() macro does not take into account the fact
that the addressing of each FPR differs between 32-bit and 64-bit
kernels. It just takes the index into the "USER area" passed from
userspace and indexes into the fp_state.fpr array.
On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
the fp_state.fpr array, meaning the user can read/write 256 bytes past
the end of the array. Because the fp_state sits in the middle of the
thread_struct there are various fields than can be overwritten,
including some pointers. As such it may be exploitable.
It has also been observed to cause systems to hang or otherwise
misbehave when using gdbserver, and is probably the root cause of this
report which could not be easily reproduced:
https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@k…
Rather than trying to make the TS_FPR() macro even more complicated to
fix the bug, or add more macros, instead add a special-case for 32-bit
kernels. This is more obvious and hopefully avoids a similar bug
happening again in future.
Note that because 32-bit kernels never have VSX enabled the code doesn't
need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
ensure that 32-bit && VSX is never enabled.
Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds")
Cc: stable(a)vger.kernel.org # v3.13+
Reported-by: Ariel Miculas <ariel.miculas(a)belden.com>
Tested-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
---
arch/powerpc/kernel/ptrace/ptrace-fpu.c | 20 ++++++++++++++------
arch/powerpc/kernel/ptrace/ptrace.c | 3 +++
2 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/ptrace/ptrace-fpu.c b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
index 5dca19361316..09c49632bfe5 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
@@ -17,9 +17,13 @@ int ptrace_get_fpr(struct task_struct *child, int index, unsigned long *data)
#ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
- if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
- else
+ if (fpidx < (PT_FPSCR - PT_FPR0)) {
+ if (IS_ENABLED(CONFIG_PPC32))
+ // On 32-bit the index we are passed refers to 32-bit words
+ *data = ((u32 *)child->thread.fp_state.fpr)[fpidx];
+ else
+ memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
+ } else
*data = child->thread.fp_state.fpscr;
#else
*data = 0;
@@ -39,9 +43,13 @@ int ptrace_put_fpr(struct task_struct *child, int index, unsigned long data)
#ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
- if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
- else
+ if (fpidx < (PT_FPSCR - PT_FPR0)) {
+ if (IS_ENABLED(CONFIG_PPC32))
+ // On 32-bit the index we are passed refers to 32-bit words
+ ((u32 *)child->thread.fp_state.fpr)[fpidx] = data;
+ else
+ memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
+ } else
child->thread.fp_state.fpscr = data;
#endif
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index 7c7093c17c45..ff5e46dbf7c5 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -446,4 +446,7 @@ void __init pt_regs_check(void)
* real registers.
*/
BUILD_BUG_ON(PT_DSCR < sizeof(struct user_pt_regs) / sizeof(unsigned long));
+
+ // ptrace_get/put_fpr() rely on PPC32 and VSX being incompatible
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_VSX));
}
--
2.35.3
commit 8e1278444446fc97778a5e5c99bca1ce0bbc5ec9 upstream.
The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
to read/write registers of another process.
To get/set a register, the API takes an index into an imaginary address
space called the "USER area", where the registers of the process are
laid out in some fashion.
The kernel then maps that index to a particular register in its own data
structures and gets/sets the value.
The API only allows a single machine-word to be read/written at a time.
So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.
The way floating point registers (FPRs) are addressed is somewhat
complicated, because double precision float values are 64-bit even on
32-bit CPUs. That means on 32-bit kernels each FPR occupies two
word-sized locations in the USER area. On 64-bit kernels each FPR
occupies one word-sized location in the USER area.
Internally the kernel stores the FPRs in an array of u64s, or if VSX is
enabled, an array of pairs of u64s where one half of each pair stores
the FPR. Which half of the pair stores the FPR depends on the kernel's
endianness.
To handle the different layouts of the FPRs depending on VSX/no-VSX and
big/little endian, the TS_FPR() macro was introduced.
Unfortunately the TS_FPR() macro does not take into account the fact
that the addressing of each FPR differs between 32-bit and 64-bit
kernels. It just takes the index into the "USER area" passed from
userspace and indexes into the fp_state.fpr array.
On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
the fp_state.fpr array, meaning the user can read/write 256 bytes past
the end of the array. Because the fp_state sits in the middle of the
thread_struct there are various fields than can be overwritten,
including some pointers. As such it may be exploitable.
It has also been observed to cause systems to hang or otherwise
misbehave when using gdbserver, and is probably the root cause of this
report which could not be easily reproduced:
https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@k…
Rather than trying to make the TS_FPR() macro even more complicated to
fix the bug, or add more macros, instead add a special-case for 32-bit
kernels. This is more obvious and hopefully avoids a similar bug
happening again in future.
Note that because 32-bit kernels never have VSX enabled the code doesn't
need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
ensure that 32-bit && VSX is never enabled.
Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds")
Cc: stable(a)vger.kernel.org # v3.13+
Reported-by: Ariel Miculas <ariel.miculas(a)belden.com>
Tested-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
---
arch/powerpc/kernel/ptrace/ptrace-fpu.c | 20 ++++++++++++++------
arch/powerpc/kernel/ptrace/ptrace.c | 3 +++
2 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/ptrace/ptrace-fpu.c b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
index 5dca19361316..09c49632bfe5 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
@@ -17,9 +17,13 @@ int ptrace_get_fpr(struct task_struct *child, int index, unsigned long *data)
#ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
- if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
- else
+ if (fpidx < (PT_FPSCR - PT_FPR0)) {
+ if (IS_ENABLED(CONFIG_PPC32))
+ // On 32-bit the index we are passed refers to 32-bit words
+ *data = ((u32 *)child->thread.fp_state.fpr)[fpidx];
+ else
+ memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
+ } else
*data = child->thread.fp_state.fpscr;
#else
*data = 0;
@@ -39,9 +43,13 @@ int ptrace_put_fpr(struct task_struct *child, int index, unsigned long data)
#ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
- if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
- else
+ if (fpidx < (PT_FPSCR - PT_FPR0)) {
+ if (IS_ENABLED(CONFIG_PPC32))
+ // On 32-bit the index we are passed refers to 32-bit words
+ ((u32 *)child->thread.fp_state.fpr)[fpidx] = data;
+ else
+ memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
+ } else
child->thread.fp_state.fpscr = data;
#endif
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index c43f77e2ac31..6d45fa288015 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -445,4 +445,7 @@ void __init pt_regs_check(void)
* real registers.
*/
BUILD_BUG_ON(PT_DSCR < sizeof(struct user_pt_regs) / sizeof(unsigned long));
+
+ // ptrace_get/put_fpr() rely on PPC32 and VSX being incompatible
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_VSX));
}
--
2.35.3
commit 8e1278444446fc97778a5e5c99bca1ce0bbc5ec9 upstream.
The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
to read/write registers of another process.
To get/set a register, the API takes an index into an imaginary address
space called the "USER area", where the registers of the process are
laid out in some fashion.
The kernel then maps that index to a particular register in its own data
structures and gets/sets the value.
The API only allows a single machine-word to be read/written at a time.
So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.
The way floating point registers (FPRs) are addressed is somewhat
complicated, because double precision float values are 64-bit even on
32-bit CPUs. That means on 32-bit kernels each FPR occupies two
word-sized locations in the USER area. On 64-bit kernels each FPR
occupies one word-sized location in the USER area.
Internally the kernel stores the FPRs in an array of u64s, or if VSX is
enabled, an array of pairs of u64s where one half of each pair stores
the FPR. Which half of the pair stores the FPR depends on the kernel's
endianness.
To handle the different layouts of the FPRs depending on VSX/no-VSX and
big/little endian, the TS_FPR() macro was introduced.
Unfortunately the TS_FPR() macro does not take into account the fact
that the addressing of each FPR differs between 32-bit and 64-bit
kernels. It just takes the index into the "USER area" passed from
userspace and indexes into the fp_state.fpr array.
On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
the fp_state.fpr array, meaning the user can read/write 256 bytes past
the end of the array. Because the fp_state sits in the middle of the
thread_struct there are various fields than can be overwritten,
including some pointers. As such it may be exploitable.
It has also been observed to cause systems to hang or otherwise
misbehave when using gdbserver, and is probably the root cause of this
report which could not be easily reproduced:
https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@k…
Rather than trying to make the TS_FPR() macro even more complicated to
fix the bug, or add more macros, instead add a special-case for 32-bit
kernels. This is more obvious and hopefully avoids a similar bug
happening again in future.
Note that because 32-bit kernels never have VSX enabled the code doesn't
need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
ensure that 32-bit && VSX is never enabled.
Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds")
Cc: stable(a)vger.kernel.org # v3.13+
Reported-by: Ariel Miculas <ariel.miculas(a)belden.com>
Tested-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
---
arch/powerpc/kernel/ptrace/ptrace-fpu.c | 20 ++++++++++++++------
arch/powerpc/kernel/ptrace/ptrace.c | 3 +++
2 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/ptrace/ptrace-fpu.c b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
index 5dca19361316..09c49632bfe5 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
@@ -17,9 +17,13 @@ int ptrace_get_fpr(struct task_struct *child, int index, unsigned long *data)
#ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
- if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
- else
+ if (fpidx < (PT_FPSCR - PT_FPR0)) {
+ if (IS_ENABLED(CONFIG_PPC32))
+ // On 32-bit the index we are passed refers to 32-bit words
+ *data = ((u32 *)child->thread.fp_state.fpr)[fpidx];
+ else
+ memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
+ } else
*data = child->thread.fp_state.fpscr;
#else
*data = 0;
@@ -39,9 +43,13 @@ int ptrace_put_fpr(struct task_struct *child, int index, unsigned long data)
#ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
- if (fpidx < (PT_FPSCR - PT_FPR0))
- memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
- else
+ if (fpidx < (PT_FPSCR - PT_FPR0)) {
+ if (IS_ENABLED(CONFIG_PPC32))
+ // On 32-bit the index we are passed refers to 32-bit words
+ ((u32 *)child->thread.fp_state.fpr)[fpidx] = data;
+ else
+ memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
+ } else
child->thread.fp_state.fpscr = data;
#endif
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index 6d5026a9db4f..eab6fa59d9d2 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -450,4 +450,7 @@ void __init pt_regs_check(void)
#else
BUILD_BUG_ON(IS_ENABLED(CONFIG_HAVE_FUNCTION_DESCRIPTORS));
#endif
+
+ // ptrace_get/put_fpr() rely on PPC32 and VSX being incompatible
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_VSX));
}
--
2.35.3
When fallocate() is used on a shmem file, the pages we allocate can end
up with !PageUptodate.
Since UFFDIO_CONTINUE tries to find the existing page the user wants to
map with SGP_READ, we would fail to find such a page, since
shmem_getpage_gfp returns with a "NULL" pagep for SGP_READ if it
discovers !PageUptodate. As a result, UFFDIO_CONTINUE returns -EFAULT,
as it would do if the page wasn't found in the page cache at all.
This isn't the intended behavior. UFFDIO_CONTINUE is just trying to find
if a page exists, and doesn't care whether it still needs to be cleared
or not. So, instead of SGP_READ, pass in SGP_NOALLOC. This is the same,
except for one critical difference: in the !PageUptodate case,
SGP_NOALLOC will clear the page and then return it. With this change,
UFFDIO_CONTINUE works properly (succeeds) on a shmem file which has been
fallocated, but otherwise not modified.
Fixes: 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem")
Cc: stable(a)vger.kernel.org
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
---
mm/userfaultfd.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 4f4892a5f767..07d3befc80e4 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -246,7 +246,10 @@ static int mcontinue_atomic_pte(struct mm_struct *dst_mm,
struct page *page;
int ret;
- ret = shmem_getpage(inode, pgoff, &page, SGP_READ);
+ ret = shmem_getpage(inode, pgoff, &page, SGP_NOALLOC);
+ /* Our caller expects us to return -EFAULT if we failed to find page. */
+ if (ret == -ENOENT)
+ ret = -EFAULT;
if (ret)
goto out;
if (!page) {
--
2.36.1.476.g0c4daa206d-goog
When fallocate() is used on a shmem file, the pages we allocate can end
up with !PageUptodate.
Since UFFDIO_CONTINUE tries to find the existing page the user wants to
map with SGP_READ, we would fail to find such a page, since
shmem_getpage_gfp returns with a "NULL" pagep for SGP_READ if it
discovers !PageUptodate. As a result, UFFDIO_CONTINUE returns -EFAULT,
as it would do if the page wasn't found in the page cache at all.
This isn't the intended behavior. UFFDIO_CONTINUE is just trying to find
if a page exists, and doesn't care whether it still needs to be cleared
or not. So, instead of SGP_READ, pass in SGP_NOALLOC. This is the same,
except for one critical difference: in the !PageUptodate case,
SGP_NOALLOC will clear the page and then return it. With this change,
UFFDIO_CONTINUE works properly (succeeds) on a shmem file which has been
fallocated, but otherwise not modified.
Fixes: 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem")
Cc: stable(a)vger.kernel.org
Signed-off-by: Axel Rasmussen <axelrasmussen(a)google.com>
---
mm/userfaultfd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 4f4892a5f767..c156f7f5b854 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -246,7 +246,7 @@ static int mcontinue_atomic_pte(struct mm_struct *dst_mm,
struct page *page;
int ret;
- ret = shmem_getpage(inode, pgoff, &page, SGP_READ);
+ ret = shmem_getpage(inode, pgoff, &page, SGP_NOALLOC);
if (ret)
goto out;
if (!page) {
--
2.36.1.255.ge46751e96f-goog
This is a preparation patch for the S29GL064N buffer writes fix. There
is no functional change.
Link: https://lore.kernel.org/r/b687c259-6413-26c9-d4c9-b3afa69ea124@pengutronix.…
Fixes: dfeae1073583("mtd: cfi_cmdset_0002: Change write buffer to check correct value")
Signed-off-by: Tokunori Ikegami <ikegami.t(a)gmail.com>
Cc: stable(a)vger.kernel.org
Acked-by: Vignesh Raghavendra <vigneshr(a)ti.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220323170458.5608-2-ikegami.t@gmail.com
---
drivers/mtd/chips/cfi_cmdset_0002.c | 77 ++++++++++++-----------------
1 file changed, 32 insertions(+), 45 deletions(-)
diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c
index 870d1f1331b1..a8e1a961e844 100644
--- a/drivers/mtd/chips/cfi_cmdset_0002.c
+++ b/drivers/mtd/chips/cfi_cmdset_0002.c
@@ -730,50 +730,34 @@ static struct mtd_info *cfi_amdstd_setup(struct mtd_info *mtd)
}
/*
- * Return true if the chip is ready.
+ * Return true if the chip is ready and has the correct value.
*
* Ready is one of: read mode, query mode, erase-suspend-read mode (in any
* non-suspended sector) and is indicated by no toggle bits toggling.
*
+ * Error are indicated by toggling bits or bits held with the wrong value,
+ * or with bits toggling.
+ *
* Note that anything more complicated than checking if no bits are toggling
* (including checking DQ5 for an error status) is tricky to get working
* correctly and is therefore not done (particularly with interleaved chips
* as each chip must be checked independently of the others).
*/
-static int __xipram chip_ready(struct map_info *map, unsigned long addr)
+static int __xipram chip_ready(struct map_info *map, unsigned long addr,
+ map_word *expected)
{
map_word d, t;
+ int ret;
d = map_read(map, addr);
t = map_read(map, addr);
- return map_word_equal(map, d, t);
-}
+ ret = map_word_equal(map, d, t);
-/*
- * Return true if the chip is ready and has the correct value.
- *
- * Ready is one of: read mode, query mode, erase-suspend-read mode (in any
- * non-suspended sector) and it is indicated by no bits toggling.
- *
- * Error are indicated by toggling bits or bits held with the wrong value,
- * or with bits toggling.
- *
- * Note that anything more complicated than checking if no bits are toggling
- * (including checking DQ5 for an error status) is tricky to get working
- * correctly and is therefore not done (particularly with interleaved chips
- * as each chip must be checked independently of the others).
- *
- */
-static int __xipram chip_good(struct map_info *map, unsigned long addr, map_word expected)
-{
- map_word oldd, curd;
-
- oldd = map_read(map, addr);
- curd = map_read(map, addr);
+ if (!ret || !expected)
+ return ret;
- return map_word_equal(map, oldd, curd) &&
- map_word_equal(map, curd, expected);
+ return map_word_equal(map, t, *expected);
}
static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr, int mode)
@@ -790,7 +774,7 @@ static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr
case FL_STATUS:
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
@@ -828,7 +812,7 @@ static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr
chip->state = FL_ERASE_SUSPENDING;
chip->erase_suspended = 1;
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
@@ -1361,7 +1345,7 @@ static int do_otp_lock(struct map_info *map, struct flchip *chip, loff_t adr,
/* wait for chip to become ready */
timeo = jiffies + msecs_to_jiffies(2);
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
@@ -1628,10 +1612,11 @@ static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip,
}
/*
- * We check "time_after" and "!chip_good" before checking
- * "chip_good" to avoid the failure due to scheduling.
+ * We check "time_after" and "!chip_ready" before checking
+ * "chip_ready" to avoid the failure due to scheduling.
*/
- if (time_after(jiffies, timeo) && !chip_good(map, adr, datum)) {
+ if (time_after(jiffies, timeo) &&
+ !chip_ready(map, adr, &datum)) {
xip_enable(map, chip, adr);
printk(KERN_WARNING "MTD %s(): software timeout\n", __func__);
xip_disable(map, chip, adr);
@@ -1639,7 +1624,7 @@ static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip,
break;
}
- if (chip_good(map, adr, datum))
+ if (chip_ready(map, adr, &datum))
break;
/* Latency issues. Drop the lock, wait a while and retry */
@@ -1883,13 +1868,13 @@ static int __xipram do_write_buffer(struct map_info *map, struct flchip *chip,
}
/*
- * We check "time_after" and "!chip_good" before checking "chip_good" to avoid
- * the failure due to scheduling.
+ * We check "time_after" and "!chip_ready" before checking
+ * "chip_ready" to avoid the failure due to scheduling.
*/
- if (time_after(jiffies, timeo) && !chip_good(map, adr, datum))
+ if (time_after(jiffies, timeo) && !chip_ready(map, adr, &datum))
break;
- if (chip_good(map, adr, datum)) {
+ if (chip_ready(map, adr, &datum)) {
xip_enable(map, chip, adr);
goto op_done;
}
@@ -2023,7 +2008,7 @@ static int cfi_amdstd_panic_wait(struct map_info *map, struct flchip *chip,
* If the driver thinks the chip is idle, and no toggle bits
* are changing, then the chip is actually idle for sure.
*/
- if (chip->state == FL_READY && chip_ready(map, adr))
+ if (chip->state == FL_READY && chip_ready(map, adr, NULL))
return 0;
/*
@@ -2040,7 +2025,7 @@ static int cfi_amdstd_panic_wait(struct map_info *map, struct flchip *chip,
/* wait for the chip to become ready */
for (i = 0; i < jiffies_to_usecs(timeo); i++) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
return 0;
udelay(1);
@@ -2104,13 +2089,13 @@ static int do_panic_write_oneword(struct map_info *map, struct flchip *chip,
map_write(map, datum, adr);
for (i = 0; i < jiffies_to_usecs(uWriteTimeout); i++) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
udelay(1);
}
- if (!chip_good(map, adr, datum)) {
+ if (!chip_ready(map, adr, &datum)) {
/* reset on all failures. */
map_write(map, CMD(0xF0), chip->start);
/* FIXME - should have reset delay before continuing */
@@ -2251,6 +2236,7 @@ static int __xipram do_erase_chip(struct map_info *map, struct flchip *chip)
DECLARE_WAITQUEUE(wait, current);
int ret = 0;
int retry_cnt = 0;
+ map_word datum = map_word_ff(map);
adr = cfi->addr_unlock1;
@@ -2305,7 +2291,7 @@ static int __xipram do_erase_chip(struct map_info *map, struct flchip *chip)
chip->erase_suspended = 0;
}
- if (chip_good(map, adr, map_word_ff(map)))
+ if (chip_ready(map, adr, &datum))
break;
if (time_after(jiffies, timeo)) {
@@ -2347,6 +2333,7 @@ static int __xipram do_erase_oneblock(struct map_info *map, struct flchip *chip,
DECLARE_WAITQUEUE(wait, current);
int ret = 0;
int retry_cnt = 0;
+ map_word datum = map_word_ff(map);
adr += chip->start;
@@ -2401,7 +2388,7 @@ static int __xipram do_erase_oneblock(struct map_info *map, struct flchip *chip,
chip->erase_suspended = 0;
}
- if (chip_good(map, adr, map_word_ff(map))) {
+ if (chip_ready(map, adr, &datum)) {
xip_enable(map, chip, adr);
break;
}
@@ -2616,7 +2603,7 @@ static int __maybe_unused do_ppb_xxlock(struct map_info *map,
*/
timeo = jiffies + msecs_to_jiffies(2000); /* 2s max (un)locking */
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
--
2.34.1
Invalidating the buffer memory in arch_sync_dma_for_device() for
FROM_DEVICE transfers
When using the streaming DMA API to map a buffer prior to inbound
non-coherent DMA (i.e. DMA_FROM_DEVICE), we invalidate any dirty CPU
cachelines so that they will not be written back during the transfer and
corrupt the buffer contents written by the DMA. This, however, poses two
potential problems:
(1) If the DMA transfer does not write to every byte in the buffer,
then the unwritten bytes will contain stale data once the transfer
has completed.
(2) If the buffer has a virtual alias in userspace, then stale data
may be visible via this alias during the period between performing
the cache invalidation and the DMA writes landing in memory.
Address both of these issues by cleaning (aka writing-back) the dirty
lines in arch_sync_dma_for_device(DMA_FROM_DEVICE) instead of discarding
them using invalidation.
Cc: Ard Biesheuvel <ardb(a)kernel.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Robin Murphy <robin.murphy(a)arm.com>
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20220606152150.GA31568@willie-the-truck
Signed-off-by: Will Deacon <will(a)kernel.org>
---
arch/arm64/mm/cache.S | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 0ea6cc25dc66..21c907987080 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -218,8 +218,6 @@ SYM_FUNC_ALIAS(__dma_flush_area, __pi___dma_flush_area)
*/
SYM_FUNC_START(__pi___dma_map_area)
add x1, x0, x1
- cmp w2, #DMA_FROM_DEVICE
- b.eq __pi_dcache_inval_poc
b __pi_dcache_clean_poc
SYM_FUNC_END(__pi___dma_map_area)
SYM_FUNC_ALIAS(__dma_map_area, __pi___dma_map_area)
--
2.36.1.476.g0c4daa206d-goog
gbaudio_dapm_free_controls() iterates over widgets using the
list_for_each_entry*() family of macros from <linux/list.h>, which
leaves the loop cursor pointing to a meaningless structure if it
completes a traversal of the list. The cursor was set to NULL at the end
of the loop body, but would be overwritten by the final loop cursor
update.
Because of this behavior, the widget could be non-null after the loop
even if the widget wasn't found, and the cleanup logic would treat the
pointer as a valid widget to free.
To fix this, introduce a temporary variable to act as the loop cursor
and copy it to a variable that can be accessed after the loop finishes.
Due to not removing any list elements, use list_for_each_entry() instead
of list_for_each_entry_safe() in the revised loop.
This was detected with the help of Coccinelle.
Fixes: 510e340efe0c ("staging: greybus: audio: Add helper APIs for dynamic audio modules")
Cc: stable(a)vger.kernel.org
Reviewed-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Reviewed-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Jared Kangas <kangas.jd(a)gmail.com>
---
Changes since v1:
* Removed safe list iteration as suggested by Johan Hovold <johan(a)kernel.org>
* Updated patch changelog to explain the list iteration change
* Added tags to changelog based on feedback (Cc:, Fixes:, Reviewed-by:)
drivers/staging/greybus/audio_helper.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/staging/greybus/audio_helper.c b/drivers/staging/greybus/audio_helper.c
index 843760675876..05e91e6bc2a0 100644
--- a/drivers/staging/greybus/audio_helper.c
+++ b/drivers/staging/greybus/audio_helper.c
@@ -115,7 +115,7 @@ int gbaudio_dapm_free_controls(struct snd_soc_dapm_context *dapm,
int num)
{
int i;
- struct snd_soc_dapm_widget *w, *next_w;
+ struct snd_soc_dapm_widget *w, *tmp_w;
#ifdef CONFIG_DEBUG_FS
struct dentry *parent = dapm->debugfs_dapm;
struct dentry *debugfs_w = NULL;
@@ -124,13 +124,13 @@ int gbaudio_dapm_free_controls(struct snd_soc_dapm_context *dapm,
mutex_lock(&dapm->card->dapm_mutex);
for (i = 0; i < num; i++) {
/* below logic can be optimized to identify widget pointer */
- list_for_each_entry_safe(w, next_w, &dapm->card->widgets,
- list) {
- if (w->dapm != dapm)
- continue;
- if (!strcmp(w->name, widget->name))
+ w = NULL;
+ list_for_each_entry(tmp_w, &dapm->card->widgets, list) {
+ if (tmp_w->dapm == dapm &&
+ !strcmp(tmp_w->name, widget->name)) {
+ w = tmp_w;
break;
- w = NULL;
+ }
}
if (!w) {
dev_err(dapm->dev, "%s: widget not found\n",
--
2.34.3
This is a preparation patch for the S29GL064N buffer writes fix. There
is no functional change.
Link: https://lore.kernel.org/r/b687c259-6413-26c9-d4c9-b3afa69ea124@pengutronix.…
Fixes: dfeae1073583("mtd: cfi_cmdset_0002: Change write buffer to check correct value")
Signed-off-by: Tokunori Ikegami <ikegami.t(a)gmail.com>
Cc: stable(a)vger.kernel.org
Acked-by: Vignesh Raghavendra <vigneshr(a)ti.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220323170458.5608-2-ikegami.t@gmail.com
---
drivers/mtd/chips/cfi_cmdset_0002.c | 77 ++++++++++++-----------------
1 file changed, 32 insertions(+), 45 deletions(-)
diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c
index 3ab75d3e2ce3..51442a189701 100644
--- a/drivers/mtd/chips/cfi_cmdset_0002.c
+++ b/drivers/mtd/chips/cfi_cmdset_0002.c
@@ -731,50 +731,34 @@ static struct mtd_info *cfi_amdstd_setup(struct mtd_info *mtd)
}
/*
- * Return true if the chip is ready.
+ * Return true if the chip is ready and has the correct value.
*
* Ready is one of: read mode, query mode, erase-suspend-read mode (in any
* non-suspended sector) and is indicated by no toggle bits toggling.
*
+ * Error are indicated by toggling bits or bits held with the wrong value,
+ * or with bits toggling.
+ *
* Note that anything more complicated than checking if no bits are toggling
* (including checking DQ5 for an error status) is tricky to get working
* correctly and is therefore not done (particularly with interleaved chips
* as each chip must be checked independently of the others).
*/
-static int __xipram chip_ready(struct map_info *map, unsigned long addr)
+static int __xipram chip_ready(struct map_info *map, unsigned long addr,
+ map_word *expected)
{
map_word d, t;
+ int ret;
d = map_read(map, addr);
t = map_read(map, addr);
- return map_word_equal(map, d, t);
-}
+ ret = map_word_equal(map, d, t);
-/*
- * Return true if the chip is ready and has the correct value.
- *
- * Ready is one of: read mode, query mode, erase-suspend-read mode (in any
- * non-suspended sector) and it is indicated by no bits toggling.
- *
- * Error are indicated by toggling bits or bits held with the wrong value,
- * or with bits toggling.
- *
- * Note that anything more complicated than checking if no bits are toggling
- * (including checking DQ5 for an error status) is tricky to get working
- * correctly and is therefore not done (particularly with interleaved chips
- * as each chip must be checked independently of the others).
- *
- */
-static int __xipram chip_good(struct map_info *map, unsigned long addr, map_word expected)
-{
- map_word oldd, curd;
-
- oldd = map_read(map, addr);
- curd = map_read(map, addr);
+ if (!ret || !expected)
+ return ret;
- return map_word_equal(map, oldd, curd) &&
- map_word_equal(map, curd, expected);
+ return map_word_equal(map, t, *expected);
}
static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr, int mode)
@@ -791,7 +775,7 @@ static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr
case FL_STATUS:
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
@@ -829,7 +813,7 @@ static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr
chip->state = FL_ERASE_SUSPENDING;
chip->erase_suspended = 1;
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
@@ -1360,7 +1344,7 @@ static int do_otp_lock(struct map_info *map, struct flchip *chip, loff_t adr,
/* wait for chip to become ready */
timeo = jiffies + msecs_to_jiffies(2);
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
@@ -1627,10 +1611,11 @@ static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip,
}
/*
- * We check "time_after" and "!chip_good" before checking
- * "chip_good" to avoid the failure due to scheduling.
+ * We check "time_after" and "!chip_ready" before checking
+ * "chip_ready" to avoid the failure due to scheduling.
*/
- if (time_after(jiffies, timeo) && !chip_good(map, adr, datum)) {
+ if (time_after(jiffies, timeo) &&
+ !chip_ready(map, adr, &datum)) {
xip_enable(map, chip, adr);
printk(KERN_WARNING "MTD %s(): software timeout\n", __func__);
xip_disable(map, chip, adr);
@@ -1638,7 +1623,7 @@ static int __xipram do_write_oneword(struct map_info *map, struct flchip *chip,
break;
}
- if (chip_good(map, adr, datum))
+ if (chip_ready(map, adr, &datum))
break;
/* Latency issues. Drop the lock, wait a while and retry */
@@ -1882,13 +1867,13 @@ static int __xipram do_write_buffer(struct map_info *map, struct flchip *chip,
}
/*
- * We check "time_after" and "!chip_good" before checking "chip_good" to avoid
- * the failure due to scheduling.
+ * We check "time_after" and "!chip_ready" before checking
+ * "chip_ready" to avoid the failure due to scheduling.
*/
- if (time_after(jiffies, timeo) && !chip_good(map, adr, datum))
+ if (time_after(jiffies, timeo) && !chip_ready(map, adr, &datum))
break;
- if (chip_good(map, adr, datum)) {
+ if (chip_ready(map, adr, &datum)) {
xip_enable(map, chip, adr);
goto op_done;
}
@@ -2022,7 +2007,7 @@ static int cfi_amdstd_panic_wait(struct map_info *map, struct flchip *chip,
* If the driver thinks the chip is idle, and no toggle bits
* are changing, then the chip is actually idle for sure.
*/
- if (chip->state == FL_READY && chip_ready(map, adr))
+ if (chip->state == FL_READY && chip_ready(map, adr, NULL))
return 0;
/*
@@ -2039,7 +2024,7 @@ static int cfi_amdstd_panic_wait(struct map_info *map, struct flchip *chip,
/* wait for the chip to become ready */
for (i = 0; i < jiffies_to_usecs(timeo); i++) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
return 0;
udelay(1);
@@ -2103,13 +2088,13 @@ static int do_panic_write_oneword(struct map_info *map, struct flchip *chip,
map_write(map, datum, adr);
for (i = 0; i < jiffies_to_usecs(uWriteTimeout); i++) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
udelay(1);
}
- if (!chip_good(map, adr, datum)) {
+ if (!chip_ready(map, adr, &datum)) {
/* reset on all failures. */
map_write(map, CMD(0xF0), chip->start);
/* FIXME - should have reset delay before continuing */
@@ -2250,6 +2235,7 @@ static int __xipram do_erase_chip(struct map_info *map, struct flchip *chip)
DECLARE_WAITQUEUE(wait, current);
int ret = 0;
int retry_cnt = 0;
+ map_word datum = map_word_ff(map);
adr = cfi->addr_unlock1;
@@ -2304,7 +2290,7 @@ static int __xipram do_erase_chip(struct map_info *map, struct flchip *chip)
chip->erase_suspended = 0;
}
- if (chip_good(map, adr, map_word_ff(map)))
+ if (chip_ready(map, adr, &datum))
break;
if (time_after(jiffies, timeo)) {
@@ -2346,6 +2332,7 @@ static int __xipram do_erase_oneblock(struct map_info *map, struct flchip *chip,
DECLARE_WAITQUEUE(wait, current);
int ret = 0;
int retry_cnt = 0;
+ map_word datum = map_word_ff(map);
adr += chip->start;
@@ -2400,7 +2387,7 @@ static int __xipram do_erase_oneblock(struct map_info *map, struct flchip *chip,
chip->erase_suspended = 0;
}
- if (chip_good(map, adr, map_word_ff(map)))
+ if (chip_ready(map, adr, &datum))
break;
if (time_after(jiffies, timeo)) {
@@ -2593,7 +2580,7 @@ static int __maybe_unused do_ppb_xxlock(struct map_info *map,
*/
timeo = jiffies + msecs_to_jiffies(2000); /* 2s max (un)locking */
for (;;) {
- if (chip_ready(map, adr))
+ if (chip_ready(map, adr, NULL))
break;
if (time_after(jiffies, timeo)) {
--
2.34.1
This is a note to let you know that I've just added the patch titled
mei: hbm: drop capability response on early shutdown
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 68553650bc9c57c7e530c84e5b2945e9dfe1a560 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Mon, 6 Jun 2022 17:42:24 +0300
Subject: mei: hbm: drop capability response on early shutdown
Drop HBM responses also in the early shutdown phase where
the usual traffic is allowed.
Extend the rule that drop HBM responses received during the shutdown
phase by also in MEI_DEV_POWERING_DOWN state.
This resolves the stall if the driver is stopping in the middle
of the link initialization or link reset.
Drop the capabilities response on early shutdown.
Fixes: 6d7163f2c49f ("mei: hbm: drop hbm responses on early shutdown")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20220606144225.282375-2-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/misc/mei/hbm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/misc/mei/hbm.c b/drivers/misc/mei/hbm.c
index cebcca6d6d3e..cf2b8261da14 100644
--- a/drivers/misc/mei/hbm.c
+++ b/drivers/misc/mei/hbm.c
@@ -1351,7 +1351,8 @@ int mei_hbm_dispatch(struct mei_device *dev, struct mei_msg_hdr *hdr)
if (dev->dev_state != MEI_DEV_INIT_CLIENTS ||
dev->hbm_state != MEI_HBM_CAP_SETUP) {
- if (dev->dev_state == MEI_DEV_POWER_DOWN) {
+ if (dev->dev_state == MEI_DEV_POWER_DOWN ||
+ dev->dev_state == MEI_DEV_POWERING_DOWN) {
dev_dbg(dev->dev, "hbm: capabilities response: on shutdown, ignoring\n");
return 0;
}
--
2.36.1
This is a note to let you know that I've just added the patch titled
mei: me: add raptor lake point S DID
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 3ed8c7d39cfef831fe508fc1308f146912fa72e6 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Mon, 6 Jun 2022 17:42:25 +0300
Subject: mei: me: add raptor lake point S DID
Add Raptor (Point) Lake S device id.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20220606144225.282375-3-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/misc/mei/hw-me-regs.h | 2 ++
drivers/misc/mei/pci-me.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/drivers/misc/mei/hw-me-regs.h b/drivers/misc/mei/hw-me-regs.h
index 64ce3f830262..15e8e2b322b1 100644
--- a/drivers/misc/mei/hw-me-regs.h
+++ b/drivers/misc/mei/hw-me-regs.h
@@ -109,6 +109,8 @@
#define MEI_DEV_ID_ADP_P 0x51E0 /* Alder Lake Point P */
#define MEI_DEV_ID_ADP_N 0x54E0 /* Alder Lake Point N */
+#define MEI_DEV_ID_RPL_S 0x7A68 /* Raptor Lake Point S */
+
/*
* MEI HW Section
*/
diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
index 33e58821e478..5435604327a7 100644
--- a/drivers/misc/mei/pci-me.c
+++ b/drivers/misc/mei/pci-me.c
@@ -116,6 +116,8 @@ static const struct pci_device_id mei_me_pci_tbl[] = {
{MEI_PCI_DEVICE(MEI_DEV_ID_ADP_P, MEI_ME_PCH15_CFG)},
{MEI_PCI_DEVICE(MEI_DEV_ID_ADP_N, MEI_ME_PCH15_CFG)},
+ {MEI_PCI_DEVICE(MEI_DEV_ID_RPL_S, MEI_ME_PCH15_CFG)},
+
/* required last entry */
{0, }
};
--
2.36.1
Please pull into 4.9 LTS:
https://gitlab.arm.com/linux-arm/linux-power/-/commit/2c16db6c92b0ee4aa61e8…
"net: fix nla_strcmp to handle more then one trailing null character"
+++ lib/nlattr.c
@@ -379,7 +379,7 @@ int nla_strcmp(const struct nlattr *nla, const char *str)
- if (attrlen > 0 && buf[attrlen - 1] == '\0')
+ while (attrlen > 0 && buf[attrlen - 1] == '\0')
which appears to be present in 4.14.233 (and presumably newer LTS),
but not in 4.9:
$ git log --oneline -n1
remotes/linux-stable/v4.14.233..143722a05028ebb8691d349007f85656a4e90a8e
We've got some code that is confirmed failing due to the lack of this one-liner,
and the fix is obvious enough...
(assuming it applies it presumably wouldn't hurt in 4.4 LTS either,
but I think that tree isn't even maintained, and I don't care about it
there)
Thanks,
Maciej Żenczykowski, Kernel Networking Developer @ Google
This is a note to let you know that I've just added the patch titled
usb: gadget: f_fs: change ep->ep safe in ffs_epfile_io()
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 0698f0209d8032e8869525aeb68f65ee7fde12ad Mon Sep 17 00:00:00 2001
From: Linyu Yuan <quic_linyyuan(a)quicinc.com>
Date: Fri, 10 Jun 2022 20:17:58 +0800
Subject: usb: gadget: f_fs: change ep->ep safe in ffs_epfile_io()
In ffs_epfile_io(), when read/write data in blocking mode, it will wait
the completion in interruptible mode, if task receive a signal, it will
terminate the wait, at same time, if function unbind occurs,
ffs_func_unbind() will kfree all eps, ffs_epfile_io() still try to
dequeue request by dereferencing ep which may become invalid.
Fix it by add ep spinlock and will not dereference ep if it is not valid.
Cc: <stable(a)vger.kernel.org> # 5.15
Reported-by: Michael Wu <michael(a)allwinnertech.com>
Tested-by: Michael Wu <michael(a)allwinnertech.com>
Reviewed-by: John Keeping <john(a)metanate.com>
Signed-off-by: Linyu Yuan <quic_linyyuan(a)quicinc.com>
Link: https://lore.kernel.org/r/1654863478-26228-3-git-send-email-quic_linyyuan@q…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/gadget/function/f_fs.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index e1fcd8bc80a1..e0fa4b186ec6 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -1080,6 +1080,11 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
spin_unlock_irq(&epfile->ffs->eps_lock);
if (wait_for_completion_interruptible(&io_data->done)) {
+ spin_lock_irq(&epfile->ffs->eps_lock);
+ if (epfile->ep != ep) {
+ ret = -ESHUTDOWN;
+ goto error_lock;
+ }
/*
* To avoid race condition with ffs_epfile_io_complete,
* dequeue the request first then check
@@ -1087,6 +1092,7 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
* condition with req->complete callback.
*/
usb_ep_dequeue(ep->ep, req);
+ spin_unlock_irq(&epfile->ffs->eps_lock);
wait_for_completion(&io_data->done);
interrupted = io_data->status < 0;
}
--
2.36.1
This is a note to let you know that I've just added the patch titled
usb: gadget: f_fs: change ep->status safe in ffs_epfile_io()
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From fb1f16d74e263baa4ad11e31e28b68f144aa55ed Mon Sep 17 00:00:00 2001
From: Linyu Yuan <quic_linyyuan(a)quicinc.com>
Date: Fri, 10 Jun 2022 20:17:57 +0800
Subject: usb: gadget: f_fs: change ep->status safe in ffs_epfile_io()
If a task read/write data in blocking mode, it will wait the completion
in ffs_epfile_io(), if function unbind occurs, ffs_func_unbind() will
kfree ffs ep, once the task wake up, it still dereference the ffs ep to
obtain the request status.
Fix it by moving the request status to io_data which is stack-safe.
Cc: <stable(a)vger.kernel.org> # 5.15
Reported-by: Michael Wu <michael(a)allwinnertech.com>
Tested-by: Michael Wu <michael(a)allwinnertech.com>
Reviewed-by: John Keeping <john(a)metanate.com>
Signed-off-by: Linyu Yuan <quic_linyyuan(a)quicinc.com>
Link: https://lore.kernel.org/r/1654863478-26228-2-git-send-email-quic_linyyuan@q…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/gadget/function/f_fs.c | 34 +++++++++++++++++-------------
1 file changed, 19 insertions(+), 15 deletions(-)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 4585ee3a444a..e1fcd8bc80a1 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -122,8 +122,6 @@ struct ffs_ep {
struct usb_endpoint_descriptor *descs[3];
u8 num;
-
- int status; /* P: epfile->mutex */
};
struct ffs_epfile {
@@ -227,6 +225,9 @@ struct ffs_io_data {
bool use_sg;
struct ffs_data *ffs;
+
+ int status;
+ struct completion done;
};
struct ffs_desc_helper {
@@ -707,12 +708,15 @@ static const struct file_operations ffs_ep0_operations = {
static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
{
+ struct ffs_io_data *io_data = req->context;
+
ENTER();
- if (req->context) {
- struct ffs_ep *ep = _ep->driver_data;
- ep->status = req->status ? req->status : req->actual;
- complete(req->context);
- }
+ if (req->status)
+ io_data->status = req->status;
+ else
+ io_data->status = req->actual;
+
+ complete(&io_data->done);
}
static ssize_t ffs_copy_to_iter(void *data, int data_len, struct iov_iter *iter)
@@ -1050,7 +1054,6 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
WARN(1, "%s: data_len == -EINVAL\n", __func__);
ret = -EINVAL;
} else if (!io_data->aio) {
- DECLARE_COMPLETION_ONSTACK(done);
bool interrupted = false;
req = ep->req;
@@ -1066,7 +1069,8 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
io_data->buf = data;
- req->context = &done;
+ init_completion(&io_data->done);
+ req->context = io_data;
req->complete = ffs_epfile_io_complete;
ret = usb_ep_queue(ep->ep, req, GFP_ATOMIC);
@@ -1075,7 +1079,7 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
spin_unlock_irq(&epfile->ffs->eps_lock);
- if (wait_for_completion_interruptible(&done)) {
+ if (wait_for_completion_interruptible(&io_data->done)) {
/*
* To avoid race condition with ffs_epfile_io_complete,
* dequeue the request first then check
@@ -1083,17 +1087,17 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
* condition with req->complete callback.
*/
usb_ep_dequeue(ep->ep, req);
- wait_for_completion(&done);
- interrupted = ep->status < 0;
+ wait_for_completion(&io_data->done);
+ interrupted = io_data->status < 0;
}
if (interrupted)
ret = -EINTR;
- else if (io_data->read && ep->status > 0)
- ret = __ffs_epfile_read_data(epfile, data, ep->status,
+ else if (io_data->read && io_data->status > 0)
+ ret = __ffs_epfile_read_data(epfile, data, io_data->status,
&io_data->data);
else
- ret = ep->status;
+ ret = io_data->status;
goto error_mutex;
} else if (!(req = usb_ep_alloc_request(ep->ep, GFP_ATOMIC))) {
ret = -ENOMEM;
--
2.36.1
commit fdf6a2f533115ec5d4d9629178f8196331f1ac50 upstream.
Fix a clock imbalance introduced by ed8cc3b1fc84 ("PCI: qcom: Add support
for SDM845 PCIe controller"), which enables the pipe clock both in init()
and in post_init() but only disables in post_deinit().
Note that the pipe clock was also never disabled in the init() error
paths and that enabling the clock before powering up the PHY looks
questionable.
Link: https://lore.kernel.org/r/20220401133351.10113-1-johan+linaro@kernel.org
Fixes: ed8cc3b1fc84 ("PCI: qcom: Add support for SDM845 PCIe controller")
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
Cc: stable(a)vger.kernel.org # 5.6
[ johan: adjust context for 5.17 ]
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
drivers/pci/controller/dwc/pcie-qcom.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index c19cd506ed3f..18d571f08cdc 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -1230,12 +1230,6 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie)
goto err_disable_clocks;
}
- ret = clk_prepare_enable(res->pipe_clk);
- if (ret) {
- dev_err(dev, "cannot prepare/enable pipe clock\n");
- goto err_disable_clocks;
- }
-
/* configure PCIe to RC mode */
writel(DEVICE_TYPE_RC, pcie->parf + PCIE20_PARF_DEVICE_TYPE);
--
2.35.1
On each vcpu load, we set the KVM_ARM64_HOST_SVE_ENABLED
flag if SVE is enabled for EL0 on the host. This is used to restore
the correct state on vpcu put.
However, it appears that nothing ever clears this flag. Once
set, it will stick until the vcpu is destroyed, which has the
potential to spuriously enable SVE for userspace.
We probably never saw the issue because no VMM uses SVE, but
that's still pretty bad. Unconditionally clearing the flag
on vcpu load addresses the issue.
Fixes: 8383741ab2e7 ("KVM: arm64: Get rid of host SVE tracking/saving")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Reviewed-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220528113829.1043361-2-maz@kernel.org
---
arch/arm64/kvm/fpsimd.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 3d251a4d2cf7..8267ff4642d3 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -80,6 +80,7 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
vcpu->arch.flags |= KVM_ARM64_FP_HOST;
+ vcpu->arch.flags &= ~KVM_ARM64_HOST_SVE_ENABLED;
if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN)
vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED;
--
2.34.1
This is used by code that doesn't need CONFIG_CRYPTO, so move this into
lib/ with a Kconfig option so that it can be selected by whatever needs
it.
This fixes a linker error Zheng pointed out when
CRYPTO_MANAGER_DISABLE_TESTS!=y and CRYPTO=m:
lib/crypto/curve25519-selftest.o: In function `curve25519_selftest':
curve25519-selftest.c:(.init.text+0x60): undefined reference to `__crypto_memneq'
curve25519-selftest.c:(.init.text+0xec): undefined reference to `__crypto_memneq'
curve25519-selftest.c:(.init.text+0x114): undefined reference to `__crypto_memneq'
curve25519-selftest.c:(.init.text+0x154): undefined reference to `__crypto_memneq'
Reported-by: Zheng Bin <zhengbin13(a)huawei.com>
Cc: Eric Biggers <ebiggers(a)kernel.org>
Cc: stable(a)vger.kernel.org
Fixes: aa127963f1ca ("crypto: lib/curve25519 - re-add selftests")
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
I'm traveling over the next week, and there are a few ways to skin this
cat, so if somebody here sees issue, feel free to pick this v1 up and
fashion a v2 out of it.
crypto/Kconfig | 1 +
crypto/Makefile | 2 +-
lib/Kconfig | 3 +++
lib/Makefile | 1 +
lib/crypto/Kconfig | 1 +
{crypto => lib}/memneq.c | 0
6 files changed, 7 insertions(+), 1 deletion(-)
rename {crypto => lib}/memneq.c (100%)
diff --git a/crypto/Kconfig b/crypto/Kconfig
index f567271ed10d..38601a072b99 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -15,6 +15,7 @@ source "crypto/async_tx/Kconfig"
#
menuconfig CRYPTO
tristate "Cryptographic API"
+ select LIB_MEMNEQ
help
This option provides the core Cryptographic API.
diff --git a/crypto/Makefile b/crypto/Makefile
index 40d4c2690a49..dbfa53567c92 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -4,7 +4,7 @@
#
obj-$(CONFIG_CRYPTO) += crypto.o
-crypto-y := api.o cipher.o compress.o memneq.o
+crypto-y := api.o cipher.o compress.o
obj-$(CONFIG_CRYPTO_ENGINE) += crypto_engine.o
obj-$(CONFIG_CRYPTO_FIPS) += fips.o
diff --git a/lib/Kconfig b/lib/Kconfig
index 6a843639814f..eaaad4d85bf2 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -120,6 +120,9 @@ config INDIRECT_IOMEM_FALLBACK
source "lib/crypto/Kconfig"
+config LIB_MEMNEQ
+ bool
+
config CRC_CCITT
tristate "CRC-CCITT functions"
help
diff --git a/lib/Makefile b/lib/Makefile
index 89fcae891361..f01023cda508 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -251,6 +251,7 @@ obj-$(CONFIG_DIMLIB) += dim/
obj-$(CONFIG_SIGNATURE) += digsig.o
lib-$(CONFIG_CLZ_TAB) += clz_tab.o
+lib-$(CONFIG_LIB_MEMNEQ) += memneq.o
obj-$(CONFIG_GENERIC_STRNCPY_FROM_USER) += strncpy_from_user.o
obj-$(CONFIG_GENERIC_STRNLEN_USER) += strnlen_user.o
diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig
index 7ee13c08c970..337d6852643a 100644
--- a/lib/crypto/Kconfig
+++ b/lib/crypto/Kconfig
@@ -71,6 +71,7 @@ config CRYPTO_LIB_CURVE25519
tristate "Curve25519 scalar multiplication library"
depends on CRYPTO_ARCH_HAVE_LIB_CURVE25519 || !CRYPTO_ARCH_HAVE_LIB_CURVE25519
select CRYPTO_LIB_CURVE25519_GENERIC if CRYPTO_ARCH_HAVE_LIB_CURVE25519=n
+ select LIB_MEMNEQ
help
Enable the Curve25519 library interface. This interface may be
fulfilled by either the generic implementation or an arch-specific
diff --git a/crypto/memneq.c b/lib/memneq.c
similarity index 100%
rename from crypto/memneq.c
rename to lib/memneq.c
--
2.35.1
Fix two possible issue in ffs_epfile_io() when operation at blocking mode.
v1: https://lore.kernel.org/linux-usb/1653989775-14267-1-git-send-email-quic_li…
v2: correct interrupted variable according comment from John Keeping
v3: (v2: https://lore.kernel.org/linux-usb/1654006119-23869-1-git-send-email-quic_li…)
add Revived-by from John keeping,
after offline discussion, add Reported-by from Michael Wu
v4: (v3: https://lore.kernel.org/linux-usb/1654056916-2062-1-git-send-email-quic_lin…)
add Cc: <stable(a)vger.kernel.org> # 5.15 (seem other branch can't aplly cleanly),
add Tested-by: Michael Wu <michael(a)allwinnertech.com>,
remove one empty line to keep original code format.
Linyu Yuan (2):
usb: gadget: f_fs: change ep->status safe in ffs_epfile_io()
usb: gadget: f_fs: change ep->ep safe in ffs_epfile_io()
drivers/usb/gadget/function/f_fs.c | 37 ++++++++++++++++++++++---------------
1 file changed, 22 insertions(+), 15 deletions(-)
--
2.7.4
For some sev ioctl interfaces, input may be passed that is less than or
equal to SEV_FW_BLOB_MAX_SIZE, but larger than the data that PSP
firmware returns. In this case, kmalloc will allocate memory that is the
size of the input rather than the size of the data. Since PSP firmware
doesn't fully overwrite the buffer, the sev ioctl interfaces with the
issue may return uninitialized slab memory.
Currently, all of the ioctl interfaces in the ccp driver are safe, but
to prevent future problems, change all ioctl interfaces that allocate
memory with kmalloc to use kzalloc and memset the data buffer to zero
in sev_ioctl_do_platform_status.
Fixes: 38103671aad3 ("crypto: ccp: Use the stack and common buffer for status commands")
Fixes: e799035609e15 ("crypto: ccp: Implement SEV_PEK_CSR ioctl command")
Fixes: 76a2b524a4b1d ("crypto: ccp: Implement SEV_PDH_CERT_EXPORT ioctl command")
Fixes: d6112ea0cb344 ("crypto: ccp - introduce SEV_GET_ID2 command")
Cc: stable(a)vger.kernel.org
Reported-by: Andy Nguyen <theflow(a)google.com>
Suggested-by: David Rientjes <rientjes(a)google.com>
Suggested-by: Peter Gonda <pgonda(a)google.com>
Signed-off-by: John Allen <john.allen(a)amd.com>
---
v2:
- Add fixes tags and CC stable(a)vger.kernel.org
v3:
- memset data buffer to zero in sev_ioctl_do_platform_status
v4:
- Add fixes tag for sev_ioctl_do_platform_status change
---
drivers/crypto/ccp/sev-dev.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 6ab93dfd478a..da143cc3a8f5 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -551,6 +551,8 @@ static int sev_ioctl_do_platform_status(struct sev_issue_cmd *argp)
struct sev_user_data_status data;
int ret;
+ memset(&data, 0, sizeof(data));
+
ret = __sev_do_cmd_locked(SEV_CMD_PLATFORM_STATUS, &data, &argp->error);
if (ret)
return ret;
@@ -604,7 +606,7 @@ static int sev_ioctl_do_pek_csr(struct sev_issue_cmd *argp, bool writable)
if (input.length > SEV_FW_BLOB_MAX_SIZE)
return -EFAULT;
- blob = kmalloc(input.length, GFP_KERNEL);
+ blob = kzalloc(input.length, GFP_KERNEL);
if (!blob)
return -ENOMEM;
@@ -828,7 +830,7 @@ static int sev_ioctl_do_get_id2(struct sev_issue_cmd *argp)
input_address = (void __user *)input.address;
if (input.address && input.length) {
- id_blob = kmalloc(input.length, GFP_KERNEL);
+ id_blob = kzalloc(input.length, GFP_KERNEL);
if (!id_blob)
return -ENOMEM;
@@ -947,14 +949,14 @@ static int sev_ioctl_do_pdh_export(struct sev_issue_cmd *argp, bool writable)
if (input.cert_chain_len > SEV_FW_BLOB_MAX_SIZE)
return -EFAULT;
- pdh_blob = kmalloc(input.pdh_cert_len, GFP_KERNEL);
+ pdh_blob = kzalloc(input.pdh_cert_len, GFP_KERNEL);
if (!pdh_blob)
return -ENOMEM;
data.pdh_cert_address = __psp_pa(pdh_blob);
data.pdh_cert_len = input.pdh_cert_len;
- cert_blob = kmalloc(input.cert_chain_len, GFP_KERNEL);
+ cert_blob = kzalloc(input.cert_chain_len, GFP_KERNEL);
if (!cert_blob) {
ret = -ENOMEM;
goto e_free_pdh;
--
2.34.1
> On Thu, Jun 09, 2022 at 11:33:18AM +0800, Mark-PK Tsai wrote:
> > Always clear OS lock before enable debug event.
> >
> > The OS lock is clear in cpuhp ops in recent kernel,
> > but when the debug exception happened before it
> > kernel might crash because debug event enable didn't
> > take effect when OS lock is hold.
> >
> > Below is the use case that having this problem:
> >
> > Register kprobe in console_unlock and kernel will
> > panic at secondary_start_kernel on secondary core.
> >
> > CPU: 1 PID: 0 Comm: swapper/1 Tainted: P
> > ...
> > pstate: 004001c5 (nzcv dAIF +PAN -UAO)
> > pc : do_undefinstr+0x5c/0x60
> > lr : do_undefinstr+0x2c/0x60
> > sp : ffffffc01338bc50
> > pmr_save: 000000f0
> > x29: ffffffc01338bc50 x28: ffffff8115e95a00 T
> > x27: ffffffc01258e000 x26: ffffff8115e95a00
> > x25: 00000000ffffffff x24: 0000000000000000
> > x23: 00000000604001c5 x22: ffffffc014015008
> > x21: 000000002232f000 x20: 00000000000000f0 j
> > x19: ffffffc01338bc70 x18: ffffffc0132ed040
> > x17: ffffffc01258eb48 x16: 0000000000000403 L&
> > x15: 0000000000016480 x14: ffffffc01258e000 i/
> > x13: 0000000000000006 x12: 0000000000006985
> > x11: 00000000d5300000 x10: 0000000000000000
> > x9 : 9f6c79217a8a0400 x8 : 00000000000000c5
> > x7 : 0000000000000000 x6 : ffffffc01338bc08 2T
> > x5 : ffffffc01338bc08 x4 : 0000000000000002
> > x3 : 0000000000000000 x2 : 0000000000000004
> > x1 : 0000000000000000 x0 : 0000000000000001 *q
> > Call trace:
> > do_undefinstr+0x5c/0x60
> > el1_undef+0x10/0xb4
> > 0xffffffc014015008
> > vprintk_func+0x210/0x290
> > printk+0x64/0x90
> > cpuinfo_detect_icache_policy+0x80/0xe0
> > __cpuinfo_store_cpu+0x150/0x160
> > secondary_start_kernel+0x154/0x440
> >
> > The root cause is that OS_LSR_EL1.OSLK is reset
> > to 1 on a cold reset[1] and the firmware didn't
> > unlock it by default.
> > So the core didn't go to el1_dbg as expected after
> > kernel_enable_single_step and eret.
>
> Hmm, I thought we didn't use hardware single-step for kprobes after
> 7ee31a3aa8f4 ("arm64: kprobes: Use BRK instead of single-step when executing
> instructions out-of-line"). What is triggering this exception?
>
> Will
You're right.
Actually this issue happend in 5.4 LTS, and the commit you mentioned
can avoid the kernel panic by not using hardware single-step.
I think 5.4 LTS should apply this commit.
7ee31a3aa8f4 ("arm64: kprobes: Use BRK instead of single-step when executing instructions out-of-line")
Cc: stable(a)vger.kernel.org
And I'm not sure if there is other use case may have problem if the
kernel don't clear OS lock in enable_debug_monitors everytime.
So should we do this to prevent someone face the similar issue?
TEST_GEN_FILES contains files that are generated during compilation and are
required to be included together with the test binaries, e.g. when
performing:
make -C tools/testing/selftests install INSTALL_PATH=/some/other/path [*]
Add test_encl.elf to TEST_GEN_FILES because otherwise the installed test
binary will fail to run.
[*] https://docs.kernel.org/dev-tools/kselftest.html
Cc: stable(a)vger.kernel.org
Fixes: 2adcba79e69d ("selftests/x86: Add a selftest for SGX")
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
v2:
Use TEST_GEN_FILES in the "all" target, instead of duplicating the path for
test_encl.elf.
---
tools/testing/selftests/sgx/Makefile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/sgx/Makefile b/tools/testing/selftests/sgx/Makefile
index 75af864e07b6..7f60811b5b20 100644
--- a/tools/testing/selftests/sgx/Makefile
+++ b/tools/testing/selftests/sgx/Makefile
@@ -17,9 +17,10 @@ ENCL_CFLAGS := -Wall -Werror -static -nostdlib -nostartfiles -fPIC \
-fno-stack-protector -mrdrnd $(INCLUDES)
TEST_CUSTOM_PROGS := $(OUTPUT)/test_sgx
+TEST_GEN_FILES := $(OUTPUT)/test_encl.elf
ifeq ($(CAN_BUILD_X86_64), 1)
-all: $(TEST_CUSTOM_PROGS) $(OUTPUT)/test_encl.elf
+all: $(TEST_CUSTOM_PROGS) $(TEST_GEN_FILES)
endif
$(OUTPUT)/test_sgx: $(OUTPUT)/main.o \
--
2.36.1
Greetings dearest
I'm a 75 year old woman. I was born an orphan and GOD blessed me
abundantly with riches but no children nor husband which makes me an
unhappy woman. Now I am affected with cancer of the lung and breast
with a partial stroke which has affected my speech. I can no longer
talk well. and half of my body is paralyzed, I sent this email to you
with the help of my private female nurse.
My condition is really deteriorating day by day and it is really
giving me lots to think about. This has prompted my decision to
donate all I have for charity; I have made numerous donations all over
the world. After going through your profile, I decided to make my last
donation of Ten Million Five Hundred Thousand United Kingdom Pounds
(UK£10.500, 000, 00) to you as my investment manager. I want you to
build an Orphanage Home With my name ( Ms.Evelyn Jacob) in your
country.
If you are willing and able to do this task for the sake of humanity
then send me below information for more details to receive the funds.
1. Name...................................................
2. Phone number...............................
3. Address.............................................
4. Country of Origin and residence
Ms.Evelyn Jecob ,
Goede dag,
Wij verstrekken leningen aan particulieren en bedrijven met een rente van 2% per jaar.
Wij zijn geïnteresseerd in het financieren van grootschalige projecten en het verstrekken van leningen. De terugbetalingstermijn is 1-30 jaar en met gratis 6 maanden uitstel.
Wij bieden: -
* Project financiering
* Zakelijke lening
* Persoonlijke lening
Neem dan contact met ons op via onderstaande gegevens, zodat wij u kunnen informeren over de voorwaarden van de lening.
Benodigd leenbedrag:
Looptijd:
Mobiel nummer:
Reageer voor meer informatie.
Groeten
Online reclame makelaar.
------------------------------------------------------------
We offer Personal/business loans at 2% interest rate. Should you be interested? do not hesitate to contact us for more details.
Thanks,
------------------------------------------------------------
Regards
Online advertising agency
Syzbot found a corrupted list bug scenario that can be triggered from
cgroup_subtree_control_write(cgrp). The reproduces writes to
cgroup.subtree_control file, which invokes:
cgroup_apply_control_enable()->css_create()->css_populate_dir(), which
then fails with a fault injected -ENOMEM.
In such scenario the css_killed_work_fn will be en-queued via
cgroup_apply_control_disable(cgrp)->kill_css(css), and bail out to
cgroup_kn_unlock(). Then cgroup_kn_unlock() will call:
cgroup_put(cgrp)->css_put(&cgrp->self), which will try to enqueue
css_release_work_fn for the same css instance, causing a list_add
corruption bug, as can be seen in the syzkaller report [1].
Fix this by synchronizing the css ref_kill and css_release jobs.
css_release() function will check if the css_killed_work_fn() has been
scheduled for the css and only en-queue the css_release_work_fn()
if css_killed_work_fn wasn't already en-queued. Otherwise css_release() will
set the CSS_REL_LATER flag for that css. This will cause the css_release_work_fn()
work to be executed after css_killed_work_fn() is finished.
Two scc flags have been introduced to implement this serialization mechanizm:
* CSS_KILL_ENQED, which will be set when css_killed_work_fn() is en-queued, and
* CSS_REL_LATER, which, if set, will cause the css_release_work_fn() to be
scheduled after the css_killed_work_fn is finished.
There is also a new lock, which will protect the integrity of the css flags.
[1] https://syzkaller.appspot.com/bug?id=e26e54d6eac9d9fb50b221ec3e4627b327465d…
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Michal Koutny <mkoutny(a)suse.com>
Cc: Zefan Li <lizefan.x(a)bytedance.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Daniel Borkmann <daniel(a)iogearbox.net>
Cc: Andrii Nakryiko <andrii(a)kernel.org>
Cc: Martin KaFai Lau <kafai(a)fb.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Yonghong Song <yhs(a)fb.com>
Cc: John Fastabend <john.fastabend(a)gmail.com>
Cc: KP Singh <kpsingh(a)kernel.org>
Cc: <cgroups(a)vger.kernel.org>
Cc: <netdev(a)vger.kernel.org>
Cc: <bpf(a)vger.kernel.org>
Cc: <stable(a)vger.kernel.org>
Cc: <linux-kernel(a)vger.kernel.org>
Reported-and-tested-by: syzbot+e42ae441c3b10acf9e9d(a)syzkaller.appspotmail.com
Fixes: 8f36aaec9c92 ("cgroup: Use rcu_work instead of explicit rcu and work item")
Signed-off-by: Tadeusz Struk <tadeusz.struk(a)linaro.org>
---
include/linux/cgroup-defs.h | 4 ++++
kernel/cgroup/cgroup.c | 35 ++++++++++++++++++++++++++++++++---
2 files changed, 36 insertions(+), 3 deletions(-)
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 1bfcfb1af352..8dc8b4edb242 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -53,6 +53,8 @@ enum {
CSS_RELEASED = (1 << 2), /* refcnt reached zero, released */
CSS_VISIBLE = (1 << 3), /* css is visible to userland */
CSS_DYING = (1 << 4), /* css is dying */
+ CSS_KILL_ENQED = (1 << 5), /* kill work enqueued for the css */
+ CSS_REL_LATER = (1 << 6), /* release needs to be done after kill */
};
/* bits in struct cgroup flags field */
@@ -162,6 +164,8 @@ struct cgroup_subsys_state {
*/
int id;
+ /* lock to protect flags */
+ spinlock_t lock;
unsigned int flags;
/*
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 1779ccddb734..a0ceead4b390 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5210,8 +5210,23 @@ static void css_release(struct percpu_ref *ref)
struct cgroup_subsys_state *css =
container_of(ref, struct cgroup_subsys_state, refcnt);
- INIT_WORK(&css->destroy_work, css_release_work_fn);
- queue_work(cgroup_destroy_wq, &css->destroy_work);
+ spin_lock_bh(&css->lock);
+
+ /*
+ * Check if the css_killed_work_fn work has been scheduled for this
+ * css and enqueue css_release_work_fn only if it wasn't.
+ * Otherwise set the CSS_REL_LATER flag, which will cause
+ * release to be enqueued after css_killed_work_fn is finished.
+ * This is to prevent list corruption by en-queuing two instance
+ * of the same work struct on the same WQ, namely cgroup_destroy_wq.
+ */
+ if (!(css->flags & CSS_KILL_ENQED)) {
+ INIT_WORK(&css->destroy_work, css_release_work_fn);
+ queue_work(cgroup_destroy_wq, &css->destroy_work);
+ } else {
+ css->flags |= CSS_REL_LATER;
+ }
+ spin_unlock_bh(&css->lock);
}
static void init_and_link_css(struct cgroup_subsys_state *css,
@@ -5230,6 +5245,7 @@ static void init_and_link_css(struct cgroup_subsys_state *css,
INIT_LIST_HEAD(&css->rstat_css_node);
css->serial_nr = css_serial_nr_next++;
atomic_set(&css->online_cnt, 0);
+ spin_lock_init(&css->lock);
if (cgroup_parent(cgrp)) {
css->parent = cgroup_css(cgroup_parent(cgrp), ss);
@@ -5545,10 +5561,12 @@ int cgroup_mkdir(struct kernfs_node *parent_kn, const char *name, umode_t mode)
*/
static void css_killed_work_fn(struct work_struct *work)
{
- struct cgroup_subsys_state *css =
+ struct cgroup_subsys_state *css_killed, *css =
container_of(work, struct cgroup_subsys_state, destroy_work);
mutex_lock(&cgroup_mutex);
+ css_killed = css;
+ css_killed->flags &= ~CSS_KILL_ENQED;
do {
offline_css(css);
@@ -5557,6 +5575,14 @@ static void css_killed_work_fn(struct work_struct *work)
css = css->parent;
} while (css && atomic_dec_and_test(&css->online_cnt));
+ spin_lock_bh(&css->lock);
+ if (css_killed->flags & CSS_REL_LATER) {
+ /* If css_release work was delayed for the css enqueue it now. */
+ INIT_WORK(&css_killed->destroy_work, css_release_work_fn);
+ queue_work(cgroup_destroy_wq, &css_killed->destroy_work);
+ css_killed->flags &= ~CSS_REL_LATER;
+ }
+ spin_unlock_bh(&css->lock);
mutex_unlock(&cgroup_mutex);
}
@@ -5566,10 +5592,13 @@ static void css_killed_ref_fn(struct percpu_ref *ref)
struct cgroup_subsys_state *css =
container_of(ref, struct cgroup_subsys_state, refcnt);
+ spin_lock_bh(&css->lock);
if (atomic_dec_and_test(&css->online_cnt)) {
+ css->flags |= CSS_KILL_ENQED;
INIT_WORK(&css->destroy_work, css_killed_work_fn);
queue_work(cgroup_destroy_wq, &css->destroy_work);
}
+ spin_unlock_bh(&css->lock);
}
/**
--
2.36.1
Commit 8e7102273f59 ("bcache: make bch_btree_check() to be
multithreaded") makes bch_btree_check() to be much faster when checking
all btree nodes during cache device registration. But it isn't in ideal
shap yet, still can be improved.
This patch does the following thing to improve current parallel btree
nodes check by multiple threads in bch_btree_check(),
- Add read lock to root node while checking all the btree nodes with
multiple threads. Although currently it is not mandatory but it is
good to have a read lock in code logic.
- Remove local variable 'char name[32]', and generate kernel thread name
string directly when calling kthread_run().
- Allocate local variable "struct btree_check_state check_state" on the
stack and avoid unnecessary dynamic memory allocation for it.
- Increase check_state->started to count created kernel thread after it
succeeds to create.
- When wait for all checking kernel threads to finish, use wait_event()
to replace wait_event_interruptible().
With this change, the code is more clear, and some potential error
conditions are avoided.
Fixes: 8e7102273f59 ("bcache: make bch_btree_check() to be multithreaded")
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: stable(a)vger.kernel.org
---
drivers/md/bcache/btree.c | 58 ++++++++++++++++++---------------------
1 file changed, 26 insertions(+), 32 deletions(-)
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index ad9f16689419..2362bb8ef6d1 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -2006,8 +2006,7 @@ int bch_btree_check(struct cache_set *c)
int i;
struct bkey *k = NULL;
struct btree_iter iter;
- struct btree_check_state *check_state;
- char name[32];
+ struct btree_check_state check_state;
/* check and mark root node keys */
for_each_key_filter(&c->root->keys, k, &iter, bch_ptr_invalid)
@@ -2018,63 +2017,58 @@ int bch_btree_check(struct cache_set *c)
if (c->root->level == 0)
return 0;
- check_state = kzalloc(sizeof(struct btree_check_state), GFP_KERNEL);
- if (!check_state)
- return -ENOMEM;
-
- check_state->c = c;
- check_state->total_threads = bch_btree_chkthread_nr();
- check_state->key_idx = 0;
- spin_lock_init(&check_state->idx_lock);
- atomic_set(&check_state->started, 0);
- atomic_set(&check_state->enough, 0);
- init_waitqueue_head(&check_state->wait);
+ check_state.c = c;
+ check_state.total_threads = bch_btree_chkthread_nr();
+ check_state.key_idx = 0;
+ spin_lock_init(&check_state.idx_lock);
+ atomic_set(&check_state.started, 0);
+ atomic_set(&check_state.enough, 0);
+ init_waitqueue_head(&check_state.wait);
+ rw_lock(0, c->root, c->root->level);
/*
* Run multiple threads to check btree nodes in parallel,
- * if check_state->enough is non-zero, it means current
+ * if check_state.enough is non-zero, it means current
* running check threads are enough, unncessary to create
* more.
*/
- for (i = 0; i < check_state->total_threads; i++) {
- /* fetch latest check_state->enough earlier */
+ for (i = 0; i < check_state.total_threads; i++) {
+ /* fetch latest check_state.enough earlier */
smp_mb__before_atomic();
- if (atomic_read(&check_state->enough))
+ if (atomic_read(&check_state.enough))
break;
- check_state->infos[i].result = 0;
- check_state->infos[i].state = check_state;
- snprintf(name, sizeof(name), "bch_btrchk[%u]", i);
- atomic_inc(&check_state->started);
+ check_state.infos[i].result = 0;
+ check_state.infos[i].state = &check_state;
- check_state->infos[i].thread =
+ check_state.infos[i].thread =
kthread_run(bch_btree_check_thread,
- &check_state->infos[i],
- name);
- if (IS_ERR(check_state->infos[i].thread)) {
+ &check_state.infos[i],
+ "bch_btrchk[%d]", i);
+ if (IS_ERR(check_state.infos[i].thread)) {
pr_err("fails to run thread bch_btrchk[%d]\n", i);
for (--i; i >= 0; i--)
- kthread_stop(check_state->infos[i].thread);
+ kthread_stop(check_state.infos[i].thread);
ret = -ENOMEM;
goto out;
}
+ atomic_inc(&check_state.started);
}
/*
* Must wait for all threads to stop.
*/
- wait_event_interruptible(check_state->wait,
- atomic_read(&check_state->started) == 0);
+ wait_event(check_state.wait, atomic_read(&check_state.started) == 0);
- for (i = 0; i < check_state->total_threads; i++) {
- if (check_state->infos[i].result) {
- ret = check_state->infos[i].result;
+ for (i = 0; i < check_state.total_threads; i++) {
+ if (check_state.infos[i].result) {
+ ret = check_state.infos[i].result;
goto out;
}
}
out:
- kfree(check_state);
+ rw_unlock(0, c->root);
return ret;
}
--
2.35.3
From: Fabio Estevam <festevam(a)denx.de>
Since commit 358ba762d9f1 ("crypto: caam - enable prediction resistance
in HRWNG") the following CAAM errors can be seen on i.MX6SX:
caam_jr 2101000.jr: 20003c5b: CCB: desc idx 60: RNG: Hardware error
hwrng: no data available
This error is due to an incorrect entropy delay for i.MX6SX.
Fix it by increasing the minimum entropy delay for i.MX6SX
as done in U-Boot:
https://patchwork.ozlabs.org/project/uboot/patch/20220415111049.2565744-1-g…
As explained in the U-Boot patch:
"RNG self tests are run to determine the correct entropy delay.
Such tests are executed with different voltages and temperatures to identify
the worst case value for the entropy delay. For i.MX6SX, it was determined
that after adding a margin value of 1000 the minimum entropy delay should be
at least 12000."
Cc: <stable(a)vger.kernel.org>
Fixes: 358ba762d9f1 ("crypto: caam - enable prediction resistance in HRWNG")
Signed-off-by: Fabio Estevam <festevam(a)denx.de>
Reviewed-by: Horia Geantă <horia.geanta(a)nxp.com>
---
Changes since v4:
- Change the function name to needs_entropy_delay_adjustment() - Vabhav
- Improve the commit log by adding the explanation from the U-Boot
patch - Vabhav
drivers/crypto/caam/ctrl.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index ca0361b2dbb0..f87aa2169e5f 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -609,6 +609,13 @@ static bool check_version(struct fsl_mc_version *mc_version, u32 major,
}
#endif
+static bool needs_entropy_delay_adjustment(void)
+{
+ if (of_machine_is_compatible("fsl,imx6sx"))
+ return true;
+ return false;
+}
+
/* Probe routine for CAAM top (controller) level */
static int caam_probe(struct platform_device *pdev)
{
@@ -855,6 +862,8 @@ static int caam_probe(struct platform_device *pdev)
* Also, if a handle was instantiated, do not change
* the TRNG parameters.
*/
+ if (needs_entropy_delay_adjustment())
+ ent_delay = 12000;
if (!(ctrlpriv->rng4_sh_init || inst_handles)) {
dev_info(dev,
"Entropy delay = %u\n",
@@ -871,6 +880,15 @@ static int caam_probe(struct platform_device *pdev)
*/
ret = instantiate_rng(dev, inst_handles,
gen_sk);
+ /*
+ * Entropy delay is determined via TRNG characterization.
+ * TRNG characterization is run across different voltages
+ * and temperatures.
+ * If worst case value for ent_dly is identified,
+ * the loop can be skipped for that platform.
+ */
+ if (needs_entropy_delay_adjustment())
+ break;
if (ret == -EAGAIN)
/*
* if here, the loop will rerun,
--
2.25.1
This reverts commit bc360b0b1611566e1bd47384daf49af6a1c51837.
This matches the hardware identifier of the Samsung 970 EVO Plus, a very
popular internal laptop NVMe drive, which is not the Samsung X5, an
external thunderbolt drive. In particular, this causes:
1) a 2.3 second boot time delay; and
2) disabling of deep power saving states.
So just revert this until whatever funny business can be worked out
regarding Samsung's PCI IDs.
Fixes: bc360b0b1611 ("nvme-pci: add quirks for Samsung X5 SSDs")
Cc: stable(a)vger.kernel.org
Cc: Monish Kumar R <monish.kumar.r(a)intel.com>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/nvme/host/pci.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 48f4f6eb877b..47b9e3e0ea5a 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3483,10 +3483,7 @@ static const struct pci_device_id nvme_id_table[] = {
NVME_QUIRK_128_BYTES_SQES |
NVME_QUIRK_SHARED_TAGS |
NVME_QUIRK_SKIP_CID_GEN },
- { PCI_DEVICE(0x144d, 0xa808), /* Samsung X5 */
- .driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY|
- NVME_QUIRK_NO_DEEPEST_PS |
- NVME_QUIRK_IGNORE_DEV_SUBNQN, },
+
{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
{ 0, }
};
--
2.35.1
There are two issues in avic_kick_target_vcpus_fast
1. It is legal to issue an IPI request with APIC_DEST_NOSHORT
and a physical destination of 0xFF (or 0xFFFFFFFF in case of x2apic),
which must be treated as a broadcast destination.
Fix this by explicitly checking for it.
Also don’t use ‘index’ in this case as it gives no new information.
2. It is legal to issue a logical IPI request to more than one target.
Index field only provides index in physical id table of first
such target and therefore can't be used before we are sure
that only a single target was addressed.
Instead, parse the ICRL/ICRH, double check that a unicast interrupt
was requested, and use that info to figure out the physical id
of the target vCPU.
At that point there is no need to use the index field as well.
In addition to fixing the above issues, also skip the call to
kvm_apic_match_dest.
It is possible to do this now, because now as long as AVIC is not
inhibited, it is guaranteed that none of the vCPUs changed their
apic id from its default value.
This fixes boot of windows guest with AVIC enabled because it uses
IPI with 0xFF destination and no destination shorthand.
Fixes: 7223fd2d5338 ("KVM: SVM: Use target APIC ID to complete AVIC IRQs when possible")
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
---
arch/x86/kvm/svm/avic.c | 105 ++++++++++++++++++++++++++--------------
1 file changed, 69 insertions(+), 36 deletions(-)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 072e2c8cc66aa..5d98ac575dedc 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -291,58 +291,91 @@ void avic_ring_doorbell(struct kvm_vcpu *vcpu)
static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source,
u32 icrl, u32 icrh, u32 index)
{
- u32 dest, apic_id;
- struct kvm_vcpu *vcpu;
+ u32 l1_physical_id, dest;
+ struct kvm_vcpu *target_vcpu;
int dest_mode = icrl & APIC_DEST_MASK;
int shorthand = icrl & APIC_SHORT_MASK;
struct kvm_svm *kvm_svm = to_kvm_svm(kvm);
- u32 *avic_logical_id_table = page_address(kvm_svm->avic_logical_id_table_page);
if (shorthand != APIC_DEST_NOSHORT)
return -EINVAL;
- /*
- * The AVIC incomplete IPI #vmexit info provides index into
- * the physical APIC ID table, which can be used to derive
- * guest physical APIC ID.
- */
+ if (apic_x2apic_mode(source))
+ dest = icrh;
+ else
+ dest = GET_APIC_DEST_FIELD(icrh);
+
if (dest_mode == APIC_DEST_PHYSICAL) {
- apic_id = index;
+ /* broadcast destination, use slow path */
+ if (apic_x2apic_mode(source) && dest == X2APIC_BROADCAST)
+ return -EINVAL;
+ if (!apic_x2apic_mode(source) && dest == APIC_BROADCAST)
+ return -EINVAL;
+
+ l1_physical_id = dest;
+
+ if (WARN_ON_ONCE(l1_physical_id != index))
+ return -EINVAL;
+
} else {
- if (!apic_x2apic_mode(source)) {
- /* For xAPIC logical mode, the index is for logical APIC table. */
- apic_id = avic_logical_id_table[index] & 0x1ff;
+ u32 bitmap, cluster;
+ int logid_index;
+
+ if (apic_x2apic_mode(source)) {
+ /* 16 bit dest mask, 16 bit cluster id */
+ bitmap = dest & 0xFFFF0000;
+ cluster = (dest >> 16) << 4;
+ } else if (kvm_lapic_get_reg(source, APIC_DFR) == APIC_DFR_FLAT) {
+ /* 8 bit dest mask*/
+ bitmap = dest;
+ cluster = 0;
} else {
- return -EINVAL;
+ /* 4 bit desk mask, 4 bit cluster id */
+ bitmap = dest & 0xF;
+ cluster = (dest >> 4) << 2;
}
- }
- /*
- * Assuming vcpu ID is the same as physical apic ID,
- * and use it to retrieve the target vCPU.
- */
- vcpu = kvm_get_vcpu_by_id(kvm, apic_id);
- if (!vcpu)
- return -EINVAL;
+ if (unlikely(!bitmap))
+ /* guest bug: nobody to send the logical interrupt to */
+ return 0;
- if (apic_x2apic_mode(vcpu->arch.apic))
- dest = icrh;
- else
- dest = GET_APIC_DEST_FIELD(icrh);
+ if (!is_power_of_2(bitmap))
+ /* multiple logical destinations, use slow path */
+ return -EINVAL;
- /*
- * Try matching the destination APIC ID with the vCPU.
- */
- if (kvm_apic_match_dest(vcpu, source, shorthand, dest, dest_mode)) {
- vcpu->arch.apic->irr_pending = true;
- svm_complete_interrupt_delivery(vcpu,
- icrl & APIC_MODE_MASK,
- icrl & APIC_INT_LEVELTRIG,
- icrl & APIC_VECTOR_MASK);
- return 0;
+ logid_index = cluster + __ffs(bitmap);
+
+ if (apic_x2apic_mode(source)) {
+ l1_physical_id = logid_index;
+ } else {
+ u32 *avic_logical_id_table =
+ page_address(kvm_svm->avic_logical_id_table_page);
+
+ u32 logid_entry = avic_logical_id_table[logid_index];
+
+ if (WARN_ON_ONCE(index != logid_index))
+ return -EINVAL;
+
+ /* guest bug: non existing/reserved logical destination */
+ if (unlikely(!(logid_entry & AVIC_LOGICAL_ID_ENTRY_VALID_MASK)))
+ return 0;
+
+ l1_physical_id = logid_entry &
+ AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK;
+ }
}
- return -EINVAL;
+ target_vcpu = kvm_get_vcpu_by_id(kvm, l1_physical_id);
+ if (unlikely(!target_vcpu))
+ /* guest bug: non existing vCPU is a target of this IPI*/
+ return 0;
+
+ target_vcpu->arch.apic->irr_pending = true;
+ svm_complete_interrupt_delivery(target_vcpu,
+ icrl & APIC_MODE_MASK,
+ icrl & APIC_INT_LEVELTRIG,
+ icrl & APIC_VECTOR_MASK);
+ return 0;
}
static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
--
2.26.3
On 08/06/2022 10:09, Kuo-Hsiang Chou wrote:
> Hi Thomas
>
> Thanks for your suggestions!
>
> I answer each revision inline that followed by [KH]:.
Thanks for reviewing this.
>
> Regards,
>
> Kuo-Hsiang Chou
>
> -----Original Message-----
>
> From: Thomas Zimmermann [mailto:tzimmermann@suse.de]
>
> Sent: Tuesday, June 07, 2022 8:03 PM
>
> To: airlied(a)redhat.com; airlied(a)linux.ie; daniel(a)ffwll.ch;
> jfalempe(a)redhat.com; regressions(a)leemhuis.info; Kuo-Hsiang Chou
> <kuohsiang_chou(a)aspeedtech.com>
>
> Subject: [PATCH] drm/ast: Treat AST2600 like AST2500 in most places
>
> Include AST2600 in most of the branches for AST2500. Thereby revert most
> effects of commit f9bd00e0ea9d ("drm/ast: Create chip AST2600").
>
> The AST2600 used to be treated like an AST2500, which at least gave
> usable display output. After introducing AST2600 in the driver without
> further updates, lots of functions take the wrong branches.
>
> Handling AST2600 in the AST2500 branches reverts back to the original
> settings. The exception are cases where AST2600 meanwhile got its own
> branch.
>
> [KH]: Based on CVE_2019_6260 item3, P2A is disallowed anymore.
>
> P2A (PCIe to AMBA) is a bridge that is able to revise any BMC registers.
>
> Yes, P2A is dangerous on security issue, because Host open a backdoor
> and someone malicious SW/APP will be easy to take control of BMC.
>
> Therefore, P2A is disabled forever.
>
> Now, return to this patch, there is no need to add AST2600 condition on
> the P2A flow.
>
[snip]
>
> [KH]: Yes, the patch is "drm/ast: Create threshold values for AST2600"
> that is the root cause of whites lines on AST2600
>
> commit
>
>
> bcc77411e8a65929655cef7b63a36000724cdc4b
> <https://cgit.freedesktop.org/drm/drm/commit/?id=bcc77411e8a65929655cef7b63a…> (patch
> <https://cgit.freedesktop.org/drm/drm/patch/?id=bcc77411e8a65929655cef7b63a3…>)
>
So basically this commit should be enough to fix the white lines and
flickering with VGA output on AST2600 ?
I will try to have it tested, and if it's good, we may want to have it
on stable kernel.
Best regards,
--
Jocelyn
The revision of the imx-sdma IP that is in the i.MX8M series is the
same is that as that in the i.MX7 series but the imx7d MODULE_FIRMWARE
directive is wrapped in a condiditional which means it's not defined
when built for aarch64 SOC_IMX8M platforms and hence you get the
following errors when the driver loads on imx8m devices:
imx-sdma 302c0000.dma-controller: Direct firmware load for imx/sdma/sdma-imx7d.bin failed with error -2
imx-sdma 302c0000.dma-controller: external firmware not found, using ROM firmware
Add the SOC_IMX8M into the check so the firmware can load on i.MX8.
Fixes: 1474d48bd639 ("arm64: dts: imx8mq: Add SDMA nodes")
Fixes: 941acd566b18 ("dmaengine: imx-sdma: Only check ratio on parts that support 1:1")
Signed-off-by: Peter Robinson <pbrobinson(a)gmail.com>
Cc: stable(a)vger.kernel.org # v5.2+
---
drivers/dma/imx-sdma.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index 8535018ee7a2..900cafdaf359 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -2346,7 +2346,7 @@ MODULE_DESCRIPTION("i.MX SDMA driver");
#if IS_ENABLED(CONFIG_SOC_IMX6Q)
MODULE_FIRMWARE("imx/sdma/sdma-imx6q.bin");
#endif
-#if IS_ENABLED(CONFIG_SOC_IMX7D)
+#if IS_ENABLED(CONFIG_SOC_IMX7D) || IS_ENABLED(CONFIG_SOC_IMX8M)
MODULE_FIRMWARE("imx/sdma/sdma-imx7d.bin");
#endif
MODULE_LICENSE("GPL");
--
2.36.1
commit b9684a71fca793213378dd410cd11675d973eaa1 upstream.
Historically we did distinguish between a flag that surpressed partition
scanning, and a combinations of the minors variable and another flag if
any partitions were supported. This was generally confusing and doesn't
make much sense, but some corner case uses of the loop driver actually
do want to support manually added partitions on a device that does not
actively scan for partitions. To make things worsee the loop driver
also wants to dynamically toggle the scanning for partitions on a live
gendisk, which makes the disk->flags updates non-atomic.
Introduce a new GD_SUPPRESS_PART_SCAN bit in disk->state that disables
just scanning for partitions, and toggle that instead of GENHD_FL_NO_PART
in the loop driver.
Fixes: 1ebe2e5f9d68 ("block: remove GENHD_FL_EXT_DEVT")
Reported-by: Ming Lei <ming.lei(a)redhat.com>
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Ming Lei <ming.lei(a)redhat.com>
Link: https://lore.kernel.org/r/20220527055806.1972352-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/genhd.c | 2 ++
drivers/block/loop.c | 8 ++++----
include/linux/blkdev.h | 1 +
3 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/block/genhd.c b/block/genhd.c
index b8b6759d670f01..3008ec2136543e 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -385,6 +385,8 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
if (disk->flags & (GENHD_FL_NO_PART | GENHD_FL_HIDDEN))
return -EINVAL;
+ if (test_bit(GD_SUPPRESS_PART_SCAN, &disk->state))
+ return -EINVAL;
if (disk->open_partitions)
return -EBUSY;
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index a58595f5ee2c8f..ba3627f982f241 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1066,7 +1066,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
lo->lo_flags |= LO_FLAGS_PARTSCAN;
partscan = lo->lo_flags & LO_FLAGS_PARTSCAN;
if (partscan)
- lo->lo_disk->flags &= ~GENHD_FL_NO_PART;
+ clear_bit(GD_SUPPRESS_PART_SCAN, &lo->lo_disk->state);
loop_global_unlock(lo, is_loop);
if (partscan)
@@ -1185,7 +1185,7 @@ static void __loop_clr_fd(struct loop_device *lo, bool release)
*/
lo->lo_flags = 0;
if (!part_shift)
- lo->lo_disk->flags |= GENHD_FL_NO_PART;
+ set_bit(GD_SUPPRESS_PART_SCAN, &lo->lo_disk->state);
mutex_lock(&lo->lo_mutex);
lo->lo_state = Lo_unbound;
mutex_unlock(&lo->lo_mutex);
@@ -1295,7 +1295,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
if (!err && (lo->lo_flags & LO_FLAGS_PARTSCAN) &&
!(prev_lo_flags & LO_FLAGS_PARTSCAN)) {
- lo->lo_disk->flags &= ~GENHD_FL_NO_PART;
+ clear_bit(GD_SUPPRESS_PART_SCAN, &lo->lo_disk->state);
partscan = true;
}
out_unlock:
@@ -2045,7 +2045,7 @@ static int loop_add(int i)
* userspace tools. Parameters like this in general should be avoided.
*/
if (!part_shift)
- disk->flags |= GENHD_FL_NO_PART;
+ set_bit(GD_SUPPRESS_PART_SCAN, &disk->state);
atomic_set(&lo->lo_refcnt, 0);
mutex_init(&lo->lo_mutex);
lo->lo_number = i;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 60d01613899711..108e3d114bfc1f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -147,6 +147,7 @@ struct gendisk {
#define GD_DEAD 2
#define GD_NATIVE_CAPACITY 3
#define GD_ADDED 4
+#define GD_SUPPRESS_PART_SCAN 5
struct mutex open_mutex; /* open/close mutex */
unsigned open_partitions; /* number of open partitions */
--
2.30.2
If packet headers changed, the cached nfct is no longer relevant
for the packet and attempt to re-use it leads to the incorrect packet
classification.
This issue is causing broken connectivity in OpenStack deployments
with OVS/OVN due to hairpin traffic being unexpectedly dropped.
The setup has datapath flows with several conntrack actions and tuple
changes between them:
actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
ct(zone=8),recirc(0x4)
After the first ct() action the packet headers are almost fully
re-written. The next ct() tries to re-use the existing nfct entry
and marks the packet as invalid, so it gets dropped later in the
pipeline.
Clearing the cached conntrack entry whenever packet tuple is changed
to avoid the issue.
The flow key should not be cleared though, because we should still
be able to match on the ct_state if the recirculation happens after
the tuple change but before the next ct() action.
Cc: stable(a)vger.kernel.org
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Frode Nordahl <frode.nordahl(a)canonical.com>
Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
Signed-off-by: Ilya Maximets <i.maximets(a)ovn.org>
---
The function ovs_ct_clear() looks a bit differently on older branches,
but the change should be exactly the same, i.e. move the
ovs_ct_fill_key() under the 'if (key)'.
The same behavior for userspace datapath was introduced along with
the conntrack caching support here:
https://github.com/openvswitch/ovs/commit/594570ea1cdecc7ef7880d707cbc7a4a4…
Interestingly, above commit also introduced the system test that can
check the issue for the kernel as well, but the test sends only one
packet and this packet goes via upcall to userspace and back to the
kernel effectively clearing the cached connection along the way and
avoiding the issue. If the test is modified to send more than a few
packets [1], it starts to fail without the kernel fix:
make check-kernel TESTSUITEFLAGS='-k negative'
142: conntrack - negative test for recirculation optimization FAILED
[1] https://pastebin.com/H1YMqaLa
net/openvswitch/actions.c | 6 ++++++
net/openvswitch/conntrack.c | 4 +++-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 1b5d73079dc9..868db4669a29 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -373,6 +373,7 @@ static void set_ip_addr(struct sk_buff *skb, struct iphdr *nh,
update_ip_l4_checksum(skb, nh, *addr, new_addr);
csum_replace4(&nh->check, *addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
*addr = new_addr;
}
@@ -420,6 +421,7 @@ static void set_ipv6_addr(struct sk_buff *skb, u8 l4_proto,
update_ipv6_checksum(skb, l4_proto, addr, new_addr);
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
memcpy(addr, new_addr, sizeof(__be32[4]));
}
@@ -660,6 +662,7 @@ static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key,
static void set_tp_port(struct sk_buff *skb, __be16 *port,
__be16 new_port, __sum16 *check)
{
+ ovs_ct_clear(skb, NULL);
inet_proto_csum_replace2(check, skb, *port, new_port, false);
*port = new_port;
}
@@ -699,6 +702,7 @@ static int set_udp(struct sk_buff *skb, struct sw_flow_key *flow_key,
uh->dest = dst;
flow_key->tp.src = src;
flow_key->tp.dst = dst;
+ ovs_ct_clear(skb, NULL);
}
skb_clear_hash(skb);
@@ -761,6 +765,8 @@ static int set_sctp(struct sk_buff *skb, struct sw_flow_key *flow_key,
sh->checksum = old_csum ^ old_correct_csum ^ new_csum;
skb_clear_hash(skb);
+ ovs_ct_clear(skb, NULL);
+
flow_key->tp.src = sh->source;
flow_key->tp.dst = sh->dest;
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 4a947c13c813..4e70df91d0f2 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -1342,7 +1342,9 @@ int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key)
nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
- ovs_ct_fill_key(skb, key, false);
+
+ if (key)
+ ovs_ct_fill_key(skb, key, false);
return 0;
}
--
2.34.3
Hello ,
It is my pleasure to communicate with you, I know that this message
will be a surprise to you my name is Mrs. Sophia Erick, I am diagnosed
with ovarian cancer which my doctor have confirmed that I have only
some weeks to live so I have decided you handover the sum of( Eleven
Million Dollars) through I decided handover the money in my account to
you for help of the orphanage homes and the needy once
Please kindly reply me here as soon as possible to enable me give
you more information but before handing over my details to you please
assure me that you will only take 30% of the money and share the rest
to the poor orphanage home and the needy once, thank you am waiting to
hear from you
Mrs Sophia Erick.
commit 22b106e5355d6e7a9c3b5cb5ed4ef22ae585ea94 upstream.
Commit d92c370a16cb ("block: really clone the block cgroup in
bio_clone_blkg_association") changed bio_clone_blkg_association() to
just clone bio->bi_blkg reference from source to destination bio. This
is however wrong if the source and destination bios are against
different block devices because struct blkcg_gq is different for each
bdev-blkcg pair. This will result in IOs being accounted (and throttled
as a result) multiple times against the same device (src bdev) while
throttling of the other device (dst bdev) is ignored. In case of BFQ the
inconsistency can even result in crashes in bfq_bic_update_cgroup().
Fix the problem by looking up correct blkcg_gq for the cloned bio.
Reported-by: Logan Gunthorpe <logang(a)deltatee.com>
Reported-and-tested-by: Donald Buczek <buczek(a)molgen.mpg.de>
Fixes: d92c370a16cb ("block: really clone the block cgroup in bio_clone_blkg_association")
CC: stable(a)vger.kernel.org
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20220602081242.7731-1-jack@suse.cz
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/blk-cgroup.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
This patch should be a backport for 5.15, 5.17, and 5.18 stable branches.
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 07a2524e6efd..ce5858dadca5 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1886,12 +1886,8 @@ EXPORT_SYMBOL_GPL(bio_associate_blkg);
*/
void bio_clone_blkg_association(struct bio *dst, struct bio *src)
{
- if (src->bi_blkg) {
- if (dst->bi_blkg)
- blkg_put(dst->bi_blkg);
- blkg_get(src->bi_blkg);
- dst->bi_blkg = src->bi_blkg;
- }
+ if (src->bi_blkg)
+ bio_associate_blkg_from_css(dst, &bio_blkcg(src)->css);
}
EXPORT_SYMBOL_GPL(bio_clone_blkg_association);
--
2.35.3
Summary: OMAP patch causes SPI bus transaction failure on TI CPU
Commit: c859e0d479b3b4f6132fc12637c51e01492f31f6
Kernel version: 5.10.87
The detailed description:
I know this is a old commit at this point, but we have observed a regression caused by this commit. It causes improper toggle during a SPI transaction with a microcontroller. The CPU in use is Texas Instruments AM3352BZCZA80. The microcontroller in use is a PIC based micro. I have logic capture images available to show the signal difference that is causing confusion on the SPI bus.
..............................................
Eric Schikschneit
Embedded Linux Engineer
NovaTech, LLC
13555 W. 107th Street
Lenexa, KS 66215
(913) 451-1880 (main)
(913) 742-XXXX (direct)
Eric.Schikschneit(a)novatechweb.com