The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 166d3863231667c4f64dee72b77d1102cdfad11f Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sun, 10 Jul 2022 19:05:22 -0400
Subject: [PATCH] xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE
The error paths of gntdev_mmap() can call unmap_grant_pages() even
though not all of the pages have been successfully mapped. This will
trigger the WARN_ON()s in __unmap_grant_pages_done(). The number of
warnings can be very large; I have observed thousands of lines of
warnings in the systemd journal.
Avoid this problem by only warning on unmapping failure if the handle
being unmapped is not INVALID_GRANT_HANDLE. The handle field of any
page that was not successfully mapped will be INVALID_GRANT_HANDLE, so
this catches all cases where unmapping can legitimately fail.
Fixes: dbe97cff7dd9 ("xen/gntdev: Avoid blocking in unmap_grant_pages()")
Cc: stable(a)vger.kernel.org
Suggested-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko(a)epam.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220710230522.1563-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 4b56c39f766d..84b143eef395 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -396,13 +396,15 @@ static void __unmap_grant_pages_done(int result,
unsigned int offset = data->unmap_ops - map->unmap_ops;
for (i = 0; i < data->count; i++) {
- WARN_ON(map->unmap_ops[offset+i].status);
+ WARN_ON(map->unmap_ops[offset + i].status != GNTST_okay &&
+ map->unmap_ops[offset + i].handle != INVALID_GRANT_HANDLE);
pr_debug("unmap handle=%d st=%d\n",
map->unmap_ops[offset+i].handle,
map->unmap_ops[offset+i].status);
map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
if (use_ptemod) {
- WARN_ON(map->kunmap_ops[offset+i].status);
+ WARN_ON(map->kunmap_ops[offset + i].status != GNTST_okay &&
+ map->kunmap_ops[offset + i].handle != INVALID_GRANT_HANDLE);
pr_debug("kunmap handle=%u st=%d\n",
map->kunmap_ops[offset+i].handle,
map->kunmap_ops[offset+i].status);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 166d3863231667c4f64dee72b77d1102cdfad11f Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sun, 10 Jul 2022 19:05:22 -0400
Subject: [PATCH] xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE
The error paths of gntdev_mmap() can call unmap_grant_pages() even
though not all of the pages have been successfully mapped. This will
trigger the WARN_ON()s in __unmap_grant_pages_done(). The number of
warnings can be very large; I have observed thousands of lines of
warnings in the systemd journal.
Avoid this problem by only warning on unmapping failure if the handle
being unmapped is not INVALID_GRANT_HANDLE. The handle field of any
page that was not successfully mapped will be INVALID_GRANT_HANDLE, so
this catches all cases where unmapping can legitimately fail.
Fixes: dbe97cff7dd9 ("xen/gntdev: Avoid blocking in unmap_grant_pages()")
Cc: stable(a)vger.kernel.org
Suggested-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko(a)epam.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220710230522.1563-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 4b56c39f766d..84b143eef395 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -396,13 +396,15 @@ static void __unmap_grant_pages_done(int result,
unsigned int offset = data->unmap_ops - map->unmap_ops;
for (i = 0; i < data->count; i++) {
- WARN_ON(map->unmap_ops[offset+i].status);
+ WARN_ON(map->unmap_ops[offset + i].status != GNTST_okay &&
+ map->unmap_ops[offset + i].handle != INVALID_GRANT_HANDLE);
pr_debug("unmap handle=%d st=%d\n",
map->unmap_ops[offset+i].handle,
map->unmap_ops[offset+i].status);
map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
if (use_ptemod) {
- WARN_ON(map->kunmap_ops[offset+i].status);
+ WARN_ON(map->kunmap_ops[offset + i].status != GNTST_okay &&
+ map->kunmap_ops[offset + i].handle != INVALID_GRANT_HANDLE);
pr_debug("kunmap handle=%u st=%d\n",
map->kunmap_ops[offset+i].handle,
map->kunmap_ops[offset+i].status);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 166d3863231667c4f64dee72b77d1102cdfad11f Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sun, 10 Jul 2022 19:05:22 -0400
Subject: [PATCH] xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE
The error paths of gntdev_mmap() can call unmap_grant_pages() even
though not all of the pages have been successfully mapped. This will
trigger the WARN_ON()s in __unmap_grant_pages_done(). The number of
warnings can be very large; I have observed thousands of lines of
warnings in the systemd journal.
Avoid this problem by only warning on unmapping failure if the handle
being unmapped is not INVALID_GRANT_HANDLE. The handle field of any
page that was not successfully mapped will be INVALID_GRANT_HANDLE, so
this catches all cases where unmapping can legitimately fail.
Fixes: dbe97cff7dd9 ("xen/gntdev: Avoid blocking in unmap_grant_pages()")
Cc: stable(a)vger.kernel.org
Suggested-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko(a)epam.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220710230522.1563-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 4b56c39f766d..84b143eef395 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -396,13 +396,15 @@ static void __unmap_grant_pages_done(int result,
unsigned int offset = data->unmap_ops - map->unmap_ops;
for (i = 0; i < data->count; i++) {
- WARN_ON(map->unmap_ops[offset+i].status);
+ WARN_ON(map->unmap_ops[offset + i].status != GNTST_okay &&
+ map->unmap_ops[offset + i].handle != INVALID_GRANT_HANDLE);
pr_debug("unmap handle=%d st=%d\n",
map->unmap_ops[offset+i].handle,
map->unmap_ops[offset+i].status);
map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
if (use_ptemod) {
- WARN_ON(map->kunmap_ops[offset+i].status);
+ WARN_ON(map->kunmap_ops[offset + i].status != GNTST_okay &&
+ map->kunmap_ops[offset + i].handle != INVALID_GRANT_HANDLE);
pr_debug("kunmap handle=%u st=%d\n",
map->kunmap_ops[offset+i].handle,
map->kunmap_ops[offset+i].status);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 166d3863231667c4f64dee72b77d1102cdfad11f Mon Sep 17 00:00:00 2001
From: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Date: Sun, 10 Jul 2022 19:05:22 -0400
Subject: [PATCH] xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE
The error paths of gntdev_mmap() can call unmap_grant_pages() even
though not all of the pages have been successfully mapped. This will
trigger the WARN_ON()s in __unmap_grant_pages_done(). The number of
warnings can be very large; I have observed thousands of lines of
warnings in the systemd journal.
Avoid this problem by only warning on unmapping failure if the handle
being unmapped is not INVALID_GRANT_HANDLE. The handle field of any
page that was not successfully mapped will be INVALID_GRANT_HANDLE, so
this catches all cases where unmapping can legitimately fail.
Fixes: dbe97cff7dd9 ("xen/gntdev: Avoid blocking in unmap_grant_pages()")
Cc: stable(a)vger.kernel.org
Suggested-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Demi Marie Obenour <demi(a)invisiblethingslab.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko(a)epam.com>
Reviewed-by: Juergen Gross <jgross(a)suse.com>
Link: https://lore.kernel.org/r/20220710230522.1563-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross(a)suse.com>
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 4b56c39f766d..84b143eef395 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -396,13 +396,15 @@ static void __unmap_grant_pages_done(int result,
unsigned int offset = data->unmap_ops - map->unmap_ops;
for (i = 0; i < data->count; i++) {
- WARN_ON(map->unmap_ops[offset+i].status);
+ WARN_ON(map->unmap_ops[offset + i].status != GNTST_okay &&
+ map->unmap_ops[offset + i].handle != INVALID_GRANT_HANDLE);
pr_debug("unmap handle=%d st=%d\n",
map->unmap_ops[offset+i].handle,
map->unmap_ops[offset+i].status);
map->unmap_ops[offset+i].handle = INVALID_GRANT_HANDLE;
if (use_ptemod) {
- WARN_ON(map->kunmap_ops[offset+i].status);
+ WARN_ON(map->kunmap_ops[offset + i].status != GNTST_okay &&
+ map->kunmap_ops[offset + i].handle != INVALID_GRANT_HANDLE);
pr_debug("kunmap handle=%u st=%d\n",
map->kunmap_ops[offset+i].handle,
map->kunmap_ops[offset+i].status);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 39cdb68c64d84e71a4a717000b6e5de208ee60cc Mon Sep 17 00:00:00 2001
From: Yangxi Xiang <xyangxi5(a)gmail.com>
Date: Tue, 28 Jun 2022 17:33:22 +0800
Subject: [PATCH] vt: fix memory overlapping when deleting chars in the buffer
A memory overlapping copy occurs when deleting a long line. This memory
overlapping copy can cause data corruption when scr_memcpyw is optimized
to memcpy because memcpy does not ensure its behavior if the destination
buffer overlaps with the source buffer. The line buffer is not always
broken, because the memcpy utilizes the hardware acceleration, whose
result is not deterministic.
Fix this problem by using replacing the scr_memcpyw with scr_memmovew.
Fixes: 81732c3b2fed ("tty vt: Fix line garbage in virtual console on command line edition")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Yangxi Xiang <xyangxi5(a)gmail.com>
Link: https://lore.kernel.org/r/20220628093322.5688-1-xyangxi5@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index f8c87c4d7399..dfc1f4b445f3 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -855,7 +855,7 @@ static void delete_char(struct vc_data *vc, unsigned int nr)
unsigned short *p = (unsigned short *) vc->vc_pos;
vc_uniscr_delete(vc, nr);
- scr_memcpyw(p, p + nr, (vc->vc_cols - vc->state.x - nr) * 2);
+ scr_memmovew(p, p + nr, (vc->vc_cols - vc->state.x - nr) * 2);
scr_memsetw(p + vc->vc_cols - vc->state.x - nr, vc->vc_video_erase_char,
nr * 2);
vc->vc_need_wrap = 0;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 39cdb68c64d84e71a4a717000b6e5de208ee60cc Mon Sep 17 00:00:00 2001
From: Yangxi Xiang <xyangxi5(a)gmail.com>
Date: Tue, 28 Jun 2022 17:33:22 +0800
Subject: [PATCH] vt: fix memory overlapping when deleting chars in the buffer
A memory overlapping copy occurs when deleting a long line. This memory
overlapping copy can cause data corruption when scr_memcpyw is optimized
to memcpy because memcpy does not ensure its behavior if the destination
buffer overlaps with the source buffer. The line buffer is not always
broken, because the memcpy utilizes the hardware acceleration, whose
result is not deterministic.
Fix this problem by using replacing the scr_memcpyw with scr_memmovew.
Fixes: 81732c3b2fed ("tty vt: Fix line garbage in virtual console on command line edition")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Yangxi Xiang <xyangxi5(a)gmail.com>
Link: https://lore.kernel.org/r/20220628093322.5688-1-xyangxi5@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index f8c87c4d7399..dfc1f4b445f3 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -855,7 +855,7 @@ static void delete_char(struct vc_data *vc, unsigned int nr)
unsigned short *p = (unsigned short *) vc->vc_pos;
vc_uniscr_delete(vc, nr);
- scr_memcpyw(p, p + nr, (vc->vc_cols - vc->state.x - nr) * 2);
+ scr_memmovew(p, p + nr, (vc->vc_cols - vc->state.x - nr) * 2);
scr_memsetw(p + vc->vc_cols - vc->state.x - nr, vc->vc_video_erase_char,
nr * 2);
vc->vc_need_wrap = 0;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 39cdb68c64d84e71a4a717000b6e5de208ee60cc Mon Sep 17 00:00:00 2001
From: Yangxi Xiang <xyangxi5(a)gmail.com>
Date: Tue, 28 Jun 2022 17:33:22 +0800
Subject: [PATCH] vt: fix memory overlapping when deleting chars in the buffer
A memory overlapping copy occurs when deleting a long line. This memory
overlapping copy can cause data corruption when scr_memcpyw is optimized
to memcpy because memcpy does not ensure its behavior if the destination
buffer overlaps with the source buffer. The line buffer is not always
broken, because the memcpy utilizes the hardware acceleration, whose
result is not deterministic.
Fix this problem by using replacing the scr_memcpyw with scr_memmovew.
Fixes: 81732c3b2fed ("tty vt: Fix line garbage in virtual console on command line edition")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Yangxi Xiang <xyangxi5(a)gmail.com>
Link: https://lore.kernel.org/r/20220628093322.5688-1-xyangxi5@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index f8c87c4d7399..dfc1f4b445f3 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -855,7 +855,7 @@ static void delete_char(struct vc_data *vc, unsigned int nr)
unsigned short *p = (unsigned short *) vc->vc_pos;
vc_uniscr_delete(vc, nr);
- scr_memcpyw(p, p + nr, (vc->vc_cols - vc->state.x - nr) * 2);
+ scr_memmovew(p, p + nr, (vc->vc_cols - vc->state.x - nr) * 2);
scr_memsetw(p + vc->vc_cols - vc->state.x - nr, vc->vc_video_erase_char,
nr * 2);
vc->vc_need_wrap = 0;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 39cdb68c64d84e71a4a717000b6e5de208ee60cc Mon Sep 17 00:00:00 2001
From: Yangxi Xiang <xyangxi5(a)gmail.com>
Date: Tue, 28 Jun 2022 17:33:22 +0800
Subject: [PATCH] vt: fix memory overlapping when deleting chars in the buffer
A memory overlapping copy occurs when deleting a long line. This memory
overlapping copy can cause data corruption when scr_memcpyw is optimized
to memcpy because memcpy does not ensure its behavior if the destination
buffer overlaps with the source buffer. The line buffer is not always
broken, because the memcpy utilizes the hardware acceleration, whose
result is not deterministic.
Fix this problem by using replacing the scr_memcpyw with scr_memmovew.
Fixes: 81732c3b2fed ("tty vt: Fix line garbage in virtual console on command line edition")
Cc: stable <stable(a)kernel.org>
Signed-off-by: Yangxi Xiang <xyangxi5(a)gmail.com>
Link: https://lore.kernel.org/r/20220628093322.5688-1-xyangxi5@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index f8c87c4d7399..dfc1f4b445f3 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -855,7 +855,7 @@ static void delete_char(struct vc_data *vc, unsigned int nr)
unsigned short *p = (unsigned short *) vc->vc_pos;
vc_uniscr_delete(vc, nr);
- scr_memcpyw(p, p + nr, (vc->vc_cols - vc->state.x - nr) * 2);
+ scr_memmovew(p, p + nr, (vc->vc_cols - vc->state.x - nr) * 2);
scr_memsetw(p + vc->vc_cols - vc->state.x - nr, vc->vc_video_erase_char,
nr * 2);
vc->vc_need_wrap = 0;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 43b5240ca6b33108998810593248186b1e3ae34a Mon Sep 17 00:00:00 2001
From: Muchun Song <songmuchun(a)bytedance.com>
Date: Thu, 9 Jun 2022 18:40:32 +0800
Subject: [PATCH] mm: sysctl: fix missing numa_stat when !CONFIG_HUGETLB_PAGE
"numa_stat" should not be included in the scope of CONFIG_HUGETLB_PAGE, if
CONFIG_HUGETLB_PAGE is not configured even if CONFIG_NUMA is configured,
"numa_stat" is missed form /proc. Move it out of CONFIG_HUGETLB_PAGE to
fix it.
Fixes: 4518085e127d ("mm, sysctl: make NUMA stats configurable")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Mel Gorman <mgorman(a)techsingularity.net>
Signed-off-by: Luis Chamberlain <mcgrof(a)kernel.org>
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index e52b6e372c60..aaf0b1f1dc57 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2091,6 +2091,17 @@ static struct ctl_table vm_table[] = {
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_TWO_HUNDRED,
},
+#ifdef CONFIG_NUMA
+ {
+ .procname = "numa_stat",
+ .data = &sysctl_vm_numa_stat,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = sysctl_vm_numa_stat_handler,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#endif
#ifdef CONFIG_HUGETLB_PAGE
{
.procname = "nr_hugepages",
@@ -2107,15 +2118,6 @@ static struct ctl_table vm_table[] = {
.mode = 0644,
.proc_handler = &hugetlb_mempolicy_sysctl_handler,
},
- {
- .procname = "numa_stat",
- .data = &sysctl_vm_numa_stat,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = sysctl_vm_numa_stat_handler,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
- },
#endif
{
.procname = "hugetlb_shm_group",
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fac47b43c760ea90e64b895dba60df0327be7775 Mon Sep 17 00:00:00 2001
From: Xiubo Li <xiubli(a)redhat.com>
Date: Mon, 11 Jul 2022 12:11:21 +0800
Subject: [PATCH] netfs: do not unlock and put the folio twice
check_write_begin() will unlock and put the folio when return
non-zero. So we should avoid unlocking and putting it twice in
netfs layer.
Change the way ->check_write_begin() works in the following two ways:
(1) Pass it a pointer to the folio pointer, allowing it to unlock and put
the folio prior to doing the stuff it wants to do, provided it clears
the folio pointer.
(2) Change the return values such that 0 with folio pointer set means
continue, 0 with folio pointer cleared means re-get and all error
codes indicating an error (no special treatment for -EAGAIN).
[ bagasdotme: use Sphinx code text syntax for *foliop pointer ]
Cc: stable(a)vger.kernel.org
Link: https://tracker.ceph.com/issues/56423
Link: https://lore.kernel.org/r/cf169f43-8ee7-8697-25da-0204d1b4343e@redhat.com
Co-developed-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: Xiubo Li <xiubli(a)redhat.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: Bagas Sanjaya <bagasdotme(a)gmail.com>
Signed-off-by: Ilya Dryomov <idryomov(a)gmail.com>
diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
index 4d19b19bcc08..73a4176144b3 100644
--- a/Documentation/filesystems/netfs_library.rst
+++ b/Documentation/filesystems/netfs_library.rst
@@ -301,7 +301,7 @@ through which it can issue requests and negotiate::
void (*issue_read)(struct netfs_io_subrequest *subreq);
bool (*is_still_valid)(struct netfs_io_request *rreq);
int (*check_write_begin)(struct file *file, loff_t pos, unsigned len,
- struct folio *folio, void **_fsdata);
+ struct folio **foliop, void **_fsdata);
void (*done)(struct netfs_io_request *rreq);
};
@@ -381,8 +381,10 @@ The operations are as follows:
allocated/grabbed the folio to be modified to allow the filesystem to flush
conflicting state before allowing it to be modified.
- It should return 0 if everything is now fine, -EAGAIN if the folio should be
- regrabbed and any other error code to abort the operation.
+ It may unlock and discard the folio it was given and set the caller's folio
+ pointer to NULL. It should return 0 if everything is now fine (``*foliop``
+ left set) or the op should be retried (``*foliop`` cleared) and any other
+ error code to abort the operation.
* ``done``
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 42118a4f3383..d1cfb235c4b9 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -375,7 +375,7 @@ static int afs_begin_cache_operation(struct netfs_io_request *rreq)
}
static int afs_check_write_begin(struct file *file, loff_t pos, unsigned len,
- struct folio *folio, void **_fsdata)
+ struct folio **foliop, void **_fsdata)
{
struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 6dee88815491..d6e5916138e4 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -63,7 +63,7 @@
(CONGESTION_ON_THRESH(congestion_kb) >> 2))
static int ceph_netfs_check_write_begin(struct file *file, loff_t pos, unsigned int len,
- struct folio *folio, void **_fsdata);
+ struct folio **foliop, void **_fsdata);
static inline struct ceph_snap_context *page_snap_context(struct page *page)
{
@@ -1288,18 +1288,19 @@ ceph_find_incompatible(struct page *page)
}
static int ceph_netfs_check_write_begin(struct file *file, loff_t pos, unsigned int len,
- struct folio *folio, void **_fsdata)
+ struct folio **foliop, void **_fsdata)
{
struct inode *inode = file_inode(file);
struct ceph_inode_info *ci = ceph_inode(inode);
struct ceph_snap_context *snapc;
- snapc = ceph_find_incompatible(folio_page(folio, 0));
+ snapc = ceph_find_incompatible(folio_page(*foliop, 0));
if (snapc) {
int r;
- folio_unlock(folio);
- folio_put(folio);
+ folio_unlock(*foliop);
+ folio_put(*foliop);
+ *foliop = NULL;
if (IS_ERR(snapc))
return PTR_ERR(snapc);
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 42f892c5712e..0ce535852151 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -319,8 +319,9 @@ static bool netfs_skip_folio_read(struct folio *folio, loff_t pos, size_t len,
* conflicting writes once the folio is grabbed and locked. It is passed a
* pointer to the fsdata cookie that gets returned to the VM to be passed to
* write_end. It is permitted to sleep. It should return 0 if the request
- * should go ahead; unlock the folio and return -EAGAIN to cause the folio to
- * be regot; or return an error.
+ * should go ahead or it may return an error. It may also unlock and put the
+ * folio, provided it sets ``*foliop`` to NULL, in which case a return of 0
+ * will cause the folio to be re-got and the process to be retried.
*
* The calling netfs must initialise a netfs context contiguous to the vfs
* inode before calling this.
@@ -348,13 +349,13 @@ int netfs_write_begin(struct netfs_inode *ctx,
if (ctx->ops->check_write_begin) {
/* Allow the netfs (eg. ceph) to flush conflicts. */
- ret = ctx->ops->check_write_begin(file, pos, len, folio, _fsdata);
+ ret = ctx->ops->check_write_begin(file, pos, len, &folio, _fsdata);
if (ret < 0) {
trace_netfs_failure(NULL, NULL, ret, netfs_fail_check_write_begin);
- if (ret == -EAGAIN)
- goto retry;
goto error;
}
+ if (!folio)
+ goto retry;
}
if (folio_test_uptodate(folio))
@@ -416,8 +417,10 @@ int netfs_write_begin(struct netfs_inode *ctx,
error_put:
netfs_put_request(rreq, false, netfs_rreq_trace_put_failed);
error:
- folio_unlock(folio);
- folio_put(folio);
+ if (folio) {
+ folio_unlock(folio);
+ folio_put(folio);
+ }
_leave(" = %d", ret);
return ret;
}
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 1773e5df8e65..1b18dfa52e48 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -214,7 +214,7 @@ struct netfs_request_ops {
void (*issue_read)(struct netfs_io_subrequest *subreq);
bool (*is_still_valid)(struct netfs_io_request *rreq);
int (*check_write_begin)(struct file *file, loff_t pos, unsigned len,
- struct folio *folio, void **_fsdata);
+ struct folio **foliop, void **_fsdata);
void (*done)(struct netfs_io_request *rreq);
};
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fac47b43c760ea90e64b895dba60df0327be7775 Mon Sep 17 00:00:00 2001
From: Xiubo Li <xiubli(a)redhat.com>
Date: Mon, 11 Jul 2022 12:11:21 +0800
Subject: [PATCH] netfs: do not unlock and put the folio twice
check_write_begin() will unlock and put the folio when return
non-zero. So we should avoid unlocking and putting it twice in
netfs layer.
Change the way ->check_write_begin() works in the following two ways:
(1) Pass it a pointer to the folio pointer, allowing it to unlock and put
the folio prior to doing the stuff it wants to do, provided it clears
the folio pointer.
(2) Change the return values such that 0 with folio pointer set means
continue, 0 with folio pointer cleared means re-get and all error
codes indicating an error (no special treatment for -EAGAIN).
[ bagasdotme: use Sphinx code text syntax for *foliop pointer ]
Cc: stable(a)vger.kernel.org
Link: https://tracker.ceph.com/issues/56423
Link: https://lore.kernel.org/r/cf169f43-8ee7-8697-25da-0204d1b4343e@redhat.com
Co-developed-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: Xiubo Li <xiubli(a)redhat.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Signed-off-by: Bagas Sanjaya <bagasdotme(a)gmail.com>
Signed-off-by: Ilya Dryomov <idryomov(a)gmail.com>
diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
index 4d19b19bcc08..73a4176144b3 100644
--- a/Documentation/filesystems/netfs_library.rst
+++ b/Documentation/filesystems/netfs_library.rst
@@ -301,7 +301,7 @@ through which it can issue requests and negotiate::
void (*issue_read)(struct netfs_io_subrequest *subreq);
bool (*is_still_valid)(struct netfs_io_request *rreq);
int (*check_write_begin)(struct file *file, loff_t pos, unsigned len,
- struct folio *folio, void **_fsdata);
+ struct folio **foliop, void **_fsdata);
void (*done)(struct netfs_io_request *rreq);
};
@@ -381,8 +381,10 @@ The operations are as follows:
allocated/grabbed the folio to be modified to allow the filesystem to flush
conflicting state before allowing it to be modified.
- It should return 0 if everything is now fine, -EAGAIN if the folio should be
- regrabbed and any other error code to abort the operation.
+ It may unlock and discard the folio it was given and set the caller's folio
+ pointer to NULL. It should return 0 if everything is now fine (``*foliop``
+ left set) or the op should be retried (``*foliop`` cleared) and any other
+ error code to abort the operation.
* ``done``
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 42118a4f3383..d1cfb235c4b9 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -375,7 +375,7 @@ static int afs_begin_cache_operation(struct netfs_io_request *rreq)
}
static int afs_check_write_begin(struct file *file, loff_t pos, unsigned len,
- struct folio *folio, void **_fsdata)
+ struct folio **foliop, void **_fsdata)
{
struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 6dee88815491..d6e5916138e4 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -63,7 +63,7 @@
(CONGESTION_ON_THRESH(congestion_kb) >> 2))
static int ceph_netfs_check_write_begin(struct file *file, loff_t pos, unsigned int len,
- struct folio *folio, void **_fsdata);
+ struct folio **foliop, void **_fsdata);
static inline struct ceph_snap_context *page_snap_context(struct page *page)
{
@@ -1288,18 +1288,19 @@ ceph_find_incompatible(struct page *page)
}
static int ceph_netfs_check_write_begin(struct file *file, loff_t pos, unsigned int len,
- struct folio *folio, void **_fsdata)
+ struct folio **foliop, void **_fsdata)
{
struct inode *inode = file_inode(file);
struct ceph_inode_info *ci = ceph_inode(inode);
struct ceph_snap_context *snapc;
- snapc = ceph_find_incompatible(folio_page(folio, 0));
+ snapc = ceph_find_incompatible(folio_page(*foliop, 0));
if (snapc) {
int r;
- folio_unlock(folio);
- folio_put(folio);
+ folio_unlock(*foliop);
+ folio_put(*foliop);
+ *foliop = NULL;
if (IS_ERR(snapc))
return PTR_ERR(snapc);
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 42f892c5712e..0ce535852151 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -319,8 +319,9 @@ static bool netfs_skip_folio_read(struct folio *folio, loff_t pos, size_t len,
* conflicting writes once the folio is grabbed and locked. It is passed a
* pointer to the fsdata cookie that gets returned to the VM to be passed to
* write_end. It is permitted to sleep. It should return 0 if the request
- * should go ahead; unlock the folio and return -EAGAIN to cause the folio to
- * be regot; or return an error.
+ * should go ahead or it may return an error. It may also unlock and put the
+ * folio, provided it sets ``*foliop`` to NULL, in which case a return of 0
+ * will cause the folio to be re-got and the process to be retried.
*
* The calling netfs must initialise a netfs context contiguous to the vfs
* inode before calling this.
@@ -348,13 +349,13 @@ int netfs_write_begin(struct netfs_inode *ctx,
if (ctx->ops->check_write_begin) {
/* Allow the netfs (eg. ceph) to flush conflicts. */
- ret = ctx->ops->check_write_begin(file, pos, len, folio, _fsdata);
+ ret = ctx->ops->check_write_begin(file, pos, len, &folio, _fsdata);
if (ret < 0) {
trace_netfs_failure(NULL, NULL, ret, netfs_fail_check_write_begin);
- if (ret == -EAGAIN)
- goto retry;
goto error;
}
+ if (!folio)
+ goto retry;
}
if (folio_test_uptodate(folio))
@@ -416,8 +417,10 @@ int netfs_write_begin(struct netfs_inode *ctx,
error_put:
netfs_put_request(rreq, false, netfs_rreq_trace_put_failed);
error:
- folio_unlock(folio);
- folio_put(folio);
+ if (folio) {
+ folio_unlock(folio);
+ folio_put(folio);
+ }
_leave(" = %d", ret);
return ret;
}
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 1773e5df8e65..1b18dfa52e48 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -214,7 +214,7 @@ struct netfs_request_ops {
void (*issue_read)(struct netfs_io_subrequest *subreq);
bool (*is_still_valid)(struct netfs_io_request *rreq);
int (*check_write_begin)(struct file *file, loff_t pos, unsigned len,
- struct folio *folio, void **_fsdata);
+ struct folio **foliop, void **_fsdata);
void (*done)(struct netfs_io_request *rreq);
};
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b24dcf1dc507f69ed3b5c66c2b6a0209ae80d4d4 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 12 Jul 2022 16:21:32 +0100
Subject: [PATCH] drm/i915/gt: Serialize GRDOM access between multiple engine
resets
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Don't allow two engines to be reset in parallel, as they would both
try to select a reset bit (and send requests to common registers)
and wait on that register, at the same time. Serialize control of
the reset requests/acks using the uncore->lock, which will also ensure
that no other GT state changes at the same time as the actual reset.
Cc: stable(a)vger.kernel.org # v4.4 and upper
Reported-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda(a)intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7…
(cherry picked from commit 336561a914fc0c6f1218228718f633b31b7af1c3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index a5338c3fde7a..c68d36fb5bbd 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -300,9 +300,9 @@ static int gen6_hw_domain_reset(struct intel_gt *gt, u32 hw_domain_mask)
return err;
}
-static int gen6_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
u32 hw_mask;
@@ -321,6 +321,20 @@ static int gen6_reset_engines(struct intel_gt *gt,
return gen6_hw_domain_reset(gt, hw_mask);
}
+static int gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
+{
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(>->uncore->lock, flags);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
+ return ret;
+}
+
static struct intel_engine_cs *find_sfc_paired_vecs_engine(struct intel_engine_cs *engine)
{
int vecs_id;
@@ -487,9 +501,9 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
rmw_clear_fw(uncore, sfc_lock.lock_reg, sfc_lock.lock_bit);
}
-static int gen11_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen11_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
@@ -583,8 +597,11 @@ static int gen8_reset_engines(struct intel_gt *gt,
struct intel_engine_cs *engine;
const bool reset_non_ready = retry >= 1;
intel_engine_mask_t tmp;
+ unsigned long flags;
int ret;
+ spin_lock_irqsave(>->uncore->lock, flags);
+
for_each_engine_masked(engine, gt, engine_mask, tmp) {
ret = gen8_engine_reset_prepare(engine);
if (ret && !reset_non_ready)
@@ -612,17 +629,19 @@ static int gen8_reset_engines(struct intel_gt *gt,
* This is best effort, so ignore any error from the initial reset.
*/
if (IS_DG2(gt->i915) && engine_mask == ALL_ENGINES)
- gen11_reset_engines(gt, gt->info.engine_mask, 0);
+ __gen11_reset_engines(gt, gt->info.engine_mask, 0);
if (GRAPHICS_VER(gt->i915) >= 11)
- ret = gen11_reset_engines(gt, engine_mask, retry);
+ ret = __gen11_reset_engines(gt, engine_mask, retry);
else
- ret = gen6_reset_engines(gt, engine_mask, retry);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
skip_reset:
for_each_engine_masked(engine, gt, engine_mask, tmp)
gen8_engine_reset_cancel(engine);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
return ret;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b24dcf1dc507f69ed3b5c66c2b6a0209ae80d4d4 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 12 Jul 2022 16:21:32 +0100
Subject: [PATCH] drm/i915/gt: Serialize GRDOM access between multiple engine
resets
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Don't allow two engines to be reset in parallel, as they would both
try to select a reset bit (and send requests to common registers)
and wait on that register, at the same time. Serialize control of
the reset requests/acks using the uncore->lock, which will also ensure
that no other GT state changes at the same time as the actual reset.
Cc: stable(a)vger.kernel.org # v4.4 and upper
Reported-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda(a)intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7…
(cherry picked from commit 336561a914fc0c6f1218228718f633b31b7af1c3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index a5338c3fde7a..c68d36fb5bbd 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -300,9 +300,9 @@ static int gen6_hw_domain_reset(struct intel_gt *gt, u32 hw_domain_mask)
return err;
}
-static int gen6_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
u32 hw_mask;
@@ -321,6 +321,20 @@ static int gen6_reset_engines(struct intel_gt *gt,
return gen6_hw_domain_reset(gt, hw_mask);
}
+static int gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
+{
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(>->uncore->lock, flags);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
+ return ret;
+}
+
static struct intel_engine_cs *find_sfc_paired_vecs_engine(struct intel_engine_cs *engine)
{
int vecs_id;
@@ -487,9 +501,9 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
rmw_clear_fw(uncore, sfc_lock.lock_reg, sfc_lock.lock_bit);
}
-static int gen11_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen11_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
@@ -583,8 +597,11 @@ static int gen8_reset_engines(struct intel_gt *gt,
struct intel_engine_cs *engine;
const bool reset_non_ready = retry >= 1;
intel_engine_mask_t tmp;
+ unsigned long flags;
int ret;
+ spin_lock_irqsave(>->uncore->lock, flags);
+
for_each_engine_masked(engine, gt, engine_mask, tmp) {
ret = gen8_engine_reset_prepare(engine);
if (ret && !reset_non_ready)
@@ -612,17 +629,19 @@ static int gen8_reset_engines(struct intel_gt *gt,
* This is best effort, so ignore any error from the initial reset.
*/
if (IS_DG2(gt->i915) && engine_mask == ALL_ENGINES)
- gen11_reset_engines(gt, gt->info.engine_mask, 0);
+ __gen11_reset_engines(gt, gt->info.engine_mask, 0);
if (GRAPHICS_VER(gt->i915) >= 11)
- ret = gen11_reset_engines(gt, engine_mask, retry);
+ ret = __gen11_reset_engines(gt, engine_mask, retry);
else
- ret = gen6_reset_engines(gt, engine_mask, retry);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
skip_reset:
for_each_engine_masked(engine, gt, engine_mask, tmp)
gen8_engine_reset_cancel(engine);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
return ret;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b24dcf1dc507f69ed3b5c66c2b6a0209ae80d4d4 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 12 Jul 2022 16:21:32 +0100
Subject: [PATCH] drm/i915/gt: Serialize GRDOM access between multiple engine
resets
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Don't allow two engines to be reset in parallel, as they would both
try to select a reset bit (and send requests to common registers)
and wait on that register, at the same time. Serialize control of
the reset requests/acks using the uncore->lock, which will also ensure
that no other GT state changes at the same time as the actual reset.
Cc: stable(a)vger.kernel.org # v4.4 and upper
Reported-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda(a)intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7…
(cherry picked from commit 336561a914fc0c6f1218228718f633b31b7af1c3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index a5338c3fde7a..c68d36fb5bbd 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -300,9 +300,9 @@ static int gen6_hw_domain_reset(struct intel_gt *gt, u32 hw_domain_mask)
return err;
}
-static int gen6_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
u32 hw_mask;
@@ -321,6 +321,20 @@ static int gen6_reset_engines(struct intel_gt *gt,
return gen6_hw_domain_reset(gt, hw_mask);
}
+static int gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
+{
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(>->uncore->lock, flags);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
+ return ret;
+}
+
static struct intel_engine_cs *find_sfc_paired_vecs_engine(struct intel_engine_cs *engine)
{
int vecs_id;
@@ -487,9 +501,9 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
rmw_clear_fw(uncore, sfc_lock.lock_reg, sfc_lock.lock_bit);
}
-static int gen11_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen11_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
@@ -583,8 +597,11 @@ static int gen8_reset_engines(struct intel_gt *gt,
struct intel_engine_cs *engine;
const bool reset_non_ready = retry >= 1;
intel_engine_mask_t tmp;
+ unsigned long flags;
int ret;
+ spin_lock_irqsave(>->uncore->lock, flags);
+
for_each_engine_masked(engine, gt, engine_mask, tmp) {
ret = gen8_engine_reset_prepare(engine);
if (ret && !reset_non_ready)
@@ -612,17 +629,19 @@ static int gen8_reset_engines(struct intel_gt *gt,
* This is best effort, so ignore any error from the initial reset.
*/
if (IS_DG2(gt->i915) && engine_mask == ALL_ENGINES)
- gen11_reset_engines(gt, gt->info.engine_mask, 0);
+ __gen11_reset_engines(gt, gt->info.engine_mask, 0);
if (GRAPHICS_VER(gt->i915) >= 11)
- ret = gen11_reset_engines(gt, engine_mask, retry);
+ ret = __gen11_reset_engines(gt, engine_mask, retry);
else
- ret = gen6_reset_engines(gt, engine_mask, retry);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
skip_reset:
for_each_engine_masked(engine, gt, engine_mask, tmp)
gen8_engine_reset_cancel(engine);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
return ret;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b24dcf1dc507f69ed3b5c66c2b6a0209ae80d4d4 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 12 Jul 2022 16:21:32 +0100
Subject: [PATCH] drm/i915/gt: Serialize GRDOM access between multiple engine
resets
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Don't allow two engines to be reset in parallel, as they would both
try to select a reset bit (and send requests to common registers)
and wait on that register, at the same time. Serialize control of
the reset requests/acks using the uncore->lock, which will also ensure
that no other GT state changes at the same time as the actual reset.
Cc: stable(a)vger.kernel.org # v4.4 and upper
Reported-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda(a)intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7…
(cherry picked from commit 336561a914fc0c6f1218228718f633b31b7af1c3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index a5338c3fde7a..c68d36fb5bbd 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -300,9 +300,9 @@ static int gen6_hw_domain_reset(struct intel_gt *gt, u32 hw_domain_mask)
return err;
}
-static int gen6_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
u32 hw_mask;
@@ -321,6 +321,20 @@ static int gen6_reset_engines(struct intel_gt *gt,
return gen6_hw_domain_reset(gt, hw_mask);
}
+static int gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
+{
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(>->uncore->lock, flags);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
+ return ret;
+}
+
static struct intel_engine_cs *find_sfc_paired_vecs_engine(struct intel_engine_cs *engine)
{
int vecs_id;
@@ -487,9 +501,9 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
rmw_clear_fw(uncore, sfc_lock.lock_reg, sfc_lock.lock_bit);
}
-static int gen11_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen11_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
@@ -583,8 +597,11 @@ static int gen8_reset_engines(struct intel_gt *gt,
struct intel_engine_cs *engine;
const bool reset_non_ready = retry >= 1;
intel_engine_mask_t tmp;
+ unsigned long flags;
int ret;
+ spin_lock_irqsave(>->uncore->lock, flags);
+
for_each_engine_masked(engine, gt, engine_mask, tmp) {
ret = gen8_engine_reset_prepare(engine);
if (ret && !reset_non_ready)
@@ -612,17 +629,19 @@ static int gen8_reset_engines(struct intel_gt *gt,
* This is best effort, so ignore any error from the initial reset.
*/
if (IS_DG2(gt->i915) && engine_mask == ALL_ENGINES)
- gen11_reset_engines(gt, gt->info.engine_mask, 0);
+ __gen11_reset_engines(gt, gt->info.engine_mask, 0);
if (GRAPHICS_VER(gt->i915) >= 11)
- ret = gen11_reset_engines(gt, engine_mask, retry);
+ ret = __gen11_reset_engines(gt, engine_mask, retry);
else
- ret = gen6_reset_engines(gt, engine_mask, retry);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
skip_reset:
for_each_engine_masked(engine, gt, engine_mask, tmp)
gen8_engine_reset_cancel(engine);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
return ret;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b24dcf1dc507f69ed3b5c66c2b6a0209ae80d4d4 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 12 Jul 2022 16:21:32 +0100
Subject: [PATCH] drm/i915/gt: Serialize GRDOM access between multiple engine
resets
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Don't allow two engines to be reset in parallel, as they would both
try to select a reset bit (and send requests to common registers)
and wait on that register, at the same time. Serialize control of
the reset requests/acks using the uncore->lock, which will also ensure
that no other GT state changes at the same time as the actual reset.
Cc: stable(a)vger.kernel.org # v4.4 and upper
Reported-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda(a)intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7…
(cherry picked from commit 336561a914fc0c6f1218228718f633b31b7af1c3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index a5338c3fde7a..c68d36fb5bbd 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -300,9 +300,9 @@ static int gen6_hw_domain_reset(struct intel_gt *gt, u32 hw_domain_mask)
return err;
}
-static int gen6_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
u32 hw_mask;
@@ -321,6 +321,20 @@ static int gen6_reset_engines(struct intel_gt *gt,
return gen6_hw_domain_reset(gt, hw_mask);
}
+static int gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
+{
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(>->uncore->lock, flags);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
+ return ret;
+}
+
static struct intel_engine_cs *find_sfc_paired_vecs_engine(struct intel_engine_cs *engine)
{
int vecs_id;
@@ -487,9 +501,9 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
rmw_clear_fw(uncore, sfc_lock.lock_reg, sfc_lock.lock_bit);
}
-static int gen11_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen11_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
@@ -583,8 +597,11 @@ static int gen8_reset_engines(struct intel_gt *gt,
struct intel_engine_cs *engine;
const bool reset_non_ready = retry >= 1;
intel_engine_mask_t tmp;
+ unsigned long flags;
int ret;
+ spin_lock_irqsave(>->uncore->lock, flags);
+
for_each_engine_masked(engine, gt, engine_mask, tmp) {
ret = gen8_engine_reset_prepare(engine);
if (ret && !reset_non_ready)
@@ -612,17 +629,19 @@ static int gen8_reset_engines(struct intel_gt *gt,
* This is best effort, so ignore any error from the initial reset.
*/
if (IS_DG2(gt->i915) && engine_mask == ALL_ENGINES)
- gen11_reset_engines(gt, gt->info.engine_mask, 0);
+ __gen11_reset_engines(gt, gt->info.engine_mask, 0);
if (GRAPHICS_VER(gt->i915) >= 11)
- ret = gen11_reset_engines(gt, engine_mask, retry);
+ ret = __gen11_reset_engines(gt, engine_mask, retry);
else
- ret = gen6_reset_engines(gt, engine_mask, retry);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
skip_reset:
for_each_engine_masked(engine, gt, engine_mask, tmp)
gen8_engine_reset_cancel(engine);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
return ret;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b24dcf1dc507f69ed3b5c66c2b6a0209ae80d4d4 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 12 Jul 2022 16:21:32 +0100
Subject: [PATCH] drm/i915/gt: Serialize GRDOM access between multiple engine
resets
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Don't allow two engines to be reset in parallel, as they would both
try to select a reset bit (and send requests to common registers)
and wait on that register, at the same time. Serialize control of
the reset requests/acks using the uncore->lock, which will also ensure
that no other GT state changes at the same time as the actual reset.
Cc: stable(a)vger.kernel.org # v4.4 and upper
Reported-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda(a)intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7…
(cherry picked from commit 336561a914fc0c6f1218228718f633b31b7af1c3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index a5338c3fde7a..c68d36fb5bbd 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -300,9 +300,9 @@ static int gen6_hw_domain_reset(struct intel_gt *gt, u32 hw_domain_mask)
return err;
}
-static int gen6_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
u32 hw_mask;
@@ -321,6 +321,20 @@ static int gen6_reset_engines(struct intel_gt *gt,
return gen6_hw_domain_reset(gt, hw_mask);
}
+static int gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
+{
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(>->uncore->lock, flags);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
+ return ret;
+}
+
static struct intel_engine_cs *find_sfc_paired_vecs_engine(struct intel_engine_cs *engine)
{
int vecs_id;
@@ -487,9 +501,9 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
rmw_clear_fw(uncore, sfc_lock.lock_reg, sfc_lock.lock_bit);
}
-static int gen11_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen11_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
@@ -583,8 +597,11 @@ static int gen8_reset_engines(struct intel_gt *gt,
struct intel_engine_cs *engine;
const bool reset_non_ready = retry >= 1;
intel_engine_mask_t tmp;
+ unsigned long flags;
int ret;
+ spin_lock_irqsave(>->uncore->lock, flags);
+
for_each_engine_masked(engine, gt, engine_mask, tmp) {
ret = gen8_engine_reset_prepare(engine);
if (ret && !reset_non_ready)
@@ -612,17 +629,19 @@ static int gen8_reset_engines(struct intel_gt *gt,
* This is best effort, so ignore any error from the initial reset.
*/
if (IS_DG2(gt->i915) && engine_mask == ALL_ENGINES)
- gen11_reset_engines(gt, gt->info.engine_mask, 0);
+ __gen11_reset_engines(gt, gt->info.engine_mask, 0);
if (GRAPHICS_VER(gt->i915) >= 11)
- ret = gen11_reset_engines(gt, engine_mask, retry);
+ ret = __gen11_reset_engines(gt, engine_mask, retry);
else
- ret = gen6_reset_engines(gt, engine_mask, retry);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
skip_reset:
for_each_engine_masked(engine, gt, engine_mask, tmp)
gen8_engine_reset_cancel(engine);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
return ret;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b24dcf1dc507f69ed3b5c66c2b6a0209ae80d4d4 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 12 Jul 2022 16:21:32 +0100
Subject: [PATCH] drm/i915/gt: Serialize GRDOM access between multiple engine
resets
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Don't allow two engines to be reset in parallel, as they would both
try to select a reset bit (and send requests to common registers)
and wait on that register, at the same time. Serialize control of
the reset requests/acks using the uncore->lock, which will also ensure
that no other GT state changes at the same time as the actual reset.
Cc: stable(a)vger.kernel.org # v4.4 and upper
Reported-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda(a)intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7…
(cherry picked from commit 336561a914fc0c6f1218228718f633b31b7af1c3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index a5338c3fde7a..c68d36fb5bbd 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -300,9 +300,9 @@ static int gen6_hw_domain_reset(struct intel_gt *gt, u32 hw_domain_mask)
return err;
}
-static int gen6_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
u32 hw_mask;
@@ -321,6 +321,20 @@ static int gen6_reset_engines(struct intel_gt *gt,
return gen6_hw_domain_reset(gt, hw_mask);
}
+static int gen6_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
+{
+ unsigned long flags;
+ int ret;
+
+ spin_lock_irqsave(>->uncore->lock, flags);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
+ return ret;
+}
+
static struct intel_engine_cs *find_sfc_paired_vecs_engine(struct intel_engine_cs *engine)
{
int vecs_id;
@@ -487,9 +501,9 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
rmw_clear_fw(uncore, sfc_lock.lock_reg, sfc_lock.lock_bit);
}
-static int gen11_reset_engines(struct intel_gt *gt,
- intel_engine_mask_t engine_mask,
- unsigned int retry)
+static int __gen11_reset_engines(struct intel_gt *gt,
+ intel_engine_mask_t engine_mask,
+ unsigned int retry)
{
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
@@ -583,8 +597,11 @@ static int gen8_reset_engines(struct intel_gt *gt,
struct intel_engine_cs *engine;
const bool reset_non_ready = retry >= 1;
intel_engine_mask_t tmp;
+ unsigned long flags;
int ret;
+ spin_lock_irqsave(>->uncore->lock, flags);
+
for_each_engine_masked(engine, gt, engine_mask, tmp) {
ret = gen8_engine_reset_prepare(engine);
if (ret && !reset_non_ready)
@@ -612,17 +629,19 @@ static int gen8_reset_engines(struct intel_gt *gt,
* This is best effort, so ignore any error from the initial reset.
*/
if (IS_DG2(gt->i915) && engine_mask == ALL_ENGINES)
- gen11_reset_engines(gt, gt->info.engine_mask, 0);
+ __gen11_reset_engines(gt, gt->info.engine_mask, 0);
if (GRAPHICS_VER(gt->i915) >= 11)
- ret = gen11_reset_engines(gt, engine_mask, retry);
+ ret = __gen11_reset_engines(gt, engine_mask, retry);
else
- ret = gen6_reset_engines(gt, engine_mask, retry);
+ ret = __gen6_reset_engines(gt, engine_mask, retry);
skip_reset:
for_each_engine_masked(engine, gt, engine_mask, tmp)
gen8_engine_reset_cancel(engine);
+ spin_unlock_irqrestore(>->uncore->lock, flags);
+
return ret;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a1c5a7bf79c1faa5633b918b5c0666545e84c4d1 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris.p.wilson(a)intel.com>
Date: Tue, 12 Jul 2022 16:21:33 +0100
Subject: [PATCH] drm/i915/gt: Serialize TLB invalidates with GT resets
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Avoid trying to invalidate the TLB in the middle of performing an
engine reset, as this may result in the reset timing out. Currently,
the TLB invalidate is only serialised by its own mutex, forgoing the
uncore lock, but we can take the uncore->lock as well to serialise
the mmio access, thereby serialising with the GDRST.
Tested on a NUC5i7RYB, BIOS RYBDWi35.86A.0380.2019.0517.1530 with
i915 selftest/hangcheck.
Cc: stable(a)vger.kernel.org # v4.4 and upper
Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Reported-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Tested-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Reviewed-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Chris Wilson <chris.p.wilson(a)intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1e59a7c45dd919a530256b9ac721a…
(cherry picked from commit 33da97894758737895e90c909f16786052680ef4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 51a0fe60c050..531af6ad7007 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -1209,6 +1209,20 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
mutex_lock(>->tlb_invalidate_lock);
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
+ spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */
+
+ for_each_engine(engine, gt, id) {
+ struct reg_and_bit rb;
+
+ rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
+ if (!i915_mmio_reg_offset(rb.reg))
+ continue;
+
+ intel_uncore_write_fw(uncore, rb.reg, rb.bit);
+ }
+
+ spin_unlock_irq(&uncore->lock);
+
for_each_engine(engine, gt, id) {
/*
* HW architecture suggest typical invalidation time at 40us,
@@ -1223,7 +1237,6 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
if (!i915_mmio_reg_offset(rb.reg))
continue;
- intel_uncore_write_fw(uncore, rb.reg, rb.bit);
if (__intel_wait_for_register_fw(uncore,
rb.reg, rb.bit, 0,
timeout_us, timeout_ms,
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a1c5a7bf79c1faa5633b918b5c0666545e84c4d1 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris.p.wilson(a)intel.com>
Date: Tue, 12 Jul 2022 16:21:33 +0100
Subject: [PATCH] drm/i915/gt: Serialize TLB invalidates with GT resets
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Avoid trying to invalidate the TLB in the middle of performing an
engine reset, as this may result in the reset timing out. Currently,
the TLB invalidate is only serialised by its own mutex, forgoing the
uncore lock, but we can take the uncore->lock as well to serialise
the mmio access, thereby serialising with the GDRST.
Tested on a NUC5i7RYB, BIOS RYBDWi35.86A.0380.2019.0517.1530 with
i915 selftest/hangcheck.
Cc: stable(a)vger.kernel.org # v4.4 and upper
Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Reported-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Tested-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Reviewed-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Chris Wilson <chris.p.wilson(a)intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1e59a7c45dd919a530256b9ac721a…
(cherry picked from commit 33da97894758737895e90c909f16786052680ef4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 51a0fe60c050..531af6ad7007 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -1209,6 +1209,20 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
mutex_lock(>->tlb_invalidate_lock);
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
+ spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */
+
+ for_each_engine(engine, gt, id) {
+ struct reg_and_bit rb;
+
+ rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
+ if (!i915_mmio_reg_offset(rb.reg))
+ continue;
+
+ intel_uncore_write_fw(uncore, rb.reg, rb.bit);
+ }
+
+ spin_unlock_irq(&uncore->lock);
+
for_each_engine(engine, gt, id) {
/*
* HW architecture suggest typical invalidation time at 40us,
@@ -1223,7 +1237,6 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
if (!i915_mmio_reg_offset(rb.reg))
continue;
- intel_uncore_write_fw(uncore, rb.reg, rb.bit);
if (__intel_wait_for_register_fw(uncore,
rb.reg, rb.bit, 0,
timeout_us, timeout_ms,
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a1c5a7bf79c1faa5633b918b5c0666545e84c4d1 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris.p.wilson(a)intel.com>
Date: Tue, 12 Jul 2022 16:21:33 +0100
Subject: [PATCH] drm/i915/gt: Serialize TLB invalidates with GT resets
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Avoid trying to invalidate the TLB in the middle of performing an
engine reset, as this may result in the reset timing out. Currently,
the TLB invalidate is only serialised by its own mutex, forgoing the
uncore lock, but we can take the uncore->lock as well to serialise
the mmio access, thereby serialising with the GDRST.
Tested on a NUC5i7RYB, BIOS RYBDWi35.86A.0380.2019.0517.1530 with
i915 selftest/hangcheck.
Cc: stable(a)vger.kernel.org # v4.4 and upper
Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Reported-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Tested-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Reviewed-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Chris Wilson <chris.p.wilson(a)intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti(a)linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1e59a7c45dd919a530256b9ac721a…
(cherry picked from commit 33da97894758737895e90c909f16786052680ef4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 51a0fe60c050..531af6ad7007 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -1209,6 +1209,20 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
mutex_lock(>->tlb_invalidate_lock);
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
+ spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */
+
+ for_each_engine(engine, gt, id) {
+ struct reg_and_bit rb;
+
+ rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
+ if (!i915_mmio_reg_offset(rb.reg))
+ continue;
+
+ intel_uncore_write_fw(uncore, rb.reg, rb.bit);
+ }
+
+ spin_unlock_irq(&uncore->lock);
+
for_each_engine(engine, gt, id) {
/*
* HW architecture suggest typical invalidation time at 40us,
@@ -1223,7 +1237,6 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
if (!i915_mmio_reg_offset(rb.reg))
continue;
- intel_uncore_write_fw(uncore, rb.reg, rb.bit);
if (__intel_wait_for_register_fw(uncore,
rb.reg, rb.bit, 0,
timeout_us, timeout_ms,
Hyper-V driver has advertised support for multi-MSI, but any attempt at
using the feature would fallback to a single MSI (non-starter for
devices that require multi-MSI). The fallback also covered up other
bugs related to multi-MSI functionality rooted in the driver not being
able to tell MSIs apart.
These patches fix those bugs by enabling hv multi-MSI through IOMMU
remapping, distinguishing multi-MSIs from the initial MSI of the MSI
block, preventing retargeting of MSI subsets from invalidating the IRTE
block, and aiding hypervisor to preserve the block of requests.
Tested on 5.15.54
Jeffrey Hugo (4):
PCI: hv: Fix multi-MSI to allow more than one MSI vector
PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI
PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()
PCI: hv: Fix interrupt mapping for multi-MSI
drivers/pci/controller/pci-hyperv.c | 99 +++++++++++++++++++++--------
1 file changed, 73 insertions(+), 26 deletions(-)
--
2.25.1
Hi,
these two patches are the backport for 3.18.y of the following commits:
commit a96ef8b504efb2ad445dfb6d54f9488c3ddf23d2
bus: mhi: host: pci_generic: add Telit FN980 v1 hardware revisio
commit 77fc41204734042861210b9d05338c9b8360affb
bus: mhi: host: pci_generic: add Telit FN990
The cherry-pick of the original commits don't apply because the commit
89ad19bea649 (bus: mhi: host: pci_generic: Sort mhi_pci_id_table based
on the PID, 2022-04-11) was not cherry-picked. Another solution is to
cherry-pick those three commits all togheter.
In the following days I will send backport for longterm releases.
Thanks!
Daniele Palmas (2):
bus: mhi: host: pci_generic: add Telit FN980 v1 hardware revision
bus: mhi: host: pci_generic: add Telit FN990
drivers/bus/mhi/host/pci_generic.c | 79 ++++++++++++++++++++++++++++++
1 file changed, 79 insertions(+)
--
2.37.1
The rng's random_init() function contributes the real time to the rng at
boot time, so that events can at least start in relation to something
particular in the real world. But this clock might not yet be set that
point in boot, so nothing is contributed. In addition, the relation
between minor clock changes from, say, NTP, and the cycle counter is
potentially useful entropic data.
This commit addresses this by mixing in a time stamp on calls to
settimeofday and adjtimex. No entropy is credited in doing so, so it
doesn't make initialization faster, but it is still useful input to
have.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
kernel/time/timekeeping.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 8e4b3c32fcf9..ad55da792f13 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1346,6 +1346,9 @@ int do_settimeofday64(const struct timespec64 *ts)
if (!ret)
audit_tk_injoffset(ts_delta);
+ ktime_get_real_ts64(&xt);
+ add_device_randomness(&xt, sizeof(xt));
+
return ret;
}
EXPORT_SYMBOL(do_settimeofday64);
@@ -2475,6 +2478,9 @@ int do_adjtimex(struct __kernel_timex *txc)
ntp_notify_cmos_timer();
+ ktime_get_real_ts64(&ts);
+ add_device_randomness(&ts, sizeof(ts));
+
return ret;
}
--
2.35.1
This bug is marked as fixed by commit:
net: core: netlink: add helper refcount dec and lock function
net: sched: add helper function to take reference to Qdisc
net: sched: extend Qdisc with rcu
net: sched: rename qdisc_destroy() to qdisc_put()
net: sched: use Qdisc rcu API instead of relying on rtnl lock
But I can't find it in any tested tree for more than 90 days.
Is it a correct commit? Please update it by replying:
#syz fix: exact-commit-title
Until then the bug is still considered open and
new crashes with the same signature are ignored.
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 5871321fb4558c55bf9567052b618ff0be6b975e ]
We currently report that range controls accept a range of 0..(max-min) but
accept writes in the range 0..(max-min+1). Remove that extra +1.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220604105246.4055214-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index e693070f51fe..d867f449d82d 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -526,7 +526,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
@@ -547,7 +547,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
--
2.35.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 5871321fb4558c55bf9567052b618ff0be6b975e ]
We currently report that range controls accept a range of 0..(max-min) but
accept writes in the range 0..(max-min+1). Remove that extra +1.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220604105246.4055214-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index f32ba64c5dda..e73360e9de8f 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -526,7 +526,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
@@ -547,7 +547,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
--
2.35.1
UML generally does not provide access to special CPU instructions like
RDRAND, and execution tends to be rather deterministic, with no real
hardware interrupts, making good randomness really very hard, if not
all together impossible. Not only is this a security eyebrow raiser, but
it's also quite annoying when trying to do various pieces of UML-based
automation that takes a long time to boot, if ever.
Fix this by trivially calling getrandom() in the host and using that
seed as "bootloader randomness", which initializes the rng immediately
at UML boot.
The old behavior can be restored the same way as on any other arch, by
way of CONFIG_TRUST_BOOTLOADER_RANDOMNESS=n or
random.trust_bootloader=0. So seen from that perspective, this just
makes UML act like other archs, which is positive in its own right.
Cc: stable(a)vger.kernel.org
Cc: Johannes Berg <johannes(a)sipsolutions.net>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
arch/um/include/shared/os.h | 7 +++++++
arch/um/kernel/um_arch.c | 8 ++++++++
arch/um/os-Linux/util.c | 6 ++++++
3 files changed, 21 insertions(+)
diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
index fafde1d5416e..79644dd88d58 100644
--- a/arch/um/include/shared/os.h
+++ b/arch/um/include/shared/os.h
@@ -11,6 +11,12 @@
#include <irq_user.h>
#include <longjmp.h>
#include <mm_id.h>
+/* This is to get size_t */
+#ifndef __UM_HOST__
+#include <linux/types.h>
+#else
+#include <stddef.h>
+#endif
#define CATCH_EINTR(expr) while ((errno = 0, ((expr) < 0)) && (errno == EINTR))
@@ -243,6 +249,7 @@ extern void stack_protections(unsigned long address);
extern int raw(int fd);
extern void setup_machinename(char *machine_out);
extern void setup_hostinfo(char *buf, int len);
+extern ssize_t os_getrandom(void *buf, size_t len, unsigned int flags);
extern void os_dump_core(void) __attribute__ ((noreturn));
extern void um_early_printk(const char *s, unsigned int n);
extern void os_fix_helper_signals(void);
diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c
index 0760e24f2eba..74f3efd96bd4 100644
--- a/arch/um/kernel/um_arch.c
+++ b/arch/um/kernel/um_arch.c
@@ -16,6 +16,7 @@
#include <linux/sched/task.h>
#include <linux/kmsg_dump.h>
#include <linux/suspend.h>
+#include <linux/random.h>
#include <asm/processor.h>
#include <asm/cpufeature.h>
@@ -406,6 +407,8 @@ int __init __weak read_initrd(void)
void __init setup_arch(char **cmdline_p)
{
+ u8 rng_seed[32];
+
stack_protections((unsigned long) &init_thread_info);
setup_physmem(uml_physmem, uml_reserved, physmem_size, highmem);
mem_total_pages(physmem_size, iomem_size, highmem);
@@ -416,6 +419,11 @@ void __init setup_arch(char **cmdline_p)
strlcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);
*cmdline_p = command_line;
setup_hostinfo(host_info, sizeof host_info);
+
+ if (os_getrandom(rng_seed, sizeof(rng_seed), 0) == sizeof(rng_seed)) {
+ add_bootloader_randomness(rng_seed, sizeof(rng_seed));
+ memzero_explicit(rng_seed, sizeof(rng_seed));
+ }
}
void __init check_bugs(void)
diff --git a/arch/um/os-Linux/util.c b/arch/um/os-Linux/util.c
index 41297ec404bf..fc0f2a9dee5a 100644
--- a/arch/um/os-Linux/util.c
+++ b/arch/um/os-Linux/util.c
@@ -14,6 +14,7 @@
#include <sys/wait.h>
#include <sys/mman.h>
#include <sys/utsname.h>
+#include <sys/random.h>
#include <init.h>
#include <os.h>
@@ -96,6 +97,11 @@ static inline void __attribute__ ((noreturn)) uml_abort(void)
exit(127);
}
+ssize_t os_getrandom(void *buf, size_t len, unsigned int flags)
+{
+ return getrandom(buf, len, flags);
+}
+
/*
* UML helper threads must not handle SIGWINCH/INT/TERM
*/
--
2.35.1
The quilt patch titled
Subject: mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
has been removed from the -mm tree. Its filename was
mm-hugetlb-separate-path-for-hwpoison-entry-in-copy_hugetlb_page_range.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Subject: mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
Date: Mon, 4 Jul 2022 10:33:05 +0900
Originally copy_hugetlb_page_range() handles migration entries and
hwpoisoned entries in similar manner. But recently the related code path
has more code for migration entries, and when
is_writable_migration_entry() was converted to
!is_readable_migration_entry(), hwpoison entries on source processes got
to be unexpectedly updated (which is legitimate for migration entries, but
not for hwpoison entries). This results in unexpected serious issues like
kernel panic when forking processes with hwpoison entries in pmd.
Separate the if branch into one for hwpoison entries and one for migration
entries.
Link: https://lkml.kernel.org/r/20220704013312.2415700-3-naoya.horiguchi@linux.dev
Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org> [5.18]
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Liu Shixin <liushixin2(a)huawei.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Yang Shi <shy828301(a)gmail.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-separate-path-for-hwpoison-entry-in-copy_hugetlb_page_range
+++ a/mm/hugetlb.c
@@ -4788,8 +4788,13 @@ again:
* sharing with another vma.
*/
;
- } else if (unlikely(is_hugetlb_entry_migration(entry) ||
- is_hugetlb_entry_hwpoisoned(entry))) {
+ } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
+ bool uffd_wp = huge_pte_uffd_wp(entry);
+
+ if (!userfaultfd_wp(dst_vma) && uffd_wp)
+ entry = huge_pte_clear_uffd_wp(entry);
+ set_huge_pte_at(dst, addr, dst_pte, entry);
+ } else if (unlikely(is_hugetlb_entry_migration(entry))) {
swp_entry_t swp_entry = pte_to_swp_entry(entry);
bool uffd_wp = huge_pte_uffd_wp(entry);
_
Patches currently in -mm which might be from naoya.horiguchi(a)nec.com are
mm-hugetlb-check-gigantic_page_runtime_supported-in-return_unused_surplus_pages.patch
mm-hugetlb-make-pud_huge-and-follow_huge_pud-aware-of-non-present-pud-entry.patch
mm-hwpoison-hugetlb-support-saving-mechanism-of-raw-error-pages.patch
mm-hwpoison-make-unpoison-aware-of-raw-error-info-in-hwpoisoned-hugepage.patch
mm-hwpoison-set-pg_hwpoison-for-busy-hugetlb-pages.patch
mm-hwpoison-make-__page_handle_poison-returns-int.patch
mm-hwpoison-skip-raw-hwpoison-page-in-freeing-1gb-hugepage.patch
mm-hwpoison-enable-memory-error-handling-on-1gb-hugepage.patch
The quilt patch titled
Subject: mm: fix missing wake-up event for FSDAX pages
has been removed from the -mm tree. Its filename was
mm-fix-missing-wake-up-event-for-fsdax-pages.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: fix missing wake-up event for FSDAX pages
Date: Tue, 5 Jul 2022 20:35:32 +0800
FSDAX page refcounts are 1-based, rather than 0-based: if refcount is
1, then the page is freed. The FSDAX pages can be pinned through GUP,
then they will be unpinned via unpin_user_page() using a folio variant
to put the page, however, folio variants did not consider this special
case, the result will be to miss a wakeup event (like the user of
__fuse_dax_break_layouts()). This results in a task being permanently
stuck in TASK_INTERRUPTIBLE state.
Since FSDAX pages are only possibly obtained by GUP users, so fix GUP
instead of folio_put() to lower overhead.
Link: https://lkml.kernel.org/r/20220705123532.283-1-songmuchun@bytedance.com
Fixes: d8ddc099c6b3 ("mm/gup: Add gup_put_folio()")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Suggested-by: Matthew Wilcox <willy(a)infradead.org>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: William Kucharski <william.kucharski(a)oracle.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 14 +++++++++-----
mm/gup.c | 6 ++++--
mm/memremap.c | 6 +++---
3 files changed, 16 insertions(+), 10 deletions(-)
--- a/include/linux/mm.h~mm-fix-missing-wake-up-event-for-fsdax-pages
+++ a/include/linux/mm.h
@@ -1130,23 +1130,27 @@ static inline bool is_zone_movable_page(
#if defined(CONFIG_ZONE_DEVICE) && defined(CONFIG_FS_DAX)
DECLARE_STATIC_KEY_FALSE(devmap_managed_key);
-bool __put_devmap_managed_page(struct page *page);
-static inline bool put_devmap_managed_page(struct page *page)
+bool __put_devmap_managed_page_refs(struct page *page, int refs);
+static inline bool put_devmap_managed_page_refs(struct page *page, int refs)
{
if (!static_branch_unlikely(&devmap_managed_key))
return false;
if (!is_zone_device_page(page))
return false;
- return __put_devmap_managed_page(page);
+ return __put_devmap_managed_page_refs(page, refs);
}
-
#else /* CONFIG_ZONE_DEVICE && CONFIG_FS_DAX */
-static inline bool put_devmap_managed_page(struct page *page)
+static inline bool put_devmap_managed_page_refs(struct page *page, int refs)
{
return false;
}
#endif /* CONFIG_ZONE_DEVICE && CONFIG_FS_DAX */
+static inline bool put_devmap_managed_page(struct page *page)
+{
+ return put_devmap_managed_page_refs(page, 1);
+}
+
/* 127: arbitrary random number, small enough to assemble well */
#define folio_ref_zero_or_close_to_overflow(folio) \
((unsigned int) folio_ref_count(folio) + 127u <= 127u)
--- a/mm/gup.c~mm-fix-missing-wake-up-event-for-fsdax-pages
+++ a/mm/gup.c
@@ -87,7 +87,8 @@ retry:
* belongs to this folio.
*/
if (unlikely(page_folio(page) != folio)) {
- folio_put_refs(folio, refs);
+ if (!put_devmap_managed_page_refs(&folio->page, refs))
+ folio_put_refs(folio, refs);
goto retry;
}
@@ -176,7 +177,8 @@ static void gup_put_folio(struct folio *
refs *= GUP_PIN_COUNTING_BIAS;
}
- folio_put_refs(folio, refs);
+ if (!put_devmap_managed_page_refs(&folio->page, refs))
+ folio_put_refs(folio, refs);
}
/**
--- a/mm/memremap.c~mm-fix-missing-wake-up-event-for-fsdax-pages
+++ a/mm/memremap.c
@@ -499,7 +499,7 @@ void free_zone_device_page(struct page *
}
#ifdef CONFIG_FS_DAX
-bool __put_devmap_managed_page(struct page *page)
+bool __put_devmap_managed_page_refs(struct page *page, int refs)
{
if (page->pgmap->type != MEMORY_DEVICE_FS_DAX)
return false;
@@ -509,9 +509,9 @@ bool __put_devmap_managed_page(struct pa
* refcount is 1, then the page is free and the refcount is
* stable because nobody holds a reference on the page.
*/
- if (page_ref_dec_return(page) == 1)
+ if (page_ref_sub_return(page, refs) == 1)
wake_up_var(&page->_refcount);
return true;
}
-EXPORT_SYMBOL(__put_devmap_managed_page);
+EXPORT_SYMBOL(__put_devmap_managed_page_refs);
#endif /* CONFIG_FS_DAX */
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-hugetlb_vmemmap-delete-hugetlb_optimize_vmemmap_enabled.patch
mm-hugetlb_vmemmap-optimize-vmemmap_optimize_mode-handling.patch
mm-hugetlb_vmemmap-introduce-the-name-hvo.patch
mm-hugetlb_vmemmap-move-vmemmap-code-related-to-hugetlb-to-hugetlb_vmemmapc.patch
mm-hugetlb_vmemmap-replace-early_param-with-core_param.patch
mm-hugetlb_vmemmap-improve-hugetlb_vmemmap-code-readability.patch
mm-hugetlb_vmemmap-move-code-comments-to-vmemmap_deduprst.patch
mm-hugetlb_vmemmap-use-ptrs_per_pte-instead-of-pmd_size-page_size.patch
The quilt patch titled
Subject: mm: fix page leak with multiple threads mapping the same page
has been removed from the -mm tree. Its filename was
mm-fix-page-leak-with-multiple-threads-mapping-the-same-page.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Josef Bacik <josef(a)toxicpanda.com>
Subject: mm: fix page leak with multiple threads mapping the same page
Date: Tue, 5 Jul 2022 16:00:36 -0400
We have an application with a lot of threads that use a shared mmap backed
by tmpfs mounted with -o huge=within_size. This application started
leaking loads of huge pages when we upgraded to a recent kernel.
Using the page ref tracepoints and a BPF program written by Tejun Heo we
were able to determine that these pages would have multiple refcounts from
the page fault path, but when it came to unmap time we wouldn't drop the
number of refs we had added from the faults.
I wrote a reproducer that mmap'ed a file backed by tmpfs with -o
huge=always, and then spawned 20 threads all looping faulting random
offsets in this map, while using madvise(MADV_DONTNEED) randomly for huge
page aligned ranges. This very quickly reproduced the problem.
The problem here is that we check for the case that we have multiple
threads faulting in a range that was previously unmapped. One thread maps
the PMD, the other thread loses the race and then returns 0. However at
this point we already have the page, and we are no longer putting this
page into the processes address space, and so we leak the page. We
actually did the correct thing prior to f9ce0be71d1f, however it looks
like Kirill copied what we do in the anonymous page case. In the
anonymous page case we don't yet have a page, so we don't have to drop a
reference on anything. Previously we did the correct thing for file based
faults by returning VM_FAULT_NOPAGE so we correctly drop the reference on
the page we faulted in.
Fix this by returning VM_FAULT_NOPAGE in the pmd_devmap_trans_unstable()
case, this makes us drop the ref on the page properly, and now my
reproducer no longer leaks the huge pages.
[josef(a)toxicpanda.com: v2]
Link: https://lkml.kernel.org/r/e90c8f0dbae836632b669c2afc434006a00d4a67.16577214…
Link: https://lkml.kernel.org/r/2b798acfd95c9ab9395fe85e8d5a835e2e10a920.16570511…
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Signed-off-by: Chris Mason <clm(a)fb.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/mm/memory.c~mm-fix-page-leak-with-multiple-threads-mapping-the-same-page
+++ a/mm/memory.c
@@ -4369,9 +4369,12 @@ vm_fault_t finish_fault(struct vm_fault
return VM_FAULT_OOM;
}
- /* See comment in handle_pte_fault() */
+ /*
+ * See comment in handle_pte_fault() for how this scenario happens, we
+ * need to return NOPAGE so that we drop this page.
+ */
if (pmd_devmap_trans_unstable(vmf->pmd))
- return 0;
+ return VM_FAULT_NOPAGE;
vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
vmf->address, &vmf->ptl);
_
Patches currently in -mm which might be from josef(a)toxicpanda.com are
The quilt patch titled
Subject: hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte
has been removed from the -mm tree. Its filename was
hugetlb-fix-memoryleak-in-hugetlb_mcopy_atomic_pte.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Miaohe Lin <linmiaohe(a)huawei.com>
Subject: hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte
Date: Sat, 9 Jul 2022 17:26:29 +0800
When alloc_huge_page fails, *pagep is set to NULL without put_page first.
So the hugepage indicated by *pagep is leaked.
Link: https://lkml.kernel.org/r/20220709092629.54291-1-linmiaohe@huawei.com
Fixes: 8cc5fcbb5be8 ("mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Acked-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
Reviewed-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/hugetlb.c~hugetlb-fix-memoryleak-in-hugetlb_mcopy_atomic_pte
+++ a/mm/hugetlb.c
@@ -5952,6 +5952,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_s
page = alloc_huge_page(dst_vma, dst_addr, 0);
if (IS_ERR(page)) {
+ put_page(*pagep);
ret = -ENOMEM;
*pagep = NULL;
goto out;
_
Patches currently in -mm which might be from linmiaohe(a)huawei.com are
mm-hugetlb-avoid-corrupting-page-mapping-in-hugetlb_mcopy_atomic_pte.patch
mm-page_alloc-minor-clean-up-for-memmap_init_compound.patch
mm-mmapc-fix-missing-call-to-vm_unacct_memory-in-mmap_region.patch
filemap-minor-cleanup-for-filemap_write_and_wait_range.patch
mm-huge_memory-use-flush_pmd_tlb_range-in-move_huge_pmd.patch
mm-huge_memory-access-vm_page_prot-with-read_once-in-remove_migration_pmd.patch
mm-huge_memory-fix-comment-of-__pud_trans_huge_lock.patch
mm-huge_memory-use-helper-touch_pud-in-huge_pud_set_accessed.patch
mm-huge_memory-use-helper-touch_pmd-in-huge_pmd_set_accessed.patch
mm-huge_memory-rename-mmun_start-to-haddr-in-remove_migration_pmd.patch
mm-huge_memory-use-helper-function-vma_lookup-in-split_huge_pages_pid.patch
mm-huge_memory-use-helper-macro-__attr_rw.patch
mm-huge_memory-fix-comment-in-zap_huge_pud.patch
mm-huge_memory-check-pmd_present-first-in-is_huge_zero_pmd.patch
mm-huge_memory-try-to-free-subpage-in-swapcache-when-possible.patch
mm-huge_memory-minor-cleanup-for-split_huge_pages_all.patch
mm-huge_memory-fix-comment-of-page_deferred_list.patch
mm-huge_memory-correct-comment-of-prep_transhuge_page.patch
mm-huge_memory-comment-the-subtly-logic-in-__split_huge_pmd.patch
mm-huge_memory-use-helper-macro-is_err_or_null-in-split_huge_pages_pid.patch
mm-page_vma_mappedc-use-helper-function-huge_pte_lock.patch
mm-mmap-fix-obsolete-comment-of-find_extend_vma.patch
mm-remove-obsolete-comment-in-do_fault_around.patch
The quilt patch titled
Subject: fs: sendfile handles O_NONBLOCK of out_fd
has been removed from the -mm tree. Its filename was
fs-sendfile-handles-o_nonblock-of-out_fd.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Andrei Vagin <avagin(a)gmail.com>
Subject: fs: sendfile handles O_NONBLOCK of out_fd
sendfile has to return EAGAIN if out_fd is nonblocking and the write into
it would block.
Here is a small reproducer for the problem:
#define _GNU_SOURCE /* See feature_test_macros(7) */
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/sendfile.h>
#define FILE_SIZE (1UL << 30)
int main(int argc, char **argv) {
int p[2], fd;
if (pipe2(p, O_NONBLOCK))
return 1;
fd = open(argv[1], O_RDWR | O_TMPFILE, 0666);
if (fd < 0)
return 1;
ftruncate(fd, FILE_SIZE);
if (sendfile(p[1], fd, 0, FILE_SIZE) == -1) {
fprintf(stderr, "FAIL\n");
}
if (sendfile(p[1], fd, 0, FILE_SIZE) != -1 || errno != EAGAIN) {
fprintf(stderr, "FAIL\n");
}
return 0;
}
It worked before b964bf53e540, it is stuck after b964bf53e540, and it
works again with this fix.
This regression occurred because do_splice_direct() calls pipe_write
that handles O_NONBLOCK. Here is a trace log from the reproducer:
1) | __x64_sys_sendfile64() {
1) | do_sendfile() {
1) | __fdget()
1) | rw_verify_area()
1) | __fdget()
1) | rw_verify_area()
1) | do_splice_direct() {
1) | rw_verify_area()
1) | splice_direct_to_actor() {
1) | do_splice_to() {
1) | rw_verify_area()
1) | generic_file_splice_read()
1) + 74.153 us | }
1) | direct_splice_actor() {
1) | iter_file_splice_write() {
1) | __kmalloc()
1) 0.148 us | pipe_lock();
1) 0.153 us | splice_from_pipe_next.part.0();
1) 0.162 us | page_cache_pipe_buf_confirm();
... 16 times
1) 0.159 us | page_cache_pipe_buf_confirm();
1) | vfs_iter_write() {
1) | do_iter_write() {
1) | rw_verify_area()
1) | do_iter_readv_writev() {
1) | pipe_write() {
1) | mutex_lock()
1) 0.153 us | mutex_unlock();
1) 1.368 us | }
1) 1.686 us | }
1) 5.798 us | }
1) 6.084 us | }
1) 0.174 us | kfree();
1) 0.152 us | pipe_unlock();
1) + 14.461 us | }
1) + 14.783 us | }
1) 0.164 us | page_cache_pipe_buf_release();
... 16 times
1) 0.161 us | page_cache_pipe_buf_release();
1) | touch_atime()
1) + 95.854 us | }
1) + 99.784 us | }
1) ! 107.393 us | }
1) ! 107.699 us | }
Link: https://lkml.kernel.org/r/20220415005015.525191-1-avagin@gmail.com
Fixes: b964bf53e540 ("teach sendfile(2) to handle send-to-pipe directly")
Signed-off-by: Andrei Vagin <avagin(a)gmail.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/read_write.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/read_write.c~fs-sendfile-handles-o_nonblock-of-out_fd
+++ a/fs/read_write.c
@@ -1263,6 +1263,9 @@ static ssize_t do_sendfile(int out_fd, i
count, fl);
file_end_write(out.file);
} else {
+ if (out.file->f_flags & O_NONBLOCK)
+ fl |= SPLICE_F_NONBLOCK;
+
retval = splice_file_to_pipe(in.file, opipe, &pos, count, fl);
}
_
Patches currently in -mm which might be from avagin(a)gmail.com are
The patch titled
Subject: mm/damon/reclaim: fix potential memory leak in damon_reclaim_init()
has been added to the -mm mm-unstable branch. Its filename is
mm-damon-reclaim-fix-potential-memory-leak-in-damon_reclaim_init.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Jianglei Nie <niejianglei2021(a)163.com>
Subject: mm/damon/reclaim: fix potential memory leak in damon_reclaim_init()
Date: Thu, 14 Jul 2022 14:37:46 +0800
damon_reclaim_init() allocates a memory chunk for ctx with
damon_new_ctx(). When damon_select_ops() fails, ctx is not released,
which will lead to a memory leak.
We should release the ctx with damon_destroy_ctx() when damon_select_ops()
fails to fix the memory leak.
Link: https://lkml.kernel.org/r/20220714063746.2343549-1-niejianglei2021@163.com
Fixes: 4d69c3457821 ("mm/damon/reclaim: use damon_select_ops() instead of damon_{v,p}a_set_operations()")
Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/reclaim.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/damon/reclaim.c~mm-damon-reclaim-fix-potential-memory-leak-in-damon_reclaim_init
+++ a/mm/damon/reclaim.c
@@ -435,8 +435,10 @@ static int __init damon_reclaim_init(voi
if (!ctx)
return -ENOMEM;
- if (damon_select_ops(ctx, DAMON_OPS_PADDR))
+ if (damon_select_ops(ctx, DAMON_OPS_PADDR)) {
+ damon_destroy_ctx(ctx);
return -EINVAL;
+ }
ctx->callback.after_wmarks_check = damon_reclaim_after_wmarks_check;
ctx->callback.after_aggregation = damon_reclaim_after_aggregation;
_
Patches currently in -mm which might be from niejianglei2021(a)163.com are
mm-damon-reclaim-fix-potential-memory-leak-in-damon_reclaim_init.patch
Good Day My Dear Friend,
I sent this mail praying it will get to you in a good condition of
health, since I myself are in a very critical health condition in
which I sleep every night without knowing if I may be alive to see the
next day. I bring peace and love to you. It is by the grace of God, I
had no choice than to do what is lawful and right in the sight of God
for eternal life and in the sight of man, for witness of God’s mercy
and glory upon my life,I am Mrs, Grayson Jackie, a widow I am
suffering from a long time brain tumor, It has defiled all forms of
medical treatment, and right now I have about a few months to leave,
according to medical experts.The situation has gotten complicated
recently with my inability to hear proper, am communicating with you
with the help of the chief nurse herein the hospital, from all
indication my conditions is really deteriorating and it is quite
obvious that, according to my doctors they have advised me that I may
not live too long, Because this illness has gotten to a very bad
stage. I plead that you will not expose or betray this trust and
confidence that I am about to repose on you for the mutual benefit of
the orphans and the less privilege. I have some funds I inherited from
my late husband, the sum of ($11,500,000.00 Dollars).Having known my
condition, I decided to donate this fund to you believing that you
will utilize it the way i am going to instruct herein.
I need you to assist me and reclaim this money and use it for
Charity works, for orphanages and gives justice and help to the poor,
needy and widows says The Lord." Jeremiah 22:15-16." and also build
schools for less privilege that will be named after my late husband if
possible and to promote the word of God and the effort that the house
of God is maintained. I do not want a situation where this money will
be used in an ungodly manner. That's why I'm taking this decision. I'm
not afraid of death, so I know where I'm going. I accept this decision
because I do not have any child who will inherit this money after I
die. Please I want your sincerely and urgent answer to know if you
will be able to execute this project for the glory of God, and I will
give you more information on how the fund will be transferred to your
bank account. May the grace, peace, love and the truth in the Word of
God be with you and all those that you love and care for.
I'm waiting for your immediate reply,
Faithfylly.
Mrs, Grayson Jackie.
Writting From the hospital.
May God Bless you.
The first patch of this patchset fixes 'feature_persistent' parameter
handling in 'blkback' to respect the frontend's persistent grants
support always. The fix makes a behavioral change, so the second patch
makes the counterpart of 'blkfront' to consistently follow the behavior
change.
Changes from v2
(https://lore.kernel.org/xen-devel/20220714224410.51147-1-sj@kernel.org/)
- Keep the behavioral change of v1
- Update blkfront's counterpart to follow the changed behavior
- Update documents for the changed behavior
Changes from v1
(https://lore.kernel.org/xen-devel/20220106091013.126076-1-mheyne@amazon.de/)
- Avoid the behavioral change
(https://lore.kernel.org/xen-devel/20220121102309.27802-1-sj@kernel.org/)
- Rebase on latest xen/tip/linux-next
- Re-work by SeongJae Park <sj(a)kernel.org>
- Cc stable@
Maximilian Heyne (1):
xen, blkback: fix persistent grants negotiation
SeongJae Park (1):
xen-blkfront: Apply 'feature_persistent' parameter when connect
Documentation/ABI/testing/sysfs-driver-xen-blkback | 2 +-
Documentation/ABI/testing/sysfs-driver-xen-blkfront | 2 +-
drivers/block/xen-blkback/xenbus.c | 9 +++------
drivers/block/xen-blkfront.c | 4 +---
4 files changed, 6 insertions(+), 11 deletions(-)
--
2.25.1
Hyper-V driver has advertised support for multi-MSI, but any attempt at
using the feature would fallback to a single MSI (non-starter for
devices that require multi-MSI). The fallback also covered up other
bugs related to multi-MSI functionality rooted in the driver not being
able to tell MSIs apart.
These patches fix those bugs by enabling hv multi-MSI through IOMMU
remapping, distinguishing multi-MSIs from the initial MSI of the MSI
block, preventing retargeting of MSI subsets from invalidating the IRTE
block, and aiding hypervisor to preserve the block of requests.
Tested on 5.10.130
Jeffrey Hugo (4):
PCI: hv: Fix multi-MSI to allow more than one MSI vector
PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI
PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()
PCI: hv: Fix interrupt mapping for multi-MSI
drivers/pci/controller/pci-hyperv.c | 99 +++++++++++++++++++++--------
1 file changed, 73 insertions(+), 26 deletions(-)
--
2.25.1
Hyper-V driver has advertised support for multi-MSI, but any attempt at
using the feature would fallback to a single MSI (non-starter for
devices that require multi-MSI). The fallback also covered up other
bugs related to multi-MSI functionality rooted in the driver not being
able to tell MSIs apart.
These patches fix those bugs by enabling hv multi-MSI through IOMMU
remapping, distinguishing multi-MSIs from the initial MSI of the MSI
block, preventing retargeting of MSI subsets from invalidating the IRTE
block, and aiding hypervisor to preserve the block of requests.
Tested on 5.4.205
Jeffrey Hugo (4):
PCI: hv: Fix multi-MSI to allow more than one MSI vector
PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI
PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()
PCI: hv: Fix interrupt mapping for multi-MSI
drivers/pci/controller/pci-hyperv.c | 99 +++++++++++++++++++++--------
1 file changed, 73 insertions(+), 26 deletions(-)
--
2.25.1
From: Sean Wang <sean.wang(a)mediatek.com>
This reverts commit 663457f421d41e9d2fcb1e84baf43d1433f80c08 that is the
commit 44c4237cf3436bda2b185ff728123651ad133f69 upstream.
Because there was mistake in
'649178c0493e ("mt76: mt7921e: fix possible probe failure after reboot")'
that caused WiFi reset cannot work well as the reported issue
"PROBLEM: [Stable v5.15.42+] [mt7921] Wake after suspend locks up system
when mt7921-driver is used on a Lenovo ThinkPad E15 G3" described in
http://lists.infradead.org/pipermail/linux-mediatek/2022-June/042668.html
So we need to revert the patch first to avoid the conflict of reverting
'649178c0493e ("mt76: mt7921e: fix possible probe failure after reboot")'
and will be applied back later after fixing.
Signed-off-by: Sean Wang <sean.wang(a)mediatek.com>
---
v2: update changelog text
---
drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
index 3d35838ef306..7d9b23a00238 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
@@ -254,10 +254,8 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
dev->bus_ops = dev->mt76.bus;
bus_ops = devm_kmemdup(dev->mt76.dev, dev->bus_ops, sizeof(*bus_ops),
GFP_KERNEL);
- if (!bus_ops) {
- ret = -ENOMEM;
- goto err_free_dev;
- }
+ if (!bus_ops)
+ return -ENOMEM;
bus_ops->rr = mt7921_rr;
bus_ops->wr = mt7921_wr;
@@ -266,7 +264,7 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
ret = __mt7921_mcu_drv_pmctrl(dev);
if (ret)
- goto err_free_dev;
+ return ret;
mdev->rev = (mt7921_l1_rr(dev, MT_HW_CHIPID) << 16) |
(mt7921_l1_rr(dev, MT_HW_REV) & 0xff);
--
2.25.1
Syzkaller reports use-after-free for net_device's in 5.10 stable releases.
The problem has been fixed by the following patch series and
it can be cleanly applied to the 5.10 branch.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Using bin_attributes with a 0 size causes fstat and friends to return that
0 size. This breaks userspace code that retrieves the size before reading
the file. Rather than reverting 75bd50fa841 ("drivers/base/node.c: use
bin_attribute to break the size limitation of cpumap ABI") let's put in a
size value at compile time.
For cpulist the maximum size is on the order of
NR_CPUS * (ceil(log10(NR_CPUS)) + 1)/2
which for 8192 is 20480 (8192 * 5)/2. In order to get near that you'd need
a system with every other CPU on one node. For example: (0,2,4,8, ... ).
To simplify the math and support larger NR_CPUS in the future we are using
(NR_CPUS * 7)/2. We also set it to a min of PAGE_SIZE to retain the older
behavior for smaller NR_CPUS.
The cpumap file the size works out to be NR_CPUS/4 + NR_CPUS/32 - 1
(or NR_CPUS * 9/32 - 1) including the ","s.
Add a set of macros for these values to cpumask.h so they can be used in
multiple places. Apply these to the handful of such files in
drivers/base/topology.c as well as node.c.
As an example, on an 80 cpu 4-node system (NR_CPUS == 8192):
before:
-r--r--r--. 1 root root 0 Jul 12 14:08 system/node/node0/cpulist
-r--r--r--. 1 root root 0 Jul 11 17:25 system/node/node0/cpumap
after:
-r--r--r--. 1 root root 28672 Jul 13 11:32 system/node/node0/cpulist
-r--r--r--. 1 root root 4096 Jul 13 11:31 system/node/node0/cpumap
CONFIG_NR_CPUS = 16384
-r--r--r--. 1 root root 57344 Jul 13 14:03 system/node/node0/cpulist
-r--r--r--. 1 root root 4607 Jul 13 14:02 system/node/node0/cpumap
The actual number of cpus doesn't matter for the reported size since they
are based on NR_CPUS.
Fixes: 75bd50fa841 ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI")
Fixes: bb9ec13d156 ("topology: use bin_attribute to break the size limitation of cpumap ABI")
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Yury Norov <yury.norov(a)gmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Phil Auld <pauld(a)redhat.com>
---
v2: Fix cpumap size calculation. Increase multiplier for cpulist size.
v3: Add comments in code.
v4: Define constants in cpumask.h. Move comments there. Also fix
topology.c.
v5: Fixed math based on Yury's corrections.
drivers/base/node.c | 4 ++--
drivers/base/topology.c | 32 ++++++++++++++++----------------
include/linux/cpumask.h | 18 ++++++++++++++++++
3 files changed, 36 insertions(+), 18 deletions(-)
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 0ac6376ef7a1..eb0f43784c2b 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -45,7 +45,7 @@ static inline ssize_t cpumap_read(struct file *file, struct kobject *kobj,
return n;
}
-static BIN_ATTR_RO(cpumap, 0);
+static BIN_ATTR_RO(cpumap, CPUMAP_FILE_MAX_BYTES);
static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
struct bin_attribute *attr, char *buf,
@@ -66,7 +66,7 @@ static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
return n;
}
-static BIN_ATTR_RO(cpulist, 0);
+static BIN_ATTR_RO(cpulist, CPULIST_FILE_MAX_BYTES);
/**
* struct node_access_nodes - Access class device to hold user visible
diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index ac6ad9ab67f9..89f98be5c5b9 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -62,47 +62,47 @@ define_id_show_func(ppin, "0x%llx");
static DEVICE_ATTR_ADMIN_RO(ppin);
define_siblings_read_func(thread_siblings, sibling_cpumask);
-static BIN_ATTR_RO(thread_siblings, 0);
-static BIN_ATTR_RO(thread_siblings_list, 0);
+static BIN_ATTR_RO(thread_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(thread_siblings_list, CPULIST_FILE_MAX_BYTES);
define_siblings_read_func(core_cpus, sibling_cpumask);
-static BIN_ATTR_RO(core_cpus, 0);
-static BIN_ATTR_RO(core_cpus_list, 0);
+static BIN_ATTR_RO(core_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(core_cpus_list, CPULIST_FILE_MAX_BYTES);
define_siblings_read_func(core_siblings, core_cpumask);
-static BIN_ATTR_RO(core_siblings, 0);
-static BIN_ATTR_RO(core_siblings_list, 0);
+static BIN_ATTR_RO(core_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(core_siblings_list, CPULIST_FILE_MAX_BYTES);
#ifdef TOPOLOGY_CLUSTER_SYSFS
define_siblings_read_func(cluster_cpus, cluster_cpumask);
-static BIN_ATTR_RO(cluster_cpus, 0);
-static BIN_ATTR_RO(cluster_cpus_list, 0);
+static BIN_ATTR_RO(cluster_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(cluster_cpus_list, CPULIST_FILE_MAX_BYTES);
#endif
#ifdef TOPOLOGY_DIE_SYSFS
define_siblings_read_func(die_cpus, die_cpumask);
-static BIN_ATTR_RO(die_cpus, 0);
-static BIN_ATTR_RO(die_cpus_list, 0);
+static BIN_ATTR_RO(die_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(die_cpus_list, CPULIST_FILE_MAX_BYTES);
#endif
define_siblings_read_func(package_cpus, core_cpumask);
-static BIN_ATTR_RO(package_cpus, 0);
-static BIN_ATTR_RO(package_cpus_list, 0);
+static BIN_ATTR_RO(package_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(package_cpus_list, CPULIST_FILE_MAX_BYTES);
#ifdef TOPOLOGY_BOOK_SYSFS
define_id_show_func(book_id, "%d");
static DEVICE_ATTR_RO(book_id);
define_siblings_read_func(book_siblings, book_cpumask);
-static BIN_ATTR_RO(book_siblings, 0);
-static BIN_ATTR_RO(book_siblings_list, 0);
+static BIN_ATTR_RO(book_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(book_siblings_list, CPULIST_FILE_MAX_BYTES);
#endif
#ifdef TOPOLOGY_DRAWER_SYSFS
define_id_show_func(drawer_id, "%d");
static DEVICE_ATTR_RO(drawer_id);
define_siblings_read_func(drawer_siblings, drawer_cpumask);
-static BIN_ATTR_RO(drawer_siblings, 0);
-static BIN_ATTR_RO(drawer_siblings_list, 0);
+static BIN_ATTR_RO(drawer_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(drawer_siblings_list, CPULIST_FILE_MAX_BYTES);
#endif
static struct bin_attribute *bin_attrs[] = {
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index fe29ac7cc469..4592d0845941 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -1071,4 +1071,22 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
[0] = 1UL \
} }
+/*
+ * Provide a valid theoretical max size for cpumap and cpulist sysfs files
+ * to avoid breaking userspace which may allocate a buffer based on the size
+ * reported by e.g. fstat.
+ *
+ * for cpumap NR_CPUS * 9/32 - 1 should be an exact length.
+ *
+ * For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up
+ * to 2 orders of magnitude larger than 8192. And then we divide by 2 to
+ * cover a worst-case of every other cpu being on one of two nodes for a
+ * very large NR_CPUS.
+ *
+ * Use PAGE_SIZE as a minimum for smaller configurations.
+ */
+#define CPUMAP_FILE_MAX_BYTES ((((NR_CPUS * 9)/32 - 1) > PAGE_SIZE) \
+ ? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE)
+#define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE)
+
#endif /* __LINUX_CPUMASK_H */
--
2.31.1
Hyper-V driver has advertised support for multi-MSI, but any attempt at
using the feature would fallback to a single MSI (non-starter for
devices that require multi-MSI). The fallback also covered up other
bugs related to multi-MSI functionality rooted in the driver not being
able to tell MSIs apart.
These patches fix those bugs by enabling hv multi-MSI through IOMMU
remapping, distinguishing multi-MSIs from the initial MSI of the MSI
block, preventing retargeting of MSI subsets from invalidating the IRTE
block, and aiding hypervisor to preserve the block of requests.
Tested on 5.18.10-stable. Looking to backport to all long-term stable
branches (post any possible merge-conflict resolution and testing
documented in future patch-sets of course.)
Jeffrey Hugo (4):
PCI: hv: Fix multi-MSI to allow more than one MSI vector
PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI
PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()
PCI: hv: Fix interrupt mapping for multi-MSI
drivers/pci/controller/pci-hyperv.c | 99 +++++++++++++++++++++--------
1 file changed, 73 insertions(+), 26 deletions(-)
--
2.25.1
Hello,
Oleksandr, thank you for Cc-ing Andrii. Andrii, thank you for the comment!
On Fri, 15 Jul 2022 15:00:10 +0300 Andrii Chepurnyi <andrii.chepurnyi82(a)gmail.com> wrote:
> [-- Attachment #1: Type: text/plain, Size: 5237 bytes --]
>
> Hello All,
>
> I faced the mentioned issue recently and just to bring more context here is
> our setup:
> We use pvblock backend for Android guest. It starts using u-boot with
> pvblock support(which frontend doesn't support the persistent grants
> feature), later it loads and starts the Linux kernel(which frontend
> supports the persistent grants feature). So in total, we have sequent two
> different frontends reconnection, the first of which doesn't support
> persistent grants.
> So the original patch [1] perfectly solves the original issue and provides
> the ability to use persistent grants after the reconnection when Linux
> frontend which supports persistent grants comes into play.
> At the same time [2] will disable the persistent grants feature for the
> first and second frontend.
Thank you for this great explanation of your situation.
> Is it possible to keep [1] as is?
Yes, my concerns about Max's original patch[1] are conflicting behavior
description in the document[1] and different behavior on blkfront-side
'feature_persistent' parameter. I will post Max's patch again with patches for
blkfront behavior change and Documents updates.
[1] https://lore.kernel.org/xen-devel/20220121102309.27802-1-sj@kernel.org/
Thanks,
SJ
>
> [1]
> https://lore.kernel.org/xen-devel/20220106091013.126076-1-mheyne@amazon.de/
> [2] https://lore.kernel.org/xen-devel/20220714224410.51147-1-sj@kernel.org/
>
> Best regards,
> Andrii
>
> On Fri, Jul 15, 2022 at 1:15 PM Oleksandr <olekstysh(a)gmail.com> wrote:
>
> >
> > On 15.07.22 01:44, SeongJae Park wrote:
> >
> >
> > Hello all.
> >
> > Adding Andrii Chepurnyi to CC who have played with the use-case which
> > required reconnect recently and faced some issues with
> > feature_persistent handling.
[...]
The bitops compile-time optimization series revealed one more
problem in olpc-xo1-sci.c:send_ebook_state(), resulted in GCC
warnings:
arch/x86/platform/olpc/olpc-xo1-sci.c: In function 'send_ebook_state':
arch/x86/platform/olpc/olpc-xo1-sci.c:83:63: warning: logical not is only applied to the left hand side of comparison [-Wlogical-not-parentheses]
83 | if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
| ^~
arch/x86/platform/olpc/olpc-xo1-sci.c:83:13: note: add parentheses around left hand side expression to silence this warning
Despite this code working as intended, this redundant double
negation of boolean value, together with comparing to `char`
with no explicit conversion to bool, makes compilers think
the author made some unintentional logical mistakes here.
Make it the other way around and negate the char instead
to silence the warnings.
Fixes: d2aa37411b8e ("x86/olpc/xo1/sci: Produce wakeup events for buttons and switches")
Cc: stable(a)vger.kernel.org # 3.5+
Reported-by: Guenter Roeck <linux(a)roeck-us.net>
Reported-by: kernel test robot <lkp(a)intel.com>
Reviewed-and-tested-by: Guenter Roeck <linux(a)roeck-us.net>
Signed-off-by: Alexander Lobakin <alexandr.lobakin(a)intel.com>
---
arch/x86/platform/olpc/olpc-xo1-sci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/platform/olpc/olpc-xo1-sci.c b/arch/x86/platform/olpc/olpc-xo1-sci.c
index f03a6883dcc6..89f25af4b3c3 100644
--- a/arch/x86/platform/olpc/olpc-xo1-sci.c
+++ b/arch/x86/platform/olpc/olpc-xo1-sci.c
@@ -80,7 +80,7 @@ static void send_ebook_state(void)
return;
}
- if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
+ if (test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == !!state)
return; /* Nothing new to report. */
input_report_switch(ebook_switch_idev, SW_TABLET_MODE, state);
--
2.36.1
The DEVICE_BUSY_TIMEOUT value is described in the Reference Manual as:
| Timeout waiting for NAND Ready/Busy or ATA IRQ. Used in WAIT_FOR_READY
| mode. This value is the number of GPMI_CLK cycles multiplied by 4096.
So instead of multiplying the value in cycles with 4096, we have to
divide it by that value. Use DIV_ROUND_UP to make sure we are on the
safe side, especially when the calculated value in cycles is smaller
than 4096 as typically the case.
This bug likely never triggered because any timeout != 0 usually will
do. In my case the busy timeout in cycles was originally calculated as
2408, which multiplied with 4096 is 0x968000. The lower 16 bits were
taken for the 16 bit wide register field, so the register value was
0x8000. With 2970bf5a32f0 ("mtd: rawnand: gpmi: fix controller timings
setting") however the value in cycles became 2384, which multiplied
with 4096 is 0x950000. The lower 16 bit are 0x0 now resulting in an
intermediate timeout when reading from NAND.
Fixes: b1206122069aa ("mtd: rawnand: gpmi: use core timings instead of an empirical derivation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sascha Hauer <s.hauer(a)pengutronix.de>
---
Just a resend with +Cc: stable(a)vger.kernel.org
drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
index 0b68d05846e18..889e403299568 100644
--- a/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
@@ -890,7 +890,7 @@ static int gpmi_nfc_compute_timings(struct gpmi_nand_data *this,
hw->timing0 = BF_GPMI_TIMING0_ADDRESS_SETUP(addr_setup_cycles) |
BF_GPMI_TIMING0_DATA_HOLD(data_hold_cycles) |
BF_GPMI_TIMING0_DATA_SETUP(data_setup_cycles);
- hw->timing1 = BF_GPMI_TIMING1_BUSY_TIMEOUT(busy_timeout_cycles * 4096);
+ hw->timing1 = BF_GPMI_TIMING1_BUSY_TIMEOUT(DIV_ROUND_UP(busy_timeout_cycles, 4096));
/*
* Derive NFC ideal delay from {3}:
--
2.30.2
Hello again,
This is a 5.15.y only set of xfs fixes. They have been tested and Ack'd.
Changes from v1->v2:
- Added Acks
[v1] https://lore.kernel.org/linux-xfs/20220707223828.599185-1-leah.rumancik@gma…
Darrick J. Wong (2):
xfs: only run COW extent recovery when there are no live extents
xfs: don't include bnobt blocks when reserving free block pool
Dave Chinner (2):
xfs: run callbacks before waking waiters in
xlog_state_shutdown_callbacks
xfs: drop async cache flushes from CIL commits.
fs/xfs/xfs_bio_io.c | 35 ------------------------
fs/xfs/xfs_fsops.c | 2 +-
fs/xfs/xfs_linux.h | 2 --
fs/xfs/xfs_log.c | 58 +++++++++++++++++-----------------------
fs/xfs/xfs_log_cil.c | 42 +++++++++--------------------
fs/xfs/xfs_log_priv.h | 3 +--
fs/xfs/xfs_log_recover.c | 24 ++++++++++++++++-
fs/xfs/xfs_mount.c | 12 +--------
fs/xfs/xfs_mount.h | 15 +++++++++++
fs/xfs/xfs_reflink.c | 5 +++-
fs/xfs/xfs_super.c | 9 -------
11 files changed, 82 insertions(+), 125 deletions(-)
--
2.37.0.170.g444d1eabd0-goog
From: Daniel Bristot de Oliveira <bristot(a)redhat.com>
commit 2586af1ac187f6b3a50930a4e33497074e81762d upstream.
The RT_RUNTIME_SHARE sched feature enables the sharing of rt_runtime
between CPUs, allowing a CPU to run a real-time task up to 100% of the
time while leaving more space for non-real-time tasks to run on the CPU
that lend rt_runtime.
The problem is that a CPU can easily borrow enough rt_runtime to allow
a spinning rt-task to run forever, starving per-cpu tasks like kworkers,
which are non-real-time by design.
This patch disables RT_RUNTIME_SHARE by default, avoiding this problem.
The feature will still be present for users that want to enable it,
though.
Signed-off-by: Daniel Bristot de Oliveira <bristot(a)redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Tested-by: Wei Wang <wvw(a)google.com>
Link: https://lkml.kernel.org/r/b776ab46817e3db5d8ef79175fa0d71073c051c7.16006979…
Signed-off-by: Mark-PK Tsai <mark-pk.tsai(a)mediatek.com>
Cc: stable(a)vger.kernel.org
---
kernel/sched/features.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index 2410db5e9a35..66c74aa4753e 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -77,7 +77,7 @@ SCHED_FEAT(WARN_DOUBLE_CLOCK, false)
SCHED_FEAT(RT_PUSH_IPI, true)
#endif
-SCHED_FEAT(RT_RUNTIME_SHARE, true)
+SCHED_FEAT(RT_RUNTIME_SHARE, false)
SCHED_FEAT(LB_MIN, false)
SCHED_FEAT(ATTACH_AGE_LOAD, true)
--
2.18.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5750676b64a561f7ec920d7c6ba130fc9c7378f3 Mon Sep 17 00:00:00 2001
From: Dave Chinner <dchinner(a)redhat.com>
Date: Wed, 13 Jul 2022 17:49:15 +1000
Subject: [PATCH] fs/remap: constrain dedupe of EOF blocks
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
If dedupe of an EOF block is not constrainted to match against only
other EOF blocks with the same EOF offset into the block, it can
match against any other block that has the same matching initial
bytes in it, even if the bytes beyond EOF in the source file do
not match.
Fix this by constraining the EOF block matching to only match
against other EOF blocks that have identical EOF offsets and data.
This allows "whole file dedupe" to continue to work without allowing
eof blocks to randomly match against partial full blocks with the
same data.
Reported-by: Ansgar Lößer <ansgar.loesser(a)tu-darmstadt.de>
Fixes: 1383a7ed6749 ("vfs: check file ranges before cloning files")
Link: https://lore.kernel.org/linux-fsdevel/a7c93559-4ba1-df2f-7a85-55a143696405@…
Signed-off-by: Dave Chinner <dchinner(a)redhat.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/remap_range.c b/fs/remap_range.c
index e112b5424cdb..881a306ee247 100644
--- a/fs/remap_range.c
+++ b/fs/remap_range.c
@@ -71,7 +71,8 @@ static int generic_remap_checks(struct file *file_in, loff_t pos_in,
* Otherwise, make sure the count is also block-aligned, having
* already confirmed the starting offsets' block alignment.
*/
- if (pos_in + count == size_in) {
+ if (pos_in + count == size_in &&
+ (!(remap_flags & REMAP_FILE_DEDUP) || pos_out + count == size_out)) {
bcount = ALIGN(size_in, bs) - pos_in;
} else {
if (!IS_ALIGNED(count, bs))
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fb6e0637ab7ebd8e61fe24f4d663c4bae99cfa62 Mon Sep 17 00:00:00 2001
From: Dmitry Osipenko <dmitry.osipenko(a)collabora.com>
Date: Thu, 30 Jun 2022 23:06:00 +0300
Subject: [PATCH] drm/panfrost: Put mapping instead of shmem obj on
panfrost_mmu_map_fault_addr() error
When panfrost_mmu_map_fault_addr() fails, the BO's mapping should be
unreferenced and not the shmem object which backs the mapping.
Cc: stable(a)vger.kernel.org
Fixes: bdefca2d8dc0 ("drm/panfrost: Add the panfrost_gem_mapping concept")
Reviewed-by: Steven Price <steven.price(a)arm.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko(a)collabora.com>
Signed-off-by: Steven Price <steven.price(a)arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220630200601.1884120-2-dmit…
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index d3f82b26a631..b285a8001b1d 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -518,7 +518,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
err_pages:
drm_gem_shmem_put_pages(&bo->base);
err_bo:
- drm_gem_object_put(&bo->base.base);
+ panfrost_gem_mapping_put(bomapping);
return ret;
}
The patch below does not apply to the 5.18-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b3a3b0255797e1d395253366ba24a4cc6c8bdf9c Mon Sep 17 00:00:00 2001
From: Naohiro Aota <naohiro.aota(a)wdc.com>
Date: Wed, 29 Jun 2022 11:00:38 +0900
Subject: [PATCH] btrfs: zoned: drop optimization of zone finish
We have an optimization in do_zone_finish() to send REQ_OP_ZONE_FINISH only
when necessary, i.e. we don't send REQ_OP_ZONE_FINISH when we assume we
wrote fully into the zone.
The assumption is determined by "alloc_offset == capacity". This condition
won't work if the last ordered extent is canceled due to some errors. In
that case, we consider the zone is deactivated without sending the finish
command while it's still active.
This inconstancy results in activating another block group while we cannot
really activate the underlying zone, which causes the active zone exceeds
errors like below.
BTRFS error (device nvme3n2): allocation failed flags 1, wanted 520192 tree-log 0, relocation: 0
nvme3n2: I/O Cmd(0x7d) @ LBA 160432128, 127 blocks, I/O Error (sct 0x1 / sc 0xbd) MORE DNR
active zones exceeded error, dev nvme3n2, sector 0 op 0xd:(ZONE_APPEND) flags 0x4800 phys_seg 1 prio class 0
nvme3n2: I/O Cmd(0x7d) @ LBA 160432128, 127 blocks, I/O Error (sct 0x1 / sc 0xbd) MORE DNR
active zones exceeded error, dev nvme3n2, sector 0 op 0xd:(ZONE_APPEND) flags 0x4800 phys_seg 1 prio class 0
Fix the issue by removing the optimization for now.
Fixes: 8376d9e1ed8f ("btrfs: zoned: finish superblock zone once no space left for new SB")
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota(a)wdc.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 9892f7413705..e33987cefbd4 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1889,7 +1889,6 @@ static int do_zone_finish(struct btrfs_block_group *block_group, bool fully_writ
{
struct btrfs_fs_info *fs_info = block_group->fs_info;
struct map_lookup *map;
- bool need_zone_finish;
int ret = 0;
int i;
@@ -1946,12 +1945,6 @@ static int do_zone_finish(struct btrfs_block_group *block_group, bool fully_writ
}
}
- /*
- * The block group is not fully allocated, so not fully written yet. We
- * need to send ZONE_FINISH command to free up an active zone.
- */
- need_zone_finish = !btrfs_zoned_bg_is_full(block_group);
-
block_group->zone_is_active = 0;
block_group->alloc_offset = block_group->zone_capacity;
block_group->free_space_ctl->free_space = 0;
@@ -1967,15 +1960,13 @@ static int do_zone_finish(struct btrfs_block_group *block_group, bool fully_writ
if (device->zone_info->max_active_zones == 0)
continue;
- if (need_zone_finish) {
- ret = blkdev_zone_mgmt(device->bdev, REQ_OP_ZONE_FINISH,
- physical >> SECTOR_SHIFT,
- device->zone_info->zone_size >> SECTOR_SHIFT,
- GFP_NOFS);
+ ret = blkdev_zone_mgmt(device->bdev, REQ_OP_ZONE_FINISH,
+ physical >> SECTOR_SHIFT,
+ device->zone_info->zone_size >> SECTOR_SHIFT,
+ GFP_NOFS);
- if (ret)
- return ret;
- }
+ if (ret)
+ return ret;
btrfs_dev_clear_active_zone(device, physical);
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 07fd5b6cdf3cc30bfde8fe0f644771688be04447 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj(a)kernel.org>
Date: Mon, 13 Jun 2022 12:19:50 -1000
Subject: [PATCH] cgroup: Use separate src/dst nodes when preloading css_sets
for migration
Each cset (css_set) is pinned by its tasks. When we're moving tasks around
across csets for a migration, we need to hold the source and destination
csets to ensure that they don't go away while we're moving tasks about. This
is done by linking cset->mg_preload_node on either the
mgctx->preloaded_src_csets or mgctx->preloaded_dst_csets list. Using the
same cset->mg_preload_node for both the src and dst lists was deemed okay as
a cset can't be both the source and destination at the same time.
Unfortunately, this overloading becomes problematic when multiple tasks are
involved in a migration and some of them are identity noop migrations while
others are actually moving across cgroups. For example, this can happen with
the following sequence on cgroup1:
#1> mkdir -p /sys/fs/cgroup/misc/a/b
#2> echo $$ > /sys/fs/cgroup/misc/a/cgroup.procs
#3> RUN_A_COMMAND_WHICH_CREATES_MULTIPLE_THREADS &
#4> PID=$!
#5> echo $PID > /sys/fs/cgroup/misc/a/b/tasks
#6> echo $PID > /sys/fs/cgroup/misc/a/cgroup.procs
the process including the group leader back into a. In this final migration,
non-leader threads would be doing identity migration while the group leader
is doing an actual one.
After #3, let's say the whole process was in cset A, and that after #4, the
leader moves to cset B. Then, during #6, the following happens:
1. cgroup_migrate_add_src() is called on B for the leader.
2. cgroup_migrate_add_src() is called on A for the other threads.
3. cgroup_migrate_prepare_dst() is called. It scans the src list.
4. It notices that B wants to migrate to A, so it tries to A to the dst
list but realizes that its ->mg_preload_node is already busy.
5. and then it notices A wants to migrate to A as it's an identity
migration, it culls it by list_del_init()'ing its ->mg_preload_node and
putting references accordingly.
6. The rest of migration takes place with B on the src list but nothing on
the dst list.
This means that A isn't held while migration is in progress. If all tasks
leave A before the migration finishes and the incoming task pins it, the
cset will be destroyed leading to use-after-free.
This is caused by overloading cset->mg_preload_node for both src and dst
preload lists. We wanted to exclude the cset from the src list but ended up
inadvertently excluding it from the dst list too.
This patch fixes the issue by separating out cset->mg_preload_node into
->mg_src_preload_node and ->mg_dst_preload_node, so that the src and dst
preloadings don't interfere with each other.
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Reported-by: Mukesh Ojha <quic_mojha(a)quicinc.com>
Reported-by: shisiyuan <shisiyuan19870131(a)gmail.com>
Link: http://lkml.kernel.org/r/1654187688-27411-1-git-send-email-shisiyuan@xiaomi…
Link: https://www.spinics.net/lists/cgroups/msg33313.html
Fixes: f817de98513d ("cgroup: prepare migration path for unified hierarchy")
Cc: stable(a)vger.kernel.org # v3.16+
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 1bfcfb1af352..d4427d0a0e18 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -264,7 +264,8 @@ struct css_set {
* List of csets participating in the on-going migration either as
* source or destination. Protected by cgroup_mutex.
*/
- struct list_head mg_preload_node;
+ struct list_head mg_src_preload_node;
+ struct list_head mg_dst_preload_node;
struct list_head mg_node;
/*
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 1779ccddb734..13c8e91d7862 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -765,7 +765,8 @@ struct css_set init_css_set = {
.task_iters = LIST_HEAD_INIT(init_css_set.task_iters),
.threaded_csets = LIST_HEAD_INIT(init_css_set.threaded_csets),
.cgrp_links = LIST_HEAD_INIT(init_css_set.cgrp_links),
- .mg_preload_node = LIST_HEAD_INIT(init_css_set.mg_preload_node),
+ .mg_src_preload_node = LIST_HEAD_INIT(init_css_set.mg_src_preload_node),
+ .mg_dst_preload_node = LIST_HEAD_INIT(init_css_set.mg_dst_preload_node),
.mg_node = LIST_HEAD_INIT(init_css_set.mg_node),
/*
@@ -1240,7 +1241,8 @@ static struct css_set *find_css_set(struct css_set *old_cset,
INIT_LIST_HEAD(&cset->threaded_csets);
INIT_HLIST_NODE(&cset->hlist);
INIT_LIST_HEAD(&cset->cgrp_links);
- INIT_LIST_HEAD(&cset->mg_preload_node);
+ INIT_LIST_HEAD(&cset->mg_src_preload_node);
+ INIT_LIST_HEAD(&cset->mg_dst_preload_node);
INIT_LIST_HEAD(&cset->mg_node);
/* Copy the set of subsystem state objects generated in
@@ -2597,21 +2599,27 @@ int cgroup_migrate_vet_dst(struct cgroup *dst_cgrp)
*/
void cgroup_migrate_finish(struct cgroup_mgctx *mgctx)
{
- LIST_HEAD(preloaded);
struct css_set *cset, *tmp_cset;
lockdep_assert_held(&cgroup_mutex);
spin_lock_irq(&css_set_lock);
- list_splice_tail_init(&mgctx->preloaded_src_csets, &preloaded);
- list_splice_tail_init(&mgctx->preloaded_dst_csets, &preloaded);
+ list_for_each_entry_safe(cset, tmp_cset, &mgctx->preloaded_src_csets,
+ mg_src_preload_node) {
+ cset->mg_src_cgrp = NULL;
+ cset->mg_dst_cgrp = NULL;
+ cset->mg_dst_cset = NULL;
+ list_del_init(&cset->mg_src_preload_node);
+ put_css_set_locked(cset);
+ }
- list_for_each_entry_safe(cset, tmp_cset, &preloaded, mg_preload_node) {
+ list_for_each_entry_safe(cset, tmp_cset, &mgctx->preloaded_dst_csets,
+ mg_dst_preload_node) {
cset->mg_src_cgrp = NULL;
cset->mg_dst_cgrp = NULL;
cset->mg_dst_cset = NULL;
- list_del_init(&cset->mg_preload_node);
+ list_del_init(&cset->mg_dst_preload_node);
put_css_set_locked(cset);
}
@@ -2651,7 +2659,7 @@ void cgroup_migrate_add_src(struct css_set *src_cset,
if (src_cset->dead)
return;
- if (!list_empty(&src_cset->mg_preload_node))
+ if (!list_empty(&src_cset->mg_src_preload_node))
return;
src_cgrp = cset_cgroup_from_root(src_cset, dst_cgrp->root);
@@ -2664,7 +2672,7 @@ void cgroup_migrate_add_src(struct css_set *src_cset,
src_cset->mg_src_cgrp = src_cgrp;
src_cset->mg_dst_cgrp = dst_cgrp;
get_css_set(src_cset);
- list_add_tail(&src_cset->mg_preload_node, &mgctx->preloaded_src_csets);
+ list_add_tail(&src_cset->mg_src_preload_node, &mgctx->preloaded_src_csets);
}
/**
@@ -2689,7 +2697,7 @@ int cgroup_migrate_prepare_dst(struct cgroup_mgctx *mgctx)
/* look up the dst cset for each src cset and link it to src */
list_for_each_entry_safe(src_cset, tmp_cset, &mgctx->preloaded_src_csets,
- mg_preload_node) {
+ mg_src_preload_node) {
struct css_set *dst_cset;
struct cgroup_subsys *ss;
int ssid;
@@ -2708,7 +2716,7 @@ int cgroup_migrate_prepare_dst(struct cgroup_mgctx *mgctx)
if (src_cset == dst_cset) {
src_cset->mg_src_cgrp = NULL;
src_cset->mg_dst_cgrp = NULL;
- list_del_init(&src_cset->mg_preload_node);
+ list_del_init(&src_cset->mg_src_preload_node);
put_css_set(src_cset);
put_css_set(dst_cset);
continue;
@@ -2716,8 +2724,8 @@ int cgroup_migrate_prepare_dst(struct cgroup_mgctx *mgctx)
src_cset->mg_dst_cset = dst_cset;
- if (list_empty(&dst_cset->mg_preload_node))
- list_add_tail(&dst_cset->mg_preload_node,
+ if (list_empty(&dst_cset->mg_dst_preload_node))
+ list_add_tail(&dst_cset->mg_dst_preload_node,
&mgctx->preloaded_dst_csets);
else
put_css_set(dst_cset);
@@ -2963,7 +2971,8 @@ static int cgroup_update_dfl_csses(struct cgroup *cgrp)
goto out_finish;
spin_lock_irq(&css_set_lock);
- list_for_each_entry(src_cset, &mgctx.preloaded_src_csets, mg_preload_node) {
+ list_for_each_entry(src_cset, &mgctx.preloaded_src_csets,
+ mg_src_preload_node) {
struct task_struct *task, *ntask;
/* all tasks in src_csets need to be migrated */
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e5c46fde75e43c15a29b40e5fc5641727f97ae47 Mon Sep 17 00:00:00 2001
From: Ard Biesheuvel <ardb(a)kernel.org>
Date: Thu, 30 Jun 2022 16:46:54 +0100
Subject: [PATCH] ARM: 9214/1: alignment: advance IT state after emulating
Thumb instruction
After emulating a misaligned load or store issued in Thumb mode, we have
to advance the IT state by hand, or it will get out of sync with the
actual instruction stream, which means we'll end up applying the wrong
condition code to subsequent instructions. This might corrupt the
program state rather catastrophically.
So borrow the it_advance() helper from the probing code, and use it on
CPSR if the emulated instruction is Thumb.
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel(a)armlinux.org.uk>
diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
index 93051e2f402c..1408a6a15d0e 100644
--- a/arch/arm/include/asm/ptrace.h
+++ b/arch/arm/include/asm/ptrace.h
@@ -163,5 +163,31 @@ static inline unsigned long user_stack_pointer(struct pt_regs *regs)
((current_stack_pointer | (THREAD_SIZE - 1)) - 7) - 1; \
})
+
+/*
+ * Update ITSTATE after normal execution of an IT block instruction.
+ *
+ * The 8 IT state bits are split into two parts in CPSR:
+ * ITSTATE<1:0> are in CPSR<26:25>
+ * ITSTATE<7:2> are in CPSR<15:10>
+ */
+static inline unsigned long it_advance(unsigned long cpsr)
+{
+ if ((cpsr & 0x06000400) == 0) {
+ /* ITSTATE<2:0> == 0 means end of IT block, so clear IT state */
+ cpsr &= ~PSR_IT_MASK;
+ } else {
+ /* We need to shift left ITSTATE<4:0> */
+ const unsigned long mask = 0x06001c00; /* Mask ITSTATE<4:0> */
+ unsigned long it = cpsr & mask;
+ it <<= 1;
+ it |= it >> (27 - 10); /* Carry ITSTATE<2> to correct place */
+ it &= mask;
+ cpsr &= ~mask;
+ cpsr |= it;
+ }
+ return cpsr;
+}
+
#endif /* __ASSEMBLY__ */
#endif
diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index 6f499559d193..f8dd0b3cc8e0 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -935,6 +935,9 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
if (type == TYPE_LDST)
do_alignment_finish_ldst(addr, instr, regs, offset);
+ if (thumb_mode(regs))
+ regs->ARM_cpsr = it_advance(regs->ARM_cpsr);
+
return 0;
bad_or_fault:
diff --git a/arch/arm/probes/decode.h b/arch/arm/probes/decode.h
index 973173598992..facc889d05ee 100644
--- a/arch/arm/probes/decode.h
+++ b/arch/arm/probes/decode.h
@@ -14,6 +14,7 @@
#include <linux/types.h>
#include <linux/stddef.h>
#include <asm/probes.h>
+#include <asm/ptrace.h>
#include <asm/kprobes.h>
void __init arm_probes_decode_init(void);
@@ -35,31 +36,6 @@ void __init find_str_pc_offset(void);
#endif
-/*
- * Update ITSTATE after normal execution of an IT block instruction.
- *
- * The 8 IT state bits are split into two parts in CPSR:
- * ITSTATE<1:0> are in CPSR<26:25>
- * ITSTATE<7:2> are in CPSR<15:10>
- */
-static inline unsigned long it_advance(unsigned long cpsr)
- {
- if ((cpsr & 0x06000400) == 0) {
- /* ITSTATE<2:0> == 0 means end of IT block, so clear IT state */
- cpsr &= ~PSR_IT_MASK;
- } else {
- /* We need to shift left ITSTATE<4:0> */
- const unsigned long mask = 0x06001c00; /* Mask ITSTATE<4:0> */
- unsigned long it = cpsr & mask;
- it <<= 1;
- it |= it >> (27 - 10); /* Carry ITSTATE<2> to correct place */
- it &= mask;
- cpsr &= ~mask;
- cpsr |= it;
- }
- return cpsr;
-}
-
static inline void __kprobes bx_write_pc(long pcv, struct pt_regs *regs)
{
long cpsr = regs->ARM_cpsr;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d5b36a4dbd06c5e8e36ca8ccc552f679069e2946 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg(a)redhat.com>
Date: Mon, 11 Jul 2022 18:16:25 +0200
Subject: [PATCH] fix race between exit_itimers() and /proc/pid/timers
As Chris explains, the comment above exit_itimers() is not correct,
we can race with proc_timers_seq_ops. Change exit_itimers() to clear
signal->posix_timers with ->siglock held.
Cc: <stable(a)vger.kernel.org>
Reported-by: chris(a)accessvector.net
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/exec.c b/fs/exec.c
index 0989fb8472a1..778123259e42 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1301,7 +1301,7 @@ int begin_new_exec(struct linux_binprm * bprm)
bprm->mm = NULL;
#ifdef CONFIG_POSIX_TIMERS
- exit_itimers(me->signal);
+ exit_itimers(me);
flush_itimer_signals();
#endif
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 505aaf9fe477..81cab4b01edc 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -85,7 +85,7 @@ static inline void exit_thread(struct task_struct *tsk)
extern __noreturn void do_group_exit(int);
extern void exit_files(struct task_struct *);
-extern void exit_itimers(struct signal_struct *);
+extern void exit_itimers(struct task_struct *);
extern pid_t kernel_clone(struct kernel_clone_args *kargs);
struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node);
diff --git a/kernel/exit.c b/kernel/exit.c
index f072959fcab7..64c938ce36fe 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -766,7 +766,7 @@ void __noreturn do_exit(long code)
#ifdef CONFIG_POSIX_TIMERS
hrtimer_cancel(&tsk->signal->real_timer);
- exit_itimers(tsk->signal);
+ exit_itimers(tsk);
#endif
if (tsk->mm)
setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1cd10b102c51..5dead89308b7 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1051,15 +1051,24 @@ static void itimer_delete(struct k_itimer *timer)
}
/*
- * This is called by do_exit or de_thread, only when there are no more
- * references to the shared signal_struct.
+ * This is called by do_exit or de_thread, only when nobody else can
+ * modify the signal->posix_timers list. Yet we need sighand->siglock
+ * to prevent the race with /proc/pid/timers.
*/
-void exit_itimers(struct signal_struct *sig)
+void exit_itimers(struct task_struct *tsk)
{
+ struct list_head timers;
struct k_itimer *tmr;
- while (!list_empty(&sig->posix_timers)) {
- tmr = list_entry(sig->posix_timers.next, struct k_itimer, list);
+ if (list_empty(&tsk->signal->posix_timers))
+ return;
+
+ spin_lock_irq(&tsk->sighand->siglock);
+ list_replace_init(&tsk->signal->posix_timers, &timers);
+ spin_unlock_irq(&tsk->sighand->siglock);
+
+ while (!list_empty(&timers)) {
+ tmr = list_first_entry(&timers, struct k_itimer, list);
itimer_delete(tmr);
}
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d5b36a4dbd06c5e8e36ca8ccc552f679069e2946 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg(a)redhat.com>
Date: Mon, 11 Jul 2022 18:16:25 +0200
Subject: [PATCH] fix race between exit_itimers() and /proc/pid/timers
As Chris explains, the comment above exit_itimers() is not correct,
we can race with proc_timers_seq_ops. Change exit_itimers() to clear
signal->posix_timers with ->siglock held.
Cc: <stable(a)vger.kernel.org>
Reported-by: chris(a)accessvector.net
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/exec.c b/fs/exec.c
index 0989fb8472a1..778123259e42 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1301,7 +1301,7 @@ int begin_new_exec(struct linux_binprm * bprm)
bprm->mm = NULL;
#ifdef CONFIG_POSIX_TIMERS
- exit_itimers(me->signal);
+ exit_itimers(me);
flush_itimer_signals();
#endif
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 505aaf9fe477..81cab4b01edc 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -85,7 +85,7 @@ static inline void exit_thread(struct task_struct *tsk)
extern __noreturn void do_group_exit(int);
extern void exit_files(struct task_struct *);
-extern void exit_itimers(struct signal_struct *);
+extern void exit_itimers(struct task_struct *);
extern pid_t kernel_clone(struct kernel_clone_args *kargs);
struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node);
diff --git a/kernel/exit.c b/kernel/exit.c
index f072959fcab7..64c938ce36fe 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -766,7 +766,7 @@ void __noreturn do_exit(long code)
#ifdef CONFIG_POSIX_TIMERS
hrtimer_cancel(&tsk->signal->real_timer);
- exit_itimers(tsk->signal);
+ exit_itimers(tsk);
#endif
if (tsk->mm)
setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1cd10b102c51..5dead89308b7 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1051,15 +1051,24 @@ static void itimer_delete(struct k_itimer *timer)
}
/*
- * This is called by do_exit or de_thread, only when there are no more
- * references to the shared signal_struct.
+ * This is called by do_exit or de_thread, only when nobody else can
+ * modify the signal->posix_timers list. Yet we need sighand->siglock
+ * to prevent the race with /proc/pid/timers.
*/
-void exit_itimers(struct signal_struct *sig)
+void exit_itimers(struct task_struct *tsk)
{
+ struct list_head timers;
struct k_itimer *tmr;
- while (!list_empty(&sig->posix_timers)) {
- tmr = list_entry(sig->posix_timers.next, struct k_itimer, list);
+ if (list_empty(&tsk->signal->posix_timers))
+ return;
+
+ spin_lock_irq(&tsk->sighand->siglock);
+ list_replace_init(&tsk->signal->posix_timers, &timers);
+ spin_unlock_irq(&tsk->sighand->siglock);
+
+ while (!list_empty(&timers)) {
+ tmr = list_first_entry(&timers, struct k_itimer, list);
itimer_delete(tmr);
}
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d5b36a4dbd06c5e8e36ca8ccc552f679069e2946 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg(a)redhat.com>
Date: Mon, 11 Jul 2022 18:16:25 +0200
Subject: [PATCH] fix race between exit_itimers() and /proc/pid/timers
As Chris explains, the comment above exit_itimers() is not correct,
we can race with proc_timers_seq_ops. Change exit_itimers() to clear
signal->posix_timers with ->siglock held.
Cc: <stable(a)vger.kernel.org>
Reported-by: chris(a)accessvector.net
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/exec.c b/fs/exec.c
index 0989fb8472a1..778123259e42 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1301,7 +1301,7 @@ int begin_new_exec(struct linux_binprm * bprm)
bprm->mm = NULL;
#ifdef CONFIG_POSIX_TIMERS
- exit_itimers(me->signal);
+ exit_itimers(me);
flush_itimer_signals();
#endif
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 505aaf9fe477..81cab4b01edc 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -85,7 +85,7 @@ static inline void exit_thread(struct task_struct *tsk)
extern __noreturn void do_group_exit(int);
extern void exit_files(struct task_struct *);
-extern void exit_itimers(struct signal_struct *);
+extern void exit_itimers(struct task_struct *);
extern pid_t kernel_clone(struct kernel_clone_args *kargs);
struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node);
diff --git a/kernel/exit.c b/kernel/exit.c
index f072959fcab7..64c938ce36fe 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -766,7 +766,7 @@ void __noreturn do_exit(long code)
#ifdef CONFIG_POSIX_TIMERS
hrtimer_cancel(&tsk->signal->real_timer);
- exit_itimers(tsk->signal);
+ exit_itimers(tsk);
#endif
if (tsk->mm)
setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1cd10b102c51..5dead89308b7 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1051,15 +1051,24 @@ static void itimer_delete(struct k_itimer *timer)
}
/*
- * This is called by do_exit or de_thread, only when there are no more
- * references to the shared signal_struct.
+ * This is called by do_exit or de_thread, only when nobody else can
+ * modify the signal->posix_timers list. Yet we need sighand->siglock
+ * to prevent the race with /proc/pid/timers.
*/
-void exit_itimers(struct signal_struct *sig)
+void exit_itimers(struct task_struct *tsk)
{
+ struct list_head timers;
struct k_itimer *tmr;
- while (!list_empty(&sig->posix_timers)) {
- tmr = list_entry(sig->posix_timers.next, struct k_itimer, list);
+ if (list_empty(&tsk->signal->posix_timers))
+ return;
+
+ spin_lock_irq(&tsk->sighand->siglock);
+ list_replace_init(&tsk->signal->posix_timers, &timers);
+ spin_unlock_irq(&tsk->sighand->siglock);
+
+ while (!list_empty(&timers)) {
+ tmr = list_first_entry(&timers, struct k_itimer, list);
itimer_delete(tmr);
}
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d5b36a4dbd06c5e8e36ca8ccc552f679069e2946 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <oleg(a)redhat.com>
Date: Mon, 11 Jul 2022 18:16:25 +0200
Subject: [PATCH] fix race between exit_itimers() and /proc/pid/timers
As Chris explains, the comment above exit_itimers() is not correct,
we can race with proc_timers_seq_ops. Change exit_itimers() to clear
signal->posix_timers with ->siglock held.
Cc: <stable(a)vger.kernel.org>
Reported-by: chris(a)accessvector.net
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/fs/exec.c b/fs/exec.c
index 0989fb8472a1..778123259e42 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1301,7 +1301,7 @@ int begin_new_exec(struct linux_binprm * bprm)
bprm->mm = NULL;
#ifdef CONFIG_POSIX_TIMERS
- exit_itimers(me->signal);
+ exit_itimers(me);
flush_itimer_signals();
#endif
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 505aaf9fe477..81cab4b01edc 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -85,7 +85,7 @@ static inline void exit_thread(struct task_struct *tsk)
extern __noreturn void do_group_exit(int);
extern void exit_files(struct task_struct *);
-extern void exit_itimers(struct signal_struct *);
+extern void exit_itimers(struct task_struct *);
extern pid_t kernel_clone(struct kernel_clone_args *kargs);
struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node);
diff --git a/kernel/exit.c b/kernel/exit.c
index f072959fcab7..64c938ce36fe 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -766,7 +766,7 @@ void __noreturn do_exit(long code)
#ifdef CONFIG_POSIX_TIMERS
hrtimer_cancel(&tsk->signal->real_timer);
- exit_itimers(tsk->signal);
+ exit_itimers(tsk);
#endif
if (tsk->mm)
setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1cd10b102c51..5dead89308b7 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1051,15 +1051,24 @@ static void itimer_delete(struct k_itimer *timer)
}
/*
- * This is called by do_exit or de_thread, only when there are no more
- * references to the shared signal_struct.
+ * This is called by do_exit or de_thread, only when nobody else can
+ * modify the signal->posix_timers list. Yet we need sighand->siglock
+ * to prevent the race with /proc/pid/timers.
*/
-void exit_itimers(struct signal_struct *sig)
+void exit_itimers(struct task_struct *tsk)
{
+ struct list_head timers;
struct k_itimer *tmr;
- while (!list_empty(&sig->posix_timers)) {
- tmr = list_entry(sig->posix_timers.next, struct k_itimer, list);
+ if (list_empty(&tsk->signal->posix_timers))
+ return;
+
+ spin_lock_irq(&tsk->sighand->siglock);
+ list_replace_init(&tsk->signal->posix_timers, &timers);
+ spin_unlock_irq(&tsk->sighand->siglock);
+
+ while (!list_empty(&timers)) {
+ tmr = list_first_entry(&timers, struct k_itimer, list);
itimer_delete(tmr);
}
}
I'm announcing the release of the 5.18.12 kernel.
This, and the 5.15.55, 5.10.131, and 5.4.206 releases are only for those users
of the drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c driver, to prevent known data
loss issues with that codebase. If you don't use this driver, no need to
upgrade.
The updated 5.18.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.18.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
Greg Kroah-Hartman (2):
Revert "mtd: rawnand: gpmi: Fix setting busy timeout setting"
Linux 5.18.12
Persistent grants feature can be used only when both backend and the
frontend supports the feature. The feature was always supported by
'blkback', but commit aac8a70db24b ("xen-blkback: add a parameter for
disabling of persistent grants") has introduced a parameter for
disabling it runtime.
To avoid the parameter be updated while being used by 'blkback', the
commit caches the parameter into 'vbd->feature_gnt_persistent' in
'xen_vbd_create()', and then check if the guest also supports the
feature and finally updates the field in 'connect_ring()'.
However, 'connect_ring()' could be called before 'xen_vbd_create()', so
later execution of 'xen_vbd_create()' can wrongly overwrite 'true' to
'vbd->feature_gnt_persistent'. As a result, 'blkback' could try to use
'persistent grants' feature even if the guest doesn't support the
feature.
This commit fixes the issue by moving the parameter value caching to
'xen_blkif_alloc()', which allocates the 'blkif'. Because the struct
embeds 'vbd' object, which will be used by 'connect_ring()' later, this
should be called before 'connect_ring()' and therefore this should be
the right and safe place to do the caching.
Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants")
Cc: <stable(a)vger.kernel.org> # 5.10.x
Signed-off-by: Maximilian Heyne <mheyne(a)amazon.de>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
Changes from v1[1]
- Avoid the behavioral change[2]
- Rebase on latest xen/tip/linux-next
- Re-work by SeongJae Park <sj(a)kernel.org>
- Cc stable@
[1] https://lore.kernel.org/xen-devel/20220106091013.126076-1-mheyne@amazon.de/
[2] https://lore.kernel.org/xen-devel/20220121102309.27802-1-sj@kernel.org/
drivers/block/xen-blkback/xenbus.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 97de13b14175..16c6785d260c 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -157,6 +157,11 @@ static int xen_blkif_alloc_rings(struct xen_blkif *blkif)
return 0;
}
+/* Enable the persistent grants feature. */
+static bool feature_persistent = true;
+module_param(feature_persistent, bool, 0644);
+MODULE_PARM_DESC(feature_persistent, "Enables the persistent grants feature");
+
static struct xen_blkif *xen_blkif_alloc(domid_t domid)
{
struct xen_blkif *blkif;
@@ -181,6 +186,8 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid)
__module_get(THIS_MODULE);
INIT_WORK(&blkif->free_work, xen_blkif_deferred_free);
+ blkif->vbd.feature_gnt_persistent = feature_persistent;
+
return blkif;
}
@@ -472,12 +479,6 @@ static void xen_vbd_free(struct xen_vbd *vbd)
vbd->bdev = NULL;
}
-/* Enable the persistent grants feature. */
-static bool feature_persistent = true;
-module_param(feature_persistent, bool, 0644);
-MODULE_PARM_DESC(feature_persistent,
- "Enables the persistent grants feature");
-
static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
unsigned major, unsigned minor, int readonly,
int cdrom)
@@ -520,8 +521,6 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
if (bdev_max_secure_erase_sectors(bdev))
vbd->discard_secure = true;
- vbd->feature_gnt_persistent = feature_persistent;
-
pr_debug("Successful creation of handle=%04x (dom=%u)\n",
handle, blkif->domid);
return 0;
--
2.25.1
Hi,
I've heard from many other e-commerce companies that they have difficulty determining if they are getting the most out of the traffic they generate for their site.
Recently a new customer, after our analysis of their website, used our proprietary online business visibility strategy that combines new SEO tools and Google ads into one value chain.
This makes it easier for potential customers to find their way to the site, and the company gets new orders.
Would you like to see how such a solution can affect your sales activities?
Yours sincerely,
Walker Cooney
This is the start of the stable review cycle for the 5.15.39 release.
There are 135 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 12 May 2022 13:07:16 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.39-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.39-rc1
Marek Behún <kabel(a)kernel.org>
PCI: aardvark: Update comment about link going down after link-up
Marek Behún <kabel(a)kernel.org>
PCI: aardvark: Drop __maybe_unused from advk_pcie_disable_phy()
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Don't mask irq when mapping
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Remove irq_mask_ack() callback for INTx interrupts
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Use separate INTA interrupt for emulated root bridge
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Fix support for PME requester on emulated bridge
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Add support for PME interrupts
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Optimize writing PCI_EXP_RTCTL_PMEIE and PCI_EXP_RTSTA_PME on emulated bridge
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Add support for ERR interrupt on emulated bridge
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Enable MSI-X support
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Fix setting MSI address
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Add support for masking MSI interrupts
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Refactor unmasking summary MSI interrupt
Marek Behún <kabel(a)kernel.org>
PCI: aardvark: Use dev_fwnode() instead of of_node_to_fwnode(dev->of_node)
Marek Behún <kabel(a)kernel.org>
PCI: aardvark: Make msi_domain_info structure a static driver structure
Marek Behún <kabel(a)kernel.org>
PCI: aardvark: Make MSI irq_chip structures static driver structures
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Check return value of generic_handle_domain_irq() when processing INTx IRQ
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Rewrite IRQ code to chained IRQ handler
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Replace custom PCIE_CORE_INT_* macros with PCI_INTERRUPT_*
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Disable common PHY when unbinding driver
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Disable link training when unbinding driver
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Assert PERST# when unbinding driver
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Fix memory leak in driver unbind
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Mask all interrupts when unbinding driver
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Disable bus mastering when unbinding driver
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Comment actions in driver remove method
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Clear all MSIs at setup
Pali Rohár <pali(a)kernel.org>
PCI: aardvark: Add support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2 registers on emulated bridge
Pali Rohár <pali(a)kernel.org>
PCI: pci-bridge-emul: Add definitions for missing capabilities registers
Pali Rohár <pali(a)kernel.org>
PCI: pci-bridge-emul: Add description for class_revision field
Frederic Weisbecker <frederic(a)kernel.org>
rcu: Apply callbacks processing time limit only on softirq
Frederic Weisbecker <frederic(a)kernel.org>
rcu: Fix callbacks processing time limit retaining cond_resched()
Helge Deller <deller(a)gmx.de>
Revert "parisc: Mark sched_clock unstable only if clocks are not syncronized"
Ricky WU <ricky_wu(a)realtek.com>
mmc: rtsx: add 74 Clocks in power on flow
Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
selftest/vm: verify remap destination address in mremap_test
Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
selftest/vm: verify mmap addr in mremap_test
Wanpeng Li <wanpengli(a)tencent.com>
KVM: LAPIC: Enable timer posted-interrupt only when mwait/hlt is advertised
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86/mmu: avoid NULL-pointer dereference on page freeing bugs
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86: Do not change ICR on write to APIC_SELF_IPI
Wanpeng Li <wanpengli(a)tencent.com>
x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
Thomas Huth <thuth(a)redhat.com>
KVM: selftests: Silence compiler warning in the kvm_page_table_test
Paolo Bonzini <pbonzini(a)redhat.com>
kvm: selftests: do not use bitfields larger than 32-bits for PTEs
Hector Martin <marcan(a)marcan.st>
iommu/dart: Add missing module owner to ops structure
Vlad Buslov <vladbu(a)nvidia.com>
net/mlx5e: Lag, Don't skip fib events on current dst
Vlad Buslov <vladbu(a)nvidia.com>
net/mlx5e: Lag, Fix fib_info pointer assignment
Vlad Buslov <vladbu(a)nvidia.com>
net/mlx5e: Lag, Fix use-after-free in fib event handler
Aya Levin <ayal(a)nvidia.com>
net/mlx5: Fix slab-out-of-bounds while reading resource dump menu
Javier Martinez Canillas <javierm(a)redhat.com>
fbdev: Make fb_release() return -ENODEV if fbdev was unregistered
Sandipan Das <sandipan.das(a)amd.com>
kvm: x86/cpuid: Only provide CPUID leaf 0xA if host has architectural PMU
Baruch Siach <baruch(a)tkos.co.il>
gpio: mvebu: drop pwm base assignment
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
drm/amdgpu: Ensure HDA function is suspended before ASIC reset
Mario Limonciello <mario.limonciello(a)amd.com>
drm/amdgpu: don't set s3 and s0ix at the same time
Mario Limonciello <mario.limonciello(a)amd.com>
drm/amdgpu: explicitly check for s0ix when evicting resources
Nirmoy Das <nirmoy.das(a)amd.com>
drm/amdgpu: unify BO evicting method in amdgpu_ttm
Filipe Manana <fdmanana(a)suse.com>
btrfs: always log symlinks in full mode
Qu Wenruo <wqu(a)suse.com>
btrfs: force v2 space cache usage for subpage mount
Sergey Shtylyov <s.shtylyov(a)omp.ru>
smsc911x: allow using IRQ0
Vladimir Oltean <vladimir.oltean(a)nxp.com>
selftests: ocelot: tc_flower_chains: specify conform-exceed action for policer
Michael Chan <michael.chan(a)broadcom.com>
bnxt_en: Fix unnecessary dropping of RX packets
Somnath Kotur <somnath.kotur(a)broadcom.com>
bnxt_en: Fix possible bnxt_open() failure caused by wrong RFS flag
Ido Schimmel <idosch(a)nvidia.com>
selftests: mirror_gre_bridge_1q: Avoid changing PVID while interface is operational
David Howells <dhowells(a)redhat.com>
rxrpc: Enable IPv6 checksums on transport socket
Eric Dumazet <edumazet(a)google.com>
mld: respect RCU rules in ip6_mc_source() and ip6_mc_msfilter()
Qiao Ma <mqaio(a)linux.alibaba.com>
hinic: fix bug of wq out of bound access
Filipe Manana <fdmanana(a)suse.com>
btrfs: do not BUG_ON() on failure to update inode when setting xattr
Kuogee Hsieh <quic_khsieh(a)quicinc.com>
drm/msm/dp: remove fail safe mode related code
Marc Kleine-Budde <mkl(a)pengutronix.de>
selftests/net: so_txtime: usage(): fix documentation of default clock
Marc Kleine-Budde <mkl(a)pengutronix.de>
selftests/net: so_txtime: fix parsing of start time stamp on 32 bit systems
Shravya Kumbham <shravya.kumbham(a)xilinx.com>
net: emaclite: Add error handling for of_address_to_resource()
Eric Dumazet <edumazet(a)google.com>
net: igmp: respect RCU rules in ip_mc_source() and ip_mc_msfilter()
Yang Yingliang <yangyingliang(a)huawei.com>
net: cpsw: add missing of_node_put() in cpsw_probe_dt()
Niels Dossche <dossche.niels(a)gmail.com>
net: mdio: Fix ENOMEM return value in BCM6368 mux bus controller
Yang Yingliang <yangyingliang(a)huawei.com>
net: stmmac: dwmac-sun8i: add missing of_node_put() in sun8i_dwmac_register_mdio_mux()
Yang Yingliang <yangyingliang(a)huawei.com>
net: dsa: mt7530: add missing of_node_put() in mt7530_setup()
Yang Yingliang <yangyingliang(a)huawei.com>
net: ethernet: mediatek: add missing of_node_put() in mtk_sgmii_init()
Trond Myklebust <trond.myklebust(a)hammerspace.com>
NFSv4: Don't invalidate inode attributes on delegation return
Mustafa Ismail <mustafa.ismail(a)intel.com>
RDMA/irdma: Fix possible crash due to NULL netdev in notifier
Shiraz Saleem <shiraz.saleem(a)intel.com>
RDMA/irdma: Reduce iWARP QP destroy time
Tatyana Nikolova <tatyana.e.nikolova(a)intel.com>
RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state
Cheng Xu <chengyou(a)linux.alibaba.com>
RDMA/siw: Fix a condition race issue in MPA request processing
Olga Kornievskaia <kolga(a)netapp.com>
SUNRPC release the transport of a relocated task with an assigned transport
Jann Horn <jannh(a)google.com>
selftests/seccomp: Don't call read() on TTY from background pgrp
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Fix deadlock in sync reset flow
Moshe Shemesh <moshe(a)nvidia.com>
net/mlx5: Avoid double clear or set of sync reset requested
Mark Zhang <markzhang(a)nvidia.com>
net/mlx5e: Fix the calling of update_buffer_lossy() API
Paul Blakey <paulb(a)nvidia.com>
net/mlx5e: CT: Fix queued up restore put() executing after relevant ft release
Vlad Buslov <vladbu(a)nvidia.com>
net/mlx5e: Don't match double-vlan packets if cvlan is not set
Moshe Tal <moshet(a)nvidia.com>
net/mlx5e: Fix trust state reset in reload
Yang Yingliang <yangyingliang(a)huawei.com>
iommu/dart: check return value after calling platform_get_resource()
Lu Baolu <baolu.lu(a)linux.intel.com>
iommu/vt-d: Drop stop marker messages
Pierre-Louis Bossart <pierre-louis.bossart(a)linux.intel.com>
ASoC: soc-ops: fix error handling
Codrin Ciubotariu <codrin.ciubotariu(a)microchip.com>
ASoC: dmaengine: Restore NULL prepare_slave_config() callback
Adam Wujek <dev_public(a)wujek.eu>
hwmon: (pmbus) disable PEC if not enabled
Armin Wolf <W_Armin(a)gmx.de>
hwmon: (adt7470) Fix warning on module removal
Puyou Lu <puyou.lu(a)gmail.com>
gpio: pca953x: fix irq_stat not updated when irq is disabled (irq_mask not set)
Nobuhiro Iwamatsu <nobuhiro1.iwamatsu(a)toshiba.co.jp>
gpio: visconti: Fix fwnode of GPIO IRQ
Duoming Zhou <duoming(a)zju.edu.cn>
NFC: netlink: fix sleep in atomic bug when firmware download timeout
Duoming Zhou <duoming(a)zju.edu.cn>
nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs
Duoming Zhou <duoming(a)zju.edu.cn>
nfc: replace improper check device_is_registered() in netlink related functions
Andreas Larsson <andreas(a)gaisler.com>
can: grcan: only use the NAPI poll budget for RX
Andreas Larsson <andreas(a)gaisler.com>
can: grcan: grcan_probe(): fix broken system id check for errata workaround needs
Daniel Hellstrom <daniel(a)gaisler.com>
can: grcan: use ofdev->dev when allocating DMA memory
Oliver Hartkopp <socketcan(a)hartkopp.net>
can: isotp: remove re-binding of bound socket
Duoming Zhou <duoming(a)zju.edu.cn>
can: grcan: grcan_close(): fix deadlock
Jan Höppner <hoeppner(a)linux.ibm.com>
s390/dasd: Fix read inconsistency for ESE DASD devices
Jan Höppner <hoeppner(a)linux.ibm.com>
s390/dasd: Fix read for ESE with blksize < 4k
Stefan Haberland <sth(a)linux.ibm.com>
s390/dasd: prevent double format of tracks for ESE devices
Stefan Haberland <sth(a)linux.ibm.com>
s390/dasd: fix data corruption for ESE devices
Mark Brown <broonie(a)kernel.org>
ASoC: meson: Fix event generation for AUI CODEC mux
Mark Brown <broonie(a)kernel.org>
ASoC: meson: Fix event generation for G12A tohdmi mux
Mark Brown <broonie(a)kernel.org>
ASoC: meson: Fix event generation for AUI ACODEC mux
Mark Brown <broonie(a)kernel.org>
ASoC: wm8958: Fix change notifications for DSP controls
Mark Brown <broonie(a)kernel.org>
ASoC: da7219: Fix change notifications for tone generator frequency
Thomas Pfaff <tpfaff(a)pcs.com>
genirq: Synchronize interrupt thread startup
Tan Tee Min <tee.min.tan(a)linux.intel.com>
net: stmmac: disable Split Header (SPH) for Intel platforms
Niels Dossche <dossche.niels(a)gmail.com>
firewire: core: extend card->lock in fw_core_handle_bus_reset
Jakob Koschel <jakobkoschel(a)gmail.com>
firewire: remove check of list iterator against head past the loop body
Chengfeng Ye <cyeaa(a)connect.ust.hk>
firewire: fix potential uaf in outbound_phy_packet_callback()
Kurt Kanzenbach <kurt(a)linutronix.de>
timekeeping: Mark NMI safe time accessors as notrace
Trond Myklebust <trond.myklebust(a)hammerspace.com>
Revert "SUNRPC: attempt AF_LOCAL connect on setup"
Nick Kossifidis <mick(a)ics.forth.gr>
RISC-V: relocate DTB if it's outside memory region
Marek Marczykowski-Górecki <marmarek(a)invisiblethingslab.com>
drm/amdgpu: do not use passthrough mode in Xen dom0
Harry Wentland <harry.wentland(a)amd.com>
drm/amd/display: Avoid reading audio pattern past AUDIO_CHANNELS_COUNT
Nicolin Chen <nicolinc(a)nvidia.com>
iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()
David Stevens <stevensd(a)chromium.org>
iommu/vt-d: Calculate mask for non-aligned flushes
Kyle Huey <me(a)kylehuey.com>
KVM: x86/svm: Account for family 17h event renumberings in amd_pmc_perf_hw_id
Thomas Gleixner <tglx(a)linutronix.de>
x86/fpu: Prevent FPU state corruption
Andrei Lalaev <andrei.lalaev(a)emlid.com>
gpiolib: of: fix bounds check for 'gpio-reserved-ranges'
Brian Norris <briannorris(a)chromium.org>
mmc: core: Set HS clock speed before sending HS CMD13
Samuel Holland <samuel(a)sholland.org>
mmc: sunxi-mmc: Fix DMA descriptors allocated above 32 bits
Shaik Sajida Bhanu <quic_c_sbhanu(a)quicinc.com>
mmc: sdhci-msm: Reset GCC_SDCC_BCR register for SDHC
Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
ALSA: fireworks: fix wrong return count shorter than expected by 4 bytes
Zihao Wang <wzhd(a)ustc.edu>
ALSA: hda/realtek: Add quirk for Yoga Duet 7 13ITL6 speakers
Helge Deller <deller(a)gmx.de>
parisc: Merge model and model name into one line in /proc/cpuinfo
Maciej W. Rozycki <macro(a)orcam.me.uk>
MIPS: Fix CP0 counter erratum detection for R4k CPUs
-------------
Diffstat:
Makefile | 4 +-
arch/mips/include/asm/timex.h | 8 +-
arch/mips/kernel/time.c | 11 +-
arch/parisc/kernel/processor.c | 3 +-
arch/parisc/kernel/setup.c | 2 +
arch/parisc/kernel/time.c | 6 +-
arch/riscv/mm/init.c | 21 +-
arch/x86/kernel/fpu/core.c | 67 ++--
arch/x86/kernel/kvm.c | 13 +
arch/x86/kvm/cpuid.c | 5 +
arch/x86/kvm/lapic.c | 10 +-
arch/x86/kvm/mmu/mmu.c | 2 +
arch/x86/kvm/svm/pmu.c | 28 +-
drivers/firewire/core-card.c | 3 +
drivers/firewire/core-cdev.c | 4 +-
drivers/firewire/core-topology.c | 9 +-
drivers/firewire/core-transaction.c | 30 +-
drivers/firewire/sbp2.c | 13 +-
drivers/gpio/gpio-mvebu.c | 7 -
drivers/gpio/gpio-pca953x.c | 4 +-
drivers/gpio/gpio-visconti.c | 7 +-
drivers/gpio/gpiolib-of.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 8 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 30 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 24 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 23 --
drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 30 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 4 +-
drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 2 +-
drivers/gpu/drm/msm/dp/dp_display.c | 6 -
drivers/gpu/drm/msm/dp/dp_panel.c | 11 -
drivers/gpu/drm/msm/dp/dp_panel.h | 1 -
drivers/hwmon/adt7470.c | 4 +-
drivers/hwmon/pmbus/pmbus_core.c | 3 +
drivers/infiniband/hw/irdma/cm.c | 26 +-
drivers/infiniband/hw/irdma/utils.c | 21 +-
drivers/infiniband/hw/irdma/verbs.c | 4 +-
drivers/infiniband/sw/siw/siw_cm.c | 7 +-
drivers/iommu/apple-dart.c | 10 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 9 +-
drivers/iommu/intel/iommu.c | 27 +-
drivers/iommu/intel/svm.c | 4 +
drivers/mmc/core/mmc.c | 23 +-
drivers/mmc/host/rtsx_pci_sdmmc.c | 29 +-
drivers/mmc/host/sdhci-msm.c | 42 ++
drivers/mmc/host/sunxi-mmc.c | 5 +-
drivers/net/can/grcan.c | 46 +--
drivers/net/dsa/mt7530.c | 1 +
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 13 +-
drivers/net/ethernet/huawei/hinic/hinic_hw_wq.c | 7 +-
drivers/net/ethernet/mediatek/mtk_sgmii.c | 1 +
.../ethernet/mellanox/mlx5/core/diag/rsc_dump.c | 31 +-
.../ethernet/mellanox/mlx5/core/en/port_buffer.c | 4 +-
drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 4 +
drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c | 10 +
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 11 +
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c | 60 +--
drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c | 38 +-
drivers/net/ethernet/mellanox/mlx5/core/lag_mp.h | 7 +-
drivers/net/ethernet/smsc/smsc911x.c | 2 +-
drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c | 1 +
drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 1 +
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
drivers/net/ethernet/ti/cpsw_new.c | 5 +-
drivers/net/ethernet/xilinx/xilinx_emaclite.c | 15 +-
drivers/net/mdio/mdio-mux-bcm6368.c | 2 +-
drivers/nfc/nfcmrvl/main.c | 2 +-
drivers/pci/controller/pci-aardvark.c | 428 ++++++++++++++++-----
drivers/pci/pci-bridge-emul.c | 49 ++-
drivers/s390/block/dasd.c | 18 +-
drivers/s390/block/dasd_eckd.c | 28 +-
drivers/s390/block/dasd_int.h | 14 +
drivers/video/fbdev/core/fbmem.c | 5 +-
fs/btrfs/disk-io.c | 11 +
fs/btrfs/tree-log.c | 14 +-
fs/btrfs/xattr.c | 6 +-
fs/nfs/nfs4proc.c | 12 +-
include/linux/stmmac.h | 1 +
kernel/irq/internals.h | 2 +
kernel/irq/irqdesc.c | 2 +
kernel/irq/manage.c | 39 +-
kernel/rcu/tree.c | 31 +-
kernel/time/timekeeping.c | 4 +-
net/can/isotp.c | 22 +-
net/ipv4/igmp.c | 9 +-
net/ipv6/mcast.c | 8 +-
net/nfc/core.c | 29 +-
net/nfc/netlink.c | 4 +-
net/rxrpc/local_object.c | 3 +
net/sunrpc/clnt.c | 11 +-
net/sunrpc/xprtsock.c | 3 -
sound/firewire/fireworks/fireworks_hwdep.c | 1 +
sound/pci/hda/patch_realtek.c | 1 +
sound/soc/codecs/da7219.c | 14 +-
sound/soc/codecs/wm8958-dsp2.c | 8 +-
sound/soc/meson/aiu-acodec-ctrl.c | 2 +-
sound/soc/meson/aiu-codec-ctrl.c | 2 +-
sound/soc/meson/g12a-tohdmitx.c | 2 +-
sound/soc/soc-generic-dmaengine-pcm.c | 6 +-
sound/soc/soc-ops.c | 2 +-
.../drivers/net/ocelot/tc_flower_chains.sh | 2 +-
.../selftests/kvm/include/x86_64/processor.h | 15 +
tools/testing/selftests/kvm/kvm_page_table_test.c | 2 +-
tools/testing/selftests/kvm/lib/x86_64/processor.c | 192 ++++-----
.../net/forwarding/mirror_gre_bridge_1q.sh | 3 +
tools/testing/selftests/net/so_txtime.c | 4 +-
tools/testing/selftests/seccomp/seccomp_bpf.c | 10 +-
tools/testing/selftests/vm/mremap_test.c | 53 +++
110 files changed, 1293 insertions(+), 656 deletions(-)
(resending because of HTML formatting)
Hi, I'm on Arch Linux and upgraded from kernel 5.18.9.arch1-1 to
5.18.10.arch1-1. The brightness keys don't work as well as before.
Gnome had 20 degrees of brightness, now it's 10, and Xfce went from 10
to 5. Additionally, on Gnome the brightness keys are a little slow to
respond and there's a bit of a stutter. Don't know why Xfce doesn't
stutter, but halving the degrees of brightness for both makes me
wonder if each press is being counted twice.
Reverting commit 3a0cf7ab8d in acpi_video.c and rebuilding
5.18.10.arch1-1 fixed it.
The laptop is a Dell Inspiron n4010 and I use "acpi_backlight=video"
to make the brightness keys work. Please let me know if there's any
hardware info you need.
#regzbot introduced: 3a0cf7ab8d
Summit reports that the BHB backports for v4.9 prevent vulnerable
platforms from booting when CONFIG_RANDOMIZE_BASE is enabled.
This is because the trampoline code takes a translation fault when
accessing the data page, because the TTBR write hasn't been completed
by an ISB before the access is made.
Upstream has a complex erratum workaround for QCOM_FALKOR_E1003 in
this area, which removes the ISB when the workaround has been applied.
v4.9 lacks this workaround, but should still have the ISB.
Restore the barrier.
Fixes: aee10c2dd013 ("arm64: entry: Add macro for reading symbol addresses from the trampoline")
Reported-by: Sumit Gupta <sumitg(a)nvidia.com>
Tested-by: Sumit Gupta <sumitg(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: James Morse <james.morse(a)arm.com>
---
This only applies to the v4.9 backport, as v4.14 has the QCOM_FALKOR_E1003
workaround.
arch/arm64/kernel/entry.S | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 1f79abb1e5dd..4551c0f35fc4 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -964,6 +964,7 @@ __ni_sys_trace:
b .
2:
tramp_map_kernel x30
+ isb
tramp_data_read_var x30, vectors
prfm plil1strm, [x30, #(1b - \vector_start)]
msr vbar_el1, x30
--
2.30.2
Hello Jianglei,
On Thu, 14 Jul 2022 14:37:46 +0800 Jianglei Nie <niejianglei2021(a)163.com> wrote:
> damon_reclaim_init() allocates a memory chunk for ctx with
> damon_new_ctx(). When damon_select_ops() fails, ctx is not released, which
> will lead to a memory leak.
>
> We should release the ctx with damon_destroy_ctx() when damon_select_ops()
> fails to fix the memory leak.
Thank you for this patch!
I think below tags would be better to be added.
Fixes: 4d69c3457821 ("mm/damon/reclaim: use damon_select_ops() instead of damon_{v,p}a_set_operations()")
Cc: <stable(a)vger.kernel.org> # 5.18.x
>
> Signed-off-by: Jianglei Nie <niejianglei2021(a)163.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Thanks,
SJ
> ---
> mm/damon/reclaim.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> index 4b07c29effe9..0b3c7396cb90 100644
> --- a/mm/damon/reclaim.c
> +++ b/mm/damon/reclaim.c
> @@ -441,8 +441,10 @@ static int __init damon_reclaim_init(void)
> if (!ctx)
> return -ENOMEM;
>
> - if (damon_select_ops(ctx, DAMON_OPS_PADDR))
> + if (damon_select_ops(ctx, DAMON_OPS_PADDR)) {
> + damon_destroy_ctx(ctx);
> return -EINVAL;
> + }
>
> ctx->callback.after_wmarks_check = damon_reclaim_after_wmarks_check;
> ctx->callback.after_aggregation = damon_reclaim_after_aggregation;
> --
> 2.25.1
Greetings,
I am glad to know you, but God knows you better and he knows why he
has directed me to you at this point in time so do not be surprised at
all. My name is Mrs.Claude Jean Alary, a widow, i have been suffering
from ovarian cancer disease. At this moment i am about to end the race
like this because the illness has gotten to a very bad stage, without
any family members and no child. I hope that you will not expose or
betray this trust and confidence that I am about to entrust to you for
the mutual benefit of the orphans and the less privileged ones. I have
some funds I inherited from my late husband,the sum of ($12,000.000,
dollars.) deposited in the Bank. Having known my present health
status, I decided to entrust this fund to you believing that you will
utilize it the way i am going to instruct herein.
Therefore I need you to assist me and reclaim this money and use it
for Charity works, for orphanages and giving justice and help to the
poor, needy and to promote the words of God and the effort that the
house of God will be maintained says The Lord." Jeremiah 22:15-16.“
It will be my great pleasure to compensate you with 35 % percent of
the total money for your personal use, 5 % percent for any expenses
that may occur during the international transfer process while 60% of
the money will go to the charity project.
All I require from you is sincerity and the ability to complete God's
task without any failure. It will be my pleasure to see that the bank
has finally released and transferred the fund into your bank account
therein your country even before I die here in the hospital, because
of my present health status everything needs to be processed rapidly
as soon as possible. Please I am waiting for your immediate reply on
my email for further details of the transaction and execution of this
charitable project.
Best Regards.
Mrs.Claude Jean Alary.
Add function mb_cache_entry_delete_or_get() to delete mbcache entry if
it is unused and also add a function to wait for entry to become unused
- mb_cache_entry_wait_unused(). We do not share code between the two
deleting function as one of them will go away soon.
CC: stable(a)vger.kernel.org
Fixes: 82939d7999df ("ext4: convert to mbcache2")
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/mbcache.c | 66 +++++++++++++++++++++++++++++++++++++++--
include/linux/mbcache.h | 10 ++++++-
2 files changed, 73 insertions(+), 3 deletions(-)
diff --git a/fs/mbcache.c b/fs/mbcache.c
index cfc28129fb6f..2010bc80a3f2 100644
--- a/fs/mbcache.c
+++ b/fs/mbcache.c
@@ -11,7 +11,7 @@
/*
* Mbcache is a simple key-value store. Keys need not be unique, however
* key-value pairs are expected to be unique (we use this fact in
- * mb_cache_entry_delete()).
+ * mb_cache_entry_delete_or_get()).
*
* Ext2 and ext4 use this cache for deduplication of extended attribute blocks.
* Ext4 also uses it for deduplication of xattr values stored in inodes.
@@ -125,6 +125,19 @@ void __mb_cache_entry_free(struct mb_cache_entry *entry)
}
EXPORT_SYMBOL(__mb_cache_entry_free);
+/*
+ * mb_cache_entry_wait_unused - wait to be the last user of the entry
+ *
+ * @entry - entry to work on
+ *
+ * Wait to be the last user of the entry.
+ */
+void mb_cache_entry_wait_unused(struct mb_cache_entry *entry)
+{
+ wait_var_event(&entry->e_refcnt, atomic_read(&entry->e_refcnt) <= 3);
+}
+EXPORT_SYMBOL(mb_cache_entry_wait_unused);
+
static struct mb_cache_entry *__entry_find(struct mb_cache *cache,
struct mb_cache_entry *entry,
u32 key)
@@ -217,7 +230,7 @@ struct mb_cache_entry *mb_cache_entry_get(struct mb_cache *cache, u32 key,
}
EXPORT_SYMBOL(mb_cache_entry_get);
-/* mb_cache_entry_delete - remove a cache entry
+/* mb_cache_entry_delete - try to remove a cache entry
* @cache - cache we work with
* @key - key
* @value - value
@@ -254,6 +267,55 @@ void mb_cache_entry_delete(struct mb_cache *cache, u32 key, u64 value)
}
EXPORT_SYMBOL(mb_cache_entry_delete);
+/* mb_cache_entry_delete_or_get - remove a cache entry if it has no users
+ * @cache - cache we work with
+ * @key - key
+ * @value - value
+ *
+ * Remove entry from cache @cache with key @key and value @value. The removal
+ * happens only if the entry is unused. The function returns NULL in case the
+ * entry was successfully removed or there's no entry in cache. Otherwise the
+ * function grabs reference of the entry that we failed to delete because it
+ * still has users and return it.
+ */
+struct mb_cache_entry *mb_cache_entry_delete_or_get(struct mb_cache *cache,
+ u32 key, u64 value)
+{
+ struct hlist_bl_node *node;
+ struct hlist_bl_head *head;
+ struct mb_cache_entry *entry;
+
+ head = mb_cache_entry_head(cache, key);
+ hlist_bl_lock(head);
+ hlist_bl_for_each_entry(entry, node, head, e_hash_list) {
+ if (entry->e_key == key && entry->e_value == value) {
+ if (atomic_read(&entry->e_refcnt) > 2) {
+ atomic_inc(&entry->e_refcnt);
+ hlist_bl_unlock(head);
+ return entry;
+ }
+ /* We keep hash list reference to keep entry alive */
+ hlist_bl_del_init(&entry->e_hash_list);
+ hlist_bl_unlock(head);
+ spin_lock(&cache->c_list_lock);
+ if (!list_empty(&entry->e_list)) {
+ list_del_init(&entry->e_list);
+ if (!WARN_ONCE(cache->c_entry_count == 0,
+ "mbcache: attempt to decrement c_entry_count past zero"))
+ cache->c_entry_count--;
+ atomic_dec(&entry->e_refcnt);
+ }
+ spin_unlock(&cache->c_list_lock);
+ mb_cache_entry_put(cache, entry);
+ return NULL;
+ }
+ }
+ hlist_bl_unlock(head);
+
+ return NULL;
+}
+EXPORT_SYMBOL(mb_cache_entry_delete_or_get);
+
/* mb_cache_entry_touch - cache entry got used
* @cache - cache the entry belongs to
* @entry - entry that got used
diff --git a/include/linux/mbcache.h b/include/linux/mbcache.h
index 20f1e3ff6013..8eca7f25c432 100644
--- a/include/linux/mbcache.h
+++ b/include/linux/mbcache.h
@@ -30,15 +30,23 @@ void mb_cache_destroy(struct mb_cache *cache);
int mb_cache_entry_create(struct mb_cache *cache, gfp_t mask, u32 key,
u64 value, bool reusable);
void __mb_cache_entry_free(struct mb_cache_entry *entry);
+void mb_cache_entry_wait_unused(struct mb_cache_entry *entry);
static inline int mb_cache_entry_put(struct mb_cache *cache,
struct mb_cache_entry *entry)
{
- if (!atomic_dec_and_test(&entry->e_refcnt))
+ unsigned int cnt = atomic_dec_return(&entry->e_refcnt);
+
+ if (cnt > 0) {
+ if (cnt <= 3)
+ wake_up_var(&entry->e_refcnt);
return 0;
+ }
__mb_cache_entry_free(entry);
return 1;
}
+struct mb_cache_entry *mb_cache_entry_delete_or_get(struct mb_cache *cache,
+ u32 key, u64 value);
void mb_cache_entry_delete(struct mb_cache *cache, u32 key, u64 value);
struct mb_cache_entry *mb_cache_entry_get(struct mb_cache *cache, u32 key,
u64 value);
--
2.35.3
Currently, when loading a kernel image via the kexec_file_load() system
call, arm64 can only use the .builtin_trusted_keys keyring to verify
a signature whereas x86 can use three more keyrings i.e.
.secondary_trusted_keys, .machine and .platform keyrings. For example,
one resulting problem is kexec'ing a kernel image would be rejected
with the error "Lockdown: kexec: kexec of unsigned images is restricted;
see man kernel_lockdown.7".
This patch set enables arm64 to make use of the same keyrings as x86 to
verify the signature kexec'ed kernel image.
Fixes: 732b7b93d849 ("arm64: kexec_file: add kernel signature verification support")
Cc: stable(a)vger.kernel.org # 105e10e2cf1c: kexec_file: drop weak attribute from functions
Cc: stable(a)vger.kernel.org # 34d5960af253: kexec: clean up arch_kexec_kernel_verify_sig
Cc: stable(a)vger.kernel.org # 83b7bb2d49ae: kexec, KEYS: make the code in bzImage64_verify_sig generic
Acked-by: Baoquan He <bhe(a)redhat.com>
Cc: kexec(a)lists.infradead.org
Cc: keyrings(a)vger.kernel.org
Cc: linux-security-module(a)vger.kernel.org
Co-developed-by: Michal Suchanek <msuchanek(a)suse.de>
Signed-off-by: Michal Suchanek <msuchanek(a)suse.de>
Acked-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Coiby Xu <coxu(a)redhat.com>
---
arch/arm64/kernel/kexec_image.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index 9ec34690e255..5ed6a585f21f 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -14,7 +14,6 @@
#include <linux/kexec.h>
#include <linux/pe.h>
#include <linux/string.h>
-#include <linux/verification.h>
#include <asm/byteorder.h>
#include <asm/cpufeature.h>
#include <asm/image.h>
@@ -130,18 +129,10 @@ static void *image_load(struct kimage *image,
return NULL;
}
-#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
-static int image_verify_sig(const char *kernel, unsigned long kernel_len)
-{
- return verify_pefile_signature(kernel, kernel_len, NULL,
- VERIFYING_KEXEC_PE_SIGNATURE);
-}
-#endif
-
const struct kexec_file_ops kexec_image_ops = {
.probe = image_probe,
.load = image_load,
#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
- .verify_sig = image_verify_sig,
+ .verify_sig = kexec_kernel_verify_pe_sig,
#endif
};
--
2.35.3
Before commit 105e10e2cf1c ("kexec_file: drop weak attribute from
functions"), there was already no arch-specific implementation
of arch_kexec_kernel_verify_sig. With weak attribute dropped by that
commit, arch_kexec_kernel_verify_sig is completely useless. So clean it
up.
Note this patch is dependent by later patches so it should backported to
the stable tree as well.
Cc: stable(a)vger.kernel.org
Suggested-by: Eric W. Biederman <ebiederm(a)xmission.com>
Reviewed-by: Michal Suchanek <msuchanek(a)suse.de>
Acked-by: Baoquan He <bhe(a)redhat.com>
Signed-off-by: Coiby Xu <coxu(a)redhat.com>
---
include/linux/kexec.h | 5 -----
kernel/kexec_file.c | 33 +++++++++++++--------------------
2 files changed, 13 insertions(+), 25 deletions(-)
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 6958c6b471f4..6e7510f39368 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -212,11 +212,6 @@ static inline void *arch_kexec_kernel_image_load(struct kimage *image)
}
#endif
-#ifdef CONFIG_KEXEC_SIG
-int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
- unsigned long buf_len);
-#endif
-
extern int kexec_add_buffer(struct kexec_buf *kbuf);
int kexec_locate_mem_hole(struct kexec_buf *kbuf);
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 0c27c81351ee..6dc1294c90fc 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -81,24 +81,6 @@ int kexec_image_post_load_cleanup_default(struct kimage *image)
return image->fops->cleanup(image->image_loader_data);
}
-#ifdef CONFIG_KEXEC_SIG
-static int kexec_image_verify_sig_default(struct kimage *image, void *buf,
- unsigned long buf_len)
-{
- if (!image->fops || !image->fops->verify_sig) {
- pr_debug("kernel loader does not support signature verification.\n");
- return -EKEYREJECTED;
- }
-
- return image->fops->verify_sig(buf, buf_len);
-}
-
-int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf, unsigned long buf_len)
-{
- return kexec_image_verify_sig_default(image, buf, buf_len);
-}
-#endif
-
/*
* Free up memory used by kernel, initrd, and command line. This is temporary
* memory allocation which is not needed any more after these buffers have
@@ -141,13 +123,24 @@ void kimage_file_post_load_cleanup(struct kimage *image)
}
#ifdef CONFIG_KEXEC_SIG
+static int kexec_image_verify_sig(struct kimage *image, void *buf,
+ unsigned long buf_len)
+{
+ if (!image->fops || !image->fops->verify_sig) {
+ pr_debug("kernel loader does not support signature verification.\n");
+ return -EKEYREJECTED;
+ }
+
+ return image->fops->verify_sig(buf, buf_len);
+}
+
static int
kimage_validate_signature(struct kimage *image)
{
int ret;
- ret = arch_kexec_kernel_verify_sig(image, image->kernel_buf,
- image->kernel_buf_len);
+ ret = kexec_image_verify_sig(image, image->kernel_buf,
+ image->kernel_buf_len);
if (ret) {
if (sig_enforce) {
--
2.35.3
When a nexthop is added, without a gw address, the default scope was set
to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen
but rejected when the route is used.
When using a route without a nexthop id, the scope can be configured in the
route, thus the problem doesn't exist.
To explain more deeply: when a user creates a nexthop, it cannot specify
the scope. To create it, the function nh_create_ipv4() calls fib_check_nh()
with scope set to 0. fib_check_nh() calls fib_check_nh_nongw() wich was
setting scope to 'host'. Then, nh_create_ipv4() calls
fib_info_update_nhc_saddr() with scope set to 'host'. The src addr is
chosen before the route is inserted.
When a 'standard' route (ie without a reference to a nexthop) is added,
fib_create_info() calls fib_info_update_nhc_saddr() with the scope set by
the user. iproute2 set the scope to 'link' by default.
Here is a way to reproduce the problem:
ip netns add foo
ip -n foo link set lo up
ip netns add bar
ip -n bar link set lo up
sleep 1
ip -n foo link add name eth0 type dummy
ip -n foo link set eth0 up
ip -n foo address add 192.168.0.1/24 dev eth0
ip -n foo link add name veth0 type veth peer name veth1 netns bar
ip -n foo link set veth0 up
ip -n bar link set veth1 up
ip -n bar address add 192.168.1.1/32 dev veth1
ip -n bar route add default dev veth1
ip -n foo nexthop add id 1 dev veth0
ip -n foo route add 192.168.1.1 nhid 1
Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> RTNETLINK answers: Invalid argument
> $ ip netns exec foo ping -c1 192.168.1.1
> ping: connect: Invalid argument
Try without nexthop group (iproute2 sets scope to 'link' by dflt):
ip -n foo route del 192.168.1.1
ip -n foo route add 192.168.1.1 dev veth0
Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> 192.168.1.1 dev veth0 src 192.168.0.1 uid 0
> cache
> $ ip netns exec foo ping -c1 192.168.1.1
> PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.039 ms
>
> --- 192.168.1.1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms
CC: stable(a)vger.kernel.org
Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")
Reported-by: Edwin Brossette <edwin.brossette(a)6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
---
v2 -> v3:
- no change
v1 -> v2:
- remove useless arp off / fixed mac settings in the description
net/ipv4/fib_semantics.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index a57ba23571c9..20177ecf5bdd 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1230,7 +1230,7 @@ static int fib_check_nh_nongw(struct net *net, struct fib_nh *nh,
nh->fib_nh_dev = in_dev->dev;
dev_hold_track(nh->fib_nh_dev, &nh->fib_nh_dev_tracker, GFP_ATOMIC);
- nh->fib_nh_scope = RT_SCOPE_HOST;
+ nh->fib_nh_scope = RT_SCOPE_LINK;
if (!netif_carrier_ok(nh->fib_nh_dev))
nh->fib_nh_flags |= RTNH_F_LINKDOWN;
err = 0;
--
2.33.0
On Wed, Jul 13, 2022 at 05:38:47AM +0800, kernel test robot wrote:
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> head: 53b881e19526bcc3e51d9668cab955c80dcf584c
> commit: 7575d3f3bbd1c68d6833b45d1b98ed182832bd44 [7082/7120] x86: Use return-thunk in asm code
> config: x86_64-rhel-8.3-syz (https://download.01.org/0day-ci/archive/20220713/202207130531.SkRjrrn8-lkp@…)
> compiler: gcc-11 (Debian 11.3.0-3) 11.3.0
> reproduce (this is a W=1 build):
> # https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/…
> git remote add linux-stable-rc https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> git fetch --no-tags linux-stable-rc linux-5.10.y
> git checkout 7575d3f3bbd1c68d6833b45d1b98ed182832bd44
> # save the config file
> mkdir build_dir && cp config build_dir/.config
> make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/
>
> If you fix the issue, kindly add following tag where applicable
> Reported-by: kernel test robot <lkp(a)intel.com>
>
> All warnings (new ones prefixed by >>):
>
> >> arch/x86/kernel/head_64.o: warning: objtool: xen_hypercall_mmu_update(): can't find starting instruction
>
> --
> 0-DAY CI Kernel Test Service
> https://01.org/lkp
Please add the following patch to fix this. This would also be
needed for 5.15-stable.
Ben.
From: Ben Hutchings <ben(a)decadent.org.uk>
Date: Thu, 14 Jul 2022 00:39:33 +0200
Subject: [PATCH] x86/xen: Fix initialisation in hypercall_page after rethunk
The hypercall_page is special and the RETs there should not be changed
into rethunk calls (but can have SLS mitigation). Change the initial
instructions to ret + int3 padding, as was done in upstream commit
5b2fc51576ef "x86/ibt,xen: Sprinkle the ENDBR".
Signed-off-by: Ben Hutchings <ben(a)decadent.org.uk>
---
arch/x86/xen/xen-head.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 38b73e7e54ba..2a3ef5fcba34 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -69,9 +69,9 @@ SYM_CODE_END(asm_cpu_bringup_and_idle)
SYM_CODE_START(hypercall_page)
.rept (PAGE_SIZE / 32)
UNWIND_HINT_FUNC
- .skip 31, 0x90
ANNOTATE_UNRET_SAFE
- RET
+ ret
+ .skip 31, 0xcc
.endr
#define HYPERCALL(n) \
--
Ben Hutchings
Always try to do things in chronological order;
it's less confusing that way.
--
Dear,
I had sent you a mail but i don't think you received it that's why am
writing you again.It is important you get back to me as soon as you
can.
Abd-Wabbo Maddah
Sphinx reported inline emphasis warning on netfs:
Documentation/filesystems/netfs_library.rst:384: WARNING: Inline emphasis start-string without end-string.
Documentation/filesystems/netfs_library.rst:384: WARNING: Inline emphasis start-string without end-string.
Documentation/filesystems/netfs_library:609: ./fs/netfs/buffered_read.c:318: WARNING: Inline emphasis start-string without end-string.
These warnings above are due to unsecaped *foliop, which confuses Sphinx as
italics syntax instead.
Use inline code for the pointer.
Link: https://lore.kernel.org/linux-doc/202207140742.GTPk4U8i-lkp@intel.com/
Fixes: 157be6ddd9e438 ("netfs: do not unlock and put the folio twice")
Reported-by: kernel test robot <lkp(a)intel.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Jeff Layton <jlayton(a)kernel.org>
Cc: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Cc: Xiubo Li <xiubli(a)redhat.com>
Cc: Ilya Dryomov <idryomov(a)gmail.com>
Cc: stable(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Bagas Sanjaya <bagasdotme(a)gmail.com>
---
Documentation/filesystems/netfs_library.rst | 6 +++---
fs/netfs/buffered_read.c | 4 ++--
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst
index 8d4cf5d5822de4..73a4176144b3b8 100644
--- a/Documentation/filesystems/netfs_library.rst
+++ b/Documentation/filesystems/netfs_library.rst
@@ -382,9 +382,9 @@ The operations are as follows:
conflicting state before allowing it to be modified.
It may unlock and discard the folio it was given and set the caller's folio
- pointer to NULL. It should return 0 if everything is now fine (*foliop
- left set) or the op should be retried (*foliop cleared) and any other error
- code to abort the operation.
+ pointer to NULL. It should return 0 if everything is now fine (``*foliop``
+ left set) or the op should be retried (``*foliop`` cleared) and any other
+ error code to abort the operation.
* ``done``
diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 8fa0725cd64981..0ce53585215106 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -320,8 +320,8 @@ static bool netfs_skip_folio_read(struct folio *folio, loff_t pos, size_t len,
* pointer to the fsdata cookie that gets returned to the VM to be passed to
* write_end. It is permitted to sleep. It should return 0 if the request
* should go ahead or it may return an error. It may also unlock and put the
- * folio, provided it sets *foliop to NULL, in which case a return of 0 will
- * cause the folio to be re-got and the process to be retried.
+ * folio, provided it sets ``*foliop`` to NULL, in which case a return of 0
+ * will cause the folio to be re-got and the process to be retried.
*
* The calling netfs must initialise a netfs context contiguous to the vfs
* inode before calling this.
--
An old man doll... just what I always wanted! - Clara
--
My Dear Friend,
Did you receive the message i sent to you?
Regards,
Mr Jerry Dosso
......................................................
내 소중한 친구,
내가 당신에게 보낸 메시지를 받았습니까?
문안 인사,
미스터 제리 도소
All creation paths except for O_TMPFILE handle umask in the vfs directly
if the filesystem doesn't support or enable POSIX ACLs. If the filesystem
does then umask handling is deferred until posix_acl_create().
Because, O_TMPFILE misses umask handling in the vfs it will not honor
umask settings. Fix this by adding the missing umask handling.
Fixes: 60545d0d4610 ("[O_TMPFILE] it's still short a few helpers, but infrastructure should be OK now...")
Cc: <stable(a)vger.kernel.org> # 4.19+
Reported-by: Christian Brauner (Microsoft) <brauner(a)kernel.org>
Acked-by: Christian Brauner (Microsoft) <brauner(a)kernel.org>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Reviewed-and-Tested-by: Jeff Layton <jlayton(a)kernel.org>
Signed-off-by: Yang Xu <xuyang2018.jy(a)fujitsu.com>
---
fs/namei.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/namei.c b/fs/namei.c
index 1f28d3f463c3..ac4225ad6ac4 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3565,6 +3565,8 @@ struct dentry *vfs_tmpfile(struct user_namespace *mnt_userns,
child = d_alloc(dentry, &slash_name);
if (unlikely(!child))
goto out_err;
+ if (!IS_POSIXACL(dir))
+ mode &= ~current_umask();
error = dir->i_op->tmpfile(mnt_userns, dir, child, mode);
if (error)
goto out_err;
--
2.27.0
From: Sean Wang <sean.wang(a)mediatek.com>
This reverts commit 663457f421d41e9d2fcb1e84baf43d1433f80c08.
Signed-off-by: Sean Wang <sean.wang(a)mediatek.com>
---
There was mistaken in
649178c0493e ("mt76: mt7921e: fix possible probe failure after reboot")
that caused WiFi reset cannot work well as the reported issue
"PROBLEM: [Stable v5.15.42+] [mt7921] Wake after suspend locks up system
when mt7921-driver is used on a Lenovo ThinkPad E15 G3"
in http://lists.infradead.org/pipermail/linux-mediatek/2022-June/042668.html
So we need to revert the patch first. Will be applied again later.
---
drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
index 3d35838ef306..7d9b23a00238 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
@@ -254,10 +254,8 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
dev->bus_ops = dev->mt76.bus;
bus_ops = devm_kmemdup(dev->mt76.dev, dev->bus_ops, sizeof(*bus_ops),
GFP_KERNEL);
- if (!bus_ops) {
- ret = -ENOMEM;
- goto err_free_dev;
- }
+ if (!bus_ops)
+ return -ENOMEM;
bus_ops->rr = mt7921_rr;
bus_ops->wr = mt7921_wr;
@@ -266,7 +264,7 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
ret = __mt7921_mcu_drv_pmctrl(dev);
if (ret)
- goto err_free_dev;
+ return ret;
mdev->rev = (mt7921_l1_rr(dev, MT_HW_CHIPID) << 16) |
(mt7921_l1_rr(dev, MT_HW_REV) & 0xff);
--
2.25.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 5871321fb4558c55bf9567052b618ff0be6b975e ]
We currently report that range controls accept a range of 0..(max-min) but
accept writes in the range 0..(max-min+1). Remove that extra +1.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220604105246.4055214-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 90ba5521c189..4fda8c24be29 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -535,7 +535,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
@@ -556,7 +556,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
--
2.35.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 5871321fb4558c55bf9567052b618ff0be6b975e ]
We currently report that range controls accept a range of 0..(max-min) but
accept writes in the range 0..(max-min+1). Remove that extra +1.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220604105246.4055214-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 0848aec1bd24..81c9ecfa7c7f 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -535,7 +535,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
@@ -556,7 +556,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
--
2.35.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 5871321fb4558c55bf9567052b618ff0be6b975e ]
We currently report that range controls accept a range of 0..(max-min) but
accept writes in the range 0..(max-min+1). Remove that extra +1.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220604105246.4055214-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 7a37312c8e0c..453b61b42dd9 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -530,7 +530,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
@@ -551,7 +551,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
--
2.35.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 5871321fb4558c55bf9567052b618ff0be6b975e ]
We currently report that range controls accept a range of 0..(max-min) but
accept writes in the range 0..(max-min+1). Remove that extra +1.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220604105246.4055214-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 7a37312c8e0c..453b61b42dd9 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -530,7 +530,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
@@ -551,7 +551,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
--
2.35.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 5871321fb4558c55bf9567052b618ff0be6b975e ]
We currently report that range controls accept a range of 0..(max-min) but
accept writes in the range 0..(max-min+1). Remove that extra +1.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Link: https://lore.kernel.org/r/20220604105246.4055214-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
sound/soc/soc-ops.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 15bfcdbdfaa4..0f26d6c31ce5 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -517,7 +517,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
@@ -538,7 +538,7 @@ int snd_soc_put_volsw_range(struct snd_kcontrol *kcontrol,
return -EINVAL;
if (mc->platform_max && tmp > mc->platform_max)
return -EINVAL;
- if (tmp > mc->max - mc->min + 1)
+ if (tmp > mc->max - mc->min)
return -EINVAL;
if (invert)
--
2.35.1
--
Hello Dear,
how are you today?hope you are fine
My name is Dr Ava Smith ,Am an English and French nationalities.
I will give you pictures and more details about me as soon as i hear from you
Thanks
Ava
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e8e79c416aae1de224c0f1860f2e3350fa171f8 Mon Sep 17 00:00:00 2001
From: Marc Kleine-Budde <mkl(a)pengutronix.de>
Date: Thu, 17 Mar 2022 08:57:35 +0100
Subject: [PATCH] can: m_can: m_can_tx_handler(): fix use after free of skb
can_put_echo_skb() will clone skb then free the skb. Move the
can_put_echo_skb() for the m_can version 3.0.x directly before the
start of the xmit in hardware, similar to the 3.1.x branch.
Fixes: 80646733f11c ("can: m_can: update to support CAN FD features")
Link: https://lore.kernel.org/all/20220317081305.739554-1-mkl@pengutronix.de
Cc: stable(a)vger.kernel.org
Reported-by: Hangyu Hua <hbh25y(a)gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 1a4b56f6fa8c..b3b5bc1c803b 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1637,8 +1637,6 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
if (err)
goto out_fail;
- can_put_echo_skb(skb, dev, 0, 0);
-
if (cdev->can.ctrlmode & CAN_CTRLMODE_FD) {
cccr = m_can_read(cdev, M_CAN_CCCR);
cccr &= ~CCCR_CMR_MASK;
@@ -1655,6 +1653,9 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
m_can_write(cdev, M_CAN_CCCR, cccr);
}
m_can_write(cdev, M_CAN_TXBTIE, 0x1);
+
+ can_put_echo_skb(skb, dev, 0, 0);
+
m_can_write(cdev, M_CAN_TXBAR, 0x1);
/* End of xmit function for version 3.0.x */
} else {
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e8e79c416aae1de224c0f1860f2e3350fa171f8 Mon Sep 17 00:00:00 2001
From: Marc Kleine-Budde <mkl(a)pengutronix.de>
Date: Thu, 17 Mar 2022 08:57:35 +0100
Subject: [PATCH] can: m_can: m_can_tx_handler(): fix use after free of skb
can_put_echo_skb() will clone skb then free the skb. Move the
can_put_echo_skb() for the m_can version 3.0.x directly before the
start of the xmit in hardware, similar to the 3.1.x branch.
Fixes: 80646733f11c ("can: m_can: update to support CAN FD features")
Link: https://lore.kernel.org/all/20220317081305.739554-1-mkl@pengutronix.de
Cc: stable(a)vger.kernel.org
Reported-by: Hangyu Hua <hbh25y(a)gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 1a4b56f6fa8c..b3b5bc1c803b 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1637,8 +1637,6 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
if (err)
goto out_fail;
- can_put_echo_skb(skb, dev, 0, 0);
-
if (cdev->can.ctrlmode & CAN_CTRLMODE_FD) {
cccr = m_can_read(cdev, M_CAN_CCCR);
cccr &= ~CCCR_CMR_MASK;
@@ -1655,6 +1653,9 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
m_can_write(cdev, M_CAN_CCCR, cccr);
}
m_can_write(cdev, M_CAN_TXBTIE, 0x1);
+
+ can_put_echo_skb(skb, dev, 0, 0);
+
m_can_write(cdev, M_CAN_TXBAR, 0x1);
/* End of xmit function for version 3.0.x */
} else {
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e53ac7374e64dede04d745ff0e70ff5048378d1f Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel(a)surriel.com>
Date: Tue, 22 Mar 2022 14:44:09 -0700
Subject: [PATCH] mm: invalidate hwpoison page cache page in fault path
Sometimes the page offlining code can leave behind a hwpoisoned clean
page cache page. This can lead to programs being killed over and over
and over again as they fault in the hwpoisoned page, get killed, and
then get re-spawned by whatever wanted to run them.
This is particularly embarrassing when the page was offlined due to
having too many corrected memory errors. Now we are killing tasks due
to them trying to access memory that probably isn't even corrupted.
This problem can be avoided by invalidating the page from the page fault
handler, which already has a branch for dealing with these kinds of
pages. With this patch we simply pretend the page fault was successful
if the page was invalidated, return to userspace, incur another page
fault, read in the file from disk (to a new memory page), and then
everything works again.
Link: https://lkml.kernel.org/r/20220212213740.423efcea@imladris.surriel.com
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memory.c b/mm/memory.c
index c96281458c83..1a55b4c5b5db 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3877,11 +3877,16 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
return ret;
if (unlikely(PageHWPoison(vmf->page))) {
- if (ret & VM_FAULT_LOCKED)
+ vm_fault_t poisonret = VM_FAULT_HWPOISON;
+ if (ret & VM_FAULT_LOCKED) {
+ /* Retry if a clean page was removed from the cache. */
+ if (invalidate_inode_page(vmf->page))
+ poisonret = 0;
unlock_page(vmf->page);
+ }
put_page(vmf->page);
vmf->page = NULL;
- return VM_FAULT_HWPOISON;
+ return poisonret;
}
if (unlikely(!(ret & VM_FAULT_LOCKED)))
--
Congratulations!
The sum of €1,500,000.00 has been donated to you by STEFANO PESSINA.
Kindly get back for more info via pstefanopessina80(a)gmail.com
When commit 72f2ecb7ece7 ("ACPI: bus: Set CPPC _OSC bits for all
and when CPPC_LIB is supported") was introduced, we found collateral
damage that a number of AMD systems that supported CPPC but
didn't advertise support in _OSC stopped having a functional
amd-pstate driver. The _OSC was only enforced on Intel systems at that
time.
This was fixed for the MSR based designs by commit 8b356e536e69f
("ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is supported")
but some shared memory based designs also support CPPC but haven't
advertised support in the _OSC. Add support for those designs as well by
hardcoding the list of systems.
Fixes: 72f2ecb7ece7 ("ACPI: bus: Set CPPC _OSC bits for all and when CPPC_LIB is supported")
Fixes: 8b356e536e69f ("ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is supported")
Link: https://lore.kernel.org/all/3559249.JlDtxWtqDm@natalenko.name/
Cc: stable(a)vger.kernel.org # 5.18
Reported-and-tested-by: Oleksandr Natalenko <oleksandr(a)natalenko.name>
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
arch/x86/kernel/acpi/cppc.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c
index 734b96454896..8d8752b44f11 100644
--- a/arch/x86/kernel/acpi/cppc.c
+++ b/arch/x86/kernel/acpi/cppc.c
@@ -16,6 +16,12 @@ bool cpc_supported_by_cpu(void)
switch (boot_cpu_data.x86_vendor) {
case X86_VENDOR_AMD:
case X86_VENDOR_HYGON:
+ if (boot_cpu_data.x86 == 0x19 && ((boot_cpu_data.x86_model <= 0x0f) ||
+ (boot_cpu_data.x86_model >= 0x20 && boot_cpu_data.x86_model <= 0x2f)))
+ return true;
+ else if (boot_cpu_data.x86 == 0x17 &&
+ boot_cpu_data.x86_model >= 0x70 && boot_cpu_data.x86_model <= 0x7f)
+ return true;
return boot_cpu_has(X86_FEATURE_CPPC);
}
return false;
--
2.34.1
The patch titled
Subject: userfaultfd: provide properly masked address for huge-pages
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
userfaultfd-provide-properly-masked-address-for-huge-pages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Nadav Amit <namit(a)vmware.com>
Subject: userfaultfd: provide properly masked address for huge-pages
Date: Mon, 11 Jul 2022 09:59:06 -0700
Commit 824ddc601adc ("userfaultfd: provide unmasked address on
page-fault") was introduced to fix an old bug, in which the offset in the
address of a page-fault was masked. Concerns were raised - although were
never backed by actual code - that some userspace code might break because
the bug has been around for quite a while. To address these concerns a
new flag was introduced, and only when this flag is set by the user,
userfaultfd provides the exact address of the page-fault.
The commit however had a bug, and if the flag is unset, the offset was
always masked based on a base-page granularity. Yet, for huge-pages, the
behavior prior to the commit was that the address is masked to the
huge-page granulrity.
While there are no reports on real breakage, fix this issue. If the flag
is unset, use the address with the masking that was done before.
Link: https://lkml.kernel.org/r/20220711165906.2682-1-namit@vmware.com
Fixes: 824ddc601adc ("userfaultfd: provide unmasked address on page-fault")
Signed-off-by: Nadav Amit <namit(a)vmware.com>
Reported-by: James Houghton <jthoughton(a)google.com>
Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com>
Reviewed-by: Peter Xu <peterx(a)redhat.com>
Reviewed-by: James Houghton <jthoughton(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/userfaultfd.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
--- a/fs/userfaultfd.c~userfaultfd-provide-properly-masked-address-for-huge-pages
+++ a/fs/userfaultfd.c
@@ -192,17 +192,19 @@ static inline void msg_init(struct uffd_
}
static inline struct uffd_msg userfault_msg(unsigned long address,
+ unsigned long real_address,
unsigned int flags,
unsigned long reason,
unsigned int features)
{
struct uffd_msg msg;
+
msg_init(&msg);
msg.event = UFFD_EVENT_PAGEFAULT;
- if (!(features & UFFD_FEATURE_EXACT_ADDRESS))
- address &= PAGE_MASK;
- msg.arg.pagefault.address = address;
+ msg.arg.pagefault.address = (features & UFFD_FEATURE_EXACT_ADDRESS) ?
+ real_address : address;
+
/*
* These flags indicate why the userfault occurred:
* - UFFD_PAGEFAULT_FLAG_WP indicates a write protect fault.
@@ -488,8 +490,8 @@ vm_fault_t handle_userfault(struct vm_fa
init_waitqueue_func_entry(&uwq.wq, userfaultfd_wake_function);
uwq.wq.private = current;
- uwq.msg = userfault_msg(vmf->real_address, vmf->flags, reason,
- ctx->features);
+ uwq.msg = userfault_msg(vmf->address, vmf->real_address, vmf->flags,
+ reason, ctx->features);
uwq.ctx = ctx;
uwq.waken = false;
_
Patches currently in -mm which might be from namit(a)vmware.com are
userfaultfd-provide-properly-masked-address-for-huge-pages.patch
KASAN reports:
[ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
[ 4.676149][ T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0
[ 4.683454][ T0]
[ 4.685638][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838f290 #1
[ 4.694331][ T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
[ 4.703196][ T0] Call Trace:
[ 4.706334][ T0] <TASK>
[ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
after converting the type of the first argument (@nr, bit number)
of arch_test_bit() from `long` to `unsigned long`[0].
Under certain conditions (for example, when ACPI NUMA is disabled
via command line), pxm_to_node() can return %NUMA_NO_NODE (-1).
It is valid 'magic' number of NUMA node, but not valid bit number
to use in bitops.
node_online() eventually descends to test_bit() without checking
for the input, assuming it's on caller side (which might be good
for perf-critical tasks). There, -1 becomes %ULONG_MAX which leads
to an insane array index when calculating bit position in memory.
For now, add an explicit check for @node being not %NUMA_NO_NODE
before calling test_bit(). The actual logics didn't change here
at all.
Fixes: ee34b32d8c29 ("dmar: support for parsing Remapping Hardware Static Affinity structure")
Cc: stable(a)vger.kernel.org # 2.6.33+
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin(a)intel.com>
---
drivers/iommu/intel/dmar.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 9699ca101c62..64b14ac4c7b0 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -494,7 +494,7 @@ static int dmar_parse_one_rhsa(struct acpi_dmar_header *header, void *arg)
if (drhd->reg_base_addr == rhsa->base_address) {
int node = pxm_to_node(rhsa->proximity_domain);
- if (!node_online(node))
+ if (node != NUMA_NO_NODE && !node_online(node))
node = NUMA_NO_NODE;
drhd->iommu->node = node;
return 0;
--
2.36.1
We have an application with a lot of threads that use a shared mmap
backed by tmpfs mounted with -o huge=within_size. This application
started leaking loads of huge pages when we upgraded to a recent kernel.
Using the page ref tracepoints and a BPF program written by Tejun Heo we
were able to determine that these pages would have multiple refcounts
from the page fault path, but when it came to unmap time we wouldn't
drop the number of refs we had added from the faults.
I wrote a reproducer that mmap'ed a file backed by tmpfs with -o
huge=always, and then spawned 20 threads all looping faulting random
offsets in this map, while using madvise(MADV_DONTNEED) randomly for
huge page aligned ranges. This very quickly reproduced the problem.
The problem here is that we check for the case that we have multiple
threads faulting in a range that was previously unmapped. One thread
maps the PMD, the other thread loses the race and then returns 0.
However at this point we already have the page, and we are no longer
putting this page into the processes address space, and so we leak the
page. We actually did the correct thing prior to f9ce0be71d1f, however
it looks like Kirill copied what we do in the anonymous page case. In
the anonymous page case we don't yet have a page, so we don't have to
drop a reference on anything. Previously we did the correct thing for
file based faults by returning VM_FAULT_NOPAGE so we correctly drop the
reference on the page we faulted in.
Fix this by returning VM_FAULT_NOPAGE in the pmd_devmap_trans_unstable()
case, this makes us drop the ref on the page properly, and now my
reproducer no longer leaks the huge pages.
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: stable(a)vger.kernel.org
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Rik van Riel <riel(a)surriel.com>
Signed-off-by: Chris Mason <clm(a)fb.com>
---
v1->v2:
- Added Kirill's Acked-by.
- Added cc:stable
- Added a comment about why we need to return NOPAGE.
mm/memory.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 7a089145cad4..207b29b09286 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4369,9 +4369,12 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
return VM_FAULT_OOM;
}
- /* See comment in handle_pte_fault() */
+ /*
+ * See comment in handle_pte_fault() for how this scenario happens, we
+ * need to return NOPAGE so that we drop this page.
+ */
if (pmd_devmap_trans_unstable(vmf->pmd))
- return 0;
+ return VM_FAULT_NOPAGE;
vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
vmf->address, &vmf->ptl);
--
2.36.1
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 230ec83d4299b30c51a1c133b4f2a669972cc08a
Gitweb: https://git.kernel.org/tip/230ec83d4299b30c51a1c133b4f2a669972cc08a
Author: Juergen Gross <jgross(a)suse.com>
AuthorDate: Fri, 08 Jul 2022 15:14:56 +02:00
Committer: Borislav Petkov <bp(a)suse.de>
CommitterDate: Wed, 13 Jul 2022 12:44:04 +02:00
x86/pat: Fix x86_has_pat_wp()
x86_has_pat_wp() is using a wrong test, as it relies on the normal
PAT configuration used by the kernel. In case the PAT MSR has been
setup by another entity (e.g. Xen hypervisor) it might return false
even if the PAT configuration is allowing WP mappings. This due to the
fact that when running as Xen PV guest the PAT MSR is setup by the
hypervisor and cannot be changed by the guest. This results in the WP
related entry to be at a different position when running as Xen PV
guest compared to the bare metal or fully virtualized case.
The correct way to test for WP support is:
1. Get the PTE protection bits needed to select WP mode by reading
__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP] (depending on the PAT MSR
setting this might return protection bits for a stronger mode, e.g.
UC-)
2. Translate those bits back into the real cache mode selected by those
PTE bits by reading __pte2cachemode_tbl[__pte2cm_idx(prot)]
3. Test for the cache mode to be _PAGE_CACHE_MODE_WP
Fixes: f88a68facd9a ("x86/mm: Extend early_memremap() support with additional attrs")
Signed-off-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Cc: <stable(a)vger.kernel.org> # 4.14
Link: https://lore.kernel.org/r/20220503132207.17234-1-jgross@suse.com
---
arch/x86/mm/init.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index d8cfce2..57ba550 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -77,10 +77,20 @@ static uint8_t __pte2cachemode_tbl[8] = {
[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
};
-/* Check that the write-protect PAT entry is set for write-protect */
+/*
+ * Check that the write-protect PAT entry is set for write-protect.
+ * To do this without making assumptions how PAT has been set up (Xen has
+ * another layout than the kernel), translate the _PAGE_CACHE_MODE_WP cache
+ * mode via the __cachemode2pte_tbl[] into protection bits (those protection
+ * bits will select a cache mode of WP or better), and then translate the
+ * protection bits back into the cache mode using __pte2cm_idx() and the
+ * __pte2cachemode_tbl[] array. This will return the really used cache mode.
+ */
bool x86_has_pat_wp(void)
{
- return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
+ uint16_t prot = __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP];
+
+ return __pte2cachemode_tbl[__pte2cm_idx(prot)] == _PAGE_CACHE_MODE_WP;
}
enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: f0592491eba2e42cc84d7831750b84cfc0150dde
Gitweb: https://git.kernel.org/tip/f0592491eba2e42cc84d7831750b84cfc0150dde
Author: Juergen Gross <jgross(a)suse.com>
AuthorDate: Fri, 08 Jul 2022 15:14:56 +02:00
Committer: Borislav Petkov <bp(a)suse.de>
CommitterDate: Wed, 13 Jul 2022 12:21:03 +02:00
x86/pat: Fix x86_has_pat_wp()
x86_has_pat_wp() is using a wrong test, as it relies on the normal
PAT configuration used by the kernel. In case the PAT MSR has been
setup by another entity (e.g. Xen hypervisor) it might return false
even if the PAT configuration is allowing WP mappings. This due to the
fact that when running as Xen PV guest the PAT MSR is setup by the
hypervisor and cannot be changed by the guest. This results in the WP
related entry to be at a different position when running as Xen PV
guest compared to the bare metal or fully virtualized case.
The correct way to test for WP support is:
1. Get the PTE protection bits needed to select WP mode by reading
__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP] (depending on the PAT MSR
setting this might return protection bits for a stronger mode, e.g.
UC-)
2. Translate those bits back into the real cache mode selected by those
PTE bits by reading __pte2cachemode_tbl[__pte2cm_idx(prot)]
3. Test for the cache mode to be _PAGE_CACHE_MODE_WP
Fixes: f88a68facd9a ("x86/mm: Extend early_memremap() support with additional attrs")
Signed-off-by: Juergen Gross <jgross(a)suse.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Cc: <stable(a)vger.kernel.org> # 4.14
Link: https://lore.kernel.org/r/20220503132207.17234-1-jgross@suse.com
---
arch/x86/mm/init.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index d8cfce2..57ba550 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -77,10 +77,20 @@ static uint8_t __pte2cachemode_tbl[8] = {
[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
};
-/* Check that the write-protect PAT entry is set for write-protect */
+/*
+ * Check that the write-protect PAT entry is set for write-protect.
+ * To do this without making assumptions how PAT has been set up (Xen has
+ * another layout than the kernel), translate the _PAGE_CACHE_MODE_WP cache
+ * mode via the __cachemode2pte_tbl[] into protection bits (those protection
+ * bits will select a cache mode of WP or better), and then translate the
+ * protection bits back into the cache mode using __pte2cm_idx() and the
+ * __pte2cachemode_tbl[] array. This will return the really used cache mode.
+ */
bool x86_has_pat_wp(void)
{
- return __pte2cachemode_tbl[_PAGE_CACHE_MODE_WP] == _PAGE_CACHE_MODE_WP;
+ uint16_t prot = __cachemode2pte_tbl[_PAGE_CACHE_MODE_WP];
+
+ return __pte2cachemode_tbl[__pte2cm_idx(prot)] == _PAGE_CACHE_MODE_WP;
}
enum page_cache_mode pgprot2cachemode(pgprot_t pgprot)
When I look into implements of create_hist_fields(), I think there can be
following two simplifications:
1. If something wrong happened in parse_var_defs(), free_var_defs() would
have been called in it, so no need goto free again after calling it;
2. After calling create_key_fields(), regardless of the value of 'ret', it
then always runs into 'out: ', so the judge of 'ret' is redundant.
No functional changes.
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
---
kernel/trace/trace_events_hist.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 2784951e0fc8..832c4ccf41ab 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -4454,7 +4454,7 @@ static int create_hist_fields(struct hist_trigger_data *hist_data,
ret = parse_var_defs(hist_data);
if (ret)
- goto out;
+ return ret;
ret = create_val_fields(hist_data, file);
if (ret)
@@ -4465,8 +4465,7 @@ static int create_hist_fields(struct hist_trigger_data *hist_data,
goto out;
ret = create_key_fields(hist_data, file);
- if (ret)
- goto out;
+
out:
free_var_defs(hist_data);
--
2.32.0
From: Chris Wilson <chris.p.wilson(a)intel.com>
Skip all further TLB invalidations once the device is wedged and
had been reset, as, on such cases, it can no longer process instructions
on the GPU and the user no longer has access to the TLB's in each engine.
That helps to reduce the performance regression introduced by TLB
invalidate logic.
Cc: stable(a)vger.kernel.org
Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Signed-off-by: Chris Wilson <chris.p.wilson(a)intel.com>
Cc: Fei Yang <fei.yang(a)intel.com>
Cc: Andi Shyti <andi.shyti(a)linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
---
See [PATCH 00/21] at: https://lore.kernel.org/all/cover.1657703926.git.mchehab@kernel.org/
drivers/gpu/drm/i915/gt/intel_gt.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 1d84418e8676..5c55a90672f4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -934,6 +934,9 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
if (I915_SELFTEST_ONLY(gt->awake == -ENODEV))
return;
+ if (intel_gt_is_wedged(gt))
+ return;
+
if (GRAPHICS_VER(i915) == 12) {
regs = gen12_regs;
num = ARRAY_SIZE(gen12_regs);
--
2.36.1
From: Chris Wilson <chris.p.wilson(a)intel.com>
Don't flush TLBs when the buffer is only used in the GGTT under full
control of the kernel, as there's no risk of concurrent access
and stale access from prefetch.
We only need to invalidate the TLB if they are accessible by the user.
That helps to reduce the performance regression introduced by TLB
invalidate logic.
Cc: stable(a)vger.kernel.org
Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
Signed-off-by: Chris Wilson <chris.p.wilson(a)intel.com>
Cc: Fei Yang <fei.yang(a)intel.com>
Cc: Andi Shyti <andi.shyti(a)linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab(a)kernel.org>
---
See [PATCH 00/21] at: https://lore.kernel.org/all/cover.1657703926.git.mchehab@kernel.org/
drivers/gpu/drm/i915/i915_vma.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index ef3b04c7e153..646f419b2035 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -538,7 +538,8 @@ int i915_vma_bind(struct i915_vma *vma,
bind_flags);
}
- set_bit(I915_BO_WAS_BOUND_BIT, &vma->obj->flags);
+ if (bind_flags & I915_VMA_LOCAL_BIND)
+ set_bit(I915_BO_WAS_BOUND_BIT, &vma->obj->flags);
atomic_or(bind_flags, &vma->flags);
return 0;
--
2.36.1
Driver registration fails on SOC imx8mn as its supplier, the clock
control module, is probed later than subsys initcall level. This driver
uses platform_driver_probe which is not compatible with deferred probing
and won't be probed again later if probe function fails due to clock not
being available at that time.
This patch replaces the use of platform_driver_probe with
platform_driver_register which will allow probing the driver later again
when the clock control module will be available.
Fixes: a580b8c5429a ("dmaengine: mxs-dma: add dma support for i.MX23/28")
Co-developed-by: Michael Trimarchi <michael(a)amarulasolutions.com>
Signed-off-by: Michael Trimarchi <michael(a)amarulasolutions.com>
Signed-off-by: Dario Binacchi <dario.binacchi(a)amarulasolutions.com>
Cc: stable(a)vger.kernel.org
---
Changes in v5:
- Update the commit message.
- Create a new patch to remove the warning generated by this patch.
Changes in v4:
- Restore __init in front of mxs_dma_probe() definition.
- Rename the mxs_dma_driver variable to mxs_dma_driver_probe.
- Update the commit message.
- Use builtin_platform_driver() instead of module_platform_driver().
Changes in v3:
- Restore __init in front of mxs_dma_init() definition.
Changes in v2:
- Add the tag "Cc: stable(a)vger.kernel.org" in the sign-off area.
drivers/dma/mxs-dma.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/dma/mxs-dma.c b/drivers/dma/mxs-dma.c
index 994fc4d2aca4..18f8154b859b 100644
--- a/drivers/dma/mxs-dma.c
+++ b/drivers/dma/mxs-dma.c
@@ -839,10 +839,6 @@ static struct platform_driver mxs_dma_driver = {
.name = "mxs-dma",
.of_match_table = mxs_dma_dt_ids,
},
+ .probe = mxs_dma_probe,
};
-
-static int __init mxs_dma_module_init(void)
-{
- return platform_driver_probe(&mxs_dma_driver, mxs_dma_probe);
-}
-subsys_initcall(mxs_dma_module_init);
+builtin_platform_driver(mxs_dma_driver);
--
2.32.0
Currently there are two corner cases not handling compat RO flags
correctly:
- Remount
We can still mount the fs RO with compat RO flags, then remount it RW.
We should not allow any write into a fs with unsupported RO flags.
- Still try to search block group items
In fact, behavior/on-disk format change to extent tree should not
need a full incompat flag.
And since we can ensure fs with unsupported RO flags never got any
writes (with above case fixed), then we can even skip block group
items search at mount time.
This patch will enhance the unsupported RO compat flags by:
- Reject RW remount if there is unsupported RO compat flags
- Go dummy block group items directly for unsupported RO compat flags
In fact, only changes to chunk/subvolume/root/csum trees should go
incompat flags.
The latter part should allow future change to extent tree to be compat
RO flags.
Thus this patch also needs to be backported to all stable trees.
Cc: stable(a)vger.kernel.org
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
fs/btrfs/block-group.c | 11 ++++++++++-
fs/btrfs/super.c | 9 +++++++++
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index ba1ce6a3bc93..5b08ac282ace 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2200,7 +2200,16 @@ int btrfs_read_block_groups(struct btrfs_fs_info *info)
int need_clear = 0;
u64 cache_gen;
- if (!root)
+ /*
+ * Either no extent root (with ibadroots rescue option) or we have
+ * unsupporter RO options. The fs can never be mounted RW, so no
+ * need to waste time search block group items.
+ *
+ * This also allows new extent tree related changes to be RO compat,
+ * no need for a full incompat flag.
+ */
+ if (!root || (btrfs_super_compat_ro_flags(info->super_copy) &
+ ~BTRFS_FEATURE_COMPAT_RO_SUPP))
return fill_dummy_bgs(info);
key.objectid = 0;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 4c7089b1681b..7d3213e67fb5 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2110,6 +2110,15 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
ret = -EINVAL;
goto restore;
}
+ if (btrfs_super_compat_ro_flags(fs_info->super_copy) &
+ ~BTRFS_FEATURE_COMPAT_RO_SUPP) {
+ btrfs_err(fs_info,
+ "can not remount read-write due to unsupported optional flags 0x%llx",
+ btrfs_super_compat_ro_flags(fs_info->super_copy) &
+ ~BTRFS_FEATURE_COMPAT_RO_SUPP);
+ ret = -EINVAL;
+ goto restore;
+ }
if (fs_info->fs_devices->rw_devices == 0) {
ret = -EACCES;
goto restore;
--
2.37.0
The commit 99c13b8c8896d7bcb92753bf
("x86/mm/pat: Don't report PAT on CPUs that don't support it")
incorrectly failed to account for the case in init_cache_modes() when
CPUs do support PAT and falsely reported PAT to be disabled when in
fact PAT is enabled. In some environments, notably in Xen PV domains,
MTRR is disabled but PAT is still enabled, and that is the case
that the aforementioned commit failed to account for.
As an unfortunate consequnce, the pat_enabled() function currently does
not correctly report that PAT is enabled in such environments. The fix
is implemented in init_cache_modes() by setting pat_bp_enabled to true
in init_cache_modes() for the case that commit 99c13b8c8896d7bcb92753bf
("x86/mm/pat: Don't report PAT on CPUs that don't support it") failed
to account for.
This patch fixes a regression that some users are experiencing with
Linux as a Xen Dom0 driving particular Intel graphics devices by
correctly reporting to the Intel i915 driver that PAT is enabled where
previously it was falsely reporting that PAT is disabled.
Fixes: 99c13b8c8896d7bcb92753bf ("x86/mm/pat: Don't report PAT on CPUs that don't support it")
Cc: stable(a)vger.kernel.org
Signed-off-by: Chuck Zmudzinski <brchuckz(a)aol.com>
---
Reminder: This patch is a regression fix that is needed on stable
versions 5.17 and later.
arch/x86/mm/pat/memtype.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
index d5ef64ddd35e..0f2417bd1b40 100644
--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -315,6 +315,18 @@ void init_cache_modes(void)
PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
}
+ else if (!pat_bp_enabled) {
+ /*
+ * In some environments, specifically Xen PV, PAT
+ * initialization is skipped because MTRRs are disabled even
+ * though PAT is available. In such environments, set PAT to
+ * enabled to correctly indicate to callers of pat_enabled()
+ * that CPU support for PAT is available.
+ */
+ pat_bp_enabled = true;
+ pr_info("x86/PAT: PAT enabled by init_cache_modes\n");
+ }
+
__init_cache_modes(pat);
}
--
2.36.1