From: Marcos Paulo de Souza <mpdesouza(a)suse.com>
[BUG]
After commit 9afc66498a0b ("btrfs: block-group: refactor how we read one
block group item"), cache->length is being assigned after calling
btrfs_create_block_group_cache. This causes a problem since
set_free_space_tree_thresholds is calculate the free-space threshould to
decide is the free-space tree should convert from extents to bitmaps.
The current code calls set_free_space_tree_thresholds with cache->length
being 0, which then makes cache->bitmap_high_thresh being zero. This
implies the system will always use bitmap instead of extents, which is
not desired if the block group is not fragmented.
This behavior can be seen by a test that expects to repair systems
with FREE_SPACE_EXTENT and FREE_SPACE_BITMAP, but the current code only
created FREE_SPACE_BITMAP.
[FIX]
Call set_free_space_tree_thresholds after setting cache->length.
Link: https://github.com/kdave/btrfs-progs/issues/251
Fixes: 9afc66498a0b ("btrfs: block-group: refactor how we read one block group item")
CC: stable(a)vger.kernel.org # 5.8+
Reviewed-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza(a)suse.com>
---
Changes from v1:
* Add a warning in set_free_space_tree_thresholds when bg->length is zero (Qu)
fs/btrfs/block-group.c | 4 +++-
fs/btrfs/free-space-tree.c | 3 +++
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 44fdfa2eeb2e..01e8ba1da1d3 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1798,7 +1798,6 @@ static struct btrfs_block_group *btrfs_create_block_group_cache(
cache->fs_info = fs_info;
cache->full_stripe_len = btrfs_full_stripe_len(fs_info, start);
- set_free_space_tree_thresholds(cache);
cache->discard_index = BTRFS_DISCARD_INDEX_UNUSED;
@@ -1908,6 +1907,8 @@ static int read_one_block_group(struct btrfs_fs_info *info,
read_block_group_item(cache, path, key);
+ set_free_space_tree_thresholds(cache);
+
if (need_clear) {
/*
* When we mount with old space cache, we need to
@@ -2128,6 +2129,7 @@ int btrfs_make_block_group(struct btrfs_trans_handle *trans, u64 bytes_used,
return -ENOMEM;
cache->length = size;
+ set_free_space_tree_thresholds(cache);
cache->used = bytes_used;
cache->flags = type;
cache->last_byte_to_unpin = (u64)-1;
diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
index 8b1f5c8897b7..1d191fbc754b 100644
--- a/fs/btrfs/free-space-tree.c
+++ b/fs/btrfs/free-space-tree.c
@@ -22,6 +22,9 @@ void set_free_space_tree_thresholds(struct btrfs_block_group *cache)
size_t bitmap_size;
u64 num_bitmaps, total_bitmap_size;
+ if (cache->length == 0)
+ btrfs_warn(cache->fs_info, "block group length is zero");
+
/*
* We convert to bitmaps when the disk space required for using extents
* exceeds that required for using bitmaps.
--
2.28.0
From: Andy Lutomirski <luto(a)kernel.org>
Trying to clear DR7 around a #DB from usermode malfunctions if we
schedule when delivering SIGTRAP. Rather than trying to define a
special no-recursion region, just allow a single level of recursion.
We do the same thing for NMI, and it hasn't caused any problems yet.
Fixes: 9f58fdde95c9 ("x86/db: Split out dr6/7 handling")
Reported-by: Kyle Huey <me(a)kylehuey.com>
Debugged-by: Josh Poimboeuf <jpoimboe(a)redhat.com>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/8b9bd05f187231df008d48cf818a6a311cbd5c98.15978823…
---
arch/x86/kernel/traps.c | 65 ++++++++++++++++++++++--------------------------
1 file changed, 31 insertions(+), 34 deletions(-)
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -729,20 +729,9 @@ static bool is_sysenter_singlestep(struc
#endif
}
-static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
+static __always_inline unsigned long debug_read_clear_dr6(void)
{
- /*
- * Disable breakpoints during exception handling; recursive exceptions
- * are exceedingly 'fun'.
- *
- * Since this function is NOKPROBE, and that also applies to
- * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
- * HW_BREAKPOINT_W on our stack)
- *
- * Entry text is excluded for HW_BP_X and cpu_entry_area, which
- * includes the entry stack is excluded for everything.
- */
- *dr7 = local_db_save();
+ unsigned long dr6;
/*
* The Intel SDM says:
@@ -755,15 +744,12 @@ static __always_inline void debug_enter(
*
* Keep it simple: clear DR6 immediately.
*/
- get_debugreg(*dr6, 6);
+ get_debugreg(dr6, 6);
set_debugreg(0, 6);
/* Filter out all the reserved bits which are preset to 1 */
- *dr6 &= ~DR6_RESERVED;
-}
+ dr6 &= ~DR6_RESERVED;
-static __always_inline void debug_exit(unsigned long dr7)
-{
- local_db_restore(dr7);
+ return dr6;
}
/*
@@ -863,6 +849,18 @@ static void handle_debug(struct pt_regs
static __always_inline void exc_debug_kernel(struct pt_regs *regs,
unsigned long dr6)
{
+ /*
+ * Disable breakpoints during exception handling; recursive exceptions
+ * are exceedingly 'fun'.
+ *
+ * Since this function is NOKPROBE, and that also applies to
+ * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
+ * HW_BREAKPOINT_W on our stack)
+ *
+ * Entry text is excluded for HW_BP_X and cpu_entry_area, which
+ * includes the entry stack is excluded for everything.
+ */
+ unsigned long dr7 = local_db_save();
bool irq_state = idtentry_enter_nmi(regs);
instrumentation_begin();
@@ -883,6 +881,8 @@ static __always_inline void exc_debug_ke
instrumentation_end();
idtentry_exit_nmi(regs, irq_state);
+
+ local_db_restore(dr7);
}
static __always_inline void exc_debug_user(struct pt_regs *regs,
@@ -894,6 +894,15 @@ static __always_inline void exc_debug_us
*/
WARN_ON_ONCE(!user_mode(regs));
+ /*
+ * NB: We can't easily clear DR7 here because
+ * idtentry_exit_to_usermode() can invoke ptrace, schedule, access
+ * user memory, etc. This means that a recursive #DB is possible. If
+ * this happens, that #DB will hit exc_debug_kernel() and clear DR7.
+ * Since we're not on the IST stack right now, everything will be
+ * fine.
+ */
+
irqentry_enter_from_user_mode(regs);
instrumentation_begin();
@@ -907,36 +916,24 @@ static __always_inline void exc_debug_us
/* IST stack entry */
DEFINE_IDTENTRY_DEBUG(exc_debug)
{
- unsigned long dr6, dr7;
-
- debug_enter(&dr6, &dr7);
- exc_debug_kernel(regs, dr6);
- debug_exit(dr7);
+ exc_debug_kernel(regs, debug_read_clear_dr6());
}
/* User entry, runs on regular task stack */
DEFINE_IDTENTRY_DEBUG_USER(exc_debug)
{
- unsigned long dr6, dr7;
-
- debug_enter(&dr6, &dr7);
- exc_debug_user(regs, dr6);
- debug_exit(dr7);
+ exc_debug_user(regs, debug_read_clear_dr6());
}
#else
/* 32 bit does not have separate entry points. */
DEFINE_IDTENTRY_RAW(exc_debug)
{
- unsigned long dr6, dr7;
-
- debug_enter(&dr6, &dr7);
+ unsigned long dr6 = debug_read_clear_dr6();
if (user_mode(regs))
exc_debug_user(regs, dr6);
else
exc_debug_kernel(regs, dr6);
-
- debug_exit(dr7);
}
#endif
From: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Sometimes re-plugging a USB device during system sleep renders the device
useless:
[ 173.418345] xhci_hcd 0000:00:14.0: Get port status 2-4 read: 0x14203e2, return 0x10262
...
[ 176.496485] usb 2-4: Waited 2000ms for CONNECT
[ 176.496781] usb usb2-port4: status 0000.0262 after resume, -19
[ 176.497103] usb 2-4: can't resume, status -19
[ 176.497438] usb usb2-port4: logical disconnect
Because PLS equals to XDEV_RESUME, xHCI driver reports U3 to usbcore,
despite of CAS bit is flagged.
So proritize CAS over XDEV_RESUME to let usbcore handle warm-reset for
the port.
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
---
drivers/usb/host/xhci-hub.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index c3554e37e09f..4e14e164cb68 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -740,15 +740,6 @@ static void xhci_hub_report_usb3_link_state(struct xhci_hcd *xhci,
{
u32 pls = status_reg & PORT_PLS_MASK;
- /* resume state is a xHCI internal state.
- * Do not report it to usb core, instead, pretend to be U3,
- * thus usb core knows it's not ready for transfer
- */
- if (pls == XDEV_RESUME) {
- *status |= USB_SS_PORT_LS_U3;
- return;
- }
-
/* When the CAS bit is set then warm reset
* should be performed on port
*/
@@ -770,6 +761,16 @@ static void xhci_hub_report_usb3_link_state(struct xhci_hcd *xhci,
*/
pls |= USB_PORT_STAT_CONNECTION;
} else {
+ /*
+ * Resume state is an xHCI internal state. Do not report it to
+ * usb core, instead, pretend to be U3, thus usb core knows
+ * it's not ready for transfer.
+ */
+ if (pls == XDEV_RESUME) {
+ *status |= USB_SS_PORT_LS_U3;
+ return;
+ }
+
/*
* If CAS bit isn't set but the Port is already at
* Compliance Mode, fake a connection so the USB core
--
2.17.1
From: Marcos Paulo de Souza <mpdesouza(a)suse.com>
[BUG]
After commit 9afc66498a0b ("btrfs: block-group: refactor how we read one
block group item"), cache->length is being assigned after calling
btrfs_create_block_group_cache. This causes a problem since
set_free_space_tree_thresholds is calculate the free-space threshould to
decide is the free-space tree should convert from extents to bitmaps.
The current code calls set_free_space_tree_thresholds with cache->length
being 0, which then makes cache->bitmap_high_thresh being zero. This
implies the system will always use bitmap instead of extents, which is
not desired if the block group is not fragmented.
This behavior can be seen by a test that expects to repair systems
with FREE_SPACE_EXTENT and FREE_SPACE_BITMAP, but the current code only
created FREE_SPACE_BITMAP.
[FIX]
Call set_free_space_tree_thresholds after setting cache->length.
Link: https://github.com/kdave/btrfs-progs/issues/251
Fixes: 9afc66498a0b ("btrfs: block-group: refactor how we read one block group item")
CC: stable(a)vger.kernel.org # 5.8+
Signed-off-by: Marcos Paulo de Souza <mpdesouza(a)suse.com>
---
fs/btrfs/block-group.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 44fdfa2eeb2e..01e8ba1da1d3 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1798,7 +1798,6 @@ static struct btrfs_block_group *btrfs_create_block_group_cache(
cache->fs_info = fs_info;
cache->full_stripe_len = btrfs_full_stripe_len(fs_info, start);
- set_free_space_tree_thresholds(cache);
cache->discard_index = BTRFS_DISCARD_INDEX_UNUSED;
@@ -1908,6 +1907,8 @@ static int read_one_block_group(struct btrfs_fs_info *info,
read_block_group_item(cache, path, key);
+ set_free_space_tree_thresholds(cache);
+
if (need_clear) {
/*
* When we mount with old space cache, we need to
@@ -2128,6 +2129,7 @@ int btrfs_make_block_group(struct btrfs_trans_handle *trans, u64 bytes_used,
return -ENOMEM;
cache->length = size;
+ set_free_space_tree_thresholds(cache);
cache->used = bytes_used;
cache->flags = type;
cache->last_byte_to_unpin = (u64)-1;
--
2.28.0
SWP_FS is used to make swap_{read,write}page() go through
the filesystem, and it's only used for swap files over
NFS. So, !SWP_FS means non NFS for now, it could be either
file backed or device backed. Something similar goes with
legacy SWP_FILE.
So in order to achieve the goal of the original patch,
SWP_BLKDEV should be used instead.
FS corruption can be observed with SSD device + XFS +
fragmented swapfile due to CONFIG_THP_SWAP=y.
I reproduced the issue with the following details:
Environment:
QEMU + upstream kernel + buildroot + NVMe (2 GB)
Kernel config:
CONFIG_BLK_DEV_NVME=y
CONFIG_THP_SWAP=y
Some reproducable steps:
mkfs.xfs -f /dev/nvme0n1
mkdir /tmp/mnt
mount /dev/nvme0n1 /tmp/mnt
bs="32k"
sz="1024m" # doesn't matter too much, I also tried 16m
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw
mkswap /tmp/mnt/sw
swapon /tmp/mnt/sw
stress --vm 2 --vm-bytes 600M # doesn't matter too much as well
Symptoms:
- FS corruption (e.g. checksum failure)
- memory corruption at: 0xd2808010
- segfault
Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device")
Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out")
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: Rafael Aquini <aquini(a)redhat.com>
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
---
v1: https://lore.kernel.org/r/20200819195613.24269-1-hsiangkao@redhat.com
changes since v1:
- improve commit message description
Hi Andrew,
Kindly consider this one instead if no other concerns...
Thanks,
Gao Xiang
mm/swapfile.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 6c26916e95fd..2937daf3ca02 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1074,7 +1074,7 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[], int entry_size)
goto nextsi;
}
if (size == SWAPFILE_CLUSTER) {
- if (!(si->flags & SWP_FS))
+ if (si->flags & SWP_BLKDEV)
n_ret = swap_alloc_cluster(si, swp_entries);
} else
n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
--
2.18.1
From: Charan Teja Reddy <charante(a)codeaurora.org>
Subject: mm, page_alloc: fix core hung in free_pcppages_bulk()
The following race is observed with the repeated online, offline and a
delay between two successive online of memory blocks of movable zone.
P1 P2
Online the first memory block in
the movable zone. The pcp struct
values are initialized to default
values,i.e., pcp->high = 0 &
pcp->batch = 1.
Allocate the pages from the
movable zone.
Try to Online the second memory
block in the movable zone thus it
entered the online_pages() but yet
to call zone_pcp_update().
This process is entered into
the exit path thus it tries
to release the order-0 pages
to pcp lists through
free_unref_page_commit().
As pcp->high = 0, pcp->count = 1
proceed to call the function
free_pcppages_bulk().
Update the pcp values thus the
new pcp values are like, say,
pcp->high = 378, pcp->batch = 63.
Read the pcp's batch value using
READ_ONCE() and pass the same to
free_pcppages_bulk(), pcp values
passed here are, batch = 63,
count = 1.
Since num of pages in the pcp
lists are less than ->batch,
then it will stuck in
while(list_empty(list)) loop
with interrupts disabled thus
a core hung.
Avoid this by ensuring free_pcppages_bulk() is called with proper count of
pcp list pages.
The mentioned race is some what easily reproducible without [1] because
pcp's are not updated for the first memory block online and thus there is
a enough race window for P2 between alloc+free and pcp struct values
update through onlining of second memory block.
With [1], the race still exists but it is very narrow as we update the pcp
struct values for the first memory block online itself.
This is not limited to the movable zone, it could also happen in cases
with the normal zone (e.g., hotplug to a node that only has DMA memory, or
no other memory yet).
[1]: https://patchwork.kernel.org/patch/11696389/
Link: http://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaur…
Fixes: 5f8dcc21211a ("page-allocator: split per-cpu list into one-list-per-migrate-type")
Signed-off-by: Charan Teja Reddy <charante(a)codeaurora.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Vinayak Menon <vinmenon(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org> [2.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/mm/page_alloc.c~mm-page_alloc-fix-core-hung-in-free_pcppages_bulk
+++ a/mm/page_alloc.c
@@ -1302,6 +1302,11 @@ static void free_pcppages_bulk(struct zo
struct page *page, *tmp;
LIST_HEAD(head);
+ /*
+ * Ensure proper count is passed which otherwise would stuck in the
+ * below while (list_empty(list)) loop.
+ */
+ count = min(pcp->count, count);
while (count) {
struct list_head *list;
_
From: Doug Berger <opendmb(a)gmail.com>
Subject: mm: include CMA pages in lowmem_reserve at boot
The lowmem_reserve arrays provide a means of applying pressure against
allocations from lower zones that were targeted at higher zones. Its
values are a function of the number of pages managed by higher zones and
are assigned by a call to the setup_per_zone_lowmem_reserve() function.
The function is initially called at boot time by the function
init_per_zone_wmark_min() and may be called later by accesses of the
/proc/sys/vm/lowmem_reserve_ratio sysctl file.
The function init_per_zone_wmark_min() was moved up from a module_init to
a core_initcall to resolve a sequencing issue with khugepaged.
Unfortunately this created a sequencing issue with CMA page accounting.
The CMA pages are added to the managed page count of a zone when
cma_init_reserved_areas() is called at boot also as a core_initcall. This
makes it uncertain whether the CMA pages will be added to the managed page
counts of their zones before or after the call to
init_per_zone_wmark_min() as it becomes dependent on link order. With the
current link order the pages are added to the managed count after the
lowmem_reserve arrays are initialized at boot.
This means the lowmem_reserve values at boot may be lower than the values
used later if /proc/sys/vm/lowmem_reserve_ratio is accessed even if the
ratio values are unchanged.
In many cases the difference is not significant, but for example
an ARM platform with 1GB of memory and the following memory layout
[ 0.000000] cma: Reserved 256 MiB at 0x0000000030000000
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000000000000-0x000000002fffffff]
[ 0.000000] Normal empty
[ 0.000000] HighMem [mem 0x0000000030000000-0x000000003fffffff]
would result in 0 lowmem_reserve for the DMA zone. This would allow
userspace to deplete the DMA zone easily. Funnily enough
$ cat /proc/sys/vm/lowmem_reserve_ratio
would fix up the situation because it forces setup_per_zone_lowmem_reserve
as a side effect.
This commit breaks the link order dependency by invoking
init_per_zone_wmark_min() as a postcore_initcall so that the CMA pages
have the chance to be properly accounted in their zone(s) and allowing the
lowmem_reserve arrays to receive consistent values.
Link: http://lkml.kernel.org/r/1597423766-27849-1-git-send-email-opendmb@gmail.com
Fixes: bc22af74f271 ("mm: update min_free_kbytes from khugepaged after core initialization")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Jason Baron <jbaron(a)akamai.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-include-cma-pages-in-lowmem_reserve-at-boot
+++ a/mm/page_alloc.c
@@ -7888,7 +7888,7 @@ int __meminit init_per_zone_wmark_min(vo
return 0;
}
-core_initcall(init_per_zone_wmark_min)
+postcore_initcall(init_per_zone_wmark_min)
/*
* min_free_kbytes_sysctl_handler - just a wrapper around proc_dointvec() so
_
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: avoid bio_alloc() failure with 1Mbyte blocks
This is a regression introduced by the patch "migrate from ll_rw_block
usage to BIO".
Bio_alloc() is limited to 256 pages (1 Mbyte). This can cause a failure
when reading 1 Mbyte block filesystems. The problem is a datablock can be
fully (or almost uncompressed), requiring 256 pages, but, because blocks
are not aligned to page boundaries, it may require 257 pages to read.
Bio_kmalloc() can handle 1024 pages, and so use this for the edge
condition.
Link: http://lkml.kernel.org/r/20200815035637.15319-1-phillip@squashfs.org.uk
Fixes: 93e72b3c612a ("squashfs: migrate from ll_rw_block usage to BIO")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: Nicolas Prochazka <nicolas.prochazka(a)gmail.com>
Reported-by: Tomoatsu Shimada <shimada(a)walbrix.com>
Reviewed-by: Guenter Roeck <groeck(a)chromium.org>
Cc: Philippe Liard <pliard(a)google.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Adrien Schildknecht <adrien+dev(a)schischi.me>
Cc: Daniel Rosenberg <drosen(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/block.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/fs/squashfs/block.c~squashfs-avoid-bio_alloc-failure-with-1mbyte-blocks
+++ a/fs/squashfs/block.c
@@ -87,7 +87,11 @@ static int squashfs_bio_read(struct supe
int error, i;
struct bio *bio;
- bio = bio_alloc(GFP_NOIO, page_count);
+ if (page_count <= BIO_MAX_PAGES)
+ bio = bio_alloc(GFP_NOIO, page_count);
+ else
+ bio = bio_kmalloc(GFP_NOIO, page_count);
+
if (!bio)
return -ENOMEM;
_
From: Hugh Dickins <hughd(a)google.com>
Subject: uprobes: __replace_page() avoid BUG in munlock_vma_page()
syzbot crashed on the VM_BUG_ON_PAGE(PageTail) in munlock_vma_page(), when
called from uprobes __replace_page(). Which of many ways to fix it?
Settled on not calling when PageCompound (since Head and Tail are equals
in this context, PageCompound the usual check in uprobes.c, and the prior
use of FOLL_SPLIT_PMD will have cleared PageMlocked already).
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008161338360.20413@eggly.anvils
Fixes: 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Acked-by: Song Liu <songliubraving(a)fb.com>
Acked-by: Oleg Nesterov <oleg(a)redhat.com>
Reviewed-by: Srikar Dronamraju <srikar(a)linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/events/uprobes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/events/uprobes.c~uprobes-__replace_page-avoid-bug-in-munlock_vma_page
+++ a/kernel/events/uprobes.c
@@ -205,7 +205,7 @@ static int __replace_page(struct vm_area
try_to_free_swap(old_page);
page_vma_mapped_walk_done(&pvmw);
- if (vma->vm_flags & VM_LOCKED)
+ if ((vma->vm_flags & VM_LOCKED) && !PageCompound(old_page))
munlock_vma_page(old_page);
put_page(old_page);
_
From: Jann Horn <jannh(a)google.com>
Subject: romfs: fix uninitialized memory leak in romfs_dev_read()
romfs has a superblock field that limits the size of the filesystem; data
beyond that limit is never accessed.
romfs_dev_read() fetches a caller-supplied number of bytes from the
backing device. It returns 0 on success or an error code on failure;
therefore, its API can't represent short reads, it's all-or-nothing.
However, when romfs_dev_read() detects that the requested operation would
cross the filesystem size limit, it currently silently truncates the
requested number of bytes. This e.g. means that when the content of a
file with size 0x1000 starts one byte before the filesystem size limit,
->readpage() will only fill a single byte of the supplied page while
leaving the rest uninitialized, leaking that uninitialized memory to
userspace.
Fix it by returning an error code instead of truncating the read when the
requested read operation would go beyond the end of the filesystem.
Link: http://lkml.kernel.org/r/20200818013202.2246365-1-jannh@google.com
Fixes: da4458bda237 ("NOMMU: Make it possible for RomFS to use MTD devices directly")
Signed-off-by: Jann Horn <jannh(a)google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: David Howells <dhowells(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/romfs/storage.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/romfs/storage.c~romfs-fix-uninitialized-memory-leak-in-romfs_dev_read
+++ a/fs/romfs/storage.c
@@ -217,10 +217,8 @@ int romfs_dev_read(struct super_block *s
size_t limit;
limit = romfs_maxsize(sb);
- if (pos >= limit)
+ if (pos >= limit || buflen > limit - pos)
return -EIO;
- if (buflen > limit - pos)
- buflen = limit - pos;
#ifdef CONFIG_ROMFS_ON_MTD
if (sb->s_mtd)
_
From: Hugh Dickins <hughd(a)google.com>
Subject: khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
syzbot crashes on the VM_BUG_ON_MM(khugepaged_test_exit(mm), mm) in
__khugepaged_enter(): yes, when one thread is about to dump core, has set
core_state, and is waiting for others, another might do something calling
__khugepaged_enter(), which now crashes because I lumped the core_state
test (known as "mmget_still_valid") into khugepaged_test_exit(). I still
think it's best to lump them together, so just in this exceptional case,
check mm->mm_users directly instead of khugepaged_test_exit().
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008141503370.18085@eggly.anvils
Fixes: bbe98f9cadff ("khugepaged: khugepaged_test_exit() check mmget_still_valid()")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Acked-by: Yang Shi <shy828301(a)gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/khugepaged.c~khugepaged-adjust-vm_bug_on_mm-in-__khugepaged_enter
+++ a/mm/khugepaged.c
@@ -466,7 +466,7 @@ int __khugepaged_enter(struct mm_struct
return -ENOMEM;
/* __khugepaged_exit() must not run from under us */
- VM_BUG_ON_MM(khugepaged_test_exit(mm), mm);
+ VM_BUG_ON_MM(atomic_read(&mm->mm_users) == 0, mm);
if (unlikely(test_and_set_bit(MMF_VM_HUGEPAGE, &mm->flags))) {
free_mm_slot(mm_slot);
return 0;
_
The patch titled
Subject: mm, THP, swap: fix allocating cluster for swapfile by mistake
has been added to the -mm tree. Its filename is
mm-thp-swap-fix-allocating-cluster-for-swapfile-by-mistake.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-thp-swap-fix-allocating-cluste…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-thp-swap-fix-allocating-cluste…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Gao Xiang <hsiangkao(a)redhat.com>
Subject: mm, THP, swap: fix allocating cluster for swapfile by mistake
SWP_FS is used to make swap_{read,write}page() go through the filesystem,
and it's only used for swap files over NFS. So, !SWP_FS means non NFS for
now, it could be either file backed or device backed. Something similar
goes with legacy SWP_FILE.
So in order to achieve the goal of the original patch, SWP_BLKDEV should
be used instead.
FS corruption can be observed with SSD device + XFS + fragmented swapfile
due to CONFIG_THP_SWAP=y.
I reproduced the issue with the following details:
Environment:
QEMU + upstream kernel + buildroot + NVMe (2 GB)
Kernel config:
CONFIG_BLK_DEV_NVME=y
CONFIG_THP_SWAP=y
Some reproducable steps:
mkfs.xfs -f /dev/nvme0n1
mkdir /tmp/mnt
mount /dev/nvme0n1 /tmp/mnt
bs="32k"
sz="1024m" # doesn't matter too much, I also tried 16m
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw
mkswap /tmp/mnt/sw
swapon /tmp/mnt/sw
stress --vm 2 --vm-bytes 600M # doesn't matter too much as well
Symptoms:
- FS corruption (e.g. checksum failure)
- memory corruption at: 0xd2808010
- segfault
Link: https://lkml.kernel.org/r/20200820045323.7809-1-hsiangkao@redhat.com
Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device")
Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out")
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Acked-by: Rafael Aquini <aquini(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Carlos Maiolino <cmaiolino(a)redhat.com>
Cc: Eric Sandeen <esandeen(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swapfile.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/swapfile.c~mm-thp-swap-fix-allocating-cluster-for-swapfile-by-mistake
+++ a/mm/swapfile.c
@@ -1078,7 +1078,7 @@ start_over:
goto nextsi;
}
if (size == SWAPFILE_CLUSTER) {
- if (!(si->flags & SWP_FS))
+ if (si->flags & SWP_BLKDEV)
n_ret = swap_alloc_cluster(si, swp_entries);
} else
n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
_
Patches currently in -mm which might be from hsiangkao(a)redhat.com are
mm-thp-swap-fix-allocating-cluster-for-swapfile-by-mistake.patch
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 75802ca66354a39ab8e35822747cd08b3384a99a Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx(a)redhat.com>
Date: Thu, 6 Aug 2020 23:26:11 -0700
Subject: [PATCH] mm/hugetlb: fix calculation of
adjust_range_if_pmd_sharing_possible
This is found by code observation only.
Firstly, the worst case scenario should assume the whole range was covered
by pmd sharing. The old algorithm might not work as expected for ranges
like (1g-2m, 1g+2m), where the adjusted range should be (0, 1g+2m) but the
expected range should be (0, 2g).
Since at it, remove the loop since it should not be required. With that,
the new code should be faster too when the invalidating range is huge.
Mike said:
: With range (1g-2m, 1g+2m) within a vma (0, 2g) the existing code will only
: adjust to (0, 1g+2m) which is incorrect.
:
: We should cc stable. The original reason for adjusting the range was to
: prevent data corruption (getting wrong page). Since the range is not
: always adjusted correctly, the potential for corruption still exists.
:
: However, I am fairly confident that adjust_range_if_pmd_sharing_possible
: is only gong to be called in two cases:
:
: 1) for a single page
: 2) for range == entire vma
:
: In those cases, the current code should produce the correct results.
:
: To be safe, let's just cc stable.
Fixes: 017b1660df89 ("mm: migration: fix migration of huge PMD shared pages")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Link: http://lkml.kernel.org/r/20200730201636.74778-1-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 27556d4d49fe..e52c878940bb 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5314,25 +5314,21 @@ static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr)
void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
unsigned long *start, unsigned long *end)
{
- unsigned long check_addr;
+ unsigned long a_start, a_end;
if (!(vma->vm_flags & VM_MAYSHARE))
return;
- for (check_addr = *start; check_addr < *end; check_addr += PUD_SIZE) {
- unsigned long a_start = check_addr & PUD_MASK;
- unsigned long a_end = a_start + PUD_SIZE;
+ /* Extend the range to be PUD aligned for a worst case scenario */
+ a_start = ALIGN_DOWN(*start, PUD_SIZE);
+ a_end = ALIGN(*end, PUD_SIZE);
- /*
- * If sharing is possible, adjust start/end if necessary.
- */
- if (range_in_vma(vma, a_start, a_end)) {
- if (a_start < *start)
- *start = a_start;
- if (a_end > *end)
- *end = a_end;
- }
- }
+ /*
+ * Intersect the range with the vma range, since pmd sharing won't be
+ * across vma after all
+ */
+ *start = max(vma->vm_start, a_start);
+ *end = min(vma->vm_end, a_end);
}
/*
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 75802ca66354a39ab8e35822747cd08b3384a99a Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx(a)redhat.com>
Date: Thu, 6 Aug 2020 23:26:11 -0700
Subject: [PATCH] mm/hugetlb: fix calculation of
adjust_range_if_pmd_sharing_possible
This is found by code observation only.
Firstly, the worst case scenario should assume the whole range was covered
by pmd sharing. The old algorithm might not work as expected for ranges
like (1g-2m, 1g+2m), where the adjusted range should be (0, 1g+2m) but the
expected range should be (0, 2g).
Since at it, remove the loop since it should not be required. With that,
the new code should be faster too when the invalidating range is huge.
Mike said:
: With range (1g-2m, 1g+2m) within a vma (0, 2g) the existing code will only
: adjust to (0, 1g+2m) which is incorrect.
:
: We should cc stable. The original reason for adjusting the range was to
: prevent data corruption (getting wrong page). Since the range is not
: always adjusted correctly, the potential for corruption still exists.
:
: However, I am fairly confident that adjust_range_if_pmd_sharing_possible
: is only gong to be called in two cases:
:
: 1) for a single page
: 2) for range == entire vma
:
: In those cases, the current code should produce the correct results.
:
: To be safe, let's just cc stable.
Fixes: 017b1660df89 ("mm: migration: fix migration of huge PMD shared pages")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Link: http://lkml.kernel.org/r/20200730201636.74778-1-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 27556d4d49fe..e52c878940bb 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5314,25 +5314,21 @@ static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr)
void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
unsigned long *start, unsigned long *end)
{
- unsigned long check_addr;
+ unsigned long a_start, a_end;
if (!(vma->vm_flags & VM_MAYSHARE))
return;
- for (check_addr = *start; check_addr < *end; check_addr += PUD_SIZE) {
- unsigned long a_start = check_addr & PUD_MASK;
- unsigned long a_end = a_start + PUD_SIZE;
+ /* Extend the range to be PUD aligned for a worst case scenario */
+ a_start = ALIGN_DOWN(*start, PUD_SIZE);
+ a_end = ALIGN(*end, PUD_SIZE);
- /*
- * If sharing is possible, adjust start/end if necessary.
- */
- if (range_in_vma(vma, a_start, a_end)) {
- if (a_start < *start)
- *start = a_start;
- if (a_end > *end)
- *end = a_end;
- }
- }
+ /*
+ * Intersect the range with the vma range, since pmd sharing won't be
+ * across vma after all
+ */
+ *start = max(vma->vm_start, a_start);
+ *end = min(vma->vm_end, a_end);
}
/*
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 75802ca66354a39ab8e35822747cd08b3384a99a Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx(a)redhat.com>
Date: Thu, 6 Aug 2020 23:26:11 -0700
Subject: [PATCH] mm/hugetlb: fix calculation of
adjust_range_if_pmd_sharing_possible
This is found by code observation only.
Firstly, the worst case scenario should assume the whole range was covered
by pmd sharing. The old algorithm might not work as expected for ranges
like (1g-2m, 1g+2m), where the adjusted range should be (0, 1g+2m) but the
expected range should be (0, 2g).
Since at it, remove the loop since it should not be required. With that,
the new code should be faster too when the invalidating range is huge.
Mike said:
: With range (1g-2m, 1g+2m) within a vma (0, 2g) the existing code will only
: adjust to (0, 1g+2m) which is incorrect.
:
: We should cc stable. The original reason for adjusting the range was to
: prevent data corruption (getting wrong page). Since the range is not
: always adjusted correctly, the potential for corruption still exists.
:
: However, I am fairly confident that adjust_range_if_pmd_sharing_possible
: is only gong to be called in two cases:
:
: 1) for a single page
: 2) for range == entire vma
:
: In those cases, the current code should produce the correct results.
:
: To be safe, let's just cc stable.
Fixes: 017b1660df89 ("mm: migration: fix migration of huge PMD shared pages")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Link: http://lkml.kernel.org/r/20200730201636.74778-1-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 27556d4d49fe..e52c878940bb 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5314,25 +5314,21 @@ static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr)
void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
unsigned long *start, unsigned long *end)
{
- unsigned long check_addr;
+ unsigned long a_start, a_end;
if (!(vma->vm_flags & VM_MAYSHARE))
return;
- for (check_addr = *start; check_addr < *end; check_addr += PUD_SIZE) {
- unsigned long a_start = check_addr & PUD_MASK;
- unsigned long a_end = a_start + PUD_SIZE;
+ /* Extend the range to be PUD aligned for a worst case scenario */
+ a_start = ALIGN_DOWN(*start, PUD_SIZE);
+ a_end = ALIGN(*end, PUD_SIZE);
- /*
- * If sharing is possible, adjust start/end if necessary.
- */
- if (range_in_vma(vma, a_start, a_end)) {
- if (a_start < *start)
- *start = a_start;
- if (a_end > *end)
- *end = a_end;
- }
- }
+ /*
+ * Intersect the range with the vma range, since pmd sharing won't be
+ * across vma after all
+ */
+ *start = max(vma->vm_start, a_start);
+ *end = min(vma->vm_end, a_end);
}
/*
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 75802ca66354a39ab8e35822747cd08b3384a99a Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx(a)redhat.com>
Date: Thu, 6 Aug 2020 23:26:11 -0700
Subject: [PATCH] mm/hugetlb: fix calculation of
adjust_range_if_pmd_sharing_possible
This is found by code observation only.
Firstly, the worst case scenario should assume the whole range was covered
by pmd sharing. The old algorithm might not work as expected for ranges
like (1g-2m, 1g+2m), where the adjusted range should be (0, 1g+2m) but the
expected range should be (0, 2g).
Since at it, remove the loop since it should not be required. With that,
the new code should be faster too when the invalidating range is huge.
Mike said:
: With range (1g-2m, 1g+2m) within a vma (0, 2g) the existing code will only
: adjust to (0, 1g+2m) which is incorrect.
:
: We should cc stable. The original reason for adjusting the range was to
: prevent data corruption (getting wrong page). Since the range is not
: always adjusted correctly, the potential for corruption still exists.
:
: However, I am fairly confident that adjust_range_if_pmd_sharing_possible
: is only gong to be called in two cases:
:
: 1) for a single page
: 2) for range == entire vma
:
: In those cases, the current code should produce the correct results.
:
: To be safe, let's just cc stable.
Fixes: 017b1660df89 ("mm: migration: fix migration of huge PMD shared pages")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Link: http://lkml.kernel.org/r/20200730201636.74778-1-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 27556d4d49fe..e52c878940bb 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5314,25 +5314,21 @@ static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr)
void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
unsigned long *start, unsigned long *end)
{
- unsigned long check_addr;
+ unsigned long a_start, a_end;
if (!(vma->vm_flags & VM_MAYSHARE))
return;
- for (check_addr = *start; check_addr < *end; check_addr += PUD_SIZE) {
- unsigned long a_start = check_addr & PUD_MASK;
- unsigned long a_end = a_start + PUD_SIZE;
+ /* Extend the range to be PUD aligned for a worst case scenario */
+ a_start = ALIGN_DOWN(*start, PUD_SIZE);
+ a_end = ALIGN(*end, PUD_SIZE);
- /*
- * If sharing is possible, adjust start/end if necessary.
- */
- if (range_in_vma(vma, a_start, a_end)) {
- if (a_start < *start)
- *start = a_start;
- if (a_end > *end)
- *end = a_end;
- }
- }
+ /*
+ * Intersect the range with the vma range, since pmd sharing won't be
+ * across vma after all
+ */
+ *start = max(vma->vm_start, a_start);
+ *end = min(vma->vm_end, a_end);
}
/*
When offlining CPU's, fixup_irqs() migrates all interrupts away from the
outgoing CPU to an online CPU. Its always possible the device sent an
interrupt to the previous CPU destination. Pending interrupt bit in IRR in
lapic identifies such interrupts. apic_soft_disable() will not capture any
new interrupts in IRR. This causes interrupts from device to be lost during
cpu offline. The issue was found when explicitly setting MSI affinity to a
CPU and immediately offlining it. It was simple to recreate with a USB
ethernet device and doing I/O to it while the CPU is offlined. Lost
interrupts happen even when Interrupt Remapping is enabled.
Current code does apic_soft_disable() before migrating interrupts.
native_cpu_disable()
{
...
apic_soft_disable();
cpu_disable_common();
--> fixup_irqs(); // Too late to capture anything in IRR.
}
Just fliping the above call sequence seems to hit the IRR checks
and the lost interrupt is fixed for both legacy MSI and when
interrupt remapping is enabled.
Fixes: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead")
Link: https://lore.kernel.org/lkml/875zdarr4h.fsf@nanos.tec.linutronix.de/
Signed-off-by: Ashok Raj <ashok.raj(a)intel.com>
To: linux-kernel(a)vger.kernel.org
To: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Sukumar Ghorai <sukumar.ghorai(a)intel.com>
Cc: Srikanth Nandamuri <srikanth.nandamuri(a)intel.com>
Cc: Evan Green <evgreen(a)chromium.org>
Cc: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/kernel/smpboot.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index ffbd9a3d78d8..278cc9f92f2f 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1603,13 +1603,20 @@ int native_cpu_disable(void)
if (ret)
return ret;
+ cpu_disable_common();
/*
* Disable the local APIC. Otherwise IPI broadcasts will reach
* it. It still responds normally to INIT, NMI, SMI, and SIPI
- * messages.
+ * messages. Its important to do apic_soft_disable() after
+ * fixup_irqs(), because fixup_irqs() called from cpu_disable_common()
+ * depends on IRR being set. After apic_soft_disable() CPU preserves
+ * currently set IRR/ISR but new interrupts will not set IRR.
+ * This causes interrupts sent to outgoing cpu before completion
+ * of irq migration to be lost. Check SDM Vol 3 "10.4.7.2 Local
+ * APIC State after It Has been Software Disabled" section for more
+ * details.
*/
apic_soft_disable();
- cpu_disable_common();
return 0;
}
--
2.13.6
As per PAPR we have to look for both EPOW sensor value and event modifier to
identify type of event and take appropriate action.
Sensor value = 3 (EPOW_SYSTEM_SHUTDOWN) schedule system to be shutdown after
OS defined delay (default 10 mins).
EPOW Event Modifier for sensor value = 3:
We have to initiate immediate shutdown for most of the event modifier except
value = 2 (system running on UPS).
Checking with firmware document its clear that we have to wait for predefined
time before initiating shutdown. If power is restored within time we should
cancel the shutdown process. I think commit 79872e35 accidently enabled
immediate poweroff for EPOW_SHUTDOWN_ON_UPS event.
We have user space tool (rtas_errd) on LPAR to monitor for EPOW_SHUTDOWN_ON_UPS.
Once it gets event it initiates shutdown after predefined time. Also starts
monitoring for any new EPOW events. If it receives "Power restored" event
before predefined time it will cancel the shutdown. Otherwise after
predefined time it will shutdown the system.
Fixes: 79872e35 (powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown)
Cc: stable(a)vger.kernel.org # v4.0+
Cc: Tyrel Datwyler <tyreld(a)linux.ibm.com>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Vasant Hegde <hegdevasant(a)linux.vnet.ibm.com>
---
Changes in v2:
- Updated patch description based on mpe, Tyrel comment.
-Vasant
arch/powerpc/platforms/pseries/ras.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index f3736fcd98fc..13c86a292c6d 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -184,7 +184,6 @@ static void handle_system_shutdown(char event_modifier)
case EPOW_SHUTDOWN_ON_UPS:
pr_emerg("Loss of system power detected. System is running on"
" UPS/battery. Check RTAS error log for details\n");
- orderly_poweroff(true);
break;
case EPOW_SHUTDOWN_LOSS_OF_CRITICAL_FUNCTIONS:
--
2.26.2
From: WANG Cong <xiyou.wangcong(a)gmail.com>
[ Upstream commit 199ab00f3cdb6f154ea93fa76fd80192861a821d ]
Andrey reported a out-of-bound access in ip6_tnl_xmit(), this
is because we use an ipv4 dst in ip6_tnl_xmit() and cast an IPv4
neigh key as an IPv6 address:
neigh = dst_neigh_lookup(skb_dst(skb),
&ipv6_hdr(skb)->daddr);
if (!neigh)
goto tx_err_link_failure;
addr6 = (struct in6_addr *)&neigh->primary_key; // <=== HERE
addr_type = ipv6_addr_type(addr6);
if (addr_type == IPV6_ADDR_ANY)
addr6 = &ipv6_hdr(skb)->daddr;
memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
Also the network header of the skb at this point should be still IPv4
for 4in6 tunnels, we shold not just use it as IPv6 header.
This patch fixes it by checking if skb->protocol is ETH_P_IPV6: if it
is, we are safe to do the nexthop lookup using skb_dst() and
ipv6_hdr(skb)->daddr; if not (aka IPv4), we have no clue about which
dest address we can pick here, we have to rely on callers to fill it
from tunnel config, so just fall to ip6_route_output() to make the
decision.
Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
Reported-by: Andrey Konovalov <andreyknvl(a)google.com>
Reported-by: syzbot+01400f5fc51cf4747bec(a)syzkaller.appspotmail.com
Tested-by: Andrey Konovalov <andreyknvl(a)google.com>
Cc: Steffen Klassert <steffen.klassert(a)secunet.com>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Alessio Balsini <balsini(a)android.com>
---
net/ipv6/ip6_tunnel.c | 32 +++++++++++++++++---------------
1 file changed, 17 insertions(+), 15 deletions(-)
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index f072a4c4575c..96563990d654 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -972,26 +972,28 @@ static int ip6_tnl_xmit2(struct sk_buff *skb,
/* NBMA tunnel */
if (ipv6_addr_any(&t->parms.raddr)) {
- struct in6_addr *addr6;
- struct neighbour *neigh;
- int addr_type;
+ if (skb->protocol == htons(ETH_P_IPV6)) {
+ struct in6_addr *addr6;
+ struct neighbour *neigh;
+ int addr_type;
- if (!skb_dst(skb))
- goto tx_err_link_failure;
+ if (!skb_dst(skb))
+ goto tx_err_link_failure;
- neigh = dst_neigh_lookup(skb_dst(skb),
- &ipv6_hdr(skb)->daddr);
- if (!neigh)
- goto tx_err_link_failure;
+ neigh = dst_neigh_lookup(skb_dst(skb),
+ &ipv6_hdr(skb)->daddr);
+ if (!neigh)
+ goto tx_err_link_failure;
- addr6 = (struct in6_addr *)&neigh->primary_key;
- addr_type = ipv6_addr_type(addr6);
+ addr6 = (struct in6_addr *)&neigh->primary_key;
+ addr_type = ipv6_addr_type(addr6);
- if (addr_type == IPV6_ADDR_ANY)
- addr6 = &ipv6_hdr(skb)->daddr;
+ if (addr_type == IPV6_ADDR_ANY)
+ addr6 = &ipv6_hdr(skb)->daddr;
- memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
- neigh_release(neigh);
+ memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
+ neigh_release(neigh);
+ }
} else if (!fl6->flowi6_mark)
dst = dst_cache_get(&t->dst_cache);
--
2.28.0.297.g1956fa8f8d-goog
Let's not try and use PAT attributes for I915_MAP_WC is the CPU doesn't
support PAT.
Fixes: 6056e50033d9 ("drm/i915/gem: Support discontiguous lmem object maps")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: <stable(a)vger.kernel.org> # v5.6+
---
drivers/gpu/drm/i915/gem/i915_gem_pages.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 7050519c87a4..5f1725268988 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -254,6 +254,9 @@ static void *i915_gem_object_map(struct drm_i915_gem_object *obj,
if (!i915_gem_object_has_struct_page(obj) && type != I915_MAP_WC)
return NULL;
+ if (type == I915_MAP_WC && !boot_cpu_has(X86_FEATURE_PAT))
+ return NULL;
+
/* A single page can always be kmapped */
if (n_pte == 1 && type == I915_MAP_WB)
return kmap(sg_page(sgt->sgl));
--
2.20.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a34a0a632dd991a371fec56431d73279f9c54029 Mon Sep 17 00:00:00 2001
From: Xin Xiong <xiongx18(a)fudan.edu.cn>
Date: Sun, 19 Jul 2020 23:45:45 +0800
Subject: [PATCH] drm: fix drm_dp_mst_port refcount leaks in
drm_dp_mst_allocate_vcpi
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
drm_dp_mst_allocate_vcpi() invokes
drm_dp_mst_topology_get_port_validated(), which increases the refcount
of the "port".
These reference counting issues take place in two exception handling
paths separately. Either when “slots” is less than 0 or when
drm_dp_init_vcpi() returns a negative value, the function forgets to
reduce the refcnt increased drm_dp_mst_topology_get_port_validated(),
which results in a refcount leak.
Fix these issues by pulling up the error handling when "slots" is less
than 0, and calling drm_dp_mst_topology_put_port() before termination
when drm_dp_init_vcpi() returns a negative value.
Fixes: 1e797f556c61 ("drm/dp: Split drm_dp_mst_allocate_vcpi")
Cc: <stable(a)vger.kernel.org> # v4.12+
Signed-off-by: Xiyu Yang <xiyuyang19(a)fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf(a)gmail.com>
Signed-off-by: Xin Xiong <xiongx18(a)fudan.edu.cn>
Reviewed-by: Lyude Paul <lyude(a)redhat.com>
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200719154545.GA41231@xin-vi…
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index 09b32289497e..b23cb2fec3f3 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -4308,11 +4308,11 @@ bool drm_dp_mst_allocate_vcpi(struct drm_dp_mst_topology_mgr *mgr,
{
int ret;
- port = drm_dp_mst_topology_get_port_validated(mgr, port);
- if (!port)
+ if (slots < 0)
return false;
- if (slots < 0)
+ port = drm_dp_mst_topology_get_port_validated(mgr, port);
+ if (!port)
return false;
if (port->vcpi.vcpi > 0) {
@@ -4328,6 +4328,7 @@ bool drm_dp_mst_allocate_vcpi(struct drm_dp_mst_topology_mgr *mgr,
if (ret) {
DRM_DEBUG_KMS("failed to init vcpi slots=%d max=63 ret=%d\n",
DIV_ROUND_UP(pbn, mgr->pbn_div), ret);
+ drm_dp_mst_topology_put_port(port);
goto out;
}
DRM_DEBUG_KMS("initing vcpi for pbn=%d slots=%d\n",
The ordering of psp_tmr_terminate() and psp_asd_unload()
got reversed when the patches were applied to stable.
This patch does not exist in Linus' tree because the ordering
is correct there. It got reversed when the patches were applied
to stable. This patch is for stable only.
Fixes: 22ff658396b446 ("drm/amdgpu: asd function needs to be unloaded in suspend phase")
Fixes: 2c41c968c6f648 ("drm/amdgpu: add TMR destory function for psp")
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 5.7.x
Cc: Huang Rui <ray.huang(a)amd.com>
---
Make the description more explicit as to why this patch is only for stable.
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 3c6f60c5b1a5..088f43ebdceb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1679,15 +1679,15 @@ static int psp_suspend(void *handle)
}
}
- ret = psp_tmr_terminate(psp);
+ ret = psp_asd_unload(psp);
if (ret) {
- DRM_ERROR("Falied to terminate tmr\n");
+ DRM_ERROR("Failed to unload asd\n");
return ret;
}
- ret = psp_asd_unload(psp);
+ ret = psp_tmr_terminate(psp);
if (ret) {
- DRM_ERROR("Failed to unload asd\n");
+ DRM_ERROR("Falied to terminate tmr\n");
return ret;
}
--
2.25.4
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ec11fe3705a7d434ec3493bdaddb9b6367e7366a Mon Sep 17 00:00:00 2001
From: hersen wu <hersenxs.wu(a)amd.com>
Date: Mon, 22 Jun 2020 13:29:16 -0400
Subject: [PATCH] drm/amd/display: OLED panel backlight adjust not work with
external display connected
[Why]
amdgpu_dm->backlight_caps is for single eDP only. the caps are upddated
for very connector. Real eDP caps will be overwritten by other external
display. For OLED panel, caps->aux_support is set to 1 for OLED pnael.
after external connected, caps+.aux_support is set to 0. This causes
OLED backlight adjustment not work.
[How]
within update_conector_ext_caps, backlight caps will be updated only for
eDP connector.
Cc: stable(a)vger.kernel.org
Signed-off-by: hersen wu <hersenxs.wu(a)amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas(a)amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6f937e25a62c..9c283531e1a2 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2028,6 +2028,7 @@ static void update_connector_ext_caps(struct amdgpu_dm_connector *aconnector)
struct amdgpu_display_manager *dm;
struct drm_connector *conn_base;
struct amdgpu_device *adev;
+ struct dc_link *link = NULL;
static const u8 pre_computed_values[] = {
50, 51, 52, 53, 55, 56, 57, 58, 59, 61, 62, 63, 65, 66, 68, 69,
71, 72, 74, 75, 77, 79, 81, 82, 84, 86, 88, 90, 92, 94, 96, 98};
@@ -2035,6 +2036,10 @@ static void update_connector_ext_caps(struct amdgpu_dm_connector *aconnector)
if (!aconnector || !aconnector->dc_link)
return;
+ link = aconnector->dc_link;
+ if (link->connector_signal != SIGNAL_TYPE_EDP)
+ return;
+
conn_base = &aconnector->base;
adev = conn_base->dev->dev_private;
dm = &adev->dm;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8425400759749e23aaa831f2482b96211201af33 Mon Sep 17 00:00:00 2001
From: Denis Efremov <efremov(a)linux.com>
Date: Fri, 5 Jun 2020 20:37:43 +0300
Subject: [PATCH] drm/amd/display: Use kvfree() to free coeff in
build_regamma()
Use kvfree() instead of kfree() to free coeff in build_regamma()
because the memory is allocated with kvzalloc().
Fixes: e752058b8671 ("drm/amd/display: Optimize gamma calculations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Denis Efremov <efremov(a)linux.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index 9431b48aecb4..56bb1f9f77ce 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -843,7 +843,7 @@ static bool build_regamma(struct pwl_float_data_ex *rgb_regamma,
pow_buffer_ptr = -1; // reset back to no optimize
ret = true;
release:
- kfree(coeff);
+ kvfree(coeff);
return ret;
}
The patch below does not apply to the 5.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8425400759749e23aaa831f2482b96211201af33 Mon Sep 17 00:00:00 2001
From: Denis Efremov <efremov(a)linux.com>
Date: Fri, 5 Jun 2020 20:37:43 +0300
Subject: [PATCH] drm/amd/display: Use kvfree() to free coeff in
build_regamma()
Use kvfree() instead of kfree() to free coeff in build_regamma()
because the memory is allocated with kvzalloc().
Fixes: e752058b8671 ("drm/amd/display: Optimize gamma calculations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Denis Efremov <efremov(a)linux.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index 9431b48aecb4..56bb1f9f77ce 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -843,7 +843,7 @@ static bool build_regamma(struct pwl_float_data_ex *rgb_regamma,
pow_buffer_ptr = -1; // reset back to no optimize
ret = true;
release:
- kfree(coeff);
+ kvfree(coeff);
return ret;
}
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8425400759749e23aaa831f2482b96211201af33 Mon Sep 17 00:00:00 2001
From: Denis Efremov <efremov(a)linux.com>
Date: Fri, 5 Jun 2020 20:37:43 +0300
Subject: [PATCH] drm/amd/display: Use kvfree() to free coeff in
build_regamma()
Use kvfree() instead of kfree() to free coeff in build_regamma()
because the memory is allocated with kvzalloc().
Fixes: e752058b8671 ("drm/amd/display: Optimize gamma calculations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Denis Efremov <efremov(a)linux.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index 9431b48aecb4..56bb1f9f77ce 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -843,7 +843,7 @@ static bool build_regamma(struct pwl_float_data_ex *rgb_regamma,
pow_buffer_ptr = -1; // reset back to no optimize
ret = true;
release:
- kfree(coeff);
+ kvfree(coeff);
return ret;
}
The patch below was submitted to be applied to the 5.8-stable tree.
I fail to see how this patch meets the stable kernel rules as found at
Documentation/process/stable-kernel-rules.rst.
I could be totally wrong, and if so, please respond to
<stable(a)vger.kernel.org> and let me know why this patch should be
applied. Otherwise, it is now dropped from my patch queues, never to be
seen again.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1d8d42ba365101fa68d210c0e2ed2bc9582fda6c Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann(a)suse.de>
Date: Fri, 5 Jun 2020 15:57:50 +0200
Subject: [PATCH] drm/mgag200: Remove declaration of mgag200_mmap() from header
file
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Commit 94668ac796a5 ("drm/mgag200: Convert mgag200 driver to VRAM MM")
removed the implementation of mgag200_mmap(). Also remove the declaration.
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Acked-by: Sam Ravnborg <sam(a)ravnborg.org>
Fixes: 94668ac796a5 ("drm/mgag200: Convert mgag200 driver to VRAM MM")
Cc: Gerd Hoffmann <kraxel(a)redhat.com>
Cc: Dave Airlie <airlied(a)redhat.com>
Cc: Krzysztof Kozlowski <krzk(a)kernel.org>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Sam Ravnborg <sam(a)ravnborg.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: "Noralf Trønnes" <noralf(a)tronnes.org>
Cc: Armijn Hemel <armijn(a)tjaldur.nl>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: Emil Velikov <emil.velikov(a)collabora.com>
Cc: <stable(a)vger.kernel.org> # v5.3+
Link: https://patchwork.freedesktop.org/patch/msgid/20200605135803.19811-2-tzimme…
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h b/drivers/gpu/drm/mgag200/mgag200_drv.h
index 47df62b1ad29..92b6679029fe 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.h
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
@@ -198,6 +198,5 @@ void mgag200_i2c_destroy(struct mga_i2c_chan *i2c);
int mgag200_mm_init(struct mga_device *mdev);
void mgag200_mm_fini(struct mga_device *mdev);
-int mgag200_mmap(struct file *filp, struct vm_area_struct *vma);
#endif /* __MGAG200_DRV_H__ */
The patch below does not apply to the 5.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 119c53d2d4044c59c450c4f5a568d80b9d861856 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Wed, 8 Jul 2020 16:49:11 +0100
Subject: [PATCH] drm/vgem: Replace opencoded version of
drm_gem_dumb_map_offset()
drm_gem_dumb_map_offset() now exists and does everything
vgem_gem_dump_map does and *ought* to do.
In particular, vgem_gem_dumb_map() was trying to reject mmapping an
imported dmabuf by checking the existence of obj->filp. Unfortunately,
we always allocated an obj->filp, even if unused for an imported dmabuf.
Instead, the drm_gem_dumb_map_offset(), since commit 90378e589192
("drm/gem: drm_gem_dumb_map_offset(): reject dma-buf"), uses the
obj->import_attach to reject such invalid mmaps.
This prevents vgem from allowing userspace mmapping the dumb handle and
attempting to incorrectly fault in remote pages belonging to another
device, where there may not even be a struct page.
v2: Use the default drm_gem_dumb_map_offset() callback
Fixes: af33a9190d02 ("drm/vgem: Enable dmabuf import interfaces")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: <stable(a)vger.kernel.org> # v4.13+
Link: https://patchwork.freedesktop.org/patch/msgid/20200708154911.21236-1-chris@…
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index e4dc7b267a0b..a775feda1cc7 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -230,32 +230,6 @@ static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
return 0;
}
-static int vgem_gem_dumb_map(struct drm_file *file, struct drm_device *dev,
- uint32_t handle, uint64_t *offset)
-{
- struct drm_gem_object *obj;
- int ret;
-
- obj = drm_gem_object_lookup(file, handle);
- if (!obj)
- return -ENOENT;
-
- if (!obj->filp) {
- ret = -EINVAL;
- goto unref;
- }
-
- ret = drm_gem_create_mmap_offset(obj);
- if (ret)
- goto unref;
-
- *offset = drm_vma_node_offset_addr(&obj->vma_node);
-unref:
- drm_gem_object_put(obj);
-
- return ret;
-}
-
static struct drm_ioctl_desc vgem_ioctls[] = {
DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
@@ -446,7 +420,6 @@ static struct drm_driver vgem_driver = {
.fops = &vgem_driver_fops,
.dumb_create = vgem_gem_dumb_create,
- .dumb_map_offset = vgem_gem_dumb_map,
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 119c53d2d4044c59c450c4f5a568d80b9d861856 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Wed, 8 Jul 2020 16:49:11 +0100
Subject: [PATCH] drm/vgem: Replace opencoded version of
drm_gem_dumb_map_offset()
drm_gem_dumb_map_offset() now exists and does everything
vgem_gem_dump_map does and *ought* to do.
In particular, vgem_gem_dumb_map() was trying to reject mmapping an
imported dmabuf by checking the existence of obj->filp. Unfortunately,
we always allocated an obj->filp, even if unused for an imported dmabuf.
Instead, the drm_gem_dumb_map_offset(), since commit 90378e589192
("drm/gem: drm_gem_dumb_map_offset(): reject dma-buf"), uses the
obj->import_attach to reject such invalid mmaps.
This prevents vgem from allowing userspace mmapping the dumb handle and
attempting to incorrectly fault in remote pages belonging to another
device, where there may not even be a struct page.
v2: Use the default drm_gem_dumb_map_offset() callback
Fixes: af33a9190d02 ("drm/vgem: Enable dmabuf import interfaces")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: <stable(a)vger.kernel.org> # v4.13+
Link: https://patchwork.freedesktop.org/patch/msgid/20200708154911.21236-1-chris@…
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index e4dc7b267a0b..a775feda1cc7 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -230,32 +230,6 @@ static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
return 0;
}
-static int vgem_gem_dumb_map(struct drm_file *file, struct drm_device *dev,
- uint32_t handle, uint64_t *offset)
-{
- struct drm_gem_object *obj;
- int ret;
-
- obj = drm_gem_object_lookup(file, handle);
- if (!obj)
- return -ENOENT;
-
- if (!obj->filp) {
- ret = -EINVAL;
- goto unref;
- }
-
- ret = drm_gem_create_mmap_offset(obj);
- if (ret)
- goto unref;
-
- *offset = drm_vma_node_offset_addr(&obj->vma_node);
-unref:
- drm_gem_object_put(obj);
-
- return ret;
-}
-
static struct drm_ioctl_desc vgem_ioctls[] = {
DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
@@ -446,7 +420,6 @@ static struct drm_driver vgem_driver = {
.fops = &vgem_driver_fops,
.dumb_create = vgem_gem_dumb_create,
- .dumb_map_offset = vgem_gem_dumb_map,
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 119c53d2d4044c59c450c4f5a568d80b9d861856 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Wed, 8 Jul 2020 16:49:11 +0100
Subject: [PATCH] drm/vgem: Replace opencoded version of
drm_gem_dumb_map_offset()
drm_gem_dumb_map_offset() now exists and does everything
vgem_gem_dump_map does and *ought* to do.
In particular, vgem_gem_dumb_map() was trying to reject mmapping an
imported dmabuf by checking the existence of obj->filp. Unfortunately,
we always allocated an obj->filp, even if unused for an imported dmabuf.
Instead, the drm_gem_dumb_map_offset(), since commit 90378e589192
("drm/gem: drm_gem_dumb_map_offset(): reject dma-buf"), uses the
obj->import_attach to reject such invalid mmaps.
This prevents vgem from allowing userspace mmapping the dumb handle and
attempting to incorrectly fault in remote pages belonging to another
device, where there may not even be a struct page.
v2: Use the default drm_gem_dumb_map_offset() callback
Fixes: af33a9190d02 ("drm/vgem: Enable dmabuf import interfaces")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: <stable(a)vger.kernel.org> # v4.13+
Link: https://patchwork.freedesktop.org/patch/msgid/20200708154911.21236-1-chris@…
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index e4dc7b267a0b..a775feda1cc7 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -230,32 +230,6 @@ static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
return 0;
}
-static int vgem_gem_dumb_map(struct drm_file *file, struct drm_device *dev,
- uint32_t handle, uint64_t *offset)
-{
- struct drm_gem_object *obj;
- int ret;
-
- obj = drm_gem_object_lookup(file, handle);
- if (!obj)
- return -ENOENT;
-
- if (!obj->filp) {
- ret = -EINVAL;
- goto unref;
- }
-
- ret = drm_gem_create_mmap_offset(obj);
- if (ret)
- goto unref;
-
- *offset = drm_vma_node_offset_addr(&obj->vma_node);
-unref:
- drm_gem_object_put(obj);
-
- return ret;
-}
-
static struct drm_ioctl_desc vgem_ioctls[] = {
DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
@@ -446,7 +420,6 @@ static struct drm_driver vgem_driver = {
.fops = &vgem_driver_fops,
.dumb_create = vgem_gem_dumb_create,
- .dumb_map_offset = vgem_gem_dumb_map,
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 119c53d2d4044c59c450c4f5a568d80b9d861856 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Wed, 8 Jul 2020 16:49:11 +0100
Subject: [PATCH] drm/vgem: Replace opencoded version of
drm_gem_dumb_map_offset()
drm_gem_dumb_map_offset() now exists and does everything
vgem_gem_dump_map does and *ought* to do.
In particular, vgem_gem_dumb_map() was trying to reject mmapping an
imported dmabuf by checking the existence of obj->filp. Unfortunately,
we always allocated an obj->filp, even if unused for an imported dmabuf.
Instead, the drm_gem_dumb_map_offset(), since commit 90378e589192
("drm/gem: drm_gem_dumb_map_offset(): reject dma-buf"), uses the
obj->import_attach to reject such invalid mmaps.
This prevents vgem from allowing userspace mmapping the dumb handle and
attempting to incorrectly fault in remote pages belonging to another
device, where there may not even be a struct page.
v2: Use the default drm_gem_dumb_map_offset() callback
Fixes: af33a9190d02 ("drm/vgem: Enable dmabuf import interfaces")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: <stable(a)vger.kernel.org> # v4.13+
Link: https://patchwork.freedesktop.org/patch/msgid/20200708154911.21236-1-chris@…
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index e4dc7b267a0b..a775feda1cc7 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -230,32 +230,6 @@ static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
return 0;
}
-static int vgem_gem_dumb_map(struct drm_file *file, struct drm_device *dev,
- uint32_t handle, uint64_t *offset)
-{
- struct drm_gem_object *obj;
- int ret;
-
- obj = drm_gem_object_lookup(file, handle);
- if (!obj)
- return -ENOENT;
-
- if (!obj->filp) {
- ret = -EINVAL;
- goto unref;
- }
-
- ret = drm_gem_create_mmap_offset(obj);
- if (ret)
- goto unref;
-
- *offset = drm_vma_node_offset_addr(&obj->vma_node);
-unref:
- drm_gem_object_put(obj);
-
- return ret;
-}
-
static struct drm_ioctl_desc vgem_ioctls[] = {
DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
@@ -446,7 +420,6 @@ static struct drm_driver vgem_driver = {
.fops = &vgem_driver_fops,
.dumb_create = vgem_gem_dumb_create,
- .dumb_map_offset = vgem_gem_dumb_map,
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 119c53d2d4044c59c450c4f5a568d80b9d861856 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Wed, 8 Jul 2020 16:49:11 +0100
Subject: [PATCH] drm/vgem: Replace opencoded version of
drm_gem_dumb_map_offset()
drm_gem_dumb_map_offset() now exists and does everything
vgem_gem_dump_map does and *ought* to do.
In particular, vgem_gem_dumb_map() was trying to reject mmapping an
imported dmabuf by checking the existence of obj->filp. Unfortunately,
we always allocated an obj->filp, even if unused for an imported dmabuf.
Instead, the drm_gem_dumb_map_offset(), since commit 90378e589192
("drm/gem: drm_gem_dumb_map_offset(): reject dma-buf"), uses the
obj->import_attach to reject such invalid mmaps.
This prevents vgem from allowing userspace mmapping the dumb handle and
attempting to incorrectly fault in remote pages belonging to another
device, where there may not even be a struct page.
v2: Use the default drm_gem_dumb_map_offset() callback
Fixes: af33a9190d02 ("drm/vgem: Enable dmabuf import interfaces")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: <stable(a)vger.kernel.org> # v4.13+
Link: https://patchwork.freedesktop.org/patch/msgid/20200708154911.21236-1-chris@…
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index e4dc7b267a0b..a775feda1cc7 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -230,32 +230,6 @@ static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
return 0;
}
-static int vgem_gem_dumb_map(struct drm_file *file, struct drm_device *dev,
- uint32_t handle, uint64_t *offset)
-{
- struct drm_gem_object *obj;
- int ret;
-
- obj = drm_gem_object_lookup(file, handle);
- if (!obj)
- return -ENOENT;
-
- if (!obj->filp) {
- ret = -EINVAL;
- goto unref;
- }
-
- ret = drm_gem_create_mmap_offset(obj);
- if (ret)
- goto unref;
-
- *offset = drm_vma_node_offset_addr(&obj->vma_node);
-unref:
- drm_gem_object_put(obj);
-
- return ret;
-}
-
static struct drm_ioctl_desc vgem_ioctls[] = {
DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
@@ -446,7 +420,6 @@ static struct drm_driver vgem_driver = {
.fops = &vgem_driver_fops,
.dumb_create = vgem_gem_dumb_create,
- .dumb_map_offset = vgem_gem_dumb_map,
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 511b6d9aed417739b6aa49d0b6b4354ad21020f1 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 26 May 2020 10:07:53 +0100
Subject: [PATCH] drm/i915/gt: Do not schedule normal requests immediately
along virtual
When we push a virtual request onto the HW, we update the rq->engine to
point to the physical engine. A request that is then submitted by the
user that waits upon the virtual engine, but along the physical engine
in use, will then see that it is due to be submitted to the same engine
and take a shortcut (and be queued without waiting for the completion
fence). However, the virtual request may be preempted (either by higher
priority users, or by timeslicing) and removed from the physical engine
to be migrated over to one of its siblings. The dependent normal request
however is oblivious to the removal of the virtual request and remains
queued to execute on HW, believing that once it reaches the head of its
queue all of its predecessors will have completed executing!
v2: Beware restriction of signal->execution_mask prior to submission.
Fixes: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine")
Testcase: igt/gem_exec_balancer/sliced
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200526090753.11329-2-chris@…
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index e64d82f7c830..0d810a62ff46 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1242,6 +1242,25 @@ i915_request_await_execution(struct i915_request *rq,
return 0;
}
+static int
+await_request_submit(struct i915_request *to, struct i915_request *from)
+{
+ /*
+ * If we are waiting on a virtual engine, then it may be
+ * constrained to execute on a single engine *prior* to submission.
+ * When it is submitted, it will be first submitted to the virtual
+ * engine and then passed to the physical engine. We cannot allow
+ * the waiter to be submitted immediately to the physical engine
+ * as it may then bypass the virtual request.
+ */
+ if (to->engine == READ_ONCE(from->engine))
+ return i915_sw_fence_await_sw_fence_gfp(&to->submit,
+ &from->submit,
+ I915_FENCE_GFP);
+ else
+ return __i915_request_await_execution(to, from, NULL);
+}
+
static int
i915_request_await_request(struct i915_request *to, struct i915_request *from)
{
@@ -1263,10 +1282,8 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
return ret;
}
- if (to->engine == READ_ONCE(from->engine))
- ret = i915_sw_fence_await_sw_fence_gfp(&to->submit,
- &from->submit,
- I915_FENCE_GFP);
+ if (is_power_of_2(to->execution_mask | READ_ONCE(from->execution_mask)))
+ ret = await_request_submit(to, from);
else
ret = emit_semaphore_wait(to, from, I915_FENCE_GFP);
if (ret < 0)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 511b6d9aed417739b6aa49d0b6b4354ad21020f1 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 26 May 2020 10:07:53 +0100
Subject: [PATCH] drm/i915/gt: Do not schedule normal requests immediately
along virtual
When we push a virtual request onto the HW, we update the rq->engine to
point to the physical engine. A request that is then submitted by the
user that waits upon the virtual engine, but along the physical engine
in use, will then see that it is due to be submitted to the same engine
and take a shortcut (and be queued without waiting for the completion
fence). However, the virtual request may be preempted (either by higher
priority users, or by timeslicing) and removed from the physical engine
to be migrated over to one of its siblings. The dependent normal request
however is oblivious to the removal of the virtual request and remains
queued to execute on HW, believing that once it reaches the head of its
queue all of its predecessors will have completed executing!
v2: Beware restriction of signal->execution_mask prior to submission.
Fixes: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine")
Testcase: igt/gem_exec_balancer/sliced
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200526090753.11329-2-chris@…
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index e64d82f7c830..0d810a62ff46 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1242,6 +1242,25 @@ i915_request_await_execution(struct i915_request *rq,
return 0;
}
+static int
+await_request_submit(struct i915_request *to, struct i915_request *from)
+{
+ /*
+ * If we are waiting on a virtual engine, then it may be
+ * constrained to execute on a single engine *prior* to submission.
+ * When it is submitted, it will be first submitted to the virtual
+ * engine and then passed to the physical engine. We cannot allow
+ * the waiter to be submitted immediately to the physical engine
+ * as it may then bypass the virtual request.
+ */
+ if (to->engine == READ_ONCE(from->engine))
+ return i915_sw_fence_await_sw_fence_gfp(&to->submit,
+ &from->submit,
+ I915_FENCE_GFP);
+ else
+ return __i915_request_await_execution(to, from, NULL);
+}
+
static int
i915_request_await_request(struct i915_request *to, struct i915_request *from)
{
@@ -1263,10 +1282,8 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
return ret;
}
- if (to->engine == READ_ONCE(from->engine))
- ret = i915_sw_fence_await_sw_fence_gfp(&to->submit,
- &from->submit,
- I915_FENCE_GFP);
+ if (is_power_of_2(to->execution_mask | READ_ONCE(from->execution_mask)))
+ ret = await_request_submit(to, from);
else
ret = emit_semaphore_wait(to, from, I915_FENCE_GFP);
if (ret < 0)
The patch below does not apply to the 5.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 511b6d9aed417739b6aa49d0b6b4354ad21020f1 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 26 May 2020 10:07:53 +0100
Subject: [PATCH] drm/i915/gt: Do not schedule normal requests immediately
along virtual
When we push a virtual request onto the HW, we update the rq->engine to
point to the physical engine. A request that is then submitted by the
user that waits upon the virtual engine, but along the physical engine
in use, will then see that it is due to be submitted to the same engine
and take a shortcut (and be queued without waiting for the completion
fence). However, the virtual request may be preempted (either by higher
priority users, or by timeslicing) and removed from the physical engine
to be migrated over to one of its siblings. The dependent normal request
however is oblivious to the removal of the virtual request and remains
queued to execute on HW, believing that once it reaches the head of its
queue all of its predecessors will have completed executing!
v2: Beware restriction of signal->execution_mask prior to submission.
Fixes: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine")
Testcase: igt/gem_exec_balancer/sliced
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200526090753.11329-2-chris@…
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index e64d82f7c830..0d810a62ff46 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1242,6 +1242,25 @@ i915_request_await_execution(struct i915_request *rq,
return 0;
}
+static int
+await_request_submit(struct i915_request *to, struct i915_request *from)
+{
+ /*
+ * If we are waiting on a virtual engine, then it may be
+ * constrained to execute on a single engine *prior* to submission.
+ * When it is submitted, it will be first submitted to the virtual
+ * engine and then passed to the physical engine. We cannot allow
+ * the waiter to be submitted immediately to the physical engine
+ * as it may then bypass the virtual request.
+ */
+ if (to->engine == READ_ONCE(from->engine))
+ return i915_sw_fence_await_sw_fence_gfp(&to->submit,
+ &from->submit,
+ I915_FENCE_GFP);
+ else
+ return __i915_request_await_execution(to, from, NULL);
+}
+
static int
i915_request_await_request(struct i915_request *to, struct i915_request *from)
{
@@ -1263,10 +1282,8 @@ i915_request_await_request(struct i915_request *to, struct i915_request *from)
return ret;
}
- if (to->engine == READ_ONCE(from->engine))
- ret = i915_sw_fence_await_sw_fence_gfp(&to->submit,
- &from->submit,
- I915_FENCE_GFP);
+ if (is_power_of_2(to->execution_mask | READ_ONCE(from->execution_mask)))
+ ret = await_request_submit(to, from);
else
ret = emit_semaphore_wait(to, from, I915_FENCE_GFP);
if (ret < 0)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1d9221e9d395c8c80d99d002bea3c9b6dd192d6a Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 13 Jul 2020 15:16:36 +0100
Subject: [PATCH] drm/i915: Skip signaling a signaled request
Preempt-to-busy introduces various fascinating complications in that the
requests may complete as we are unsubmitting them from HW. As they may
then signal after unsubmission, we may find ourselves having to cleanup
the signaling request from within the signaling callback. This causes us
to recurse onto the same i915_request.lock.
However, if the request is already signaled (as it will be before we
enter the signal callbacks), we know we can skip the signaling of that
request during submission, neatly evading the spinlock recursion.
unsubmit(ve.rq0) # timeslice expiration or other preemption
-> virtual_submit_request(ve.rq0)
dma_fence_signal(ve.rq0) # request completed before preemption ack
-> submit_notify(ve.rq1)
-> virtual_submit_request(ve.rq1) # sees that we have completed ve.rq0
-> __i915_request_submit(ve.rq0)
[ 264.210142] BUG: spinlock recursion on CPU#2, sample_multi_tr/2093
[ 264.210150] lock: 0xffff9efd6ac55080, .magic: dead4ead, .owner: sample_multi_tr/2093, .owner_cpu: 2
[ 264.210155] CPU: 2 PID: 2093 Comm: sample_multi_tr Tainted: G U
[ 264.210158] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X212.B01.1909060036 09/06/2019
[ 264.210160] Call Trace:
[ 264.210167] dump_stack+0x98/0xda
[ 264.210174] spin_dump.cold+0x24/0x3c
[ 264.210178] do_raw_spin_lock+0x9a/0xd0
[ 264.210184] _raw_spin_lock_nested+0x6a/0x70
[ 264.210314] __i915_request_submit+0x10a/0x3c0 [i915]
[ 264.210415] virtual_submit_request+0x9b/0x380 [i915]
[ 264.210516] submit_notify+0xaf/0x14c [i915]
[ 264.210602] __i915_sw_fence_complete+0x8a/0x230 [i915]
[ 264.210692] i915_sw_fence_complete+0x2d/0x40 [i915]
[ 264.210762] __dma_i915_sw_fence_wake+0x19/0x30 [i915]
[ 264.210767] dma_fence_signal_locked+0xb1/0x1c0
[ 264.210772] dma_fence_signal+0x29/0x50
[ 264.210871] i915_request_wait+0x5cb/0x830 [i915]
[ 264.210876] ? dma_resv_get_fences_rcu+0x294/0x5d0
[ 264.210974] i915_gem_object_wait_fence+0x2f/0x40 [i915]
[ 264.211084] i915_gem_object_wait+0xce/0x400 [i915]
[ 264.211178] i915_gem_wait_ioctl+0xff/0x290 [i915]
Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
References: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: "Nayana, Venkata Ramana" <venkata.ramana.nayana(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.4+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200713141636.29326-1-chris@…
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index d907d538176e..91786310c114 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -314,13 +314,18 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq)
{
lockdep_assert_held(&rq->lock);
+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags))
+ return true;
+
if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) {
struct intel_breadcrumbs *b = &rq->engine->breadcrumbs;
struct intel_context *ce = rq->context;
struct list_head *pos;
spin_lock(&b->irq_lock);
- GEM_BUG_ON(test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags));
+
+ if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags))
+ goto unlock;
if (!__intel_breadcrumbs_arm_irq(b))
goto unlock;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 3bb7320249ae..0b2fe55e6194 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -560,22 +560,25 @@ bool __i915_request_submit(struct i915_request *request)
engine->serial++;
result = true;
-xfer: /* We may be recursing from the signal callback of another i915 fence */
- spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
-
+xfer:
if (!test_and_set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags)) {
list_move_tail(&request->sched.link, &engine->active.requests);
clear_bit(I915_FENCE_FLAG_PQUEUE, &request->fence.flags);
- __notify_execute_cb(request);
}
- GEM_BUG_ON(!llist_empty(&request->execute_cb));
- if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
- !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &request->fence.flags) &&
- !i915_request_enable_breadcrumb(request))
- intel_engine_signal_breadcrumbs(engine);
+ /* We may be recursing from the signal callback of another i915 fence */
+ if (!i915_request_signaled(request)) {
+ spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
+
+ __notify_execute_cb(request);
+ if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+ &request->fence.flags) &&
+ !i915_request_enable_breadcrumb(request))
+ intel_engine_signal_breadcrumbs(engine);
- spin_unlock(&request->lock);
+ spin_unlock(&request->lock);
+ GEM_BUG_ON(!llist_empty(&request->execute_cb));
+ }
return result;
}
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1d9221e9d395c8c80d99d002bea3c9b6dd192d6a Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 13 Jul 2020 15:16:36 +0100
Subject: [PATCH] drm/i915: Skip signaling a signaled request
Preempt-to-busy introduces various fascinating complications in that the
requests may complete as we are unsubmitting them from HW. As they may
then signal after unsubmission, we may find ourselves having to cleanup
the signaling request from within the signaling callback. This causes us
to recurse onto the same i915_request.lock.
However, if the request is already signaled (as it will be before we
enter the signal callbacks), we know we can skip the signaling of that
request during submission, neatly evading the spinlock recursion.
unsubmit(ve.rq0) # timeslice expiration or other preemption
-> virtual_submit_request(ve.rq0)
dma_fence_signal(ve.rq0) # request completed before preemption ack
-> submit_notify(ve.rq1)
-> virtual_submit_request(ve.rq1) # sees that we have completed ve.rq0
-> __i915_request_submit(ve.rq0)
[ 264.210142] BUG: spinlock recursion on CPU#2, sample_multi_tr/2093
[ 264.210150] lock: 0xffff9efd6ac55080, .magic: dead4ead, .owner: sample_multi_tr/2093, .owner_cpu: 2
[ 264.210155] CPU: 2 PID: 2093 Comm: sample_multi_tr Tainted: G U
[ 264.210158] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X212.B01.1909060036 09/06/2019
[ 264.210160] Call Trace:
[ 264.210167] dump_stack+0x98/0xda
[ 264.210174] spin_dump.cold+0x24/0x3c
[ 264.210178] do_raw_spin_lock+0x9a/0xd0
[ 264.210184] _raw_spin_lock_nested+0x6a/0x70
[ 264.210314] __i915_request_submit+0x10a/0x3c0 [i915]
[ 264.210415] virtual_submit_request+0x9b/0x380 [i915]
[ 264.210516] submit_notify+0xaf/0x14c [i915]
[ 264.210602] __i915_sw_fence_complete+0x8a/0x230 [i915]
[ 264.210692] i915_sw_fence_complete+0x2d/0x40 [i915]
[ 264.210762] __dma_i915_sw_fence_wake+0x19/0x30 [i915]
[ 264.210767] dma_fence_signal_locked+0xb1/0x1c0
[ 264.210772] dma_fence_signal+0x29/0x50
[ 264.210871] i915_request_wait+0x5cb/0x830 [i915]
[ 264.210876] ? dma_resv_get_fences_rcu+0x294/0x5d0
[ 264.210974] i915_gem_object_wait_fence+0x2f/0x40 [i915]
[ 264.211084] i915_gem_object_wait+0xce/0x400 [i915]
[ 264.211178] i915_gem_wait_ioctl+0xff/0x290 [i915]
Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
References: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: "Nayana, Venkata Ramana" <venkata.ramana.nayana(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.4+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200713141636.29326-1-chris@…
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index d907d538176e..91786310c114 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -314,13 +314,18 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq)
{
lockdep_assert_held(&rq->lock);
+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags))
+ return true;
+
if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) {
struct intel_breadcrumbs *b = &rq->engine->breadcrumbs;
struct intel_context *ce = rq->context;
struct list_head *pos;
spin_lock(&b->irq_lock);
- GEM_BUG_ON(test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags));
+
+ if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags))
+ goto unlock;
if (!__intel_breadcrumbs_arm_irq(b))
goto unlock;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 3bb7320249ae..0b2fe55e6194 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -560,22 +560,25 @@ bool __i915_request_submit(struct i915_request *request)
engine->serial++;
result = true;
-xfer: /* We may be recursing from the signal callback of another i915 fence */
- spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
-
+xfer:
if (!test_and_set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags)) {
list_move_tail(&request->sched.link, &engine->active.requests);
clear_bit(I915_FENCE_FLAG_PQUEUE, &request->fence.flags);
- __notify_execute_cb(request);
}
- GEM_BUG_ON(!llist_empty(&request->execute_cb));
- if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
- !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &request->fence.flags) &&
- !i915_request_enable_breadcrumb(request))
- intel_engine_signal_breadcrumbs(engine);
+ /* We may be recursing from the signal callback of another i915 fence */
+ if (!i915_request_signaled(request)) {
+ spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
+
+ __notify_execute_cb(request);
+ if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+ &request->fence.flags) &&
+ !i915_request_enable_breadcrumb(request))
+ intel_engine_signal_breadcrumbs(engine);
- spin_unlock(&request->lock);
+ spin_unlock(&request->lock);
+ GEM_BUG_ON(!llist_empty(&request->execute_cb));
+ }
return result;
}
The patch below does not apply to the 5.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1d9221e9d395c8c80d99d002bea3c9b6dd192d6a Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 13 Jul 2020 15:16:36 +0100
Subject: [PATCH] drm/i915: Skip signaling a signaled request
Preempt-to-busy introduces various fascinating complications in that the
requests may complete as we are unsubmitting them from HW. As they may
then signal after unsubmission, we may find ourselves having to cleanup
the signaling request from within the signaling callback. This causes us
to recurse onto the same i915_request.lock.
However, if the request is already signaled (as it will be before we
enter the signal callbacks), we know we can skip the signaling of that
request during submission, neatly evading the spinlock recursion.
unsubmit(ve.rq0) # timeslice expiration or other preemption
-> virtual_submit_request(ve.rq0)
dma_fence_signal(ve.rq0) # request completed before preemption ack
-> submit_notify(ve.rq1)
-> virtual_submit_request(ve.rq1) # sees that we have completed ve.rq0
-> __i915_request_submit(ve.rq0)
[ 264.210142] BUG: spinlock recursion on CPU#2, sample_multi_tr/2093
[ 264.210150] lock: 0xffff9efd6ac55080, .magic: dead4ead, .owner: sample_multi_tr/2093, .owner_cpu: 2
[ 264.210155] CPU: 2 PID: 2093 Comm: sample_multi_tr Tainted: G U
[ 264.210158] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X212.B01.1909060036 09/06/2019
[ 264.210160] Call Trace:
[ 264.210167] dump_stack+0x98/0xda
[ 264.210174] spin_dump.cold+0x24/0x3c
[ 264.210178] do_raw_spin_lock+0x9a/0xd0
[ 264.210184] _raw_spin_lock_nested+0x6a/0x70
[ 264.210314] __i915_request_submit+0x10a/0x3c0 [i915]
[ 264.210415] virtual_submit_request+0x9b/0x380 [i915]
[ 264.210516] submit_notify+0xaf/0x14c [i915]
[ 264.210602] __i915_sw_fence_complete+0x8a/0x230 [i915]
[ 264.210692] i915_sw_fence_complete+0x2d/0x40 [i915]
[ 264.210762] __dma_i915_sw_fence_wake+0x19/0x30 [i915]
[ 264.210767] dma_fence_signal_locked+0xb1/0x1c0
[ 264.210772] dma_fence_signal+0x29/0x50
[ 264.210871] i915_request_wait+0x5cb/0x830 [i915]
[ 264.210876] ? dma_resv_get_fences_rcu+0x294/0x5d0
[ 264.210974] i915_gem_object_wait_fence+0x2f/0x40 [i915]
[ 264.211084] i915_gem_object_wait+0xce/0x400 [i915]
[ 264.211178] i915_gem_wait_ioctl+0xff/0x290 [i915]
Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
References: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: "Nayana, Venkata Ramana" <venkata.ramana.nayana(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.4+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200713141636.29326-1-chris@…
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index d907d538176e..91786310c114 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -314,13 +314,18 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq)
{
lockdep_assert_held(&rq->lock);
+ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags))
+ return true;
+
if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) {
struct intel_breadcrumbs *b = &rq->engine->breadcrumbs;
struct intel_context *ce = rq->context;
struct list_head *pos;
spin_lock(&b->irq_lock);
- GEM_BUG_ON(test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags));
+
+ if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags))
+ goto unlock;
if (!__intel_breadcrumbs_arm_irq(b))
goto unlock;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 3bb7320249ae..0b2fe55e6194 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -560,22 +560,25 @@ bool __i915_request_submit(struct i915_request *request)
engine->serial++;
result = true;
-xfer: /* We may be recursing from the signal callback of another i915 fence */
- spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
-
+xfer:
if (!test_and_set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags)) {
list_move_tail(&request->sched.link, &engine->active.requests);
clear_bit(I915_FENCE_FLAG_PQUEUE, &request->fence.flags);
- __notify_execute_cb(request);
}
- GEM_BUG_ON(!llist_empty(&request->execute_cb));
- if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
- !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &request->fence.flags) &&
- !i915_request_enable_breadcrumb(request))
- intel_engine_signal_breadcrumbs(engine);
+ /* We may be recursing from the signal callback of another i915 fence */
+ if (!i915_request_signaled(request)) {
+ spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
+
+ __notify_execute_cb(request);
+ if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+ &request->fence.flags) &&
+ !i915_request_enable_breadcrumb(request))
+ intel_engine_signal_breadcrumbs(engine);
- spin_unlock(&request->lock);
+ spin_unlock(&request->lock);
+ GEM_BUG_ON(!llist_empty(&request->execute_cb));
+ }
return result;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 24ebec25fb270100e252b19c288e21bd7d8cc7f7 Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Fri, 29 May 2020 14:12:18 +0100
Subject: [PATCH] arm64: hw_breakpoint: Don't invoke overflow handler on
uaccess watchpoints
Unprivileged memory accesses generated by the so-called "translated"
instructions (e.g. STTR) at EL1 can cause EL0 watchpoints to fire
unexpectedly if kernel debugging is enabled. In such cases, the
hw_breakpoint logic will invoke the user overflow handler which will
typically raise a SIGTRAP back to the current task. This is futile when
returning back to the kernel because (a) the signal won't have been
delivered and (b) userspace can't handle the thing anyway.
Avoid invoking the user overflow handler for watchpoints triggered by
kernel uaccess routines, and instead single-step over the faulting
instruction as we would if no overflow handler had been installed.
(Fixes tag identifies the introduction of unprivileged memory accesses,
which exposed this latent bug in the hw_breakpoint code)
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: James Morse <james.morse(a)arm.com>
Fixes: 57f4959bad0a ("arm64: kernel: Add support for User Access Override")
Reported-by: Luis Machado <luis.machado(a)linaro.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
index 0b727edf4104..af234a1e08b7 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -730,6 +730,27 @@ static u64 get_distance_from_watchpoint(unsigned long addr, u64 val,
return 0;
}
+static int watchpoint_report(struct perf_event *wp, unsigned long addr,
+ struct pt_regs *regs)
+{
+ int step = is_default_overflow_handler(wp);
+ struct arch_hw_breakpoint *info = counter_arch_bp(wp);
+
+ info->trigger = addr;
+
+ /*
+ * If we triggered a user watchpoint from a uaccess routine, then
+ * handle the stepping ourselves since userspace really can't help
+ * us with this.
+ */
+ if (!user_mode(regs) && info->ctrl.privilege == AARCH64_BREAKPOINT_EL0)
+ step = 1;
+ else
+ perf_bp_event(wp, regs);
+
+ return step;
+}
+
static int watchpoint_handler(unsigned long addr, unsigned int esr,
struct pt_regs *regs)
{
@@ -739,7 +760,6 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr,
u64 val;
struct perf_event *wp, **slots;
struct debug_info *debug_info;
- struct arch_hw_breakpoint *info;
struct arch_hw_breakpoint_ctrl ctrl;
slots = this_cpu_ptr(wp_on_reg);
@@ -777,25 +797,13 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr,
if (dist != 0)
continue;
- info = counter_arch_bp(wp);
- info->trigger = addr;
- perf_bp_event(wp, regs);
-
- /* Do we need to handle the stepping? */
- if (is_default_overflow_handler(wp))
- step = 1;
+ step = watchpoint_report(wp, addr, regs);
}
- if (min_dist > 0 && min_dist != -1) {
- /* No exact match found. */
- wp = slots[closest_match];
- info = counter_arch_bp(wp);
- info->trigger = addr;
- perf_bp_event(wp, regs);
- /* Do we need to handle the stepping? */
- if (is_default_overflow_handler(wp))
- step = 1;
- }
+ /* No exact match found? */
+ if (min_dist > 0 && min_dist != -1)
+ step = watchpoint_report(slots[closest_match], addr, regs);
+
rcu_read_unlock();
if (!step)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 24ebec25fb270100e252b19c288e21bd7d8cc7f7 Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Fri, 29 May 2020 14:12:18 +0100
Subject: [PATCH] arm64: hw_breakpoint: Don't invoke overflow handler on
uaccess watchpoints
Unprivileged memory accesses generated by the so-called "translated"
instructions (e.g. STTR) at EL1 can cause EL0 watchpoints to fire
unexpectedly if kernel debugging is enabled. In such cases, the
hw_breakpoint logic will invoke the user overflow handler which will
typically raise a SIGTRAP back to the current task. This is futile when
returning back to the kernel because (a) the signal won't have been
delivered and (b) userspace can't handle the thing anyway.
Avoid invoking the user overflow handler for watchpoints triggered by
kernel uaccess routines, and instead single-step over the faulting
instruction as we would if no overflow handler had been installed.
(Fixes tag identifies the introduction of unprivileged memory accesses,
which exposed this latent bug in the hw_breakpoint code)
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: James Morse <james.morse(a)arm.com>
Fixes: 57f4959bad0a ("arm64: kernel: Add support for User Access Override")
Reported-by: Luis Machado <luis.machado(a)linaro.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
index 0b727edf4104..af234a1e08b7 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -730,6 +730,27 @@ static u64 get_distance_from_watchpoint(unsigned long addr, u64 val,
return 0;
}
+static int watchpoint_report(struct perf_event *wp, unsigned long addr,
+ struct pt_regs *regs)
+{
+ int step = is_default_overflow_handler(wp);
+ struct arch_hw_breakpoint *info = counter_arch_bp(wp);
+
+ info->trigger = addr;
+
+ /*
+ * If we triggered a user watchpoint from a uaccess routine, then
+ * handle the stepping ourselves since userspace really can't help
+ * us with this.
+ */
+ if (!user_mode(regs) && info->ctrl.privilege == AARCH64_BREAKPOINT_EL0)
+ step = 1;
+ else
+ perf_bp_event(wp, regs);
+
+ return step;
+}
+
static int watchpoint_handler(unsigned long addr, unsigned int esr,
struct pt_regs *regs)
{
@@ -739,7 +760,6 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr,
u64 val;
struct perf_event *wp, **slots;
struct debug_info *debug_info;
- struct arch_hw_breakpoint *info;
struct arch_hw_breakpoint_ctrl ctrl;
slots = this_cpu_ptr(wp_on_reg);
@@ -777,25 +797,13 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr,
if (dist != 0)
continue;
- info = counter_arch_bp(wp);
- info->trigger = addr;
- perf_bp_event(wp, regs);
-
- /* Do we need to handle the stepping? */
- if (is_default_overflow_handler(wp))
- step = 1;
+ step = watchpoint_report(wp, addr, regs);
}
- if (min_dist > 0 && min_dist != -1) {
- /* No exact match found. */
- wp = slots[closest_match];
- info = counter_arch_bp(wp);
- info->trigger = addr;
- perf_bp_event(wp, regs);
- /* Do we need to handle the stepping? */
- if (is_default_overflow_handler(wp))
- step = 1;
- }
+ /* No exact match found? */
+ if (min_dist > 0 && min_dist != -1)
+ step = watchpoint_report(slots[closest_match], addr, regs);
+
rcu_read_unlock();
if (!step)
From: Mike Snitzer <snitzer(a)redhat.com>
Discontinue issuing writethrough write IO in series to the origin and
then cache.
Use bio_clone_fast() to create a new origin clone bio that will be
mapped to the origin device and then bio_chain() it to the bio that gets
remapped to the cache device. The origin clone bio does _not_ have a
copy of the per_bio_data -- as such check_if_tick_bio_needed() will not
be called.
The cache bio (parent bio) will not complete until the origin bio has
completed -- this fulfills bio_clone_fast()'s requirements as well as
the requirement to not complete the original IO until the write IO has
completed to both the origin and cache device.
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
(cherry picked from commit 2df3bae9a6543e90042291707b8db0cbfbae9ee9)
Fixes: 4ec34f2196d125ff781170ddc6c3058c08ec5e73 (dm bio record:
save/restore bi_end_io and bi_integrity )
4ec34f21 introduced a mkfs.ext4 hang on a LVM device that has been
modified with lvconvert --cachemode=writethrough.
CC:stable@vger.kernel.org for 4.14.y
Signed-off-by: John Donnelly <john.p.donnelly(a)oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy(a)oracle.com>
conflicts:
drivers/md/dm-cache-target.c. - Corrected usage of
writethrough_mode(&cache->feature) that was caught by
compiler, and removed unused static functions : writethrough_endio(),
defer_writethrough_bio(), wake_deferred_writethrough_worker()
that generated warnings.
---
drivers/md/dm-cache-target.c | 92 ++++++++++++++++++--------------------------
1 file changed, 37 insertions(+), 55 deletions(-)
diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c
index 69cdb29ef6be..2732d1df05fa 100644
--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -450,6 +450,7 @@ struct cache {
struct work_struct migration_worker;
struct delayed_work waker;
struct dm_bio_prison_v2 *prison;
+ struct bio_set *bs;
mempool_t *migration_pool;
@@ -537,11 +538,6 @@ static void wake_deferred_bio_worker(struct cache *cache)
queue_work(cache->wq, &cache->deferred_bio_worker);
}
-static void wake_deferred_writethrough_worker(struct cache *cache)
-{
- queue_work(cache->wq, &cache->deferred_writethrough_worker);
-}
-
static void wake_migration_worker(struct cache *cache)
{
if (passthrough_mode(&cache->features))
@@ -868,16 +864,23 @@ static void check_if_tick_bio_needed(struct cache *cache, struct bio *bio)
spin_unlock_irqrestore(&cache->lock, flags);
}
-static void remap_to_origin_clear_discard(struct cache *cache, struct bio *bio,
- dm_oblock_t oblock)
+static void __remap_to_origin_clear_discard(struct cache *cache, struct bio *bio,
+ dm_oblock_t oblock, bool bio_has_pbd)
{
- // FIXME: this is called way too much.
- check_if_tick_bio_needed(cache, bio);
+ if (bio_has_pbd)
+ check_if_tick_bio_needed(cache, bio);
remap_to_origin(cache, bio);
if (bio_data_dir(bio) == WRITE)
clear_discard(cache, oblock_to_dblock(cache, oblock));
}
+static void remap_to_origin_clear_discard(struct cache *cache, struct bio *bio,
+ dm_oblock_t oblock)
+{
+ // FIXME: check_if_tick_bio_needed() is called way too much through this interface
+ __remap_to_origin_clear_discard(cache, bio, oblock, true);
+}
+
static void remap_to_cache_dirty(struct cache *cache, struct bio *bio,
dm_oblock_t oblock, dm_cblock_t cblock)
{
@@ -937,57 +940,26 @@ static void issue_op(struct bio *bio, void *context)
accounted_request(cache, bio);
}
-static void defer_writethrough_bio(struct cache *cache, struct bio *bio)
-{
- unsigned long flags;
-
- spin_lock_irqsave(&cache->lock, flags);
- bio_list_add(&cache->deferred_writethrough_bios, bio);
- spin_unlock_irqrestore(&cache->lock, flags);
-
- wake_deferred_writethrough_worker(cache);
-}
-
-static void writethrough_endio(struct bio *bio)
-{
- struct per_bio_data *pb = get_per_bio_data(bio, PB_DATA_SIZE_WT);
-
- dm_unhook_bio(&pb->hook_info, bio);
-
- if (bio->bi_status) {
- bio_endio(bio);
- return;
- }
-
- dm_bio_restore(&pb->bio_details, bio);
- remap_to_cache(pb->cache, bio, pb->cblock);
-
- /*
- * We can't issue this bio directly, since we're in interrupt
- * context. So it gets put on a bio list for processing by the
- * worker thread.
- */
- defer_writethrough_bio(pb->cache, bio);
-}
-
/*
- * FIXME: send in parallel, huge latency as is.
* When running in writethrough mode we need to send writes to clean blocks
- * to both the cache and origin devices. In future we'd like to clone the
- * bio and send them in parallel, but for now we're doing them in
- * series as this is easier.
+ * to both the cache and origin devices. Clone the bio and send them in parallel.
*/
-static void remap_to_origin_then_cache(struct cache *cache, struct bio *bio,
- dm_oblock_t oblock, dm_cblock_t cblock)
+static void remap_to_origin_and_cache(struct cache *cache, struct bio *bio,
+ dm_oblock_t oblock, dm_cblock_t cblock)
{
- struct per_bio_data *pb = get_per_bio_data(bio, PB_DATA_SIZE_WT);
+ struct bio *origin_bio = bio_clone_fast(bio, GFP_NOIO, cache->bs);
+
+ BUG_ON(!origin_bio);
- pb->cache = cache;
- pb->cblock = cblock;
- dm_hook_bio(&pb->hook_info, bio, writethrough_endio, NULL);
- dm_bio_record(&pb->bio_details, bio);
+ bio_chain(origin_bio, bio);
+ /*
+ * Passing false to __remap_to_origin_clear_discard() skips
+ * all code that might use per_bio_data (since clone doesn't have it)
+ */
+ __remap_to_origin_clear_discard(cache, origin_bio, oblock, false);
+ submit_bio(origin_bio);
- remap_to_origin_clear_discard(pb->cache, bio, oblock);
+ remap_to_cache(cache, bio, cblock);
}
/*----------------------------------------------------------------
@@ -1873,7 +1845,7 @@ static int map_bio(struct cache *cache, struct bio *bio, dm_oblock_t block,
} else {
if (bio_data_dir(bio) == WRITE && writethrough_mode(&cache->features) &&
!is_dirty(cache, cblock)) {
- remap_to_origin_then_cache(cache, bio, block, cblock);
+ remap_to_origin_and_cache(cache, bio, block, cblock);
accounted_begin(cache, bio);
} else
remap_to_cache_dirty(cache, bio, block, cblock);
@@ -2132,6 +2104,9 @@ static void destroy(struct cache *cache)
kfree(cache->ctr_args[i]);
kfree(cache->ctr_args);
+ if (cache->bs)
+ bioset_free(cache->bs);
+
kfree(cache);
}
@@ -2589,6 +2564,13 @@ static int cache_create(struct cache_args *ca, struct cache **result)
cache->features = ca->features;
ti->per_io_data_size = get_per_bio_data_size(cache);
+ if (writethrough_mode(&cache->features)) {
+ /* Create bioset for writethrough bios issued to origin */
+ cache->bs = bioset_create(BIO_POOL_SIZE, 0, 0);
+ if (!cache->bs)
+ goto bad;
+ }
+
cache->callbacks.congested_fn = cache_is_congested;
dm_table_add_target_callbacks(ti->table, &cache->callbacks);
--
1.8.3.1
Hello, stable-kernel maintainers!
Could you please cherry-pick these commits into the v5.7.x kernel?
commit 0de6db30ef79b391cedd749801a49c485d2daf4b
Author: Sowjanya Komatineni <skomatineni(a)nvidia.com>
Date: Mon Jan 13 23:24:17 2020 -0800
ASoC: tegra: Use device managed resource APIs to get the clock
commit 1e4e0bf136aa4b4aa59c1e6af19844bd6d807794
Author: Sowjanya Komatineni <skomatineni(a)nvidia.com>
Date: Mon Jan 13 23:24:23 2020 -0800
ASoC: tegra: Add audio mclk parent configuration
commit ff5d18cb04f4ecccbcf05b7f83ab6df2a0d95c16
Author: Sowjanya Komatineni <skomatineni(a)nvidia.com>
Date: Mon Jan 13 23:24:24 2020 -0800
ASoC: tegra: Enable audio mclk during tegra_asoc_utils_init()
It will fix a huge warnings splat during of kernel boot on NVIDIA Tegra
SoCs. For some reason these patches haven't made into 5.7 when it was
released and several people complained about the warnings. Thanks in
advance!
Recently a customer of ours experienced a crash when booting the
system while enabling memory-hotplug.
The problem is that Normal zones on different nodes don't get their private
zone->pageset allocated, and keep sharing the initial boot_pageset.
The sharing between zones is normally safe as explained by the comment for
boot_pageset - it's a percpu structure, and manipulations are done with
disabled interrupts, and boot_pageset is set up in a way that any page placed
on its pcplist is immediately flushed to shared zone's freelist, because
pcp->high == 1.
However, the hotplug operation updates pcp->high to a higher value as it
expects to be operating on a private pageset.
The problem is in build_all_zonelists(), which is called when the first range
of pages is onlined for the Normal zone of node X or Y:
if (system_state == SYSTEM_BOOTING) {
build_all_zonelists_init();
} else {
#ifdef CONFIG_MEMORY_HOTPLUG
if (zone)
setup_zone_pageset(zone);
#endif
/* we have to stop all cpus to guarantee there is no user
of zonelist */
stop_machine(__build_all_zonelists, pgdat, NULL);
/* cpuset refresh routine should be here */
}
When called during hotplug, it should execute the setup_zone_pageset(zone)
which allocates the private pageset.
However, with memhp_default_state=online, this happens early while
system_state == SYSTEM_BOOTING is still true, hence this step is skipped.
(and build_all_zonelists_init() is probably unsafe anyway at this point).
Another hotplug operation on the same zone then leads to zone_pcp_update(zone)
called from online_pages(), which updates the pcp->high for the shared
boot_pageset to a value higher than 1.
At that point, pages freed from Node X and Y Normal zones can end up on the same
pcplist and from there they can be freed to the wrong zone's freelist,
leading to the corruption and crashes.
Please, note that upstream has fixed that differently (and unintentionally) by
adding another boot state (SYSTEM_SCHEDULING), which is set before smp_init().
That should happen before memory hotplug events even with memhp_default_state=online.
Backporting that would be too intrusive.
Signed-off-by: Oscar Salvador <osalvador(a)suse.de>
Debugged-by: Vlastimil Babka <vbabka(a)suse.cz>
---
include/linux/mmzone.h | 3 ++-
init/main.c | 2 +-
mm/memory_hotplug.c | 10 +++++-----
mm/page_alloc.c | 7 ++++---
4 files changed, 12 insertions(+), 10 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index e3d7754f25f0..5c7645e156a5 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -756,7 +756,8 @@ static inline bool is_dev_zone(const struct zone *zone)
#include <linux/memory_hotplug.h>
extern struct mutex zonelists_mutex;
-void build_all_zonelists(pg_data_t *pgdat, struct zone *zone);
+void build_all_zonelists(pg_data_t *pgdat, struct zone *zone,
+ bool hotplug_context);
void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx);
bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,
int classzone_idx, unsigned int alloc_flags,
diff --git a/init/main.c b/init/main.c
index d47860dbe896..7ad08957dd18 100644
--- a/init/main.c
+++ b/init/main.c
@@ -512,7 +512,7 @@ asmlinkage __visible void __init start_kernel(void)
smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
boot_cpu_hotplug_init();
- build_all_zonelists(NULL, NULL);
+ build_all_zonelists(NULL, NULL, false);
page_alloc_init();
pr_notice("Kernel command line: %s\n", boot_command_line);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 449999657c0b..a4ffe5996317 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1125,7 +1125,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
mutex_lock(&zonelists_mutex);
if (!populated_zone(zone)) {
need_zonelists_rebuild = 1;
- build_all_zonelists(NULL, zone);
+ build_all_zonelists(NULL, zone, true);
}
ret = walk_system_ram_range(pfn, nr_pages, &onlined_pages,
@@ -1146,7 +1146,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
if (onlined_pages) {
node_states_set_node(nid, &arg);
if (need_zonelists_rebuild)
- build_all_zonelists(NULL, NULL);
+ build_all_zonelists(NULL, NULL, true);
else
zone_pcp_update(zone);
}
@@ -1220,7 +1220,7 @@ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start)
* to access not-initialized zonelist, build here.
*/
mutex_lock(&zonelists_mutex);
- build_all_zonelists(pgdat, NULL);
+ build_all_zonelists(pgdat, NULL, true);
mutex_unlock(&zonelists_mutex);
/*
@@ -1276,7 +1276,7 @@ int try_online_node(int nid)
if (pgdat->node_zonelists->_zonerefs->zone == NULL) {
mutex_lock(&zonelists_mutex);
- build_all_zonelists(NULL, NULL);
+ build_all_zonelists(NULL, NULL, true);
mutex_unlock(&zonelists_mutex);
}
@@ -2016,7 +2016,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
if (!populated_zone(zone)) {
zone_pcp_reset(zone);
mutex_lock(&zonelists_mutex);
- build_all_zonelists(NULL, NULL);
+ build_all_zonelists(NULL, NULL, true);
mutex_unlock(&zonelists_mutex);
} else
zone_pcp_update(zone);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index de00e0fec484..f394dd87fa03 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4608,7 +4608,7 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write,
user_zonelist_order = oldval;
} else if (oldval != user_zonelist_order) {
mutex_lock(&zonelists_mutex);
- build_all_zonelists(NULL, NULL);
+ build_all_zonelists(NULL, NULL, false);
mutex_unlock(&zonelists_mutex);
}
}
@@ -4988,11 +4988,12 @@ build_all_zonelists_init(void)
* (2) call of __init annotated helper build_all_zonelists_init
* [protected by SYSTEM_BOOTING].
*/
-void __ref build_all_zonelists(pg_data_t *pgdat, struct zone *zone)
+void __ref build_all_zonelists(pg_data_t *pgdat, struct zone *zone,
+ bool hotplug_context)
{
set_zonelist_order();
- if (system_state == SYSTEM_BOOTING) {
+ if (system_state == SYSTEM_BOOTING && !hotplug_context) {
build_all_zonelists_init();
} else {
#ifdef CONFIG_MEMORY_HOTPLUG
--
2.26.2
From: Thomas Gleixner <tglx(a)linutronix.de>
commit baedb87d1b53532f81b4bd0387f83b05d4f7eb9a upstream.
Setting interrupt affinity on inactive interrupts is inconsistent when
hierarchical irq domains are enabled. The core code should just store the
affinity and not call into the irq chip driver for inactive interrupts
because the chip drivers may not be in a state to handle such requests.
X86 has a hacky workaround for that but all other irq chips have not which
causes problems e.g. on GIC V3 ITS.
Instead of adding more ugly hacks all over the place, solve the problem in
the core code. If the affinity is set on an inactive interrupt then:
- Store it in the irq descriptors affinity mask
- Update the effective affinity to reflect that so user space has
a consistent view
- Don't call into the irq chip driver
This is the core equivalent of the X86 workaround and works correctly
because the affinity setting is established in the irq chip when the
interrupt is activated later on.
Note, that this is only effective when hierarchical irq domains are enabled
by the architecture. Doing it unconditionally would break legacy irq chip
implementations.
For hierarchial irq domains this works correctly as none of the drivers can
have a dependency on affinity setting in inactive state by design.
Remove the X86 workaround as it is not longer required.
Fixes: 02edee152d6e ("x86/apic/vector: Ignore set_affinity call for inactive interrupts")
Reported-by: Ali Saidi <alisaidi(a)amazon.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Ali Saidi <alisaidi(a)amazon.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com
Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
[fllinden(a)amazon.com - 4.14 never had the x86 workaround, so skip x86 changes]
Signed-off-by: Frank van der Linden <fllinden(a)amazon.com>
---
kernel/irq/manage.c | 37 +++++++++++++++++++++++++++++++++++--
1 file changed, 35 insertions(+), 2 deletions(-)
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 5277949e82e0..ce6fba446814 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -168,9 +168,9 @@ void irq_set_thread_affinity(struct irq_desc *desc)
set_bit(IRQTF_AFFINITY, &action->thread_flags);
}
+#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK
static void irq_validate_effective_affinity(struct irq_data *data)
{
-#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK
const struct cpumask *m = irq_data_get_effective_affinity_mask(data);
struct irq_chip *chip = irq_data_get_irq_chip(data);
@@ -178,9 +178,19 @@ static void irq_validate_effective_affinity(struct irq_data *data)
return;
pr_warn_once("irq_chip %s did not update eff. affinity mask of irq %u\n",
chip->name, data->irq);
-#endif
}
+static inline void irq_init_effective_affinity(struct irq_data *data,
+ const struct cpumask *mask)
+{
+ cpumask_copy(irq_data_get_effective_affinity_mask(data), mask);
+}
+#else
+static inline void irq_validate_effective_affinity(struct irq_data *data) { }
+static inline void irq_init_effective_affinity(struct irq_data *data,
+ const struct cpumask *mask) { }
+#endif
+
int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
bool force)
{
@@ -205,6 +215,26 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
return ret;
}
+static bool irq_set_affinity_deactivated(struct irq_data *data,
+ const struct cpumask *mask, bool force)
+{
+ struct irq_desc *desc = irq_data_to_desc(data);
+
+ /*
+ * If the interrupt is not yet activated, just store the affinity
+ * mask and do not call the chip driver at all. On activation the
+ * driver has to make sure anyway that the interrupt is in a
+ * useable state so startup works.
+ */
+ if (!IS_ENABLED(CONFIG_IRQ_DOMAIN_HIERARCHY) || irqd_is_activated(data))
+ return false;
+
+ cpumask_copy(desc->irq_common_data.affinity, mask);
+ irq_init_effective_affinity(data, mask);
+ irqd_set(data, IRQD_AFFINITY_SET);
+ return true;
+}
+
int irq_set_affinity_locked(struct irq_data *data, const struct cpumask *mask,
bool force)
{
@@ -215,6 +245,9 @@ int irq_set_affinity_locked(struct irq_data *data, const struct cpumask *mask,
if (!chip || !chip->irq_set_affinity)
return -EINVAL;
+ if (irq_set_affinity_deactivated(data, mask, force))
+ return 0;
+
if (irq_can_move_pcntxt(data)) {
ret = irq_do_set_affinity(data, mask, force);
} else {
--
2.17.2
The patch below does not apply to the 5.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 18e77600f7a1ed69f8ce46c9e11cad0985712dfa Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Thu, 6 Aug 2020 23:26:22 -0700
Subject: [PATCH] khugepaged: retract_page_tables() remember to test exit
Only once have I seen this scenario (and forgot even to notice what forced
the eventual crash): a sequence of "BUG: Bad page map" alerts from
vm_normal_page(), from zap_pte_range() servicing exit_mmap();
pmd:00000000, pte values corresponding to data in physical page 0.
The pte mappings being zapped in this case were supposed to be from a huge
page of ext4 text (but could as well have been shmem): my belief is that
it was racing with collapse_file()'s retract_page_tables(), found *pmd
pointing to a page table, locked it, but *pmd had become 0 by the time
start_pte was decided.
In most cases, that possibility is excluded by holding mmap lock; but
exit_mmap() proceeds without mmap lock. Most of what's run by khugepaged
checks khugepaged_test_exit() after acquiring mmap lock:
khugepaged_collapse_pte_mapped_thps() and hugepage_vma_revalidate() do so,
for example. But retract_page_tables() did not: fix that.
The fix is for retract_page_tables() to check khugepaged_test_exit(),
after acquiring mmap lock, before doing anything to the page table.
Getting the mmap lock serializes with __mmput(), which briefly takes and
drops it in __khugepaged_exit(); then the khugepaged_test_exit() check on
mm_users makes sure we don't touch the page table once exit_mmap() might
reach it, since exit_mmap() will be proceeding without mmap lock, not
expecting anyone to be racing with it.
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021215400.27773@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a9aca9b71d6f..ac04b332a373 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1532,6 +1532,7 @@ static int khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot)
static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
{
struct vm_area_struct *vma;
+ struct mm_struct *mm;
unsigned long addr;
pmd_t *pmd, _pmd;
@@ -1560,7 +1561,8 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
continue;
if (vma->vm_end < addr + HPAGE_PMD_SIZE)
continue;
- pmd = mm_find_pmd(vma->vm_mm, addr);
+ mm = vma->vm_mm;
+ pmd = mm_find_pmd(mm, addr);
if (!pmd)
continue;
/*
@@ -1570,17 +1572,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* mmap_lock while holding page lock. Fault path does it in
* reverse order. Trylock is a way to avoid deadlock.
*/
- if (mmap_write_trylock(vma->vm_mm)) {
- spinlock_t *ptl = pmd_lock(vma->vm_mm, pmd);
- /* assume page table is clear */
- _pmd = pmdp_collapse_flush(vma, addr, pmd);
- spin_unlock(ptl);
- mmap_write_unlock(vma->vm_mm);
- mm_dec_nr_ptes(vma->vm_mm);
- pte_free(vma->vm_mm, pmd_pgtable(_pmd));
+ if (mmap_write_trylock(mm)) {
+ if (!khugepaged_test_exit(mm)) {
+ spinlock_t *ptl = pmd_lock(mm, pmd);
+ /* assume page table is clear */
+ _pmd = pmdp_collapse_flush(vma, addr, pmd);
+ spin_unlock(ptl);
+ mm_dec_nr_ptes(mm);
+ pte_free(mm, pmd_pgtable(_pmd));
+ }
+ mmap_write_unlock(mm);
} else {
/* Try again later */
- khugepaged_add_pte_mapped_thp(vma->vm_mm, addr);
+ khugepaged_add_pte_mapped_thp(mm, addr);
}
}
i_mmap_unlock_write(mapping);
For support of long running hypercalls xen_maybe_preempt_hcall() is
calling cond_resched() in case a hypercall marked as preemptible has
been interrupted.
Normally this is no problem, as only hypercalls done via some ioctl()s
are marked to be preemptible. In rare cases when during such a
preemptible hypercall an interrupt occurs and any softirq action is
started from irq_exit(), a further hypercall issued by the softirq
handler will be regarded to be preemptible, too. This might lead to
rescheduling in spite of the softirq handler potentially having set
preempt_disable(), leading to splats like:
BUG: sleeping function called from invalid context at drivers/xen/preempt.c:37
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 20775, name: xl
INFO: lockdep is turned off.
CPU: 1 PID: 20775 Comm: xl Tainted: G D W 5.4.46-1_prgmr_debug.el7.x86_64 #1
Call Trace:
<IRQ>
dump_stack+0x8f/0xd0
___might_sleep.cold.76+0xb2/0x103
xen_maybe_preempt_hcall+0x48/0x70
xen_do_hypervisor_callback+0x37/0x40
RIP: e030:xen_hypercall_xen_version+0xa/0x20
Code: ...
RSP: e02b:ffffc900400dcc30 EFLAGS: 00000246
RAX: 000000000004000d RBX: 0000000000000200 RCX: ffffffff8100122a
RDX: ffff88812e788000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffffff83ee3ad0 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000246 R12: ffff8881824aa0b0
R13: 0000000865496000 R14: 0000000865496000 R15: ffff88815d040000
? xen_hypercall_xen_version+0xa/0x20
? xen_force_evtchn_callback+0x9/0x10
? check_events+0x12/0x20
? xen_restore_fl_direct+0x1f/0x20
? _raw_spin_unlock_irqrestore+0x53/0x60
? debug_dma_sync_single_for_cpu+0x91/0xc0
? _raw_spin_unlock_irqrestore+0x53/0x60
? xen_swiotlb_sync_single_for_cpu+0x3d/0x140
? mlx4_en_process_rx_cq+0x6b6/0x1110 [mlx4_en]
? mlx4_en_poll_rx_cq+0x64/0x100 [mlx4_en]
? net_rx_action+0x151/0x4a0
? __do_softirq+0xed/0x55b
? irq_exit+0xea/0x100
? xen_evtchn_do_upcall+0x2c/0x40
? xen_do_hypervisor_callback+0x29/0x40
</IRQ>
? xen_hypercall_domctl+0xa/0x20
? xen_hypercall_domctl+0x8/0x20
? privcmd_ioctl+0x221/0x990 [xen_privcmd]
? do_vfs_ioctl+0xa5/0x6f0
? ksys_ioctl+0x60/0x90
? trace_hardirqs_off_thunk+0x1a/0x20
? __x64_sys_ioctl+0x16/0x20
? do_syscall_64+0x62/0x250
? entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fix that by testing preempt_count() before calling cond_resched().
In kernel 5.8 this can't happen any more due to the entry code rework.
Reported-by: Sarah Newman <srn(a)prgmr.com>
Fixes: 0fa2f5cb2b0ecd8 ("sched/preempt, xen: Use need_resched() instead of should_resched()")
Cc: Sarah Newman <srn(a)prgmr.com>
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
drivers/xen/preempt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/xen/preempt.c b/drivers/xen/preempt.c
index 17240c5325a3..6ad87b5c95ed 100644
--- a/drivers/xen/preempt.c
+++ b/drivers/xen/preempt.c
@@ -27,7 +27,7 @@ EXPORT_SYMBOL_GPL(xen_in_preemptible_hcall);
asmlinkage __visible void xen_maybe_preempt_hcall(void)
{
if (unlikely(__this_cpu_read(xen_in_preemptible_hcall)
- && need_resched())) {
+ && need_resched() && !preempt_count())) {
/*
* Clear flag as we may be rescheduled on a different
* cpu.
--
2.26.2
LLVM implemented a recent "libcall optimization" that lowers calls to
`sprintf(dest, "%s", str)` where the return value is used to
`stpcpy(dest, str) - dest`. This generally avoids the machinery involved
in parsing format strings. This optimization was introduced into
clang-12. Because the kernel does not provide an implementation of
stpcpy, we observe linkage failures for almost all targets when building
with ToT clang.
The interface is unsafe as it does not perform any bounds checking.
Disable this "libcall optimization" via `-fno-builtin-stpcpy`.
Cc: stable(a)vger.kernel.org # 4.4
Link: https://bugs.llvm.org/show_bug.cgi?id=47162
Link: https://github.com/ClangBuiltLinux/linux/issues/1126
Link: https://reviews.llvm.org/D85963
Reported-by: Sami Tolvanen <samitolvanen(a)google.com>
Suggested-by: Dávid Bolvanský <david.bolvansky(a)gmail.com>
Suggested-by: Kees Cook <keescook(a)chromium.org>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
---
Makefile | 1 +
1 file changed, 1 insertion(+)
diff --git a/Makefile b/Makefile
index 9cac6fde3479..e523dc8d30e0 100644
--- a/Makefile
+++ b/Makefile
@@ -578,6 +578,7 @@ ifneq ($(LLVM_IAS),1)
CLANG_FLAGS += -no-integrated-as
endif
CLANG_FLAGS += -Werror=unknown-warning-option
+CLANG_FLAGS += -fno-builtin-stpcpy
KBUILD_CFLAGS += $(CLANG_FLAGS)
KBUILD_AFLAGS += $(CLANG_FLAGS)
export CLANG_FLAGS
--
2.28.0.297.g1956fa8f8d-goog
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 18e77600f7a1ed69f8ce46c9e11cad0985712dfa Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Thu, 6 Aug 2020 23:26:22 -0700
Subject: [PATCH] khugepaged: retract_page_tables() remember to test exit
Only once have I seen this scenario (and forgot even to notice what forced
the eventual crash): a sequence of "BUG: Bad page map" alerts from
vm_normal_page(), from zap_pte_range() servicing exit_mmap();
pmd:00000000, pte values corresponding to data in physical page 0.
The pte mappings being zapped in this case were supposed to be from a huge
page of ext4 text (but could as well have been shmem): my belief is that
it was racing with collapse_file()'s retract_page_tables(), found *pmd
pointing to a page table, locked it, but *pmd had become 0 by the time
start_pte was decided.
In most cases, that possibility is excluded by holding mmap lock; but
exit_mmap() proceeds without mmap lock. Most of what's run by khugepaged
checks khugepaged_test_exit() after acquiring mmap lock:
khugepaged_collapse_pte_mapped_thps() and hugepage_vma_revalidate() do so,
for example. But retract_page_tables() did not: fix that.
The fix is for retract_page_tables() to check khugepaged_test_exit(),
after acquiring mmap lock, before doing anything to the page table.
Getting the mmap lock serializes with __mmput(), which briefly takes and
drops it in __khugepaged_exit(); then the khugepaged_test_exit() check on
mm_users makes sure we don't touch the page table once exit_mmap() might
reach it, since exit_mmap() will be proceeding without mmap lock, not
expecting anyone to be racing with it.
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021215400.27773@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a9aca9b71d6f..ac04b332a373 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1532,6 +1532,7 @@ static int khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot)
static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
{
struct vm_area_struct *vma;
+ struct mm_struct *mm;
unsigned long addr;
pmd_t *pmd, _pmd;
@@ -1560,7 +1561,8 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
continue;
if (vma->vm_end < addr + HPAGE_PMD_SIZE)
continue;
- pmd = mm_find_pmd(vma->vm_mm, addr);
+ mm = vma->vm_mm;
+ pmd = mm_find_pmd(mm, addr);
if (!pmd)
continue;
/*
@@ -1570,17 +1572,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* mmap_lock while holding page lock. Fault path does it in
* reverse order. Trylock is a way to avoid deadlock.
*/
- if (mmap_write_trylock(vma->vm_mm)) {
- spinlock_t *ptl = pmd_lock(vma->vm_mm, pmd);
- /* assume page table is clear */
- _pmd = pmdp_collapse_flush(vma, addr, pmd);
- spin_unlock(ptl);
- mmap_write_unlock(vma->vm_mm);
- mm_dec_nr_ptes(vma->vm_mm);
- pte_free(vma->vm_mm, pmd_pgtable(_pmd));
+ if (mmap_write_trylock(mm)) {
+ if (!khugepaged_test_exit(mm)) {
+ spinlock_t *ptl = pmd_lock(mm, pmd);
+ /* assume page table is clear */
+ _pmd = pmdp_collapse_flush(vma, addr, pmd);
+ spin_unlock(ptl);
+ mm_dec_nr_ptes(mm);
+ pte_free(mm, pmd_pgtable(_pmd));
+ }
+ mmap_write_unlock(mm);
} else {
/* Try again later */
- khugepaged_add_pte_mapped_thp(vma->vm_mm, addr);
+ khugepaged_add_pte_mapped_thp(mm, addr);
}
}
i_mmap_unlock_write(mapping);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 18e77600f7a1ed69f8ce46c9e11cad0985712dfa Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Thu, 6 Aug 2020 23:26:22 -0700
Subject: [PATCH] khugepaged: retract_page_tables() remember to test exit
Only once have I seen this scenario (and forgot even to notice what forced
the eventual crash): a sequence of "BUG: Bad page map" alerts from
vm_normal_page(), from zap_pte_range() servicing exit_mmap();
pmd:00000000, pte values corresponding to data in physical page 0.
The pte mappings being zapped in this case were supposed to be from a huge
page of ext4 text (but could as well have been shmem): my belief is that
it was racing with collapse_file()'s retract_page_tables(), found *pmd
pointing to a page table, locked it, but *pmd had become 0 by the time
start_pte was decided.
In most cases, that possibility is excluded by holding mmap lock; but
exit_mmap() proceeds without mmap lock. Most of what's run by khugepaged
checks khugepaged_test_exit() after acquiring mmap lock:
khugepaged_collapse_pte_mapped_thps() and hugepage_vma_revalidate() do so,
for example. But retract_page_tables() did not: fix that.
The fix is for retract_page_tables() to check khugepaged_test_exit(),
after acquiring mmap lock, before doing anything to the page table.
Getting the mmap lock serializes with __mmput(), which briefly takes and
drops it in __khugepaged_exit(); then the khugepaged_test_exit() check on
mm_users makes sure we don't touch the page table once exit_mmap() might
reach it, since exit_mmap() will be proceeding without mmap lock, not
expecting anyone to be racing with it.
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021215400.27773@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a9aca9b71d6f..ac04b332a373 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1532,6 +1532,7 @@ static int khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot)
static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
{
struct vm_area_struct *vma;
+ struct mm_struct *mm;
unsigned long addr;
pmd_t *pmd, _pmd;
@@ -1560,7 +1561,8 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
continue;
if (vma->vm_end < addr + HPAGE_PMD_SIZE)
continue;
- pmd = mm_find_pmd(vma->vm_mm, addr);
+ mm = vma->vm_mm;
+ pmd = mm_find_pmd(mm, addr);
if (!pmd)
continue;
/*
@@ -1570,17 +1572,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* mmap_lock while holding page lock. Fault path does it in
* reverse order. Trylock is a way to avoid deadlock.
*/
- if (mmap_write_trylock(vma->vm_mm)) {
- spinlock_t *ptl = pmd_lock(vma->vm_mm, pmd);
- /* assume page table is clear */
- _pmd = pmdp_collapse_flush(vma, addr, pmd);
- spin_unlock(ptl);
- mmap_write_unlock(vma->vm_mm);
- mm_dec_nr_ptes(vma->vm_mm);
- pte_free(vma->vm_mm, pmd_pgtable(_pmd));
+ if (mmap_write_trylock(mm)) {
+ if (!khugepaged_test_exit(mm)) {
+ spinlock_t *ptl = pmd_lock(mm, pmd);
+ /* assume page table is clear */
+ _pmd = pmdp_collapse_flush(vma, addr, pmd);
+ spin_unlock(ptl);
+ mm_dec_nr_ptes(mm);
+ pte_free(mm, pmd_pgtable(_pmd));
+ }
+ mmap_write_unlock(mm);
} else {
/* Try again later */
- khugepaged_add_pte_mapped_thp(vma->vm_mm, addr);
+ khugepaged_add_pte_mapped_thp(mm, addr);
}
}
i_mmap_unlock_write(mapping);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 18e77600f7a1ed69f8ce46c9e11cad0985712dfa Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Thu, 6 Aug 2020 23:26:22 -0700
Subject: [PATCH] khugepaged: retract_page_tables() remember to test exit
Only once have I seen this scenario (and forgot even to notice what forced
the eventual crash): a sequence of "BUG: Bad page map" alerts from
vm_normal_page(), from zap_pte_range() servicing exit_mmap();
pmd:00000000, pte values corresponding to data in physical page 0.
The pte mappings being zapped in this case were supposed to be from a huge
page of ext4 text (but could as well have been shmem): my belief is that
it was racing with collapse_file()'s retract_page_tables(), found *pmd
pointing to a page table, locked it, but *pmd had become 0 by the time
start_pte was decided.
In most cases, that possibility is excluded by holding mmap lock; but
exit_mmap() proceeds without mmap lock. Most of what's run by khugepaged
checks khugepaged_test_exit() after acquiring mmap lock:
khugepaged_collapse_pte_mapped_thps() and hugepage_vma_revalidate() do so,
for example. But retract_page_tables() did not: fix that.
The fix is for retract_page_tables() to check khugepaged_test_exit(),
after acquiring mmap lock, before doing anything to the page table.
Getting the mmap lock serializes with __mmput(), which briefly takes and
drops it in __khugepaged_exit(); then the khugepaged_test_exit() check on
mm_users makes sure we don't touch the page table once exit_mmap() might
reach it, since exit_mmap() will be proceeding without mmap lock, not
expecting anyone to be racing with it.
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021215400.27773@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a9aca9b71d6f..ac04b332a373 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1532,6 +1532,7 @@ static int khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot)
static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
{
struct vm_area_struct *vma;
+ struct mm_struct *mm;
unsigned long addr;
pmd_t *pmd, _pmd;
@@ -1560,7 +1561,8 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
continue;
if (vma->vm_end < addr + HPAGE_PMD_SIZE)
continue;
- pmd = mm_find_pmd(vma->vm_mm, addr);
+ mm = vma->vm_mm;
+ pmd = mm_find_pmd(mm, addr);
if (!pmd)
continue;
/*
@@ -1570,17 +1572,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* mmap_lock while holding page lock. Fault path does it in
* reverse order. Trylock is a way to avoid deadlock.
*/
- if (mmap_write_trylock(vma->vm_mm)) {
- spinlock_t *ptl = pmd_lock(vma->vm_mm, pmd);
- /* assume page table is clear */
- _pmd = pmdp_collapse_flush(vma, addr, pmd);
- spin_unlock(ptl);
- mmap_write_unlock(vma->vm_mm);
- mm_dec_nr_ptes(vma->vm_mm);
- pte_free(vma->vm_mm, pmd_pgtable(_pmd));
+ if (mmap_write_trylock(mm)) {
+ if (!khugepaged_test_exit(mm)) {
+ spinlock_t *ptl = pmd_lock(mm, pmd);
+ /* assume page table is clear */
+ _pmd = pmdp_collapse_flush(vma, addr, pmd);
+ spin_unlock(ptl);
+ mm_dec_nr_ptes(mm);
+ pte_free(mm, pmd_pgtable(_pmd));
+ }
+ mmap_write_unlock(mm);
} else {
/* Try again later */
- khugepaged_add_pte_mapped_thp(vma->vm_mm, addr);
+ khugepaged_add_pte_mapped_thp(mm, addr);
}
}
i_mmap_unlock_write(mapping);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 18e77600f7a1ed69f8ce46c9e11cad0985712dfa Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Thu, 6 Aug 2020 23:26:22 -0700
Subject: [PATCH] khugepaged: retract_page_tables() remember to test exit
Only once have I seen this scenario (and forgot even to notice what forced
the eventual crash): a sequence of "BUG: Bad page map" alerts from
vm_normal_page(), from zap_pte_range() servicing exit_mmap();
pmd:00000000, pte values corresponding to data in physical page 0.
The pte mappings being zapped in this case were supposed to be from a huge
page of ext4 text (but could as well have been shmem): my belief is that
it was racing with collapse_file()'s retract_page_tables(), found *pmd
pointing to a page table, locked it, but *pmd had become 0 by the time
start_pte was decided.
In most cases, that possibility is excluded by holding mmap lock; but
exit_mmap() proceeds without mmap lock. Most of what's run by khugepaged
checks khugepaged_test_exit() after acquiring mmap lock:
khugepaged_collapse_pte_mapped_thps() and hugepage_vma_revalidate() do so,
for example. But retract_page_tables() did not: fix that.
The fix is for retract_page_tables() to check khugepaged_test_exit(),
after acquiring mmap lock, before doing anything to the page table.
Getting the mmap lock serializes with __mmput(), which briefly takes and
drops it in __khugepaged_exit(); then the khugepaged_test_exit() check on
mm_users makes sure we don't touch the page table once exit_mmap() might
reach it, since exit_mmap() will be proceeding without mmap lock, not
expecting anyone to be racing with it.
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021215400.27773@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a9aca9b71d6f..ac04b332a373 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1532,6 +1532,7 @@ static int khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot)
static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
{
struct vm_area_struct *vma;
+ struct mm_struct *mm;
unsigned long addr;
pmd_t *pmd, _pmd;
@@ -1560,7 +1561,8 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
continue;
if (vma->vm_end < addr + HPAGE_PMD_SIZE)
continue;
- pmd = mm_find_pmd(vma->vm_mm, addr);
+ mm = vma->vm_mm;
+ pmd = mm_find_pmd(mm, addr);
if (!pmd)
continue;
/*
@@ -1570,17 +1572,19 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* mmap_lock while holding page lock. Fault path does it in
* reverse order. Trylock is a way to avoid deadlock.
*/
- if (mmap_write_trylock(vma->vm_mm)) {
- spinlock_t *ptl = pmd_lock(vma->vm_mm, pmd);
- /* assume page table is clear */
- _pmd = pmdp_collapse_flush(vma, addr, pmd);
- spin_unlock(ptl);
- mmap_write_unlock(vma->vm_mm);
- mm_dec_nr_ptes(vma->vm_mm);
- pte_free(vma->vm_mm, pmd_pgtable(_pmd));
+ if (mmap_write_trylock(mm)) {
+ if (!khugepaged_test_exit(mm)) {
+ spinlock_t *ptl = pmd_lock(mm, pmd);
+ /* assume page table is clear */
+ _pmd = pmdp_collapse_flush(vma, addr, pmd);
+ spin_unlock(ptl);
+ mm_dec_nr_ptes(mm);
+ pte_free(mm, pmd_pgtable(_pmd));
+ }
+ mmap_write_unlock(mm);
} else {
/* Try again later */
- khugepaged_add_pte_mapped_thp(vma->vm_mm, addr);
+ khugepaged_add_pte_mapped_thp(mm, addr);
}
}
i_mmap_unlock_write(mapping);
If the value of the link register is not correct (tail call from asm
that didn't set it, stack corruption, memory no longer mapped), then
using it for an address calculation may trigger an exception. Without a
fixup handler, this will lead to a panic, which will unwind, which will
trigger the fault repeatedly in an infinite loop.
We don't observe such failures currently, but we have. Just to be safe,
add a fixup handler here so that at least we don't have an infinite
loop.
Cc: stable(a)vger.kernel.org
Fixes: commit 6dc5fd93b2f1 ("ARM: 8900/1: UNWINDER_FRAME_POINTER implementation for Clang")
Reported-by: Miles Chen <miles.chen(a)mediatek.com>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
---
arch/arm/lib/backtrace-clang.S | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/arm/lib/backtrace-clang.S b/arch/arm/lib/backtrace-clang.S
index 5388ac664c12..40eb2215eaf4 100644
--- a/arch/arm/lib/backtrace-clang.S
+++ b/arch/arm/lib/backtrace-clang.S
@@ -146,7 +146,7 @@ for_each_frame: tst frame, mask @ Check for address exceptions
tst sv_lr, #0 @ If there's no previous lr,
beq finished_setup @ we're done.
- ldr r0, [sv_lr, #-4] @ get call instruction
+prev_call: ldr r0, [sv_lr, #-4] @ get call instruction
ldr r3, .Lopcode+4
and r2, r3, r0 @ is this a bl call
teq r2, r3
@@ -206,6 +206,13 @@ finished_setup:
mov r2, frame
bl printk
no_frame: ldmfd sp!, {r4 - r9, fp, pc}
+/*
+ * Accessing the address pointed to by the link register triggered an
+ * exception, don't try to unwind through it.
+ */
+bad_lr: mov sv_fp, #0
+ mov sv_lr, #0
+ b finished_setup
ENDPROC(c_backtrace)
.pushsection __ex_table,"a"
.align 3
@@ -214,6 +221,7 @@ ENDPROC(c_backtrace)
.long 1003b, 1006b
.long 1004b, 1006b
.long 1005b, 1006b
+ .long prev_call, bad_lr
.popsection
.Lbad: .asciz "%sBacktrace aborted due to bad frame pointer <%p>\n"
--
2.28.0.163.g6104cc2f0b6-goog
An earlier commit:
b7db41c9e03b ("io_uring: fix regression with always ignoring signals in io_cqring_wait()")
ensured that we didn't get stuck waiting for eventfd reads when it's
registered with the io_uring ring for event notification, but we still
have a gap where the task can be waiting on other events in the kernel
and need a bigger nudge to make forward progress.
Ensure that we use signaled notifications for a task that isn't currently
running, to be certain the work is seen and processed immediately.
Cc: stable(a)vger.kernel.org # v5.7+
Reported-by: Josef <josef.grieb(a)gmail.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
fs/io_uring.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index e9b27cdaa735..443eecdfeda9 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1712,21 +1712,27 @@ static int io_req_task_work_add(struct io_kiocb *req, struct callback_head *cb)
struct io_ring_ctx *ctx = req->ctx;
int ret, notify = TWA_RESUME;
+ ret = __task_work_add(tsk, cb);
+ if (unlikely(ret))
+ return ret;
+
/*
* SQPOLL kernel thread doesn't need notification, just a wakeup.
- * If we're not using an eventfd, then TWA_RESUME is always fine,
- * as we won't have dependencies between request completions for
- * other kernel wait conditions.
+ * For any other work, use signaled wakeups if the task isn't
+ * running to avoid dependencies between tasks or threads. If
+ * the issuing task is currently waiting in the kernel on a thread,
+ * and same thread is waiting for a completion event, then we need
+ * to ensure that the issuing task processes task_work. TWA_SIGNAL
+ * is needed for that.
*/
if (ctx->flags & IORING_SETUP_SQPOLL)
notify = 0;
- else if (ctx->cq_ev_fd)
+ else if (READ_ONCE(tsk->state) != TASK_RUNNING)
notify = TWA_SIGNAL;
- ret = task_work_add(tsk, cb, notify);
- if (!ret)
- wake_up_process(tsk);
- return ret;
+ __task_work_notify(tsk, notify);
+ wake_up_process(tsk);
+ return 0;
}
static void __io_req_task_cancel(struct io_kiocb *req, int error)
--
2.28.0
Hi
[This is an automated email]
This commit has been processed because it contains a "Fixes:" tag
fixing commit: b189817cf789 ("crypto: caam/qi - add ablkcipher and authenc algorithms").
The bot has tested the following trees: v5.8.1, v5.7.15, v5.4.58, v4.19.139, v4.14.193.
v5.8.1: Failed to apply! Possible dependencies:
297142490236 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
528f776df67c ("crypto: qat - allow xts requests not multiple of block")
a85211f36f3d ("crypto: qat - fallback for xts with 192 bit keys")
b185a68710e0 ("crypto: qat - validate xts key")
b8aa7dc5c753 ("crypto: drivers - set the flag CRYPTO_ALG_ALLOCATES_MEMORY")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
v5.7.15: Failed to apply! Possible dependencies:
297142490236 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
528f776df67c ("crypto: qat - allow xts requests not multiple of block")
a85211f36f3d ("crypto: qat - fallback for xts with 192 bit keys")
b185a68710e0 ("crypto: qat - validate xts key")
b8aa7dc5c753 ("crypto: drivers - set the flag CRYPTO_ALG_ALLOCATES_MEMORY")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
v5.4.58: Failed to apply! Possible dependencies:
297142490236 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
64db5e7439fb ("crypto: sparc/aes - convert to skcipher API")
66d7fb94e4ff ("crypto: blake2s - generic C library implementation and selftest")
674f368a952c ("crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN")
746b2e024c67 ("crypto: lib - tidy up lib/crypto Kconfig and Makefile")
7988fb2c03c8 ("crypto: s390/aes - convert to skcipher API")
7f725f41f627 ("crypto: powerpc - convert SPE AES algorithms to skcipher API")
7f9b0880925f ("crypto: blake2s - implement generic shash driver")
91d689337fe8 ("crypto: blake2b - add blake2b generic implementation")
b4d0c0aad57a ("crypto: arm - use Kconfig based compiler checks for crypto opcodes")
b95bba5d0114 ("crypto: skcipher - rename the crypto_blkcipher module and kconfig option")
d00c06398154 ("crypto: s390/paes - convert to skcipher API")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
ed0356eda153 ("crypto: blake2s - x86_64 SIMD implementation")
v4.19.139: Failed to apply! Possible dependencies:
0a5dff9882e5 ("crypto: arm/ghash - provide a synchronous version")
1ca1b917940c ("crypto: chacha20-generic - refactor to allow varying number of rounds")
297142490236 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
5ca7badb1f62 ("crypto: caam/jr - ablkcipher -> skcipher conversion")
674f368a952c ("crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN")
8a5a79d5556b ("crypto: x86/chacha20 - Add a 4-block AVX2 variant")
99680c5e9182 ("crypto: arm - convert to use crypto_simd_usable()")
9b17608f15b9 ("crypto: x86/chacha20 - Use larger block functions more aggressively")
9dbe3072c6b1 ("crypto: caam/qi - ablkcipher -> skcipher conversion")
a5dd97f86211 ("crypto: x86/chacha20 - Add a 2-block AVX2 variant")
aec48adce85d ("crypto: caam/qi - remove ablkcipher IV generation")
c3b734dd325d ("crypto: x86/chacha20 - Support partial lengths in 8-block AVX2 variant")
cf5448b5c3d8 ("crypto: caam/jr - remove ablkcipher IV generation")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
db8e15a24957 ("crypto: x86/chacha20 - Support partial lengths in 4-block SSSE3 variant")
e4e72063d3c0 ("crypto: x86/chacha20 - Support partial lengths in 1-block SSSE3 variant")
v4.14.193: Failed to apply! Possible dependencies:
297142490236 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
5ca7badb1f62 ("crypto: caam/jr - ablkcipher -> skcipher conversion")
662f70ede597 ("crypto: caam - remove needless ablkcipher key copy")
7e0880b9fbbe ("crypto: caam - add Derived Key Protocol (DKP) support")
9dbe3072c6b1 ("crypto: caam/qi - ablkcipher -> skcipher conversion")
cf5448b5c3d8 ("crypto: caam/jr - remove ablkcipher IV generation")
NOTE: The patch will not be queued to stable trees until it is upstream.
How should we proceed with this patch?
--
Thanks
Sasha
Hi
[This is an automated email]
This commit has been processed because it contains a "Fixes:" tag
fixing commit: b189817cf789 ("crypto: caam/qi - add ablkcipher and authenc algorithms").
The bot has tested the following trees: v5.8.1, v5.7.15, v5.4.58, v4.19.139, v4.14.193.
v5.8.1: Failed to apply! Possible dependencies:
528f776df67c ("crypto: qat - allow xts requests not multiple of block")
6359e4e1bca8 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
a85211f36f3d ("crypto: qat - fallback for xts with 192 bit keys")
b185a68710e0 ("crypto: qat - validate xts key")
b8aa7dc5c753 ("crypto: drivers - set the flag CRYPTO_ALG_ALLOCATES_MEMORY")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
v5.7.15: Failed to apply! Possible dependencies:
528f776df67c ("crypto: qat - allow xts requests not multiple of block")
6359e4e1bca8 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
a85211f36f3d ("crypto: qat - fallback for xts with 192 bit keys")
b185a68710e0 ("crypto: qat - validate xts key")
b8aa7dc5c753 ("crypto: drivers - set the flag CRYPTO_ALG_ALLOCATES_MEMORY")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
v5.4.58: Failed to apply! Possible dependencies:
6359e4e1bca8 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
64db5e7439fb ("crypto: sparc/aes - convert to skcipher API")
66d7fb94e4ff ("crypto: blake2s - generic C library implementation and selftest")
674f368a952c ("crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN")
746b2e024c67 ("crypto: lib - tidy up lib/crypto Kconfig and Makefile")
7988fb2c03c8 ("crypto: s390/aes - convert to skcipher API")
7f725f41f627 ("crypto: powerpc - convert SPE AES algorithms to skcipher API")
7f9b0880925f ("crypto: blake2s - implement generic shash driver")
91d689337fe8 ("crypto: blake2b - add blake2b generic implementation")
b4d0c0aad57a ("crypto: arm - use Kconfig based compiler checks for crypto opcodes")
b95bba5d0114 ("crypto: skcipher - rename the crypto_blkcipher module and kconfig option")
d00c06398154 ("crypto: s390/paes - convert to skcipher API")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
ed0356eda153 ("crypto: blake2s - x86_64 SIMD implementation")
v4.19.139: Failed to apply! Possible dependencies:
0a5dff9882e5 ("crypto: arm/ghash - provide a synchronous version")
1ca1b917940c ("crypto: chacha20-generic - refactor to allow varying number of rounds")
5ca7badb1f62 ("crypto: caam/jr - ablkcipher -> skcipher conversion")
6359e4e1bca8 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
674f368a952c ("crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN")
8a5a79d5556b ("crypto: x86/chacha20 - Add a 4-block AVX2 variant")
99680c5e9182 ("crypto: arm - convert to use crypto_simd_usable()")
9b17608f15b9 ("crypto: x86/chacha20 - Use larger block functions more aggressively")
9dbe3072c6b1 ("crypto: caam/qi - ablkcipher -> skcipher conversion")
a5dd97f86211 ("crypto: x86/chacha20 - Add a 2-block AVX2 variant")
aec48adce85d ("crypto: caam/qi - remove ablkcipher IV generation")
c3b734dd325d ("crypto: x86/chacha20 - Support partial lengths in 8-block AVX2 variant")
cf5448b5c3d8 ("crypto: caam/jr - remove ablkcipher IV generation")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
db8e15a24957 ("crypto: x86/chacha20 - Support partial lengths in 4-block SSSE3 variant")
e4e72063d3c0 ("crypto: x86/chacha20 - Support partial lengths in 1-block SSSE3 variant")
v4.14.193: Failed to apply! Possible dependencies:
5ca7badb1f62 ("crypto: caam/jr - ablkcipher -> skcipher conversion")
6359e4e1bca8 ("crypto: caam/qi - add fallback for XTS with more than 8B IV")
662f70ede597 ("crypto: caam - remove needless ablkcipher key copy")
7e0880b9fbbe ("crypto: caam - add Derived Key Protocol (DKP) support")
9dbe3072c6b1 ("crypto: caam/qi - ablkcipher -> skcipher conversion")
cf5448b5c3d8 ("crypto: caam/jr - remove ablkcipher IV generation")
NOTE: The patch will not be queued to stable trees until it is upstream.
How should we proceed with this patch?
--
Thanks
Sasha
Hi
[This is an automated email]
This commit has been processed because it contains a "Fixes:" tag
fixing commit: 226853ac3ebe ("crypto: caam/qi2 - add skcipher algorithms").
The bot has tested the following trees: v5.8.1, v5.7.15, v5.4.58.
v5.8.1: Failed to apply! Possible dependencies:
528f776df67c ("crypto: qat - allow xts requests not multiple of block")
a85211f36f3d ("crypto: qat - fallback for xts with 192 bit keys")
b185a68710e0 ("crypto: qat - validate xts key")
b8aa7dc5c753 ("crypto: drivers - set the flag CRYPTO_ALG_ALLOCATES_MEMORY")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
f9665aa69c99 ("crypto: caam/qi2 - add fallback for XTS with more than 8B IV")
v5.7.15: Failed to apply! Possible dependencies:
528f776df67c ("crypto: qat - allow xts requests not multiple of block")
a85211f36f3d ("crypto: qat - fallback for xts with 192 bit keys")
b185a68710e0 ("crypto: qat - validate xts key")
b8aa7dc5c753 ("crypto: drivers - set the flag CRYPTO_ALG_ALLOCATES_MEMORY")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
f9665aa69c99 ("crypto: caam/qi2 - add fallback for XTS with more than 8B IV")
v5.4.58: Failed to apply! Possible dependencies:
64db5e7439fb ("crypto: sparc/aes - convert to skcipher API")
66d7fb94e4ff ("crypto: blake2s - generic C library implementation and selftest")
674f368a952c ("crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN")
746b2e024c67 ("crypto: lib - tidy up lib/crypto Kconfig and Makefile")
7988fb2c03c8 ("crypto: s390/aes - convert to skcipher API")
7f725f41f627 ("crypto: powerpc - convert SPE AES algorithms to skcipher API")
7f9b0880925f ("crypto: blake2s - implement generic shash driver")
91d689337fe8 ("crypto: blake2b - add blake2b generic implementation")
b4d0c0aad57a ("crypto: arm - use Kconfig based compiler checks for crypto opcodes")
b95bba5d0114 ("crypto: skcipher - rename the crypto_blkcipher module and kconfig option")
d00c06398154 ("crypto: s390/paes - convert to skcipher API")
da6a66853a38 ("crypto: caam - silence .setkey in case of bad key length")
ed0356eda153 ("crypto: blake2s - x86_64 SIMD implementation")
f9665aa69c99 ("crypto: caam/qi2 - add fallback for XTS with more than 8B IV")
NOTE: The patch will not be queued to stable trees until it is upstream.
How should we proceed with this patch?
--
Thanks
Sasha