The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f85086f95fa36194eb0db5cd5c12e56801b98523 Mon Sep 17 00:00:00 2001
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Date: Fri, 25 Sep 2020 21:19:31 -0700
Subject: [PATCH] mm: don't rely on system state to detect hot-plug operations
In register_mem_sect_under_node() the system_state's value is checked to
detect whether the call is made during boot time or during an hot-plug
operation. Unfortunately, that check against SYSTEM_BOOTING is wrong
because regular memory is registered at SYSTEM_SCHEDULING state. In
addition, memory hot-plug operation can be triggered at this system
state by the ACPI [1]. So checking against the system state is not
enough.
The consequence is that on system with interleaved node's ranges like this:
Early memory node ranges
node 1: [mem 0x0000000000000000-0x000000011fffffff]
node 2: [mem 0x0000000120000000-0x000000014fffffff]
node 1: [mem 0x0000000150000000-0x00000001ffffffff]
node 0: [mem 0x0000000200000000-0x000000048fffffff]
node 2: [mem 0x0000000490000000-0x00000007ffffffff]
This can be seen on PowerPC LPAR after multiple memory hot-plug and
hot-unplug operations are done. At the next reboot the node's memory
ranges can be interleaved and since the call to link_mem_sections() is
made in topology_init() while the system is in the SYSTEM_SCHEDULING
state, the node's id is not checked, and the sections registered to
multiple nodes:
$ ls -l /sys/devices/system/memory/memory21/node*
total 0
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
In that case, the system is able to boot but if later one of theses
memory blocks is hot-unplugged and then hot-plugged, the sysfs
inconsistency is detected and this is triggering a BUG_ON():
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
Call Trace:
add_memory_resource+0x23c/0x340 (unreliable)
__add_memory+0x5c/0xf0
dlpar_add_lmb+0x1b4/0x500
dlpar_memory+0x1f8/0xb80
handle_dlpar_errorlog+0xc0/0x190
dlpar_store+0x198/0x4a0
kobj_attr_store+0x30/0x50
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1b0/0x290
vfs_write+0xe8/0x290
ksys_write+0xdc/0x130
system_call_exception+0x160/0x270
system_call_common+0xf0/0x27c
This patch addresses the root cause by not relying on the system_state
value to detect whether the call is due to a hot-plug operation. An
extra parameter is added to link_mem_sections() detailing whether the
operation is due to a hot-plug operation.
[1] According to Oscar Salvador, using this qemu command line, ACPI
memory hotplug operations are raised at SYSTEM_SCHEDULING state:
$QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \
-m size=$MEM,slots=255,maxmem=4294967296k \
-numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \
-object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \
-object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \
-object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \
-object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \
-object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \
-object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \
-object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \
Fixes: 4fbce633910e ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()")
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 508b80f6329b..50af16e68d98 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -761,14 +761,36 @@ static int __ref get_nid_for_pfn(unsigned long pfn)
return pfn_to_nid(pfn);
}
+static int do_register_memory_block_under_node(int nid,
+ struct memory_block *mem_blk)
+{
+ int ret;
+
+ /*
+ * If this memory block spans multiple nodes, we only indicate
+ * the last processed node.
+ */
+ mem_blk->nid = nid;
+
+ ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
+ &mem_blk->dev.kobj,
+ kobject_name(&mem_blk->dev.kobj));
+ if (ret)
+ return ret;
+
+ return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
+ &node_devices[nid]->dev.kobj,
+ kobject_name(&node_devices[nid]->dev.kobj));
+}
+
/* register memory section under specified node if it spans that node */
-static int register_mem_sect_under_node(struct memory_block *mem_blk,
- void *arg)
+static int register_mem_block_under_node_early(struct memory_block *mem_blk,
+ void *arg)
{
unsigned long memory_block_pfns = memory_block_size_bytes() / PAGE_SIZE;
unsigned long start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
unsigned long end_pfn = start_pfn + memory_block_pfns - 1;
- int ret, nid = *(int *)arg;
+ int nid = *(int *)arg;
unsigned long pfn;
for (pfn = start_pfn; pfn <= end_pfn; pfn++) {
@@ -785,38 +807,33 @@ static int register_mem_sect_under_node(struct memory_block *mem_blk,
}
/*
- * We need to check if page belongs to nid only for the boot
- * case, during hotplug we know that all pages in the memory
- * block belong to the same node.
- */
- if (system_state == SYSTEM_BOOTING) {
- page_nid = get_nid_for_pfn(pfn);
- if (page_nid < 0)
- continue;
- if (page_nid != nid)
- continue;
- }
-
- /*
- * If this memory block spans multiple nodes, we only indicate
- * the last processed node.
+ * We need to check if page belongs to nid only at the boot
+ * case because node's ranges can be interleaved.
*/
- mem_blk->nid = nid;
-
- ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
- &mem_blk->dev.kobj,
- kobject_name(&mem_blk->dev.kobj));
- if (ret)
- return ret;
+ page_nid = get_nid_for_pfn(pfn);
+ if (page_nid < 0)
+ continue;
+ if (page_nid != nid)
+ continue;
- return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
- &node_devices[nid]->dev.kobj,
- kobject_name(&node_devices[nid]->dev.kobj));
+ return do_register_memory_block_under_node(nid, mem_blk);
}
/* mem section does not span the specified node */
return 0;
}
+/*
+ * During hotplug we know that all pages in the memory block belong to the same
+ * node.
+ */
+static int register_mem_block_under_node_hotplug(struct memory_block *mem_blk,
+ void *arg)
+{
+ int nid = *(int *)arg;
+
+ return do_register_memory_block_under_node(nid, mem_blk);
+}
+
/*
* Unregister a memory block device under the node it spans. Memory blocks
* with multiple nodes cannot be offlined and therefore also never be removed.
@@ -832,11 +849,19 @@ void unregister_memory_block_under_nodes(struct memory_block *mem_blk)
kobject_name(&node_devices[mem_blk->nid]->dev.kobj));
}
-int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn)
+int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn,
+ enum meminit_context context)
{
+ walk_memory_blocks_func_t func;
+
+ if (context == MEMINIT_HOTPLUG)
+ func = register_mem_block_under_node_hotplug;
+ else
+ func = register_mem_block_under_node_early;
+
return walk_memory_blocks(PFN_PHYS(start_pfn),
PFN_PHYS(end_pfn - start_pfn), (void *)&nid,
- register_mem_sect_under_node);
+ func);
}
#ifdef CONFIG_HUGETLBFS
diff --git a/include/linux/node.h b/include/linux/node.h
index 4866f32a02d8..014ba3ab2efd 100644
--- a/include/linux/node.h
+++ b/include/linux/node.h
@@ -99,11 +99,13 @@ extern struct node *node_devices[];
typedef void (*node_registration_func_t)(struct node *);
#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA)
-extern int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long end_pfn);
+int link_mem_sections(int nid, unsigned long start_pfn,
+ unsigned long end_pfn,
+ enum meminit_context context);
#else
static inline int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long end_pfn)
+ unsigned long end_pfn,
+ enum meminit_context context)
{
return 0;
}
@@ -128,7 +130,8 @@ static inline int register_one_node(int nid)
if (error)
return error;
/* link memory sections under this node */
- error = link_mem_sections(nid, start_pfn, end_pfn);
+ error = link_mem_sections(nid, start_pfn, end_pfn,
+ MEMINIT_EARLY);
}
return error;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 345308a8bd15..ce3e73e3a5c1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1080,7 +1080,8 @@ int __ref add_memory_resource(int nid, struct resource *res)
}
/* link memory sections under this node.*/
- ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1));
+ ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1),
+ MEMINIT_HOTPLUG);
BUG_ON(ret);
/* create new memmap entry */
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c1d0da83358a2316d9be7f229f26126dbaa07468 Mon Sep 17 00:00:00 2001
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Date: Fri, 25 Sep 2020 21:19:28 -0700
Subject: [PATCH] mm: replace memmap_context by meminit_context
Patch series "mm: fix memory to node bad links in sysfs", v3.
Sometimes, firmware may expose interleaved memory layout like this:
Early memory node ranges
node 1: [mem 0x0000000000000000-0x000000011fffffff]
node 2: [mem 0x0000000120000000-0x000000014fffffff]
node 1: [mem 0x0000000150000000-0x00000001ffffffff]
node 0: [mem 0x0000000200000000-0x000000048fffffff]
node 2: [mem 0x0000000490000000-0x00000007ffffffff]
In that case, we can see memory blocks assigned to multiple nodes in
sysfs:
$ ls -l /sys/devices/system/memory/memory21
total 0
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
-rw-r--r-- 1 root root 65536 Aug 24 05:27 online
-r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device
-r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index
drwxr-xr-x 2 root root 0 Aug 24 05:27 power
-r--r--r-- 1 root root 65536 Aug 24 05:27 removable
-rw-r--r-- 1 root root 65536 Aug 24 05:27 state
lrwxrwxrwx 1 root root 0 Aug 24 05:25 subsystem -> ../../../../bus/memory
-rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent
-r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones
The same applies in the node's directory with a memory21 link in both
the node1 and node2's directory.
This is wrong but doesn't prevent the system to run. However when
later, one of these memory blocks is hot-unplugged and then hot-plugged,
the system is detecting an inconsistency in the sysfs layout and a
BUG_ON() is raised:
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
Call Trace:
add_memory_resource+0x23c/0x340 (unreliable)
__add_memory+0x5c/0xf0
dlpar_add_lmb+0x1b4/0x500
dlpar_memory+0x1f8/0xb80
handle_dlpar_errorlog+0xc0/0x190
dlpar_store+0x198/0x4a0
kobj_attr_store+0x30/0x50
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1b0/0x290
vfs_write+0xe8/0x290
ksys_write+0xdc/0x130
system_call_exception+0x160/0x270
system_call_common+0xf0/0x27c
This has been seen on PowerPC LPAR.
The root cause of this issue is that when node's memory is registered,
the range used can overlap another node's range, thus the memory block
is registered to multiple nodes in sysfs.
There are two issues here:
(a) The sysfs memory and node's layouts are broken due to these
multiple links
(b) The link errors in link_mem_sections() should not lead to a system
panic.
To address (a) register_mem_sect_under_node should not rely on the
system state to detect whether the link operation is triggered by a hot
plug operation or not. This is addressed by the patches 1 and 2 of this
series.
Issue (b) will be addressed separately.
This patch (of 2):
The memmap_context enum is used to detect whether a memory operation is
due to a hot-add operation or happening at boot time.
Make it general to the hotplug operation and rename it as
meminit_context.
There is no functional change introduced by this patch
Suggested-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J . Wysocki" <rafael(a)kernel.org>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com
Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 0b3fb4c7af29..8e7b8c6c576e 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -538,7 +538,7 @@ virtual_memmap_init(u64 start, u64 end, void *arg)
if (map_start < map_end)
memmap_init_zone((unsigned long)(map_end - map_start),
args->nid, args->zone, page_to_pfn(map_start),
- MEMMAP_EARLY, NULL);
+ MEMINIT_EARLY, NULL);
return 0;
}
@@ -547,8 +547,8 @@ memmap_init (unsigned long size, int nid, unsigned long zone,
unsigned long start_pfn)
{
if (!vmem_map) {
- memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY,
- NULL);
+ memmap_init_zone(size, nid, zone, start_pfn,
+ MEMINIT_EARLY, NULL);
} else {
struct page *start;
struct memmap_init_callback_data args;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b2f370f0b420..48614513eb66 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2416,7 +2416,7 @@ extern int __meminit __early_pfn_to_nid(unsigned long pfn,
extern void set_dma_reserve(unsigned long new_dma_reserve);
extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long,
- enum memmap_context, struct vmem_altmap *);
+ enum meminit_context, struct vmem_altmap *);
extern void setup_per_zone_wmarks(void);
extern int __meminit init_per_zone_wmark_min(void);
extern void mem_init(void);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 8379432f4f2f..0f7a4ff4b059 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -824,10 +824,15 @@ bool zone_watermark_ok(struct zone *z, unsigned int order,
unsigned int alloc_flags);
bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
unsigned long mark, int highest_zoneidx);
-enum memmap_context {
- MEMMAP_EARLY,
- MEMMAP_HOTPLUG,
+/*
+ * Memory initialization context, use to differentiate memory added by
+ * the platform statically or via memory hotplug interface.
+ */
+enum meminit_context {
+ MEMINIT_EARLY,
+ MEMINIT_HOTPLUG,
};
+
extern void init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
unsigned long size);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b11a269e2356..345308a8bd15 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -729,7 +729,7 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
* are reserved so nobody should be touching them so we should be safe
*/
memmap_init_zone(nr_pages, nid, zone_idx(zone), start_pfn,
- MEMMAP_HOTPLUG, altmap);
+ MEMINIT_HOTPLUG, altmap);
set_zone_contiguous(zone);
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fab5e97dc9ca..5661fa164f13 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5975,7 +5975,7 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
* done. Non-atomic initialization, single-pass.
*/
void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
- unsigned long start_pfn, enum memmap_context context,
+ unsigned long start_pfn, enum meminit_context context,
struct vmem_altmap *altmap)
{
unsigned long pfn, end_pfn = start_pfn + size;
@@ -6007,7 +6007,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
* There can be holes in boot-time mem_map[]s handed to this
* function. They do not exist on hotplugged memory.
*/
- if (context == MEMMAP_EARLY) {
+ if (context == MEMINIT_EARLY) {
if (overlap_memmap_init(zone, &pfn))
continue;
if (defer_init(nid, pfn, end_pfn))
@@ -6016,7 +6016,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
page = pfn_to_page(pfn);
__init_single_page(page, pfn, zone, nid);
- if (context == MEMMAP_HOTPLUG)
+ if (context == MEMINIT_HOTPLUG)
__SetPageReserved(page);
/*
@@ -6099,7 +6099,7 @@ void __ref memmap_init_zone_device(struct zone *zone,
* check here not to call set_pageblock_migratetype() against
* pfn out of zone.
*
- * Please note that MEMMAP_HOTPLUG path doesn't clear memmap
+ * Please note that MEMINIT_HOTPLUG path doesn't clear memmap
* because this is done early in section_activate()
*/
if (!(pfn & (pageblock_nr_pages - 1))) {
@@ -6137,7 +6137,7 @@ void __meminit __weak memmap_init(unsigned long size, int nid,
if (end_pfn > start_pfn) {
size = end_pfn - start_pfn;
memmap_init_zone(size, nid, zone, start_pfn,
- MEMMAP_EARLY, NULL);
+ MEMINIT_EARLY, NULL);
}
}
}
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ee1dfad5325ff1cfb2239e564cd411b3bfe8667a Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer(a)redhat.com>
Date: Mon, 14 Sep 2020 13:04:19 -0400
Subject: [PATCH] dm: fix bio splitting and its bio completion order for
regular IO
dm_queue_split() is removed because __split_and_process_bio() _must_
handle splitting bios to ensure proper bio submission and completion
ordering as a bio is split.
Otherwise, multiple recursive calls to ->submit_bio will cause multiple
split bios to be allocated from the same ->bio_split mempool at the same
time. This would result in deadlock in low memory conditions because no
progress could be made (only one bio is available in ->bio_split
mempool).
This fix has been verified to still fix the loss of performance, due
to excess splitting, that commit 120c9257f5f1 provided.
Fixes: 120c9257f5f1 ("Revert "dm: always call blk_queue_split() in dm_process_bio()"")
Cc: stable(a)vger.kernel.org # 5.0+, requires custom backport due to 5.9 changes
Reported-by: Ming Lei <ming.lei(a)redhat.com>
Signed-off-by: Mike Snitzer <snitzer(a)redhat.com>
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 4a40df8af7d3..d948cd522431 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1724,23 +1724,6 @@ static blk_qc_t __process_bio(struct mapped_device *md, struct dm_table *map,
return ret;
}
-static void dm_queue_split(struct mapped_device *md, struct dm_target *ti, struct bio **bio)
-{
- unsigned len, sector_count;
-
- sector_count = bio_sectors(*bio);
- len = min_t(sector_t, max_io_len((*bio)->bi_iter.bi_sector, ti), sector_count);
-
- if (sector_count > len) {
- struct bio *split = bio_split(*bio, len, GFP_NOIO, &md->queue->bio_split);
-
- bio_chain(split, *bio);
- trace_block_split(md->queue, split, (*bio)->bi_iter.bi_sector);
- submit_bio_noacct(*bio);
- *bio = split;
- }
-}
-
static blk_qc_t dm_process_bio(struct mapped_device *md,
struct dm_table *map, struct bio *bio)
{
@@ -1768,14 +1751,12 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
if (current->bio_list) {
if (is_abnormal_io(bio))
blk_queue_split(&bio);
- else
- dm_queue_split(md, ti, &bio);
+ /* regular IO is split by __split_and_process_bio */
}
if (dm_get_md_type(md) == DM_TYPE_NVME_BIO_BASED)
return __process_bio(md, map, bio, ti);
- else
- return __split_and_process_bio(md, map, bio);
+ return __split_and_process_bio(md, map, bio);
}
static blk_qc_t dm_submit_bio(struct bio *bio)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c4ad98e4b72cb5be30ea282fce935248f2300e62 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Tue, 15 Sep 2020 11:42:17 +0100
Subject: [PATCH] KVM: arm64: Assume write fault on S1PTW permission fault on
instruction fetch
KVM currently assumes that an instruction abort can never be a write.
This is in general true, except when the abort is triggered by
a S1PTW on instruction fetch that tries to update the S1 page tables
(to set AF, for example).
This can happen if the page tables have been paged out and brought
back in without seeing a direct write to them (they are thus marked
read only), and the fault handling code will make the PT executable(!)
instead of writable. The guest gets stuck forever.
In these conditions, the permission fault must be considered as
a write so that the Stage-1 update can take place. This is essentially
the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM:
Take S1 walks into account when determining S2 write faults").
Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce
kvm_vcpu_trap_is_exec_fault() that only return true when no faulting
on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to
kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't
specific to data abort.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Reviewed-by: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200915104218.1284701-2-maz@kernel.org
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 49a55be2b9a2..4f618af660ba 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -298,7 +298,7 @@ static __always_inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu)
return (kvm_vcpu_get_esr(vcpu) & ESR_ELx_SRT_MASK) >> ESR_ELx_SRT_SHIFT;
}
-static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_abt_iss1tw(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_S1PTW);
}
@@ -306,7 +306,7 @@ static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
static __always_inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_WNR) ||
- kvm_vcpu_dabt_iss1tw(vcpu); /* AF/DBM update */
+ kvm_vcpu_abt_iss1tw(vcpu); /* AF/DBM update */
}
static inline bool kvm_vcpu_dabt_is_cm(const struct kvm_vcpu *vcpu)
@@ -335,6 +335,11 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
}
+static inline bool kvm_vcpu_trap_is_exec_fault(const struct kvm_vcpu *vcpu)
+{
+ return kvm_vcpu_trap_is_iabt(vcpu) && !kvm_vcpu_abt_iss1tw(vcpu);
+}
+
static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
{
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC;
@@ -372,6 +377,9 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
{
+ if (kvm_vcpu_abt_iss1tw(vcpu))
+ return true;
+
if (kvm_vcpu_trap_is_iabt(vcpu))
return false;
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 426ef65601dd..d64c5d56c860 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -445,7 +445,7 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
kvm_vcpu_dabt_isvalid(vcpu) &&
!kvm_vcpu_abt_issea(vcpu) &&
- !kvm_vcpu_dabt_iss1tw(vcpu);
+ !kvm_vcpu_abt_iss1tw(vcpu);
if (valid) {
int ret = __vgic_v2_perform_cpuif_access(vcpu);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f58d657a898d..9aec1ce491d2 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1843,7 +1843,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
struct kvm_s2_mmu *mmu = vcpu->arch.hw_mmu;
write_fault = kvm_is_write_fault(vcpu);
- exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
+ exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
VM_BUG_ON(write_fault && exec_fault);
if (fault_status == FSC_PERM && !write_fault && !exec_fault) {
@@ -2125,7 +2125,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
goto out;
}
- if (kvm_vcpu_dabt_iss1tw(vcpu)) {
+ if (kvm_vcpu_abt_iss1tw(vcpu)) {
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
ret = 1;
goto out_unlock;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c4ad98e4b72cb5be30ea282fce935248f2300e62 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Tue, 15 Sep 2020 11:42:17 +0100
Subject: [PATCH] KVM: arm64: Assume write fault on S1PTW permission fault on
instruction fetch
KVM currently assumes that an instruction abort can never be a write.
This is in general true, except when the abort is triggered by
a S1PTW on instruction fetch that tries to update the S1 page tables
(to set AF, for example).
This can happen if the page tables have been paged out and brought
back in without seeing a direct write to them (they are thus marked
read only), and the fault handling code will make the PT executable(!)
instead of writable. The guest gets stuck forever.
In these conditions, the permission fault must be considered as
a write so that the Stage-1 update can take place. This is essentially
the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM:
Take S1 walks into account when determining S2 write faults").
Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce
kvm_vcpu_trap_is_exec_fault() that only return true when no faulting
on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to
kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't
specific to data abort.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Reviewed-by: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200915104218.1284701-2-maz@kernel.org
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 49a55be2b9a2..4f618af660ba 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -298,7 +298,7 @@ static __always_inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu)
return (kvm_vcpu_get_esr(vcpu) & ESR_ELx_SRT_MASK) >> ESR_ELx_SRT_SHIFT;
}
-static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_abt_iss1tw(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_S1PTW);
}
@@ -306,7 +306,7 @@ static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
static __always_inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_WNR) ||
- kvm_vcpu_dabt_iss1tw(vcpu); /* AF/DBM update */
+ kvm_vcpu_abt_iss1tw(vcpu); /* AF/DBM update */
}
static inline bool kvm_vcpu_dabt_is_cm(const struct kvm_vcpu *vcpu)
@@ -335,6 +335,11 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
}
+static inline bool kvm_vcpu_trap_is_exec_fault(const struct kvm_vcpu *vcpu)
+{
+ return kvm_vcpu_trap_is_iabt(vcpu) && !kvm_vcpu_abt_iss1tw(vcpu);
+}
+
static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
{
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC;
@@ -372,6 +377,9 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
{
+ if (kvm_vcpu_abt_iss1tw(vcpu))
+ return true;
+
if (kvm_vcpu_trap_is_iabt(vcpu))
return false;
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 426ef65601dd..d64c5d56c860 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -445,7 +445,7 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
kvm_vcpu_dabt_isvalid(vcpu) &&
!kvm_vcpu_abt_issea(vcpu) &&
- !kvm_vcpu_dabt_iss1tw(vcpu);
+ !kvm_vcpu_abt_iss1tw(vcpu);
if (valid) {
int ret = __vgic_v2_perform_cpuif_access(vcpu);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f58d657a898d..9aec1ce491d2 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1843,7 +1843,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
struct kvm_s2_mmu *mmu = vcpu->arch.hw_mmu;
write_fault = kvm_is_write_fault(vcpu);
- exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
+ exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
VM_BUG_ON(write_fault && exec_fault);
if (fault_status == FSC_PERM && !write_fault && !exec_fault) {
@@ -2125,7 +2125,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
goto out;
}
- if (kvm_vcpu_dabt_iss1tw(vcpu)) {
+ if (kvm_vcpu_abt_iss1tw(vcpu)) {
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
ret = 1;
goto out_unlock;
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c4ad98e4b72cb5be30ea282fce935248f2300e62 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Tue, 15 Sep 2020 11:42:17 +0100
Subject: [PATCH] KVM: arm64: Assume write fault on S1PTW permission fault on
instruction fetch
KVM currently assumes that an instruction abort can never be a write.
This is in general true, except when the abort is triggered by
a S1PTW on instruction fetch that tries to update the S1 page tables
(to set AF, for example).
This can happen if the page tables have been paged out and brought
back in without seeing a direct write to them (they are thus marked
read only), and the fault handling code will make the PT executable(!)
instead of writable. The guest gets stuck forever.
In these conditions, the permission fault must be considered as
a write so that the Stage-1 update can take place. This is essentially
the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM:
Take S1 walks into account when determining S2 write faults").
Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce
kvm_vcpu_trap_is_exec_fault() that only return true when no faulting
on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to
kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't
specific to data abort.
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Reviewed-by: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200915104218.1284701-2-maz@kernel.org
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 49a55be2b9a2..4f618af660ba 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -298,7 +298,7 @@ static __always_inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu)
return (kvm_vcpu_get_esr(vcpu) & ESR_ELx_SRT_MASK) >> ESR_ELx_SRT_SHIFT;
}
-static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_abt_iss1tw(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_S1PTW);
}
@@ -306,7 +306,7 @@ static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
static __always_inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
{
return !!(kvm_vcpu_get_esr(vcpu) & ESR_ELx_WNR) ||
- kvm_vcpu_dabt_iss1tw(vcpu); /* AF/DBM update */
+ kvm_vcpu_abt_iss1tw(vcpu); /* AF/DBM update */
}
static inline bool kvm_vcpu_dabt_is_cm(const struct kvm_vcpu *vcpu)
@@ -335,6 +335,11 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
}
+static inline bool kvm_vcpu_trap_is_exec_fault(const struct kvm_vcpu *vcpu)
+{
+ return kvm_vcpu_trap_is_iabt(vcpu) && !kvm_vcpu_abt_iss1tw(vcpu);
+}
+
static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
{
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC;
@@ -372,6 +377,9 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
{
+ if (kvm_vcpu_abt_iss1tw(vcpu))
+ return true;
+
if (kvm_vcpu_trap_is_iabt(vcpu))
return false;
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 426ef65601dd..d64c5d56c860 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -445,7 +445,7 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
kvm_vcpu_dabt_isvalid(vcpu) &&
!kvm_vcpu_abt_issea(vcpu) &&
- !kvm_vcpu_dabt_iss1tw(vcpu);
+ !kvm_vcpu_abt_iss1tw(vcpu);
if (valid) {
int ret = __vgic_v2_perform_cpuif_access(vcpu);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f58d657a898d..9aec1ce491d2 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1843,7 +1843,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
struct kvm_s2_mmu *mmu = vcpu->arch.hw_mmu;
write_fault = kvm_is_write_fault(vcpu);
- exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
+ exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
VM_BUG_ON(write_fault && exec_fault);
if (fault_status == FSC_PERM && !write_fault && !exec_fault) {
@@ -2125,7 +2125,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
goto out;
}
- if (kvm_vcpu_dabt_iss1tw(vcpu)) {
+ if (kvm_vcpu_abt_iss1tw(vcpu)) {
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
ret = 1;
goto out_unlock;
As Eric pointed out, this patch should not be applied. There were 2
patches addressing the very same leak and it seems one of them already
landed and this one is being applied on top of it, so in practice, this
patch is producing a double free.
Iago
On Sun, 2020-09-27 at 13:52 -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> drm/v3d: don't leak bin job if v3d_job_init fails.
>
> to the 5.4-stable tree which can be found at:
>
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> drm-v3d-don-t-leak-bin-job-if-v3d_job_init-fails.patch
> and it can be found in the queue-5.4 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable
> tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit cd0324f318e4447471e0bb01d4bea4f5de4aab1e
> Author: Iago Toral Quiroga <itoral(a)igalia.com>
> Date: Mon Sep 16 09:11:25 2019 +0200
>
> drm/v3d: don't leak bin job if v3d_job_init fails.
>
> [ Upstream commit 0d352a3a8a1f26168d09f7073e61bb4b328e3bb9 ]
>
> If the initialization of the job fails we need to kfree() it
> before returning.
>
> Signed-off-by: Iago Toral Quiroga <itoral(a)igalia.com>
> Signed-off-by: Eric Anholt <eric(a)anholt.net>
> Link:
> https://patchwork.freedesktop.org/patch/msgid/20190916071125.5255-1-itoral@…
> Fixes: a783a09ee76d ("drm/v3d: Refactor job management.")
> Reviewed-by: Eric Anholt <eric(a)anholt.net>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c
> b/drivers/gpu/drm/v3d/v3d_gem.c
> index 19c092d75266b..6316bf3646af5 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -565,6 +565,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void
> *data,
> ret = v3d_job_init(v3d, file_priv, &bin->base,
> v3d_job_free, args->in_sync_bcl);
> if (ret) {
> + kfree(bin);
> v3d_job_put(&render->base);
> kfree(bin);
> return ret;
>
This is a note to let you know that I've just added the patch titled
vt_ioctl: make VT_RESIZEX behave like VT_RESIZE
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 988d0763361bb65690d60e2bc53a6b72777040c3 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp>
Date: Sun, 27 Sep 2020 20:46:30 +0900
Subject: vt_ioctl: make VT_RESIZEX behave like VT_RESIZE
syzbot is reporting UAF/OOB read at bit_putcs()/soft_cursor() [1][2], for
vt_resizex() from ioctl(VT_RESIZEX) allows setting font height larger than
actual font height calculated by con_font_set() from ioctl(PIO_FONT).
Since fbcon_set_font() from con_font_set() allocates minimal amount of
memory based on actual font height calculated by con_font_set(),
use of vt_resizex() can cause UAF/OOB read for font data.
VT_RESIZEX was introduced in Linux 1.3.3, but it is unclear that what
comes to the "+ more" part, and I couldn't find a user of VT_RESIZEX.
#define VT_RESIZE 0x5609 /* set kernel's idea of screensize */
#define VT_RESIZEX 0x560A /* set kernel's idea of screensize + more */
So far we are not aware of syzbot reports caused by setting non-zero value
to v_vlin parameter. But given that it is possible that nobody is using
VT_RESIZEX, we can try removing support for v_clin and v_vlin parameters.
Therefore, this patch effectively makes VT_RESIZEX behave like VT_RESIZE,
with emitting a message if somebody is still using v_clin and/or v_vlin
parameters.
[1] https://syzkaller.appspot.com/bug?id=32577e96d88447ded2d3b76d71254fb8552458…
[2] https://syzkaller.appspot.com/bug?id=6b8355d27b2b94fb5cedf4655e3a59162d9e48…
Reported-by: syzbot <syzbot+b308f5fd049fbbc6e74f(a)syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+16469b5e8e5a72e9131e(a)syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/4933b81b-9b1a-355b-df0e-9b31e8280ab9@i-love.sakur…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/vt/vt_ioctl.c | 57 +++++++--------------------------------
1 file changed, 10 insertions(+), 47 deletions(-)
diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
index 2ea76a09e07f..0a33b8ababe3 100644
--- a/drivers/tty/vt/vt_ioctl.c
+++ b/drivers/tty/vt/vt_ioctl.c
@@ -772,58 +772,21 @@ static int vt_resizex(struct vc_data *vc, struct vt_consize __user *cs)
if (copy_from_user(&v, cs, sizeof(struct vt_consize)))
return -EFAULT;
- /* FIXME: Should check the copies properly */
- if (!v.v_vlin)
- v.v_vlin = vc->vc_scan_lines;
-
- if (v.v_clin) {
- int rows = v.v_vlin / v.v_clin;
- if (v.v_rows != rows) {
- if (v.v_rows) /* Parameters don't add up */
- return -EINVAL;
- v.v_rows = rows;
- }
- }
-
- if (v.v_vcol && v.v_ccol) {
- int cols = v.v_vcol / v.v_ccol;
- if (v.v_cols != cols) {
- if (v.v_cols)
- return -EINVAL;
- v.v_cols = cols;
- }
- }
-
- if (v.v_clin > 32)
- return -EINVAL;
+ if (v.v_vlin)
+ pr_info_once("\"struct vt_consize\"->v_vlin is ignored. Please report if you need this.\n");
+ if (v.v_clin)
+ pr_info_once("\"struct vt_consize\"->v_clin is ignored. Please report if you need this.\n");
+ console_lock();
for (i = 0; i < MAX_NR_CONSOLES; i++) {
- struct vc_data *vcp;
+ vc = vc_cons[i].d;
- if (!vc_cons[i].d)
- continue;
- console_lock();
- vcp = vc_cons[i].d;
- if (vcp) {
- int ret;
- int save_scan_lines = vcp->vc_scan_lines;
- int save_font_height = vcp->vc_font.height;
-
- if (v.v_vlin)
- vcp->vc_scan_lines = v.v_vlin;
- if (v.v_clin)
- vcp->vc_font.height = v.v_clin;
- vcp->vc_resize_user = 1;
- ret = vc_resize(vcp, v.v_cols, v.v_rows);
- if (ret) {
- vcp->vc_scan_lines = save_scan_lines;
- vcp->vc_font.height = save_font_height;
- console_unlock();
- return ret;
- }
+ if (vc) {
+ vc->vc_resize_user = 1;
+ vc_resize(vc, v.v_cols, v.v_rows);
}
- console_unlock();
}
+ console_unlock();
return 0;
}
--
2.28.0
musb_queue_resume_work() would call the provided callback if the runtime
PM status was 'active'. Otherwise, it would enqueue the request if the
hardware was still suspended (musb->is_runtime_suspended is true).
This causes a race with the runtime PM handlers, as it is possible to be
in the case where the runtime PM status is not yet 'active', but the
hardware has been awaken (PM resume function has been called).
When hitting the race, the resume work was not enqueued, which probably
triggered other bugs further down the stack. For instance, a telnet
connection on Ingenic SoCs would result in a 50/50 chance of a
segmentation fault somewhere in the musb code.
Rework the code so that either we call the callback directly if
(musb->is_runtime_suspended == 0), or enqueue the query otherwise.
Fixes: ea2f35c01d5e ("usb: musb: Fix sleeping function called from invalid context for hdrc glue")
Cc: stable(a)vger.kernel.org # v4.9
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
---
drivers/usb/musb/musb_core.c | 31 +++++++++++++++++--------------
1 file changed, 17 insertions(+), 14 deletions(-)
diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c
index 384a8039a7fd..462c10d7455a 100644
--- a/drivers/usb/musb/musb_core.c
+++ b/drivers/usb/musb/musb_core.c
@@ -2241,32 +2241,35 @@ int musb_queue_resume_work(struct musb *musb,
{
struct musb_pending_work *w;
unsigned long flags;
+ bool is_suspended;
int error;
if (WARN_ON(!callback))
return -EINVAL;
- if (pm_runtime_active(musb->controller))
- return callback(musb, data);
+ spin_lock_irqsave(&musb->list_lock, flags);
+ is_suspended = musb->is_runtime_suspended;
+
+ if (is_suspended) {
+ w = devm_kzalloc(musb->controller, sizeof(*w), GFP_ATOMIC);
+ if (!w) {
+ error = -ENOMEM;
+ goto out_unlock;
+ }
- w = devm_kzalloc(musb->controller, sizeof(*w), GFP_ATOMIC);
- if (!w)
- return -ENOMEM;
+ w->callback = callback;
+ w->data = data;
- w->callback = callback;
- w->data = data;
- spin_lock_irqsave(&musb->list_lock, flags);
- if (musb->is_runtime_suspended) {
list_add_tail(&w->node, &musb->pending_list);
error = 0;
- } else {
- dev_err(musb->controller, "could not add resume work %p\n",
- callback);
- devm_kfree(musb->controller, w);
- error = -EINPROGRESS;
}
+
+out_unlock:
spin_unlock_irqrestore(&musb->list_lock, flags);
+ if (!is_suspended)
+ error = callback(musb, data);
+
return error;
}
EXPORT_SYMBOL_GPL(musb_queue_resume_work);
--
2.28.0
Hello Dear Friendhow are you doing? , I do hope that you are swimming in the ocean of health.My name is AYAZ IBRAHIM I own a Mining company and have been into Gold mining with 25 years Experience.I have 99.99% 23 carat Gold been deposited in safekeeping company, the total kilogram is 300 Kilogram, I am looking for a potential buy that will buy the Gold at good price .Gold global price for 23 carat 1 kilogram is approximately $43,000.00 USD, I have decided to sale the 300 kilograms of Gold to a potential buyer at the selling rate of USD$29,000.00 for one KG, this is to enable me raise the quick finance that I need to put into my plant project in other to facilitate the project .
I Hope to hear from you ASAP so we can proceed further.With RegardsMR. AYAZ IBRAHIM
The use of PHY_REFCLK_USE_PAD introduced a regression for apq8064
devices. It was tested that while apq doesn't require the padding, ipq
SoC must use it or the kernel hangs on boot.
Fixes: de3c4bf6489 ("PCI: qcom: Add support for tx term offset for rev 2.1.0")
Reported-by: Ilia Mirkin <imirkin(a)alum.mit.edu>
Signed-off-by: Ilia Mirkin <imirkin(a)alum.mit.edu>
Signed-off-by: Ansuel Smith <ansuelsmth(a)gmail.com>
Cc: stable(a)vger.kernel.org # v4.19+
---
drivers/pci/controller/dwc/pcie-qcom.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 3aac77a295ba..dad6e9ce66ba 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -387,7 +387,9 @@ static int qcom_pcie_init_2_1_0(struct qcom_pcie *pcie)
/* enable external reference clock */
val = readl(pcie->parf + PCIE20_PARF_PHY_REFCLK);
- val &= ~PHY_REFCLK_USE_PAD;
+ /* USE_PAD is required only for ipq806x */
+ if (!of_device_is_compatible(node, "qcom,pcie-apq8064"))
+ val &= ~PHY_REFCLK_USE_PAD;
val |= PHY_REFCLK_SSP_EN;
writel(val, pcie->parf + PCIE20_PARF_PHY_REFCLK);
--
2.27.0
If interrupt comes late, during probe error path or device remove (could
be triggered with CONFIG_DEBUG_SHIRQ), the interrupt handler
i2c_imx_isr() will access registers with the clock being disabled. This
leads to external abort on non-linefetch on Toradex Colibri VF50 module
(with Vybrid VF5xx):
Unhandled fault: external abort on non-linefetch (0x1008) at 0x8882d003
Internal error: : 1008 [#1] ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 5.7.0 #607
Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
(i2c_imx_isr) from [<8017009c>] (free_irq+0x25c/0x3b0)
(free_irq) from [<805844ec>] (release_nodes+0x178/0x284)
(release_nodes) from [<80580030>] (really_probe+0x10c/0x348)
(really_probe) from [<80580380>] (driver_probe_device+0x60/0x170)
(driver_probe_device) from [<80580630>] (device_driver_attach+0x58/0x60)
(device_driver_attach) from [<805806bc>] (__driver_attach+0x84/0xc0)
(__driver_attach) from [<8057e228>] (bus_for_each_dev+0x68/0xb4)
(bus_for_each_dev) from [<8057f3ec>] (bus_add_driver+0x144/0x1ec)
(bus_add_driver) from [<80581320>] (driver_register+0x78/0x110)
(driver_register) from [<8010213c>] (do_one_initcall+0xa8/0x2f4)
(do_one_initcall) from [<80c0100c>] (kernel_init_freeable+0x178/0x1dc)
(kernel_init_freeable) from [<80807048>] (kernel_init+0x8/0x110)
(kernel_init) from [<80100114>] (ret_from_fork+0x14/0x20)
Additionally, the i2c_imx_isr() could wake up the wait queue
(imx_i2c_struct->queue) before its initialization happens.
The resource-managed framework should not be used for interrupt handling,
because the resource will be released too late - after disabling clocks.
The interrupt handler is not prepared for such case.
Fixes: 1c4b6c3bcf30 ("i2c: imx: implement bus recovery")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzk(a)kernel.org>
---
Changes since v2:
1. Rebase
Changes since v1:
1. Remove the devm- and use regular methods.
---
drivers/i2c/busses/i2c-imx.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index 63f4367c312b..c98529c76348 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -1169,14 +1169,6 @@ static int i2c_imx_probe(struct platform_device *pdev)
return ret;
}
- /* Request IRQ */
- ret = devm_request_irq(&pdev->dev, irq, i2c_imx_isr, IRQF_SHARED,
- pdev->name, i2c_imx);
- if (ret) {
- dev_err(&pdev->dev, "can't claim irq %d\n", irq);
- goto clk_disable;
- }
-
/* Init queue */
init_waitqueue_head(&i2c_imx->queue);
@@ -1195,6 +1187,14 @@ static int i2c_imx_probe(struct platform_device *pdev)
if (ret < 0)
goto rpm_disable;
+ /* Request IRQ */
+ ret = request_threaded_irq(irq, i2c_imx_isr, NULL, IRQF_SHARED,
+ pdev->name, i2c_imx);
+ if (ret) {
+ dev_err(&pdev->dev, "can't claim irq %d\n", irq);
+ goto rpm_disable;
+ }
+
/* Set up clock divider */
i2c_imx->bitrate = I2C_MAX_STANDARD_MODE_FREQ;
ret = of_property_read_u32(pdev->dev.of_node,
@@ -1237,13 +1237,12 @@ static int i2c_imx_probe(struct platform_device *pdev)
clk_notifier_unregister:
clk_notifier_unregister(i2c_imx->clk, &i2c_imx->clk_change_nb);
+ free_irq(irq, i2c_imx);
rpm_disable:
pm_runtime_put_noidle(&pdev->dev);
pm_runtime_disable(&pdev->dev);
pm_runtime_set_suspended(&pdev->dev);
pm_runtime_dont_use_autosuspend(&pdev->dev);
-
-clk_disable:
clk_disable_unprepare(i2c_imx->clk);
return ret;
}
@@ -1251,7 +1250,7 @@ static int i2c_imx_probe(struct platform_device *pdev)
static int i2c_imx_remove(struct platform_device *pdev)
{
struct imx_i2c_struct *i2c_imx = platform_get_drvdata(pdev);
- int ret;
+ int irq, ret;
ret = pm_runtime_get_sync(&pdev->dev);
if (ret < 0)
@@ -1271,6 +1270,9 @@ static int i2c_imx_remove(struct platform_device *pdev)
imx_i2c_write_reg(0, i2c_imx, IMX_I2C_I2SR);
clk_notifier_unregister(i2c_imx->clk, &i2c_imx->clk_change_nb);
+ irq = platform_get_irq(pdev, 0);
+ if (irq >= 0)
+ free_irq(irq, i2c_imx);
clk_disable_unprepare(i2c_imx->clk);
pm_runtime_put_noidle(&pdev->dev);
--
2.17.1
The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.
This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero
All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Jan Kara <jack(a)suse.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy(a)solarflare.com>
Cc: Philipp Reisner <philipp.reisner(a)linbit.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Vlastimil Babka <vbabka(a)suse.com>
Cc: stable(a)vger.kernel.org
---
include/linux/net.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/linux/net.h b/include/linux/net.h
index d48ff1180879..05db8690f67e 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -21,6 +21,7 @@
#include <linux/rcupdate.h>
#include <linux/once.h>
#include <linux/fs.h>
+#include <linux/mm.h>
#include <linux/sockptr.h>
#include <uapi/linux/net.h>
@@ -286,6 +287,21 @@ do { \
#define net_get_random_once_wait(buf, nbytes) \
get_random_once_wait((buf), (nbytes))
+/*
+ * E.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+static inline bool sendpage_ok(struct page *page)
+{
+ return !PageSlab(page) && page_count(page) >= 1;
+}
+
int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec,
size_t num, size_t len);
int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
--
2.26.2
This is a note to let you know that I've just added the patch titled
vt_ioctl: make VT_RESIZEX behave like VT_RESIZE
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 988d0763361bb65690d60e2bc53a6b72777040c3 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp>
Date: Sun, 27 Sep 2020 20:46:30 +0900
Subject: vt_ioctl: make VT_RESIZEX behave like VT_RESIZE
syzbot is reporting UAF/OOB read at bit_putcs()/soft_cursor() [1][2], for
vt_resizex() from ioctl(VT_RESIZEX) allows setting font height larger than
actual font height calculated by con_font_set() from ioctl(PIO_FONT).
Since fbcon_set_font() from con_font_set() allocates minimal amount of
memory based on actual font height calculated by con_font_set(),
use of vt_resizex() can cause UAF/OOB read for font data.
VT_RESIZEX was introduced in Linux 1.3.3, but it is unclear that what
comes to the "+ more" part, and I couldn't find a user of VT_RESIZEX.
#define VT_RESIZE 0x5609 /* set kernel's idea of screensize */
#define VT_RESIZEX 0x560A /* set kernel's idea of screensize + more */
So far we are not aware of syzbot reports caused by setting non-zero value
to v_vlin parameter. But given that it is possible that nobody is using
VT_RESIZEX, we can try removing support for v_clin and v_vlin parameters.
Therefore, this patch effectively makes VT_RESIZEX behave like VT_RESIZE,
with emitting a message if somebody is still using v_clin and/or v_vlin
parameters.
[1] https://syzkaller.appspot.com/bug?id=32577e96d88447ded2d3b76d71254fb8552458…
[2] https://syzkaller.appspot.com/bug?id=6b8355d27b2b94fb5cedf4655e3a59162d9e48…
Reported-by: syzbot <syzbot+b308f5fd049fbbc6e74f(a)syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+16469b5e8e5a72e9131e(a)syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/4933b81b-9b1a-355b-df0e-9b31e8280ab9@i-love.sakur…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/vt/vt_ioctl.c | 57 +++++++--------------------------------
1 file changed, 10 insertions(+), 47 deletions(-)
diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
index 2ea76a09e07f..0a33b8ababe3 100644
--- a/drivers/tty/vt/vt_ioctl.c
+++ b/drivers/tty/vt/vt_ioctl.c
@@ -772,58 +772,21 @@ static int vt_resizex(struct vc_data *vc, struct vt_consize __user *cs)
if (copy_from_user(&v, cs, sizeof(struct vt_consize)))
return -EFAULT;
- /* FIXME: Should check the copies properly */
- if (!v.v_vlin)
- v.v_vlin = vc->vc_scan_lines;
-
- if (v.v_clin) {
- int rows = v.v_vlin / v.v_clin;
- if (v.v_rows != rows) {
- if (v.v_rows) /* Parameters don't add up */
- return -EINVAL;
- v.v_rows = rows;
- }
- }
-
- if (v.v_vcol && v.v_ccol) {
- int cols = v.v_vcol / v.v_ccol;
- if (v.v_cols != cols) {
- if (v.v_cols)
- return -EINVAL;
- v.v_cols = cols;
- }
- }
-
- if (v.v_clin > 32)
- return -EINVAL;
+ if (v.v_vlin)
+ pr_info_once("\"struct vt_consize\"->v_vlin is ignored. Please report if you need this.\n");
+ if (v.v_clin)
+ pr_info_once("\"struct vt_consize\"->v_clin is ignored. Please report if you need this.\n");
+ console_lock();
for (i = 0; i < MAX_NR_CONSOLES; i++) {
- struct vc_data *vcp;
+ vc = vc_cons[i].d;
- if (!vc_cons[i].d)
- continue;
- console_lock();
- vcp = vc_cons[i].d;
- if (vcp) {
- int ret;
- int save_scan_lines = vcp->vc_scan_lines;
- int save_font_height = vcp->vc_font.height;
-
- if (v.v_vlin)
- vcp->vc_scan_lines = v.v_vlin;
- if (v.v_clin)
- vcp->vc_font.height = v.v_clin;
- vcp->vc_resize_user = 1;
- ret = vc_resize(vcp, v.v_cols, v.v_rows);
- if (ret) {
- vcp->vc_scan_lines = save_scan_lines;
- vcp->vc_font.height = save_font_height;
- console_unlock();
- return ret;
- }
+ if (vc) {
+ vc->vc_resize_user = 1;
+ vc_resize(vc, v.v_cols, v.v_rows);
}
- console_unlock();
}
+ console_unlock();
return 0;
}
--
2.28.0
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 7ed24f1e8cf1 - net/mlx5e: Fix endianness when calculating pedit mask first bit
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ❌ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 0c52d2a4d9e9 - Linux 5.8.12
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
This is the start of the stable review cycle for the 5.8.12 release.
There are 56 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 27 Sep 2020 12:47:02 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.12-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.8.12-rc1
Maor Dickman <maord(a)nvidia.com>
net/mlx5e: Fix endianness when calculating pedit mask first bit
Maxim Mikityanskiy <maximmi(a)mellanox.com>
net/mlx5e: Use synchronize_rcu to sync with NAPI
Maxim Mikityanskiy <maximmi(a)mellanox.com>
net/mlx5e: Use RCU to protect rq->xdp_prog
Taehee Yoo <ap420073(a)gmail.com>
Revert "netns: don't disable BHs when locking "nsid_lock""
Parshuram Thombare <pthombar(a)cadence.com>
net: macb: fix for pause frame receive enable bit
Matthias Schiffer <matthias.schiffer(a)ew.tq-group.com>
net: dsa: microchip: ksz8795: really set the correct number of ports
Vladimir Oltean <olteanv(a)gmail.com>
net: dsa: link interfaces with the DSA master to get rid of lockdep warnings
Dexuan Cui <decui(a)microsoft.com>
hv_netvsc: Fix hibernation for mlx5 VF driver
Luo bin <luobin9(a)huawei.com>
hinic: fix rewaking txq after netif_tx_disable
Jianbo Liu <jianbol(a)mellanox.com>
net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
Vadym Kochan <vadym.kochan(a)plvision.eu>
net: ipa: fix u32_replace_bits by u32p_xxx version
Jason A. Donenfeld <Jason(a)zx2c4.com>
wireguard: peerlookup: take lock before checking hash in replace operation
Jason A. Donenfeld <Jason(a)zx2c4.com>
wireguard: noise: take lock when removing handshake entry from table
Grygorii Strashko <grygorii.strashko(a)ti.com>
net: ethernet: ti: cpsw_new: fix suspend/resume
Eric Dumazet <edumazet(a)google.com>
net: add __must_check to skb_put_padto()
Eric Dumazet <edumazet(a)google.com>
net: qrtr: check skb_put_padto() return value
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: Do not warn in phy_stop() on PHY_DOWN
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: Avoid NPD upon phy_detach() when driver is unbound
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Disable IRQs only if NAPI gets scheduled
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Use napi_complete_done()
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: use netif_tx_napi_add() for TX NAPI
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Wake TX queue again
Michael Chan <michael.chan(a)broadcom.com>
bnxt_en: Protect bnxt_set_eee() and bnxt_set_pauseparam() with mutex.
Edwin Peer <edwin.peer(a)broadcom.com>
bnxt_en: return proper error codes in bnxt_show_temp
Vasundhara Volam <vasundhara-v.volam(a)broadcom.com>
bnxt_en: Use memcpy to copy VPD field info.
Tariq Toukan <tariqt(a)mellanox.com>
net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
Maor Dickman <maord(a)mellanox.com>
net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
Xin Long <lucien.xin(a)gmail.com>
tipc: use skb_unshare() instead in tipc_buf_append()
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
tipc: fix shutdown() of connection oriented socket
Peilin Ye <yepeilin.cs(a)gmail.com>
tipc: Fix memory leak in tipc_group_create_member()
Vinicius Costa Gomes <vinicius.gomes(a)intel.com>
taprio: Fix allowing too small intervals
Jakub Kicinski <kuba(a)kernel.org>
nfp: use correct define to return NONE fec
Henry Ptasinski <hptasinski(a)google.com>
net: sctp: Fix IPv6 ancestor_size calc in sctp_copy_descendant
Yunsheng Lin <linyunsheng(a)huawei.com>
net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
Xin Long <lucien.xin(a)gmail.com>
net: sched: initialize with 0 before setting erspan md->u
Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
net: phy: call phy_disable_interrupts() in phy_attach_direct() instead
Maor Gottlieb <maorg(a)nvidia.com>
net/mlx5: Fix FTE cleanup
Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC
Ido Schimmel <idosch(a)nvidia.com>
net: Fix bridge enslavement failure
Linus Walleij <linus.walleij(a)linaro.org>
net: dsa: rtl8366: Properly clear member config
Petr Machata <petrm(a)nvidia.com>
net: DCB: Validate DCB_ATTR_DCB_BUFFER argument
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
Eric Dumazet <edumazet(a)google.com>
ipv6: avoid lockdep issue in fib6_del()
David Ahern <dsahern(a)kernel.org>
ipv4: Update exception handling for multipath routes via same device
David Ahern <dsahern(a)gmail.com>
ipv4: Initialize flowi4_multipath_hash in data path
Wei Wang <weiwan(a)google.com>
ip: fix tos reflection in ack and reset packets
Luo bin <luobin9(a)huawei.com>
hinic: bump up the timeout of SET_FUNC_STATE cmd
Dan Carpenter <dan.carpenter(a)oracle.com>
hdlc_ppp: add range checks in ppp_cp_parse_cr()
Mark Gray <mark.d.gray(a)redhat.com>
geneve: add transport ports in route lookup for geneve
Ganji Aravind <ganji.aravind(a)chelsio.com>
cxgb4: Fix offset when clearing filter byte counters
Raju Rangoju <rajur(a)chelsio.com>
cxgb4: fix memory leak during module unload
Vasundhara Volam <vasundhara-v.volam(a)broadcom.com>
bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task()
Vasundhara Volam <vasundhara-v.volam(a)broadcom.com>
bnxt_en: Avoid sending firmware messages when AER error is detected.
Cong Wang <xiyou.wangcong(a)gmail.com>
act_ife: load meta modules before tcf_idr_check_alloc()
Jakub Kicinski <kuba(a)kernel.org>
ibmvnic: add missing parenthesis in do_reset()
Mingming Cao <mmc(a)linux.vnet.ibm.com>
ibmvnic fix NULL tx_pools and rx_tools issue at do_reset
-------------
Diffstat:
Makefile | 4 +-
drivers/net/dsa/microchip/ksz8795.c | 2 +-
drivers/net/dsa/rtl8366.c | 20 ++++---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 40 ++++++++-----
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 31 +++++++----
drivers/net/ethernet/cadence/macb_main.c | 3 +-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 9 ++-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_mps.c | 2 +-
drivers/net/ethernet/huawei/hinic/hinic_hw_mgmt.c | 16 ++++--
drivers/net/ethernet/huawei/hinic/hinic_main.c | 24 ++++++++
drivers/net/ethernet/huawei/hinic/hinic_tx.c | 18 +-----
drivers/net/ethernet/ibm/ibmvnic.c | 21 ++++++-
drivers/net/ethernet/lantiq_xrx200.c | 21 ++++---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 14 +----
.../net/ethernet/mellanox/mlx5/core/en/xsk/setup.c | 3 +-
.../mellanox/mlx5/core/en_accel/tls_stats.c | 12 ++--
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 65 ++++++++++------------
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 12 +---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 39 +++++++------
drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 17 ++++--
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 52 +++++++++--------
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 8 +--
.../net/ethernet/netronome/nfp/nfp_net_ethtool.c | 4 +-
drivers/net/ethernet/ti/cpsw_new.c | 53 ++++++++++++++++++
drivers/net/geneve.c | 37 ++++++++----
drivers/net/hyperv/netvsc_drv.c | 16 ++++--
drivers/net/ipa/ipa_table.c | 4 +-
drivers/net/phy/phy.c | 2 +-
drivers/net/phy/phy_device.c | 11 ++--
drivers/net/wan/hdlc_ppp.c | 16 ++++--
drivers/net/wireguard/noise.c | 5 +-
drivers/net/wireguard/peerlookup.c | 11 +++-
include/linux/skbuff.h | 7 ++-
include/net/flow.h | 1 +
include/net/sctp/structs.h | 8 ++-
net/bridge/br_vlan.c | 27 +++++----
net/core/dev.c | 2 +-
net/core/filter.c | 1 +
net/core/net_namespace.c | 22 ++++----
net/dcb/dcbnl.c | 8 +++
net/dsa/slave.c | 18 +++++-
net/ipv4/fib_frontend.c | 1 +
net/ipv4/ip_output.c | 3 +-
net/ipv4/route.c | 14 +++--
net/ipv6/Kconfig | 1 +
net/ipv6/ip6_fib.c | 13 +++--
net/qrtr/qrtr.c | 21 +++----
net/sched/act_ife.c | 44 +++++++++++----
net/sched/cls_flower.c | 1 +
net/sched/sch_generic.c | 48 +++++++++++-----
net/sched/sch_taprio.c | 28 ++++++----
net/sctp/socket.c | 9 +--
net/tipc/group.c | 14 +++--
net/tipc/msg.c | 3 +-
net/tipc/socket.c | 5 +-
58 files changed, 573 insertions(+), 326 deletions(-)
This is the start of the stable review cycle for the 5.4.68 release.
There are 43 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 27 Sep 2020 12:47:02 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.68-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.68-rc1
Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
Xunlei Pang <xlpang(a)linux.alibaba.com>
mm: memcg: fix memcg reclaim soft lockup
Eric Dumazet <edumazet(a)google.com>
net: add __must_check to skb_put_padto()
Eric Dumazet <edumazet(a)google.com>
net: qrtr: check skb_put_padto() return value
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: Do not warn in phy_stop() on PHY_DOWN
Florian Fainelli <f.fainelli(a)gmail.com>
net: phy: Avoid NPD upon phy_detach() when driver is unbound
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Disable IRQs only if NAPI gets scheduled
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Use napi_complete_done()
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: use netif_tx_napi_add() for TX NAPI
Hauke Mehrtens <hauke(a)hauke-m.de>
net: lantiq: Wake TX queue again
Michael Chan <michael.chan(a)broadcom.com>
bnxt_en: Protect bnxt_set_eee() and bnxt_set_pauseparam() with mutex.
Edwin Peer <edwin.peer(a)broadcom.com>
bnxt_en: return proper error codes in bnxt_show_temp
Tariq Toukan <tariqt(a)mellanox.com>
net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
Maor Dickman <maord(a)mellanox.com>
net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
Xin Long <lucien.xin(a)gmail.com>
tipc: use skb_unshare() instead in tipc_buf_append()
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
tipc: fix shutdown() of connection oriented socket
Peilin Ye <yepeilin.cs(a)gmail.com>
tipc: Fix memory leak in tipc_group_create_member()
Vinicius Costa Gomes <vinicius.gomes(a)intel.com>
taprio: Fix allowing too small intervals
Jakub Kicinski <kuba(a)kernel.org>
nfp: use correct define to return NONE fec
Henry Ptasinski <hptasinski(a)google.com>
net: sctp: Fix IPv6 ancestor_size calc in sctp_copy_descendant
Yunsheng Lin <linyunsheng(a)huawei.com>
net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
Maor Gottlieb <maorg(a)nvidia.com>
net/mlx5: Fix FTE cleanup
Necip Fazil Yildiran <fazilyildiran(a)gmail.com>
net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC
Ido Schimmel <idosch(a)nvidia.com>
net: Fix bridge enslavement failure
Linus Walleij <linus.walleij(a)linaro.org>
net: dsa: rtl8366: Properly clear member config
Petr Machata <petrm(a)nvidia.com>
net: DCB: Validate DCB_ATTR_DCB_BUFFER argument
Vladimir Oltean <vladimir.oltean(a)nxp.com>
net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
Eric Dumazet <edumazet(a)google.com>
ipv6: avoid lockdep issue in fib6_del()
David Ahern <dsahern(a)kernel.org>
ipv4: Update exception handling for multipath routes via same device
David Ahern <dsahern(a)gmail.com>
ipv4: Initialize flowi4_multipath_hash in data path
Wei Wang <weiwan(a)google.com>
ip: fix tos reflection in ack and reset packets
Dan Carpenter <dan.carpenter(a)oracle.com>
hdlc_ppp: add range checks in ppp_cp_parse_cr()
Mark Gray <mark.d.gray(a)redhat.com>
geneve: add transport ports in route lookup for geneve
Ganji Aravind <ganji.aravind(a)chelsio.com>
cxgb4: Fix offset when clearing filter byte counters
Raju Rangoju <rajur(a)chelsio.com>
cxgb4: fix memory leak during module unload
Vasundhara Volam <vasundhara-v.volam(a)broadcom.com>
bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task()
Vasundhara Volam <vasundhara-v.volam(a)broadcom.com>
bnxt_en: Avoid sending firmware messages when AER error is detected.
Cong Wang <xiyou.wangcong(a)gmail.com>
act_ife: load meta modules before tcf_idr_check_alloc()
Ralph Campbell <rcampbell(a)nvidia.com>
mm/thp: fix __split_huge_pmd_locked() for migration PMD
Muchun Song <songmuchun(a)bytedance.com>
kprobes: fix kill kprobe which has been marked as gone
Jakub Kicinski <kuba(a)kernel.org>
ibmvnic: add missing parenthesis in do_reset()
Mingming Cao <mmc(a)linux.vnet.ibm.com>
ibmvnic fix NULL tx_pools and rx_tools issue at do_reset
Mark Salyzyn <salyzyn(a)android.com>
af_key: pfkey_dump needs parameter validation
-------------
Diffstat:
Makefile | 4 +-
drivers/iommu/Kconfig | 2 +-
drivers/iommu/amd_iommu.c | 17 +++++--
drivers/iommu/amd_iommu_init.c | 21 ++++++++-
drivers/net/dsa/rtl8366.c | 20 ++++++---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 32 ++++++++-----
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 31 ++++++++-----
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 9 ++--
drivers/net/ethernet/chelsio/cxgb4/cxgb4_mps.c | 2 +-
drivers/net/ethernet/ibm/ibmvnic.c | 21 +++++++--
drivers/net/ethernet/lantiq_xrx200.c | 21 +++++----
.../mellanox/mlx5/core/en_accel/tls_stats.c | 12 +++--
.../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 52 ++++++++++++----------
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 8 ++--
.../net/ethernet/netronome/nfp/nfp_net_ethtool.c | 4 +-
drivers/net/geneve.c | 37 ++++++++++-----
drivers/net/phy/phy.c | 2 +-
drivers/net/phy/phy_device.c | 3 +-
drivers/net/wan/hdlc_ppp.c | 16 ++++---
include/linux/skbuff.h | 7 +--
include/net/flow.h | 1 +
include/net/sctp/structs.h | 8 ++--
kernel/kprobes.c | 9 +++-
mm/huge_memory.c | 40 ++++++++++-------
mm/vmscan.c | 8 ++++
net/bridge/br_vlan.c | 27 ++++++-----
net/core/dev.c | 2 +-
net/core/filter.c | 1 +
net/dcb/dcbnl.c | 8 ++++
net/ipv4/fib_frontend.c | 1 +
net/ipv4/ip_output.c | 3 +-
net/ipv4/route.c | 14 +++---
net/ipv6/Kconfig | 1 +
net/ipv6/ip6_fib.c | 13 ++++--
net/key/af_key.c | 7 +++
net/qrtr/qrtr.c | 20 +++++----
net/sched/act_ife.c | 44 +++++++++++++-----
net/sched/sch_generic.c | 49 +++++++++++++-------
net/sched/sch_taprio.c | 28 +++++++-----
net/sctp/socket.c | 9 ++--
net/tipc/group.c | 14 ++++--
net/tipc/msg.c | 3 +-
net/tipc/socket.c | 5 +--
44 files changed, 429 insertions(+), 211 deletions(-)
This commit resolves a minor bug in the selection/discovery of more
specific USB device drivers for devices that are currently bound to
generic USB device drivers.
The bug is related to the way a candidate USB device driver is
compared against the generic USB device driver. The code in
is_dev_usb_generic_driver() assumes that the device driver in question
is a USB device driver by calling to_usb_device_driver(dev->driver)
to downcast; however I have observed that this assumption is not always
true, through code instrumentation.
This commit avoids the incorrect downcast altogether by comparing
the USB device's driver (i.e., dev->driver) to the generic USB
device driver directly. This method was suggested by Alan Stern.
This bug was found while investigating Andrey Konovalov's report
indicating usbip device driver misbehaviour with the recently merged
generic USB device driver selection feature. The report is linked
below.
Fixes: d5643d2249 ("USB: Fix device driver race")
Link: https://lore.kernel.org/linux-usb/CAAeHK+zOrHnxjRFs=OE8T=O9208B9HP_oo8RZpyV…
Cc: <stable(a)vger.kernel.org> # 5.8
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Alan Stern <stern(a)rowland.harvard.edu>
Cc: Bastien Nocera <hadess(a)hadess.net>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Valentina Manea <valentina.manea.m(a)gmail.com>
Cc: <syzkaller(a)googlegroups.com>
Signed-off-by: M. Vefa Bicakci <m.v.b(a)runbox.com>
---
v3: Simplify the device driver pointer comparison to avoid incorrect
downcast, as suggested by Alan Stern. Minor edits to the commit
message.
v2: Following Bastien Nocera's code review comment, use is_usb_device()
to verify that a device is bound to a USB device driver (as opposed
to, e.g., a USB interface driver) to avoid incorrect downcast.
---
drivers/usb/core/driver.c | 11 ++---------
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index 950044a6e77f..7d90cbe063ec 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -905,21 +905,14 @@ static int usb_uevent(struct device *dev, struct kobj_uevent_env *env)
return 0;
}
-static bool is_dev_usb_generic_driver(struct device *dev)
-{
- struct usb_device_driver *udd = dev->driver ?
- to_usb_device_driver(dev->driver) : NULL;
-
- return udd == &usb_generic_driver;
-}
-
static int __usb_bus_reprobe_drivers(struct device *dev, void *data)
{
struct usb_device_driver *new_udriver = data;
struct usb_device *udev;
int ret;
- if (!is_dev_usb_generic_driver(dev))
+ /* Don't reprobe if current driver isn't usb_generic_driver */
+ if (dev->driver != &usb_generic_driver.drvwrap.driver)
return 0;
udev = to_usb_device(dev);
--
2.26.2
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Subject: mm: don't rely on system state to detect hot-plug operations
In register_mem_sect_under_node() the system_state's value is checked to
detect whether the call is made during boot time or during an hot-plug
operation. Unfortunately, that check against SYSTEM_BOOTING is wrong
because regular memory is registered at SYSTEM_SCHEDULING state. In
addition, memory hot-plug operation can be triggered at this system state
by the ACPI [1]. So checking against the system state is not enough.
The consequence is that on system with interleaved node's ranges like this:
Early memory node ranges
node 1: [mem 0x0000000000000000-0x000000011fffffff]
node 2: [mem 0x0000000120000000-0x000000014fffffff]
node 1: [mem 0x0000000150000000-0x00000001ffffffff]
node 0: [mem 0x0000000200000000-0x000000048fffffff]
node 2: [mem 0x0000000490000000-0x00000007ffffffff]
This can be seen on PowerPC LPAR after multiple memory hot-plug and
hot-unplug operations are done. At the next reboot the node's memory ranges
can be interleaved and since the call to link_mem_sections() is made in
topology_init() while the system is in the SYSTEM_SCHEDULING state, the
node's id is not checked, and the sections registered to multiple nodes:
$ ls -l /sys/devices/system/memory/memory21/node*
total 0
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1
lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2
In that case, the system is able to boot but if later one of theses memory
blocks is hot-unplugged and then hot-plugged, the sysfs inconsistency is
detected and this is triggering a BUG_ON():
------------[ cut here ]------------
kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
NIP: c000000000403f34 LR: c000000000403f2c CTR: 0000000000000000
REGS: c0000004876e3660 TRAP: 0700 Not tainted (5.9.0-rc1+)
MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24000448 XER: 20040000
CFAR: c000000000846d20 IRQMASK: 0
GPR00: c000000000403f2c c0000004876e38f0 c0000000012f6f00 ffffffffffffffef
GPR04: 0000000000000227 c0000004805ae680 0000000000000000 00000004886f0000
GPR08: 0000000000000226 0000000000000003 0000000000000002 fffffffffffffffd
GPR12: 0000000088000484 c00000001ec96280 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000004 0000000000000003
GPR20: c00000047814ffe0 c0000007ffff7c08 0000000000000010 c0000000013332c8
GPR24: 0000000000000000 c0000000011f6cc0 0000000000000000 0000000000000000
GPR28: ffffffffffffffef 0000000000000001 0000000150000000 0000000010000000
NIP [c000000000403f34] add_memory_resource+0x244/0x340
LR [c000000000403f2c] add_memory_resource+0x23c/0x340
Call Trace:
[c0000004876e38f0] [c000000000403f2c] add_memory_resource+0x23c/0x340 (unreliable)
[c0000004876e39c0] [c00000000040408c] __add_memory+0x5c/0xf0
[c0000004876e39f0] [c0000000000e2b94] dlpar_add_lmb+0x1b4/0x500
[c0000004876e3ad0] [c0000000000e3888] dlpar_memory+0x1f8/0xb80
[c0000004876e3b60] [c0000000000dc0d0] handle_dlpar_errorlog+0xc0/0x190
[c0000004876e3bd0] [c0000000000dc398] dlpar_store+0x198/0x4a0
[c0000004876e3c90] [c00000000072e630] kobj_attr_store+0x30/0x50
[c0000004876e3cb0] [c00000000051f954] sysfs_kf_write+0x64/0x90
[c0000004876e3cd0] [c00000000051ee40] kernfs_fop_write+0x1b0/0x290
[c0000004876e3d20] [c000000000438dd8] vfs_write+0xe8/0x290
[c0000004876e3d70] [c0000000004391ac] ksys_write+0xdc/0x130
[c0000004876e3dc0] [c000000000034e40] system_call_exception+0x160/0x270
[c0000004876e3e20] [c00000000000d740] system_call_common+0xf0/0x27c
Instruction dump:
48442e35 60000000 0b030000 3cbe0001 7fa3eb78 7bc48402 38a5fffe 7ca5fa14
78a58402 48442db1 60000000 7c7c1b78 <0b030000> 7f23cb78 4bda371d 60000000
---[ end trace 562fd6c109cd0fb2 ]---
This patch addresses the root cause by not relying on the system_state
value to detect whether the call is due to a hot-plug operation. An extra
parameter is added to link_mem_sections() detailing whether the operation
is due to a hot-plug operation.
[1] According to Oscar Salvador, using this qemu command line, ACPI memory
hotplug operations are raised at SYSTEM_SCHEDULING state:
$QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \
-m size=$MEM,slots=255,maxmem=4294967296k \
-numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \
-object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \
-object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \
-object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \
-object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \
-object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \
-object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \
-object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \
Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com
Fixes: 4fbce633910e ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()")
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/base/node.c | 85 ++++++++++++++++++++++++++---------------
include/linux/node.h | 11 +++--
mm/memory_hotplug.c | 3 -
3 files changed, 64 insertions(+), 35 deletions(-)
--- a/drivers/base/node.c~mm-dont-rely-on-system-state-to-detect-hot-plug-operations
+++ a/drivers/base/node.c
@@ -761,14 +761,36 @@ static int __ref get_nid_for_pfn(unsigne
return pfn_to_nid(pfn);
}
+static int do_register_memory_block_under_node(int nid,
+ struct memory_block *mem_blk)
+{
+ int ret;
+
+ /*
+ * If this memory block spans multiple nodes, we only indicate
+ * the last processed node.
+ */
+ mem_blk->nid = nid;
+
+ ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
+ &mem_blk->dev.kobj,
+ kobject_name(&mem_blk->dev.kobj));
+ if (ret)
+ return ret;
+
+ return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
+ &node_devices[nid]->dev.kobj,
+ kobject_name(&node_devices[nid]->dev.kobj));
+}
+
/* register memory section under specified node if it spans that node */
-static int register_mem_sect_under_node(struct memory_block *mem_blk,
- void *arg)
+static int register_mem_block_under_node_early(struct memory_block *mem_blk,
+ void *arg)
{
unsigned long memory_block_pfns = memory_block_size_bytes() / PAGE_SIZE;
unsigned long start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
unsigned long end_pfn = start_pfn + memory_block_pfns - 1;
- int ret, nid = *(int *)arg;
+ int nid = *(int *)arg;
unsigned long pfn;
for (pfn = start_pfn; pfn <= end_pfn; pfn++) {
@@ -785,39 +807,34 @@ static int register_mem_sect_under_node(
}
/*
- * We need to check if page belongs to nid only for the boot
- * case, during hotplug we know that all pages in the memory
- * block belong to the same node.
- */
- if (system_state == SYSTEM_BOOTING) {
- page_nid = get_nid_for_pfn(pfn);
- if (page_nid < 0)
- continue;
- if (page_nid != nid)
- continue;
- }
-
- /*
- * If this memory block spans multiple nodes, we only indicate
- * the last processed node.
+ * We need to check if page belongs to nid only at the boot
+ * case because node's ranges can be interleaved.
*/
- mem_blk->nid = nid;
-
- ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
- &mem_blk->dev.kobj,
- kobject_name(&mem_blk->dev.kobj));
- if (ret)
- return ret;
+ page_nid = get_nid_for_pfn(pfn);
+ if (page_nid < 0)
+ continue;
+ if (page_nid != nid)
+ continue;
- return sysfs_create_link_nowarn(&mem_blk->dev.kobj,
- &node_devices[nid]->dev.kobj,
- kobject_name(&node_devices[nid]->dev.kobj));
+ return do_register_memory_block_under_node(nid, mem_blk);
}
/* mem section does not span the specified node */
return 0;
}
/*
+ * During hotplug we know that all pages in the memory block belong to the same
+ * node.
+ */
+static int register_mem_block_under_node_hotplug(struct memory_block *mem_blk,
+ void *arg)
+{
+ int nid = *(int *)arg;
+
+ return do_register_memory_block_under_node(nid, mem_blk);
+}
+
+/*
* Unregister a memory block device under the node it spans. Memory blocks
* with multiple nodes cannot be offlined and therefore also never be removed.
*/
@@ -832,11 +849,19 @@ void unregister_memory_block_under_nodes
kobject_name(&node_devices[mem_blk->nid]->dev.kobj));
}
-int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn)
+int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn,
+ enum meminit_context context)
{
+ walk_memory_blocks_func_t func;
+
+ if (context == MEMINIT_HOTPLUG)
+ func = register_mem_block_under_node_hotplug;
+ else
+ func = register_mem_block_under_node_early;
+
return walk_memory_blocks(PFN_PHYS(start_pfn),
PFN_PHYS(end_pfn - start_pfn), (void *)&nid,
- register_mem_sect_under_node);
+ func);
}
#ifdef CONFIG_HUGETLBFS
--- a/include/linux/node.h~mm-dont-rely-on-system-state-to-detect-hot-plug-operations
+++ a/include/linux/node.h
@@ -99,11 +99,13 @@ extern struct node *node_devices[];
typedef void (*node_registration_func_t)(struct node *);
#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA)
-extern int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long end_pfn);
+int link_mem_sections(int nid, unsigned long start_pfn,
+ unsigned long end_pfn,
+ enum meminit_context context);
#else
static inline int link_mem_sections(int nid, unsigned long start_pfn,
- unsigned long end_pfn)
+ unsigned long end_pfn,
+ enum meminit_context context)
{
return 0;
}
@@ -128,7 +130,8 @@ static inline int register_one_node(int
if (error)
return error;
/* link memory sections under this node */
- error = link_mem_sections(nid, start_pfn, end_pfn);
+ error = link_mem_sections(nid, start_pfn, end_pfn,
+ MEMINIT_EARLY);
}
return error;
--- a/mm/memory_hotplug.c~mm-dont-rely-on-system-state-to-detect-hot-plug-operations
+++ a/mm/memory_hotplug.c
@@ -1080,7 +1080,8 @@ int __ref add_memory_resource(int nid, s
}
/* link memory sections under this node.*/
- ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1));
+ ret = link_mem_sections(nid, PFN_DOWN(start), PFN_UP(start + size - 1),
+ MEMINIT_HOTPLUG);
BUG_ON(ret);
/* create new memmap entry */
_
From: Mikulas Patocka <mpatocka(a)redhat.com>
Subject: arch/x86/lib/usercopy_64.c: fix __copy_user_flushcache() cache writeback
If we copy less than 8 bytes and if the destination crosses a cache line,
__copy_user_flushcache would invalidate only the first cache line. This
patch makes it invalidate the second cache line as well.
Link: https://lkml.kernel.org/r/alpine.LRH.2.02.2009161451140.21915@file01.intran…
Fixes: 0aed55af88345b ("x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations")
Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com>
Reviewed-by: Dan Williams <dan.j.wiilliams(a)intel.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Jeff Moyer <jmoyer(a)redhat.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Toshi Kani <toshi.kani(a)hpe.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Matthew Wilcox <mawilcox(a)microsoft.com>
Cc: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Ingo Molnar <mingo(a)elte.hu>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/x86/lib/usercopy_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/lib/usercopy_64.c~arch-x86-lib-usercopy_64c-fix-__copy_user_flushcache-cache-writeback
+++ a/arch/x86/lib/usercopy_64.c
@@ -120,7 +120,7 @@ long __copy_user_flushcache(void *dst, c
*/
if (size < 8) {
if (!IS_ALIGNED(dest, 4) || size != 4)
- clean_cache_range(dst, 1);
+ clean_cache_range(dst, size);
} else {
if (!IS_ALIGNED(dest, 8)) {
dest = ALIGN(dest, boot_cpu_data.x86_clflush_size);
_
From: Nick Desaulniers <ndesaulniers(a)google.com>
Subject: lib/string.c: implement stpcpy
LLVM implemented a recent "libcall optimization" that lowers calls to
`sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest,
str) - dest`. This generally avoids the machinery involved in parsing
format strings. `stpcpy` is just like `strcpy` except it returns the
pointer to the new tail of `dest`. This optimization was introduced into
clang-12.
Implement this so that we don't observe linkage failures due to missing
symbol definitions for `stpcpy`.
Similar to last year's fire drill with:
commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
The kernel is somewhere between a "freestanding" environment (no full
libc) and "hosted" environment (many symbols from libc exist with the same
type, function signature, and semantics).
As H. Peter Anvin notes, there's not really a great way to inform the
compiler that you're targeting a freestanding environment but would like
to opt-in to some libcall optimizations (see pr/47280 below), rather than
opt-out.
Arvind notes, -fno-builtin-* behaves slightly differently between GCC
and Clang, and Clang is missing many __builtin_* definitions, which I
consider a bug in Clang and am working on fixing.
Masahiro summarizes the subtle distinction between compilers justly:
To prevent transformation from foo() into bar(), there are two ways in
Clang to do that; -fno-builtin-foo, and -fno-builtin-bar. There is
only one in GCC; -fno-buitin-foo.
(Any difference in that behavior in Clang is likely a bug from a missing
__builtin_* definition.)
Masahiro also notes:
We want to disable optimization from foo() to bar(),
but we may still benefit from the optimization from
foo() into something else. If GCC implements the same transform, we
would run into a problem because it is not -fno-builtin-bar, but
-fno-builtin-foo that disables that optimization.
In this regard, -fno-builtin-foo would be more future-proof than
-fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
may want to prevent calls from foo() being optimized into calls to
bar(), but we still may want other optimization on calls to foo().
It seems that compilers today don't quite provide the fine grain control
over which libcall optimizations pseudo-freestanding environments would
prefer.
Finally, Kees notes that this interface is unsafe, so we should not
encourage its use. As such, I've removed the declaration from any
header, but it still needs to be exported to avoid linkage errors in
modules.
Link: https://lkml.kernel.org/r/20200914161643.938408-1-ndesaulniers@google.com
Link: https://bugs.llvm.org/show_bug.cgi?id=47162
Link: https://bugs.llvm.org/show_bug.cgi?id=47280
Link: https://github.com/ClangBuiltLinux/linux/issues/1126
Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
Link: https://reviews.llvm.org/D85963
Reported-by: Sami Tolvanen <samitolvanen(a)google.com>
Suggested-by: Andy Lavr <andy.lavr(a)gmail.com>
Suggested-by: Arvind Sankar <nivedita(a)alum.mit.edu>
Suggested-by: Joe Perches <joe(a)perches.com>
Suggested-by: Kees Cook <keescook(a)chromium.org>
Suggested-by: Masahiro Yamada <masahiroy(a)kernel.org>
Suggested-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/string.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
--- a/lib/string.c~lib-stringc-implement-stpcpy
+++ a/lib/string.c
@@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const ch
}
EXPORT_SYMBOL(strscpy_pad);
+/**
+ * stpcpy - copy a string from src to dest returning a pointer to the new end
+ * of dest, including src's %NUL-terminator. May overrun dest.
+ * @dest: pointer to end of string being copied into. Must be large enough
+ * to receive copy.
+ * @src: pointer to the beginning of string being copied from. Must not overlap
+ * dest.
+ *
+ * stpcpy differs from strcpy in a key way: the return value is a pointer
+ * to the new %NUL-terminating character in @dest. (For strcpy, the return
+ * value is a pointer to the start of @dest). This interface is considered
+ * unsafe as it doesn't perform bounds checking of the inputs. As such it's
+ * not recommended for usage. Instead, its definition is provided in case
+ * the compiler lowers other libcalls to stpcpy.
+ */
+char *stpcpy(char *__restrict__ dest, const char *__restrict__ src);
+char *stpcpy(char *__restrict__ dest, const char *__restrict__ src)
+{
+ while ((*dest++ = *src++) != '\0')
+ /* nothing */;
+ return --dest;
+}
+EXPORT_SYMBOL(stpcpy);
+
#ifndef __HAVE_ARCH_STRCAT
/**
* strcat - Append one %NUL-terminated string to another
_
From: Vasily Gorbik <gor(a)linux.ibm.com>
Subject: mm/gup: fix gup_fast with dynamic page table folding
Currently to make sure that every page table entry is read just once
gup_fast walks perform READ_ONCE and pass pXd value down to the next
gup_pXd_range function by value e.g.:
static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
...
pudp = pud_offset(&p4d, addr);
This function passes a reference on that local value copy to pXd_offset,
and might get the very same pointer in return. This happens when the
level is folded (on most arches), and that pointer should not be iterated.
On s390 due to the fact that each task might have different 5,4 or 3-level
address translation and hence different levels folded the logic is more
complex and non-iteratable pointer to a local copy leads to severe
problems.
Here is an example of what happens with gup_fast on s390, for a task with
3-levels paging, crossing a 2 GB pud boundary:
// addr = 0x1007ffff000, end = 0x10080001000
static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
pud_t *pudp;
// pud_offset returns &p4d itself (a pointer to a value on stack)
pudp = pud_offset(&p4d, addr);
do {
// on second iteratation reading "random" stack value
pud_t pud = READ_ONCE(*pudp);
// next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390
next = pud_addr_end(addr, end);
...
} while (pudp++, addr = next, addr != end); // pudp++ iterating over stack
return 1;
}
This happens since s390 moved to common gup code with commit d1874a0c2805
("s390/mm: make the pxd_offset functions more robust") and commit
1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code").
s390 tried to mimic static level folding by changing pXd_offset
primitives to always calculate top level page table offset in pgd_offset
and just return the value passed when pXd_offset has to act as folded.
What is crucial for gup_fast and what has been overlooked is that
PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly.
And the latter is not possible with dynamic folding.
To fix the issue in addition to pXd values pass original pXdp pointers
down to gup_pXd_range functions. And introduce pXd_offset_lockless
helpers, which take an additional pXd entry value parameter. This has
already been discussed in
https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1
Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856…
Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code")
Signed-off-by: Vasily Gorbik <gor(a)linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Russell King <linux(a)armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Paul Mackerras <paulus(a)samba.org>
Cc: Jeff Dike <jdike(a)addtoit.com>
Cc: Richard Weinberger <richard(a)nod.at>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org> [5.2+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/s390/include/asm/pgtable.h | 42 +++++++++++++++++++++---------
include/linux/pgtable.h | 10 +++++++
mm/gup.c | 18 ++++++------
3 files changed, 49 insertions(+), 21 deletions(-)
--- a/arch/s390/include/asm/pgtable.h~mm-gup-fix-gup_fast-with-dynamic-page-table-folding
+++ a/arch/s390/include/asm/pgtable.h
@@ -1260,26 +1260,44 @@ static inline pgd_t *pgd_offset_raw(pgd_
#define pgd_offset(mm, address) pgd_offset_raw(READ_ONCE((mm)->pgd), address)
-static inline p4d_t *p4d_offset(pgd_t *pgd, unsigned long address)
+static inline p4d_t *p4d_offset_lockless(pgd_t *pgdp, pgd_t pgd, unsigned long address)
{
- if ((pgd_val(*pgd) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R1)
- return (p4d_t *) pgd_deref(*pgd) + p4d_index(address);
- return (p4d_t *) pgd;
+ if ((pgd_val(pgd) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R1)
+ return (p4d_t *) pgd_deref(pgd) + p4d_index(address);
+ return (p4d_t *) pgdp;
}
+#define p4d_offset_lockless p4d_offset_lockless
-static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address)
+static inline p4d_t *p4d_offset(pgd_t *pgdp, unsigned long address)
{
- if ((p4d_val(*p4d) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R2)
- return (pud_t *) p4d_deref(*p4d) + pud_index(address);
- return (pud_t *) p4d;
+ return p4d_offset_lockless(pgdp, *pgdp, address);
+}
+
+static inline pud_t *pud_offset_lockless(p4d_t *p4dp, p4d_t p4d, unsigned long address)
+{
+ if ((p4d_val(p4d) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R2)
+ return (pud_t *) p4d_deref(p4d) + pud_index(address);
+ return (pud_t *) p4dp;
+}
+#define pud_offset_lockless pud_offset_lockless
+
+static inline pud_t *pud_offset(p4d_t *p4dp, unsigned long address)
+{
+ return pud_offset_lockless(p4dp, *p4dp, address);
}
#define pud_offset pud_offset
-static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
+static inline pmd_t *pmd_offset_lockless(pud_t *pudp, pud_t pud, unsigned long address)
+{
+ if ((pud_val(pud) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R3)
+ return (pmd_t *) pud_deref(pud) + pmd_index(address);
+ return (pmd_t *) pudp;
+}
+#define pmd_offset_lockless pmd_offset_lockless
+
+static inline pmd_t *pmd_offset(pud_t *pudp, unsigned long address)
{
- if ((pud_val(*pud) & _REGION_ENTRY_TYPE_MASK) >= _REGION_ENTRY_TYPE_R3)
- return (pmd_t *) pud_deref(*pud) + pmd_index(address);
- return (pmd_t *) pud;
+ return pmd_offset_lockless(pudp, *pudp, address);
}
#define pmd_offset pmd_offset
--- a/include/linux/pgtable.h~mm-gup-fix-gup_fast-with-dynamic-page-table-folding
+++ a/include/linux/pgtable.h
@@ -1427,6 +1427,16 @@ typedef unsigned int pgtbl_mod_mask;
#define mm_pmd_folded(mm) __is_defined(__PAGETABLE_PMD_FOLDED)
#endif
+#ifndef p4d_offset_lockless
+#define p4d_offset_lockless(pgdp, pgd, address) p4d_offset(&(pgd), address)
+#endif
+#ifndef pud_offset_lockless
+#define pud_offset_lockless(p4dp, p4d, address) pud_offset(&(p4d), address)
+#endif
+#ifndef pmd_offset_lockless
+#define pmd_offset_lockless(pudp, pud, address) pmd_offset(&(pud), address)
+#endif
+
/*
* p?d_leaf() - true if this entry is a final mapping to a physical address.
* This differs from p?d_huge() by the fact that they are always available (if
--- a/mm/gup.c~mm-gup-fix-gup_fast-with-dynamic-page-table-folding
+++ a/mm/gup.c
@@ -2485,13 +2485,13 @@ static int gup_huge_pgd(pgd_t orig, pgd_
return 1;
}
-static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
+static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
pmd_t *pmdp;
- pmdp = pmd_offset(&pud, addr);
+ pmdp = pmd_offset_lockless(pudp, pud, addr);
do {
pmd_t pmd = READ_ONCE(*pmdp);
@@ -2528,13 +2528,13 @@ static int gup_pmd_range(pud_t pud, unsi
return 1;
}
-static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
+static int gup_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
pud_t *pudp;
- pudp = pud_offset(&p4d, addr);
+ pudp = pud_offset_lockless(p4dp, p4d, addr);
do {
pud_t pud = READ_ONCE(*pudp);
@@ -2549,20 +2549,20 @@ static int gup_pud_range(p4d_t p4d, unsi
if (!gup_huge_pd(__hugepd(pud_val(pud)), addr,
PUD_SHIFT, next, flags, pages, nr))
return 0;
- } else if (!gup_pmd_range(pud, addr, next, flags, pages, nr))
+ } else if (!gup_pmd_range(pudp, pud, addr, next, flags, pages, nr))
return 0;
} while (pudp++, addr = next, addr != end);
return 1;
}
-static int gup_p4d_range(pgd_t pgd, unsigned long addr, unsigned long end,
+static int gup_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
p4d_t *p4dp;
- p4dp = p4d_offset(&pgd, addr);
+ p4dp = p4d_offset_lockless(pgdp, pgd, addr);
do {
p4d_t p4d = READ_ONCE(*p4dp);
@@ -2574,7 +2574,7 @@ static int gup_p4d_range(pgd_t pgd, unsi
if (!gup_huge_pd(__hugepd(p4d_val(p4d)), addr,
P4D_SHIFT, next, flags, pages, nr))
return 0;
- } else if (!gup_pud_range(p4d, addr, next, flags, pages, nr))
+ } else if (!gup_pud_range(p4dp, p4d, addr, next, flags, pages, nr))
return 0;
} while (p4dp++, addr = next, addr != end);
@@ -2602,7 +2602,7 @@ static void gup_pgd_range(unsigned long
if (!gup_huge_pd(__hugepd(pgd_val(pgd)), addr,
PGDIR_SHIFT, next, flags, pages, nr))
return;
- } else if (!gup_p4d_range(pgd, addr, next, flags, pages, nr))
+ } else if (!gup_p4d_range(pgdp, pgd, addr, next, flags, pages, nr))
return;
} while (pgdp++, addr = next, addr != end);
}
_
From: Gao Xiang <hsiangkao(a)redhat.com>
Subject: mm, THP, swap: fix allocating cluster for swapfile by mistake
SWP_FS is used to make swap_{read,write}page() go through the filesystem,
and it's only used for swap files over NFS. So, !SWP_FS means non NFS for
now, it could be either file backed or device backed. Something similar
goes with legacy SWP_FILE.
So in order to achieve the goal of the original patch, SWP_BLKDEV should
be used instead.
FS corruption can be observed with SSD device + XFS + fragmented swapfile
due to CONFIG_THP_SWAP=y.
I reproduced the issue with the following details:
Environment:
QEMU + upstream kernel + buildroot + NVMe (2 GB)
Kernel config:
CONFIG_BLK_DEV_NVME=y
CONFIG_THP_SWAP=y
Some reproducable steps:
mkfs.xfs -f /dev/nvme0n1
mkdir /tmp/mnt
mount /dev/nvme0n1 /tmp/mnt
bs="32k"
sz="1024m" # doesn't matter too much, I also tried 16m
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw
mkswap /tmp/mnt/sw
swapon /tmp/mnt/sw
stress --vm 2 --vm-bytes 600M # doesn't matter too much as well
Symptoms:
- FS corruption (e.g. checksum failure)
- memory corruption at: 0xd2808010
- segfault
Link: https://lkml.kernel.org/r/20200820045323.7809-1-hsiangkao@redhat.com
Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device")
Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out")
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Acked-by: Rafael Aquini <aquini(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Carlos Maiolino <cmaiolino(a)redhat.com>
Cc: Eric Sandeen <esandeen(a)redhat.com>
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/swapfile.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/swapfile.c~mm-thp-swap-fix-allocating-cluster-for-swapfile-by-mistake
+++ a/mm/swapfile.c
@@ -1078,7 +1078,7 @@ start_over:
goto nextsi;
}
if (size == SWAPFILE_CLUSTER) {
- if (!(si->flags & SWP_FS))
+ if (si->flags & SWP_BLKDEV)
n_ret = swap_alloc_cluster(si, swp_entries);
} else
n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
_
The patch titled
Subject: mm/memory_hotplug: drain per-cpu pages again during memory offline
has been removed from the -mm tree. Its filename was
mm-memory_hotplug-drain-per-cpu-pages-again-during-memory-offline.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Subject: mm/memory_hotplug: drain per-cpu pages again during memory offline
There is a race during page offline that can lead to infinite loop:
a page never ends up on a buddy list and __offline_pages() keeps
retrying infinitely or until a termination signal is received.
Thread#1 - a new process:
load_elf_binary
begin_new_exec
exec_mmap
mmput
exit_mmap
tlb_finish_mmu
tlb_flush_mmu
release_pages
free_unref_page_list
free_unref_page_prepare
set_pcppage_migratetype(page, migratetype);
// Set page->index migration type below MIGRATE_PCPTYPES
Thread#2 - hot-removes memory
__offline_pages
start_isolate_page_range
set_migratetype_isolate
set_pageblock_migratetype(page, MIGRATE_ISOLATE);
Set migration type to MIGRATE_ISOLATE-> set
drain_all_pages(zone);
// drain per-cpu page lists to buddy allocator.
Thread#1 - continue
free_unref_page_commit
migratetype = get_pcppage_migratetype(page);
// get old migration type
list_add(&page->lru, &pcp->lists[migratetype]);
// add new page to already drained pcp list
Thread#2
Never drains pcp again, and therefore gets stuck in the loop.
The fix is to try to drain per-cpu lists again after
check_pages_isolated_cb() fails.
Link: https://lkml.kernel.org/r/20200903140032.380431-1-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20200904151448.100489-2-pasha.tatashin@soleen.com
Link: http://lkml.kernel.org/r/20200904070235.GA15277@dhcp22.suse.cz
Fixes: c52e75935f8d ("mm: remove extra drain pages on pcp list")
Signed-off-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory_hotplug.c | 14 ++++++++++++++
mm/page_isolation.c | 8 ++++++++
2 files changed, 22 insertions(+)
--- a/mm/memory_hotplug.c~mm-memory_hotplug-drain-per-cpu-pages-again-during-memory-offline
+++ a/mm/memory_hotplug.c
@@ -1575,6 +1575,20 @@ static int __ref __offline_pages(unsigne
/* check again */
ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
NULL, check_pages_isolated_cb);
+ /*
+ * per-cpu pages are drained in start_isolate_page_range, but if
+ * there are still pages that are not free, make sure that we
+ * drain again, because when we isolated range we might
+ * have raced with another thread that was adding pages to pcp
+ * list.
+ *
+ * Forward progress should be still guaranteed because
+ * pages on the pcp list can only belong to MOVABLE_ZONE
+ * because has_unmovable_pages explicitly checks for
+ * PageBuddy on freed pages on other zones.
+ */
+ if (ret)
+ drain_all_pages(zone);
} while (ret);
/* Ok, all of our target is isolated.
--- a/mm/page_isolation.c~mm-memory_hotplug-drain-per-cpu-pages-again-during-memory-offline
+++ a/mm/page_isolation.c
@@ -170,6 +170,14 @@ __first_valid_page(unsigned long pfn, un
* pageblocks we may have modified and return -EBUSY to caller. This
* prevents two threads from simultaneously working on overlapping ranges.
*
+ * Please note that there is no strong synchronization with the page allocator
+ * either. Pages might be freed while their page blocks are marked ISOLATED.
+ * In some cases pages might still end up on pcp lists and that would allow
+ * for their allocation even when they are in fact isolated already. Depending
+ * on how strong of a guarantee the caller needs drain_all_pages might be needed
+ * (e.g. __offline_pages will need to call it after check for isolated range for
+ * a next retry).
+ *
* Return: the number of isolated pageblocks on success and -EBUSY if any part
* of range cannot be isolated.
*/
_
Patches currently in -mm which might be from pasha.tatashin(a)soleen.com are
The patch titled
Subject: selftests/vm: fix display of page size in map_hugetlb
has been removed from the -mm tree. Its filename was
selftests-vm-fix-display-of-page-size-in-map_hugetlb.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Subject: selftests/vm: fix display of page size in map_hugetlb
The displayed size is in bytes while the text says it is in kB.
Shift it by 10 to really display kBytes.
Link: https://lkml.kernel.org/r/e27481224564a93d14106e750de31189deaa8bc8.15988619…
Fixes: fa7b9a805c79 ("tools/selftest/vm: allow choosing mem size and page size in map_hugetlb")
Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/vm/map_hugetlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/vm/map_hugetlb.c~selftests-vm-fix-display-of-page-size-in-map_hugetlb
+++ a/tools/testing/selftests/vm/map_hugetlb.c
@@ -83,7 +83,7 @@ int main(int argc, char **argv)
}
if (shift)
- printf("%u kB hugepages\n", 1 << shift);
+ printf("%u kB hugepages\n", 1 << (shift - 10));
else
printf("Default size hugepages\n");
printf("Mapping %lu Mbytes\n", (unsigned long)length >> 20);
_
Patches currently in -mm which might be from christophe.leroy(a)csgroup.eu are
The patch titled
Subject: mm/thp: fix __split_huge_pmd_locked() for migration PMD
has been removed from the -mm tree. Its filename was
mm-thp-fix-__split_huge_pmd_locked-for-migration-pmd.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Ralph Campbell <rcampbell(a)nvidia.com>
Subject: mm/thp: fix __split_huge_pmd_locked() for migration PMD
A migrating transparent huge page has to already be unmapped. Otherwise,
the page could be modified while it is being copied to a new page and data
could be lost. The function __split_huge_pmd() checks for a PMD migration
entry before calling __split_huge_pmd_locked() leading one to think that
__split_huge_pmd_locked() can handle splitting a migrating PMD.
However, the code always increments the page->_mapcount and adjusts the
memory control group accounting assuming the page is mapped.
Also, if the PMD entry is a migration PMD entry, the call to
is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead
of migration_entry_to_pfn(pmd_to_swp_entry(pmd)). Fix these problems by
checking for a PMD migration entry.
Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.com
Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Reviewed-by: Zi Yan <ziy(a)nvidia.com>
Cc: Jerome Glisse <jglisse(a)redhat.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Alistair Popple <apopple(a)nvidia.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: Bharata B Rao <bharata(a)linux.ibm.com>
Cc: Ben Skeggs <bskeggs(a)redhat.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 42 +++++++++++++++++++++++-------------------
1 file changed, 23 insertions(+), 19 deletions(-)
--- a/mm/huge_memory.c~mm-thp-fix-__split_huge_pmd_locked-for-migration-pmd
+++ a/mm/huge_memory.c
@@ -2022,7 +2022,7 @@ static void __split_huge_pmd_locked(stru
put_page(page);
add_mm_counter(mm, mm_counter_file(page), -HPAGE_PMD_NR);
return;
- } else if (is_huge_zero_pmd(*pmd)) {
+ } else if (pmd_trans_huge(*pmd) && is_huge_zero_pmd(*pmd)) {
/*
* FIXME: Do we want to invalidate secondary mmu by calling
* mmu_notifier_invalidate_range() see comments below inside
@@ -2116,30 +2116,34 @@ static void __split_huge_pmd_locked(stru
pte = pte_offset_map(&_pmd, addr);
BUG_ON(!pte_none(*pte));
set_pte_at(mm, addr, pte, entry);
- atomic_inc(&page[i]._mapcount);
- pte_unmap(pte);
- }
-
- /*
- * Set PG_double_map before dropping compound_mapcount to avoid
- * false-negative page_mapped().
- */
- if (compound_mapcount(page) > 1 && !TestSetPageDoubleMap(page)) {
- for (i = 0; i < HPAGE_PMD_NR; i++)
+ if (!pmd_migration)
atomic_inc(&page[i]._mapcount);
+ pte_unmap(pte);
}
- lock_page_memcg(page);
- if (atomic_add_negative(-1, compound_mapcount_ptr(page))) {
- /* Last compound_mapcount is gone. */
- __dec_lruvec_page_state(page, NR_ANON_THPS);
- if (TestClearPageDoubleMap(page)) {
- /* No need in mapcount reference anymore */
+ if (!pmd_migration) {
+ /*
+ * Set PG_double_map before dropping compound_mapcount to avoid
+ * false-negative page_mapped().
+ */
+ if (compound_mapcount(page) > 1 &&
+ !TestSetPageDoubleMap(page)) {
for (i = 0; i < HPAGE_PMD_NR; i++)
- atomic_dec(&page[i]._mapcount);
+ atomic_inc(&page[i]._mapcount);
+ }
+
+ lock_page_memcg(page);
+ if (atomic_add_negative(-1, compound_mapcount_ptr(page))) {
+ /* Last compound_mapcount is gone. */
+ __dec_lruvec_page_state(page, NR_ANON_THPS);
+ if (TestClearPageDoubleMap(page)) {
+ /* No need in mapcount reference anymore */
+ for (i = 0; i < HPAGE_PMD_NR; i++)
+ atomic_dec(&page[i]._mapcount);
+ }
}
+ unlock_page_memcg(page);
}
- unlock_page_memcg(page);
smp_wmb(); /* make pte visible before pmd */
pmd_populate(mm, pmd, pgtable);
_
Patches currently in -mm which might be from rcampbell(a)nvidia.com are
mm-test-use-the-new-skip-macro.patch
hmm-test-remove-unused-dmirror_zero_page.patch
mm-move-call-to-compound_head-in-release_pages.patch
mm-migrate-remove-cpages-in-migrate_vma_finalize.patch
mm-migrate-remove-obsolete-comment-about-device-public.patch
The patch titled
Subject: kprobes: fix kill kprobe which has been marked as gone
has been removed from the -mm tree. Its filename was
kprobes-fix-kill-kprobe-which-has-been-marked-as-gone.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: kprobes: fix kill kprobe which has been marked as gone
If a kprobe is marked as gone, we should not kill it again. Otherwise, we
can disarm the kprobe more than once. In that case, the statistics of
kprobe_ftrace_enabled can unbalance which can lead to that kprobe do not
work.
Link: https://lkml.kernel.org/r/20200822030055.32383-1-songmuchun@bytedance.com
Fixes: e8386a0cb22f ("kprobes: support probing module __exit function")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Co-developed-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Signed-off-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: "Naveen N . Rao" <naveen.n.rao(a)linux.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy(a)intel.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/kprobes.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/kernel/kprobes.c~kprobes-fix-kill-kprobe-which-has-been-marked-as-gone
+++ a/kernel/kprobes.c
@@ -2140,6 +2140,9 @@ static void kill_kprobe(struct kprobe *p
lockdep_assert_held(&kprobe_mutex);
+ if (WARN_ON_ONCE(kprobe_gone(p)))
+ return;
+
p->flags |= KPROBE_FLAG_GONE;
if (kprobe_aggrprobe(p)) {
/*
@@ -2419,7 +2422,10 @@ static int kprobes_module_callback(struc
mutex_lock(&kprobe_mutex);
for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
head = &kprobe_table[i];
- hlist_for_each_entry(p, head, hlist)
+ hlist_for_each_entry(p, head, hlist) {
+ if (kprobe_gone(p))
+ continue;
+
if (within_module_init((unsigned long)p->addr, mod) ||
(checkcore &&
within_module_core((unsigned long)p->addr, mod))) {
@@ -2436,6 +2442,7 @@ static int kprobes_module_callback(struc
*/
kill_kprobe(p);
}
+ }
}
if (val == MODULE_STATE_GOING)
remove_module_kprobe_blacklist(mod);
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-memcontrol-fix-missing-suffix-of-workingset_restore.patch
mm-memcontrol-add-the-missing-numa_stat-interface-for-cgroup-v2.patch
The patch titled
Subject: ksm: reinstate memcg charge on copied pages
has been removed from the -mm tree. Its filename was
ksm-reinstate-memcg-charge-on-copied-pages.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: ksm: reinstate memcg charge on copied pages
Patch series "mm: fixes to past from future testing".
Here's a set of independent fixes against 5.9-rc2: prompted by
testing Alex Shi's "warning on !memcg" and lru_lock series, but
I think fit for 5.9 - though maybe only the first for stable.
This patch (of 5):
In 5.8 some instances of memcg charging in do_swap_page() and unuse_pte()
were removed, on the understanding that swap cache is now already charged
at those points; but a case was missed, when ksm_might_need_to_copy() has
decided it must allocate a substitute page: such pages were never charged.
Fix it inside ksm_might_need_to_copy().
This was discovered by Alex Shi's prospective commit "mm/memcg: warning on
!memcg after readahead page charged".
But there is a another surprise: this also fixes some rarer uncharged
PageAnon cases, when KSM is configured in, but has never been activated.
ksm_might_need_to_copy()'s anon_vma->root and linear_page_index() check
sometimes catches a case which would need to have been copied if KSM were
turned on. Or that's my optimistic interpretation (of my own old code),
but it leaves some doubt as to whether everything is working as intended
there - might it hint at rare anon ptes which rmap cannot find? A
question not easily answered: put in the fix for missed memcg charges.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301343270.5954@eggly.anvils
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301358020.5954@eggly.anvils
Fixes: 4c6355b25e8b ("mm: memcontrol: charge swapin pages on instantiation")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Alex Shi <alex.shi(a)linux.alibaba.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc; Matthew Wilcox <willy(a)infradead.org>
Cc: Qian Cai <cai(a)lca.pw>
Cc: <stable(a)vger.kernel.org> [5.8]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/ksm.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/mm/ksm.c~ksm-reinstate-memcg-charge-on-copied-pages
+++ a/mm/ksm.c
@@ -2586,6 +2586,10 @@ struct page *ksm_might_need_to_copy(stru
return page; /* let do_swap_page report the error */
new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
+ if (new_page && mem_cgroup_charge(new_page, vma->vm_mm, GFP_KERNEL)) {
+ put_page(new_page);
+ new_page = NULL;
+ }
if (new_page) {
copy_user_highpage(new_page, page, address, vma);
_
Patches currently in -mm which might be from hughd(a)google.com are
CLK_TO_US macro is used to calculate potential transfer time for various
timeout handling. However it overflows on transfer bigger than 512 bytes
because it first did (len * 8 * 1000000).
This controller typically operates at 45MHz. This patch did 2 things:
1. calculate clock / 1000000 first
2. add a 4M transfer size cap so that the final timeout in DMA reading
doesn't overflow
Fixes: 881d1ee9fe81f ("spi: add support for mediatek spi-nor controller")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Chuanhong Guo <gch981213(a)gmail.com>
---
Change since v1: fix transfer size cap to 4M
drivers/spi/spi-mtk-nor.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-mtk-nor.c b/drivers/spi/spi-mtk-nor.c
index 6e6ca2b8e6c82..62f5ff2779884 100644
--- a/drivers/spi/spi-mtk-nor.c
+++ b/drivers/spi/spi-mtk-nor.c
@@ -89,7 +89,7 @@
// Buffered page program can do one 128-byte transfer
#define MTK_NOR_PP_SIZE 128
-#define CLK_TO_US(sp, clkcnt) ((clkcnt) * 1000000 / sp->spi_freq)
+#define CLK_TO_US(sp, clkcnt) DIV_ROUND_UP(clkcnt, sp->spi_freq / 1000000)
struct mtk_nor {
struct spi_controller *ctlr;
@@ -177,6 +177,10 @@ static int mtk_nor_adjust_op_size(struct spi_mem *mem, struct spi_mem_op *op)
if ((op->addr.nbytes == 3) || (op->addr.nbytes == 4)) {
if ((op->data.dir == SPI_MEM_DATA_IN) &&
mtk_nor_match_read(op)) {
+ // limit size to prevent timeout calculation overflow
+ if (op->data.nbytes > 0x400000)
+ op->data.nbytes = 0x400000;
+
if ((op->addr.val & MTK_NOR_DMA_ALIGN_MASK) ||
(op->data.nbytes < MTK_NOR_DMA_ALIGN))
op->data.nbytes = 1;
--
2.26.2
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: cf9938637c5c - Linux 5.8.12-rc1
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
x86_64:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 2:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 01a94ace543c - net/mlx5e: Fix endianness when calculating pedit mask first bit
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Podman system integration test - as root
⚡⚡⚡ Podman system integration test - as user
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory: fork_mem
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 4:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 5:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
x86_64:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 2:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 8375d24eb6f6 - net/mlx5e: Fix endianness when calculating pedit mask first bit
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ❌ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
The basic permission bits (protection bits in AmigaOS) have been broken
in Linux' affs - it would only set bits, but never delete them.
Also, contrary to the documentation, the Archived bit was not handled.
Let's fix this for good, and set the bits such that Linux and classic
AmigaOS can coexist in the most peaceful manner.
Also, update the documentation to represent the current state of things.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Max Staudt <max(a)enpas.org>
---
Documentation/filesystems/affs.rst | 16 ++++++++++------
fs/affs/amigaffs.c | 27 +++++++++++++++++++++++++++
fs/affs/file.c | 27 ++++++++++++++++++++++++++-
3 files changed, 63 insertions(+), 7 deletions(-)
diff --git a/Documentation/filesystems/affs.rst b/Documentation/filesystems/affs.rst
index 7f1a40dce6d3..5776cbd5fa53 100644
--- a/Documentation/filesystems/affs.rst
+++ b/Documentation/filesystems/affs.rst
@@ -110,13 +110,15 @@ The Amiga protection flags RWEDRWEDHSPARWED are handled as follows:
- R maps to r for user, group and others. On directories, R implies x.
- - If both W and D are allowed, w will be set.
+ - W maps to w.
- E maps to x.
- - H and P are always retained and ignored under Linux.
+ - D is ignored.
- - A is always reset when a file is written to.
+ - H, S and P are always retained and ignored under Linux.
+
+ - A is cleared when a file is written to.
User id and group id will be used unless set[gu]id are given as mount
options. Since most of the Amiga file systems are single user systems
@@ -128,11 +130,13 @@ Linux -> Amiga:
The Linux rwxrwxrwx file mode is handled as follows:
- - r permission will set R for user, group and others.
+ - r permission will allow R for user, group and others.
+
+ - w permission will allow W for user, group and others.
- - w permission will set W and D for user, group and others.
+ - x permission of the user will allow E for plain files.
- - x permission of the user will set E for plain files.
+ - D will be allowed for user, group and others.
- All other flags (suid, sgid, ...) are ignored and will
not be retained.
diff --git a/fs/affs/amigaffs.c b/fs/affs/amigaffs.c
index f708c45d5f66..7952f885e6c6 100644
--- a/fs/affs/amigaffs.c
+++ b/fs/affs/amigaffs.c
@@ -420,24 +420,51 @@ affs_mode_to_prot(struct inode *inode)
u32 prot = AFFS_I(inode)->i_protect;
umode_t mode = inode->i_mode;
+ /*
+ * First, clear all RWED bits for owner, group, other.
+ * Then, recalculate them afresh.
+ *
+ * We'll always clear the delete-inhibit bit for the owner,
+ * as that is the classic single-user mode AmigaOS protection
+ * bit and we need to stay compatible with all scenarios.
+ *
+ * Since multi-user AmigaOS is an extension, we'll only set
+ * the delete-allow bit if any of the other bits in the same
+ * user class (group/other) are used.
+ */
+ prot &= ~(FIBF_NOEXECUTE | FIBF_NOREAD
+ | FIBF_NOWRITE | FIBF_NODELETE
+ | FIBF_GRP_EXECUTE | FIBF_GRP_READ
+ | FIBF_GRP_WRITE | FIBF_GRP_DELETE
+ | FIBF_OTR_EXECUTE | FIBF_OTR_READ
+ | FIBF_OTR_WRITE | FIBF_OTR_DELETE);
+
+ /* Classic single-user AmigaOS flags. These are inverted. */
if (!(mode & 0100))
prot |= FIBF_NOEXECUTE;
if (!(mode & 0400))
prot |= FIBF_NOREAD;
if (!(mode & 0200))
prot |= FIBF_NOWRITE;
+
+ /* Multi-user extended flags. Not inverted. */
if (mode & 0010)
prot |= FIBF_GRP_EXECUTE;
if (mode & 0040)
prot |= FIBF_GRP_READ;
if (mode & 0020)
prot |= FIBF_GRP_WRITE;
+ if (mode & 0070)
+ prot |= FIBF_GRP_DELETE;
+
if (mode & 0001)
prot |= FIBF_OTR_EXECUTE;
if (mode & 0004)
prot |= FIBF_OTR_READ;
if (mode & 0002)
prot |= FIBF_OTR_WRITE;
+ if (mode & 0007)
+ prot |= FIBF_OTR_DELETE;
AFFS_I(inode)->i_protect = prot;
}
diff --git a/fs/affs/file.c b/fs/affs/file.c
index a26a0f96c119..9a137e2f1782 100644
--- a/fs/affs/file.c
+++ b/fs/affs/file.c
@@ -429,6 +429,25 @@ static int affs_write_begin(struct file *file, struct address_space *mapping,
return ret;
}
+static int affs_write_end(struct file *file, struct address_space *mapping,
+ loff_t pos, unsigned int len, unsigned int copied,
+ struct page *page, void *fsdata)
+{
+ struct inode *inode = mapping->host;
+ int ret;
+
+ ret = generic_write_end(file, mapping, pos, len, copied,
+ page, fsdata);
+
+ /* Clear Archived bit on file writes, as AmigaOS would do */
+ if (AFFS_I(inode)->i_protect & FIBF_ARCHIVED) {
+ AFFS_I(inode)->i_protect &= ~FIBF_ARCHIVED;
+ mark_inode_dirty(inode);
+ }
+
+ return ret;
+}
+
static sector_t _affs_bmap(struct address_space *mapping, sector_t block)
{
return generic_block_bmap(mapping,block,affs_get_block);
@@ -438,7 +457,7 @@ const struct address_space_operations affs_aops = {
.readpage = affs_readpage,
.writepage = affs_writepage,
.write_begin = affs_write_begin,
- .write_end = generic_write_end,
+ .write_end = affs_write_end,
.direct_IO = affs_direct_IO,
.bmap = _affs_bmap
};
@@ -795,6 +814,12 @@ static int affs_write_end_ofs(struct file *file, struct address_space *mapping,
if (tmp > inode->i_size)
inode->i_size = AFFS_I(inode)->mmu_private = tmp;
+ /* Clear Archived bit on file writes, as AmigaOS would do */
+ if (AFFS_I(inode)->i_protect & FIBF_ARCHIVED) {
+ AFFS_I(inode)->i_protect &= ~FIBF_ARCHIVED;
+ mark_inode_dirty(inode);
+ }
+
err_first_bh:
unlock_page(page);
put_page(page);
--
2.20.1
commit a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab
objects") adds the checks for Slab pages, but the pages don't have
page_count are still missing from the check.
Network layer's sendpage method is not designed to send page_count 0
pages neither, therefore both PageSlab() and page_count() should be
both checked for the sending page. This is exactly what sendpage_ok()
does.
This patch uses sendpage_ok() in do_tcp_sendpages() to detect misused
.sendpage, to make the code more robust.
Fixes: a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects")
Suggested-by: Eric Dumazet <eric.dumazet(a)gmail.com>
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Vasily Averin <vvs(a)virtuozzo.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: stable(a)vger.kernel.org
---
net/ipv4/tcp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 31f3b858db81..2135ee7c806d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -970,7 +970,8 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
if (IS_ENABLED(CONFIG_DEBUG_VM) &&
- WARN_ONCE(PageSlab(page), "page must not be a Slab one"))
+ WARN_ONCE(!sendpage_ok(page),
+ "page must not be a Slab one and have page_count > 0"))
return -EINVAL;
/* Wait for a connection to finish. One exception is TCP Fast Open
--
2.26.2
Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
send slab pages. But for pages allocated by __get_free_pages() without
__GFP_COMP, which also have refcount as 0, they are still sent by
kernel_sendpage() to remote end, this is problematic.
The new introduced helper sendpage_ok() checks both PageSlab tag and
page_count counter, and returns true if the checking page is OK to be
sent by kernel_sendpage().
This patch fixes the page checking issue of nvme_tcp_try_send_data()
with sendpage_ok(). If sendpage_ok() returns true, send this page by
kernel_sendpage(), otherwise use sock_no_sendpage to handle this page.
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Jan Kara <jack(a)suse.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy(a)solarflare.com>
Cc: Philipp Reisner <philipp.reisner(a)linbit.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Vlastimil Babka <vbabka(a)suse.com>
Cc: stable(a)vger.kernel.org
---
drivers/nvme/host/tcp.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 8f4f29f18b8c..d6a3e1487354 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -913,12 +913,11 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
else
flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
- /* can't zcopy slab pages */
- if (unlikely(PageSlab(page))) {
- ret = sock_no_sendpage(queue->sock, page, offset, len,
+ if (sendpage_ok(page)) {
+ ret = kernel_sendpage(queue->sock, page, offset, len,
flags);
} else {
- ret = kernel_sendpage(queue->sock, page, offset, len,
+ ret = sock_no_sendpage(queue->sock, page, offset, len,
flags);
}
if (ret <= 0)
--
2.26.2
This is a note to let you know that I've just added the patch titled
usbcore/driver: Accommodate usbip
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 3fce39601a1a34d940cf62858ee01ed9dac5d459 Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <m.v.b(a)runbox.com>
Date: Tue, 22 Sep 2020 14:07:03 +0300
Subject: usbcore/driver: Accommodate usbip
Commit 88b7381a939d ("USB: Select better matching USB drivers when
available") inadvertently broke usbip functionality. The commit in
question allows USB device drivers to be explicitly matched with
USB devices via the use of driver-provided identifier tables and
match functions, which is useful for a specialised device driver
to be chosen for a device that can also be handled by another,
more generic, device driver.
Prior, the USB device section of usb_device_match() had an
unconditional "return 1" statement, which allowed user-space to bind
USB devices to the usbip_host device driver, if desired. However,
the aforementioned commit changed the default/fallback return
value to zero. This breaks device drivers such as usbip_host, so
this commit restores the legacy behaviour, but only if a device
driver does not have an id_table and a match() function.
In addition, if usb_device_match is called for a device driver
and device pair where the device does not match the id_table of the
device driver in question, then the device driver will be disqualified
for the device. This allows avoiding the default case of "return 1",
which prevents undesirable probe() calls to a driver even though
its id_table did not match the device.
Finally, this commit changes the specialised-driver-to-generic-driver
transition code so that when a device driver returns -ENODEV, a more
generic device driver is only considered if the current device driver
does not have an id_table and a match() function. This ensures that
"generic" drivers such as usbip_host will not be considered specialised
device drivers and will not cause the device to be locked in to the
generic device driver, when a more specialised device driver could be
tried.
All of these changes restore usbip functionality without regressions,
ensure that the specialised/generic device driver selection logic works
as expected with the usb and apple-mfi-fastcharge drivers, and do not
negatively affect the use of devices provided by dummy_hcd.
Fixes: 88b7381a939d ("USB: Select better matching USB drivers when available")
Cc: <stable(a)vger.kernel.org> # 5.8
Cc: Bastien Nocera <hadess(a)hadess.net>
Cc: Valentina Manea <valentina.manea.m(a)gmail.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Alan Stern <stern(a)rowland.harvard.edu>
Cc: <syzkaller(a)googlegroups.com>
Tested-by: Andrey Konovalov <andreyknvl(a)google.com>
Acked-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: M. Vefa Bicakci <m.v.b(a)runbox.com>
Link: https://lore.kernel.org/r/20200922110703.720960-5-m.v.b@runbox.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/driver.c | 37 +++++++++++++++++++++++++++++++------
1 file changed, 31 insertions(+), 6 deletions(-)
diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index b00b7fb1aad1..b351962279e4 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -269,8 +269,30 @@ static int usb_probe_device(struct device *dev)
if (error)
return error;
+ /* Probe the USB device with the driver in hand, but only
+ * defer to a generic driver in case the current USB
+ * device driver has an id_table or a match function; i.e.,
+ * when the device driver was explicitly matched against
+ * a device.
+ *
+ * If the device driver does not have either of these,
+ * then we assume that it can bind to any device and is
+ * not truly a more specialized/non-generic driver, so a
+ * return value of -ENODEV should not force the device
+ * to be handled by the generic USB driver, as there
+ * can still be another, more specialized, device driver.
+ *
+ * This accommodates the usbip driver.
+ *
+ * TODO: What if, in the future, there are multiple
+ * specialized USB device drivers for a particular device?
+ * In such cases, there is a need to try all matching
+ * specialised device drivers prior to setting the
+ * use_generic_driver bit.
+ */
error = udriver->probe(udev);
- if (error == -ENODEV && udriver != &usb_generic_driver) {
+ if (error == -ENODEV && udriver != &usb_generic_driver &&
+ (udriver->id_table || udriver->match)) {
udev->use_generic_driver = 1;
return -EPROBE_DEFER;
}
@@ -831,14 +853,17 @@ static int usb_device_match(struct device *dev, struct device_driver *drv)
udev = to_usb_device(dev);
udrv = to_usb_device_driver(drv);
- if (udrv->id_table &&
- usb_device_match_id(udev, udrv->id_table) != NULL) {
- return 1;
- }
+ if (udrv->id_table)
+ return usb_device_match_id(udev, udrv->id_table) != NULL;
if (udrv->match)
return udrv->match(udev);
- return 0;
+
+ /* If the device driver under consideration does not have a
+ * id_table or a match function, then let the driver's probe
+ * function decide.
+ */
+ return 1;
} else if (is_usb_interface(dev)) {
struct usb_interface *intf;
--
2.28.0
This is a note to let you know that I've just added the patch titled
usbcore/driver: Fix incorrect downcast
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 4df30e7603432704380b12fe40a604ee7f66746d Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <m.v.b(a)runbox.com>
Date: Tue, 22 Sep 2020 14:07:02 +0300
Subject: usbcore/driver: Fix incorrect downcast
This commit resolves a minor bug in the selection/discovery of more
specific USB device drivers for devices that are currently bound to
generic USB device drivers.
The bug is related to the way a candidate USB device driver is
compared against the generic USB device driver. The code in
is_dev_usb_generic_driver() assumes that the device driver in question
is a USB device driver by calling to_usb_device_driver(dev->driver)
to downcast; however I have observed that this assumption is not always
true, through code instrumentation.
This commit avoids the incorrect downcast altogether by comparing
the USB device's driver (i.e., dev->driver) to the generic USB
device driver directly. This method was suggested by Alan Stern.
This bug was found while investigating Andrey Konovalov's report
indicating usbip device driver misbehaviour with the recently merged
generic USB device driver selection feature. The report is linked
below.
Fixes: d5643d2249b2 ("USB: Fix device driver race")
Cc: <stable(a)vger.kernel.org> # 5.8
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Alan Stern <stern(a)rowland.harvard.edu>
Cc: Bastien Nocera <hadess(a)hadess.net>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Valentina Manea <valentina.manea.m(a)gmail.com>
Cc: <syzkaller(a)googlegroups.com>
Tested-by: Andrey Konovalov <andreyknvl(a)google.com>
Signed-off-by: M. Vefa Bicakci <m.v.b(a)runbox.com>
Link: https://lore.kernel.org/r/20200922110703.720960-4-m.v.b@runbox.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/driver.c | 11 ++---------
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index 715782995428..b00b7fb1aad1 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -905,21 +905,14 @@ static int usb_uevent(struct device *dev, struct kobj_uevent_env *env)
return 0;
}
-static bool is_dev_usb_generic_driver(struct device *dev)
-{
- struct usb_device_driver *udd = dev->driver ?
- to_usb_device_driver(dev->driver) : NULL;
-
- return udd == &usb_generic_driver;
-}
-
static int __usb_bus_reprobe_drivers(struct device *dev, void *data)
{
struct usb_device_driver *new_udriver = data;
struct usb_device *udev;
int ret;
- if (!is_dev_usb_generic_driver(dev))
+ /* Don't reprobe if current driver isn't usb_generic_driver */
+ if (dev->driver != &usb_generic_driver.drvwrap.driver)
return 0;
udev = to_usb_device(dev);
--
2.28.0
This is a note to let you know that I've just added the patch titled
usbcore/driver: Fix specific driver selection
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From aea850cd35ae3d266fe6f93fb9edb25e4a555230 Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <m.v.b(a)runbox.com>
Date: Tue, 22 Sep 2020 14:07:01 +0300
Subject: usbcore/driver: Fix specific driver selection
This commit resolves a bug in the selection/discovery of more
specific USB device drivers for devices that are currently bound to
generic USB device drivers.
The bug is in the logic that determines whether a device currently
bound to a generic USB device driver should be re-probed by a
more specific USB device driver or not. The code in
__usb_bus_reprobe_drivers() used to have the following lines:
if (usb_device_match_id(udev, new_udriver->id_table) == NULL &&
(!new_udriver->match || new_udriver->match(udev) != 0))
return 0;
ret = device_reprobe(dev);
As the reader will notice, the code checks whether the USB device in
consideration matches the identifier table (id_table) of a specific
USB device_driver (new_udriver), followed by a similar check, but this
time with the USB device driver's match function. However, the match
function's return value is not checked correctly. When match() returns
zero, it means that the specific USB device driver is *not* applicable
to the USB device in question, but the code then goes on to reprobe the
device with the new USB device driver under consideration. All this to
say, the logic is inverted.
This bug was found by code inspection and instrumentation while
investigating the root cause of the issue reported by Andrey Konovalov,
where usbip took over syzkaller's virtual USB devices in an undesired
manner. The report is linked below.
Fixes: d5643d2249b2 ("USB: Fix device driver race")
Cc: <stable(a)vger.kernel.org> # 5.8
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Alan Stern <stern(a)rowland.harvard.edu>
Cc: Bastien Nocera <hadess(a)hadess.net>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Valentina Manea <valentina.manea.m(a)gmail.com>
Cc: <syzkaller(a)googlegroups.com>
Tested-by: Andrey Konovalov <andreyknvl(a)google.com>
Signed-off-by: M. Vefa Bicakci <m.v.b(a)runbox.com>
Link: https://lore.kernel.org/r/20200922110703.720960-3-m.v.b@runbox.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/driver.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index 7e73e989645b..715782995428 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -924,7 +924,7 @@ static int __usb_bus_reprobe_drivers(struct device *dev, void *data)
udev = to_usb_device(dev);
if (usb_device_match_id(udev, new_udriver->id_table) == NULL &&
- (!new_udriver->match || new_udriver->match(udev) != 0))
+ (!new_udriver->match || new_udriver->match(udev) == 0))
return 0;
ret = device_reprobe(dev);
--
2.28.0
This is a note to let you know that I've just added the patch titled
Revert "usbip: Implement a match function to fix usbip"
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From d6407613c1e2ef90213dee388aa16b6e1bd08cbc Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <m.v.b(a)runbox.com>
Date: Tue, 22 Sep 2020 14:07:00 +0300
Subject: Revert "usbip: Implement a match function to fix usbip"
This commit reverts commit 7a2f2974f265 ("usbip: Implement a match
function to fix usbip").
In summary, commit d5643d2249b2 ("USB: Fix device driver race")
inadvertently broke usbip functionality, which I resolved in an incorrect
manner by introducing a match function to usbip, usbip_match(), that
unconditionally returns true.
However, the usbip_match function, as is, causes usbip to take over
virtual devices used by syzkaller for USB fuzzing, which is a regression
reported by Andrey Konovalov.
Furthermore, in conjunction with the fix of another bug, handled by another
patch titled "usbcore/driver: Fix specific driver selection" in this patch
set, the usbip_match function causes unexpected USB subsystem behaviour
when the usbip_host driver is loaded. The unexpected behaviour can be
qualified as follows:
- If commit 41160802ab8e ("USB: Simplify USB ID table match") is included
in the kernel, then all USB devices are bound to the usbip_host
driver, which appears to the user as if all USB devices were
disconnected.
- If the same commit (41160802ab8e) is not in the kernel (as is the case
with v5.8.10) then all USB devices are re-probed and re-bound to their
original device drivers, which appears to the user as a disconnection
and re-connection of USB devices.
Please note that this commit will make usbip non-operational again,
until yet another patch in this patch set is merged, titled
"usbcore/driver: Accommodate usbip".
Cc: <stable(a)vger.kernel.org> # 5.8: 41160802ab8e: USB: Simplify USB ID table match
Cc: <stable(a)vger.kernel.org> # 5.8
Cc: Bastien Nocera <hadess(a)hadess.net>
Cc: Valentina Manea <valentina.manea.m(a)gmail.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Alan Stern <stern(a)rowland.harvard.edu>
Cc: <syzkaller(a)googlegroups.com>
Reported-by: Andrey Konovalov <andreyknvl(a)google.com>
Tested-by: Andrey Konovalov <andreyknvl(a)google.com>
Acked-by: Shuah Khan <skhan(a)linuxfoundation.org>
Signed-off-by: M. Vefa Bicakci <m.v.b(a)runbox.com>
Link: https://lore.kernel.org/r/20200922110703.720960-2-m.v.b@runbox.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/usbip/stub_dev.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/drivers/usb/usbip/stub_dev.c b/drivers/usb/usbip/stub_dev.c
index 9d7d642022d1..2305d425e6c9 100644
--- a/drivers/usb/usbip/stub_dev.c
+++ b/drivers/usb/usbip/stub_dev.c
@@ -461,11 +461,6 @@ static void stub_disconnect(struct usb_device *udev)
return;
}
-static bool usbip_match(struct usb_device *udev)
-{
- return true;
-}
-
#ifdef CONFIG_PM
/* These functions need usb_port_suspend and usb_port_resume,
@@ -491,7 +486,6 @@ struct usb_device_driver stub_driver = {
.name = "usbip-host",
.probe = stub_probe,
.disconnect = stub_disconnect,
- .match = usbip_match,
#ifdef CONFIG_PM
.suspend = stub_suspend,
.resume = stub_resume,
--
2.28.0
Currently, we check we can send a pulse prior to disabling the
heartbeat to verify that we can change the heartbeat, but since we may
re-evaluate execution upon changing the heartbeat interval we need another
pulse afterwards to refresh execution.
Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v5.7+
---
drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 8ffdf676c0a0..d09df370f7cd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -192,10 +192,12 @@ int intel_engine_set_heartbeat(struct intel_engine_cs *engine,
WRITE_ONCE(engine->props.heartbeat_interval_ms, delay);
if (intel_engine_pm_get_if_awake(engine)) {
- if (delay)
+ if (delay) {
intel_engine_unpark_heartbeat(engine);
- else
+ } else {
intel_engine_park_heartbeat(engine);
+ intel_engine_pulse(engine); /* recheck execution */
+ }
intel_engine_pm_put(engine);
}
--
2.20.1
When running as Xen dom0 the kernel isn't responsible for selecting the
error handling mode, this should be handled by the hypervisor.
So disable setting FF mode when running as Xen pv guest. Not doing so
might result in boot splats like:
[ 7.509696] HEST: Enabling Firmware First mode for corrected errors.
[ 7.510382] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 2.
[ 7.510383] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 3.
[ 7.510384] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 4.
[ 7.510384] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 5.
[ 7.510385] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 6.
[ 7.510386] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 7.
[ 7.510386] mce: [Firmware Bug]: Ignoring request to disable invalid MCA bank 8.
Reason is that the HEST ACPI table contains the real number of MCA
banks, while the hypervisor is emulating only 2 banks for guests.
Cc: stable(a)vger.kernel.org
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
arch/x86/xen/enlighten_pv.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 22e741e0b10c..065c049d0f3c 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1296,6 +1296,14 @@ asmlinkage __visible void __init xen_start_kernel(void)
xen_smp_init();
+#ifdef CONFIG_ACPI
+ /*
+ * Disable selecting "Firmware First mode" for correctable memory
+ * errors, as this is the duty of the hypervisor to decide.
+ */
+ acpi_disable_cmcff = 1;
+#endif
+
#ifdef CONFIG_ACPI_NUMA
/*
* The pages we from Xen are not related to machine pages, so
--
2.26.2
When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode),
current driver disables interrupt remapping when it updates the IRTE
so that the upper and lower 64-bit values can be updated safely.
However, this creates a small window, where the interrupt could
arrive and result in IO_PAGE_FAULT (for interrupt) as shown below.
IOMMU Driver Device IRQ
============ ===========
irte.RemapEn=0
...
change IRTE IRQ from device ==> IO_PAGE_FAULT !!
...
irte.RemapEn=1
This scenario has been observed when changing irq affinity on a system
running I/O-intensive workload, in which the destination APIC ID
in the IRTE is updated.
Instead, use cmpxchg_double() to update the 128-bit IRTE at once without
disabling the interrupt remapping. However, this means several features,
which require GA (128-bit IRTE) support will also be affected if cmpxchg16b
is not supported (which is unprecedented for AMD processors w/ IOMMU).
Cc: stable(a)vger.kernel.org
Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure")
Reported-by: Sean Osborne <sean.m.osborne(a)oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
Tested-by: Erik Rockstrom <erik.rockstrom(a)oracle.com>
Reviewed-by: Joao Martins <joao.m.martins(a)oracle.com>
Link: https://lore.kernel.org/r/20200903093822.52012-3-suravee.suthikulpanit@amd.…
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
---
Note: This patch is the back-port on top of the stable branch linux-5.4.y
for the upstream commit e52d58d54a32 ("iommu/amd: Use cmpxchg_double() when
updating 128-bit IRTE") since the original patch does not apply cleanly.
drivers/iommu/Kconfig | 2 +-
drivers/iommu/amd_iommu.c | 17 +++++++++++++----
drivers/iommu/amd_iommu_init.c | 21 +++++++++++++++++++--
3 files changed, 33 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 390568afee9f..fc0160e8ed33 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -138,7 +138,7 @@ config AMD_IOMMU
select PCI_PASID
select IOMMU_API
select IOMMU_IOVA
- depends on X86_64 && PCI && ACPI
+ depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE
---help---
With this option you can enable support for AMD IOMMU hardware in
your system. An IOMMU is a hardware component which provides
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index fa91d856a43e..7b724f7b27a9 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3873,6 +3873,7 @@ static int alloc_irq_index(u16 devid, int count, bool align,
static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte,
struct amd_ir_data *data)
{
+ bool ret;
struct irq_remap_table *table;
struct amd_iommu *iommu;
unsigned long flags;
@@ -3890,10 +3891,18 @@ static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte,
entry = (struct irte_ga *)table->table;
entry = &entry[index];
- entry->lo.fields_remap.valid = 0;
- entry->hi.val = irte->hi.val;
- entry->lo.val = irte->lo.val;
- entry->lo.fields_remap.valid = 1;
+
+ ret = cmpxchg_double(&entry->lo.val, &entry->hi.val,
+ entry->lo.val, entry->hi.val,
+ irte->lo.val, irte->hi.val);
+ /*
+ * We use cmpxchg16 to atomically update the 128-bit IRTE,
+ * and it cannot be updated by the hardware or other processors
+ * behind us, so the return value of cmpxchg16 should be the
+ * same as the old value.
+ */
+ WARN_ON(!ret);
+
if (data)
data->ref = entry;
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 135ae5222cf3..31d7e2d4f304 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1522,7 +1522,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
iommu->mmio_phys_end = MMIO_REG_END_OFFSET;
else
iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET;
- if (((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0))
+
+ /*
+ * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports.
+ * GAM also requires GA mode. Therefore, we need to
+ * check cmpxchg16b support before enabling it.
+ */
+ if (!boot_cpu_has(X86_FEATURE_CX16) ||
+ ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0))
amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY;
break;
case 0x11:
@@ -1531,8 +1538,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
iommu->mmio_phys_end = MMIO_REG_END_OFFSET;
else
iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET;
- if (((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0))
+
+ /*
+ * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports.
+ * XT, GAM also requires GA mode. Therefore, we need to
+ * check cmpxchg16b support before enabling them.
+ */
+ if (!boot_cpu_has(X86_FEATURE_CX16) ||
+ ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) {
amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY;
+ break;
+ }
+
/*
* Note: Since iommu_update_intcapxt() leverages
* the IOMMU MMIO access to MSI capability block registers
--
2.17.1
Sasha,
This patch and the upstream commit 26e495f34107 ("iommu/amd: Restore IRTE.RemapEn bit after programming IRTE")
are related. Since the commit 26e495f34107 has been back-ported to linux-5.4.y branch of the stable tree,
this patch should also be add to linux-5.4.y as well.
Thanks,
Suravee
On 9/7/20 10:05 AM, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
>
> to the 5.8-stable tree which can be found at:
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.kernel…
>
> The filename of the patch is:
> iommu-amd-use-cmpxchg_double-when-updating-128-bit-i.patch
> and it can be found in the queue-5.8 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit 31640157d429339f8fffc14a645cb2c9739e0f17
> Author: Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
> Date: Thu Sep 3 09:38:22 2020 +0000
>
> iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
>
> [ Upstream commit e52d58d54a321d4fe9d0ecdabe4f8774449f0d6e ]
>
> When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode),
> current driver disables interrupt remapping when it updates the IRTE
> so that the upper and lower 64-bit values can be updated safely.
>
> However, this creates a small window, where the interrupt could
> arrive and result in IO_PAGE_FAULT (for interrupt) as shown below.
>
> IOMMU Driver Device IRQ
> ============ ===========
> irte.RemapEn=0
> ...
> change IRTE IRQ from device ==> IO_PAGE_FAULT !!
> ...
> irte.RemapEn=1
>
> This scenario has been observed when changing irq affinity on a system
> running I/O-intensive workload, in which the destination APIC ID
> in the IRTE is updated.
>
> Instead, use cmpxchg_double() to update the 128-bit IRTE at once without
> disabling the interrupt remapping. However, this means several features,
> which require GA (128-bit IRTE) support will also be affected if cmpxchg16b
> is not supported (which is unprecedented for AMD processors w/ IOMMU).
>
> Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure")
> Reported-by: Sean Osborne <sean.m.osborne(a)oracle.com>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
> Tested-by: Erik Rockstrom <erik.rockstrom(a)oracle.com>
> Reviewed-by: Joao Martins <joao.m.martins(a)oracle.com>
> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kern…
> Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index b0f308cb7f7c2..201b2718f0755 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -143,7 +143,7 @@ config AMD_IOMMU
> select IOMMU_API
> select IOMMU_IOVA
> select IOMMU_DMA
> - depends on X86_64 && PCI && ACPI
> + depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE
> help
> With this option you can enable support for AMD IOMMU hardware in
> your system. An IOMMU is a hardware component which provides
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 6ebd4825e3206..bf45f8e2c7edd 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -1518,7 +1518,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
> iommu->mmio_phys_end = MMIO_REG_END_OFFSET;
> else
> iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET;
> - if (((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0))
> +
> + /*
> + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports.
> + * GAM also requires GA mode. Therefore, we need to
> + * check cmpxchg16b support before enabling it.
> + */
> + if (!boot_cpu_has(X86_FEATURE_CX16) ||
> + ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0))
> amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY;
> break;
> case 0x11:
> @@ -1527,8 +1534,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
> iommu->mmio_phys_end = MMIO_REG_END_OFFSET;
> else
> iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET;
> - if (((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0))
> +
> + /*
> + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports.
> + * XT, GAM also requires GA mode. Therefore, we need to
> + * check cmpxchg16b support before enabling them.
> + */
> + if (!boot_cpu_has(X86_FEATURE_CX16) ||
> + ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) {
> amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY;
> + break;
> + }
> +
> /*
> * Note: Since iommu_update_intcapxt() leverages
> * the IOMMU MMIO access to MSI capability block registers
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index d7b037891fb7e..200ee948f6ec1 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -3283,6 +3283,7 @@ out:
> static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte,
> struct amd_ir_data *data)
> {
> + bool ret;
> struct irq_remap_table *table;
> struct amd_iommu *iommu;
> unsigned long flags;
> @@ -3300,10 +3301,18 @@ static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte,
>
> entry = (struct irte_ga *)table->table;
> entry = &entry[index];
> - entry->lo.fields_remap.valid = 0;
> - entry->hi.val = irte->hi.val;
> - entry->lo.val = irte->lo.val;
> - entry->lo.fields_remap.valid = 1;
> +
> + ret = cmpxchg_double(&entry->lo.val, &entry->hi.val,
> + entry->lo.val, entry->hi.val,
> + irte->lo.val, irte->hi.val);
> + /*
> + * We use cmpxchg16 to atomically update the 128-bit IRTE,
> + * and it cannot be updated by the hardware or other processors
> + * behind us, so the return value of cmpxchg16 should be the
> + * same as the old value.
> + */
> + WARN_ON(!ret);
> +
> if (data)
> data->ref = entry;
>
>
From: Jason Gerecke <jason.gerecke(a)wacom.com>
It has recently been reported that the "heartbeat" report from devices
like the 2nd-gen Intuos Pro (PTH-460, PTH-660, PTH-860) or the 2nd-gen
Bluetooth-enabled Intuos tablets (CTL-4100WL, CTL-6100WL) can cause the
driver to send a spurious BTN_TOUCH=0 once per second in the middle of
drawing. This can result in broken lines while drawing on Chrome OS.
The source of the issue has been traced back to a change which modified
the driver to only call `wacom_wac_pad_report()` once per report instead
of once per collection. As part of this change, pad-handling code was
removed from `wacom_wac_collection()` under the assumption that the
`WACOM_PEN_FIELD` and `WACOM_TOUCH_FIELD` checks would not be satisfied
when a pad or battery collection was being processed.
To be clear, the macros `WACOM_PAD_FIELD` and `WACOM_PEN_FIELD` do not
currently check exclusive conditions. In fact, most "pad" fields will
also appear to be "pen" fields simply due to their presence inside of
a Digitizer application collection. Because of this, the removal of
the check from `wacom_wac_collection()` just causes pad / battery
collections to instead trigger a call to `wacom_wac_pen_report()`
instead. The pen report function in turn resets the tip switch state
just prior to exiting, resulting in the observed BTN_TOUCH=0 symptom.
To correct this, we restore a version of the `WACOM_PAD_FIELD` check
in `wacom_wac_collection()` and return early. This effectively prevents
pad / battery collections from being reported until the very end of the
report as originally intended.
Fixes: d4b8efeb46d9 ("HID: wacom: generic: Correct pad syncing")
Cc: stable(a)vger.kernel.org # v4.17+
Signed-off-by: Jason Gerecke <jason.gerecke(a)wacom.com>
Reviewed-by: Ping Cheng <ping.cheng(a)wacom.com>
Tested-by: Ping Cheng <ping.cheng(a)wacom.com>
---
drivers/hid/wacom_wac.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/wacom_wac.c b/drivers/hid/wacom_wac.c
index 1c96809b51c9..b74acbd5997b 100644
--- a/drivers/hid/wacom_wac.c
+++ b/drivers/hid/wacom_wac.c
@@ -2773,7 +2773,9 @@ static int wacom_wac_collection(struct hid_device *hdev, struct hid_report *repo
if (report->type != HID_INPUT_REPORT)
return -1;
- if (WACOM_PEN_FIELD(field) && wacom->wacom_wac.pen_input)
+ if (WACOM_PAD_FIELD(field))
+ return 0;
+ else if (WACOM_PEN_FIELD(field) && wacom->wacom_wac.pen_input)
wacom_wac_pen_report(hdev, report);
else if (WACOM_FINGER_FIELD(field) && wacom->wacom_wac.touch_input)
wacom_wac_finger_report(hdev, report);
--
2.28.0
We only allow persistent requests to remain on the GPU past the closure
of their containing context (and process) so long as they are continuously
checked for hangs or allow other requests to preempt them, as we need to
ensure forward progress of the system. If we allow persistent contexts
to remain on the system after the the hangcheck mechanism is disabled,
the system may grind to a halt. On disabling the mechanism, we sent a
pulse along the engine to remove all executing contexts from the engine
which would check for hung contexts -- but we did not prevent those
contexts from being resubmitted if they survived the final hangcheck.
Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs")
Testcase: igt/gem_ctx_persistence/heartbeat-stop
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v5.7+
---
drivers/gpu/drm/i915/gt/intel_engine.h | 9 +++++++++
drivers/gpu/drm/i915/i915_request.c | 5 +++++
2 files changed, 14 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 08e2c000dcc3..7c3a1012e702 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -337,4 +337,13 @@ intel_engine_has_preempt_reset(const struct intel_engine_cs *engine)
return intel_engine_has_preemption(engine);
}
+static inline bool
+intel_engine_has_heartbeat(const struct intel_engine_cs *engine)
+{
+ if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
+ return false;
+
+ return READ_ONCE(engine->props.heartbeat_interval_ms);
+}
+
#endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 436ce368ddaa..0e813819b041 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -542,8 +542,13 @@ bool __i915_request_submit(struct i915_request *request)
if (i915_request_completed(request))
goto xfer;
+ if (unlikely(intel_context_is_closed(request->context) &&
+ !intel_engine_has_heartbeat(engine)))
+ intel_context_set_banned(request->context);
+
if (unlikely(intel_context_is_banned(request->context)))
i915_request_set_error_once(request, -EIO);
+
if (unlikely(fatal_error(request->fence.error)))
__i915_request_skip(request);
--
2.20.1
From: Lukas Wunner <lukas(a)wunner.de>
commit e0a851fe6b9b619527bd928aa93caaddd003f70c upstream.
If the call to uart_add_one_port() in serial8250_register_8250_port()
fails, a half-initialized entry in the serial_8250ports[] array is left
behind.
A subsequent reprobe of the same serial port causes that entry to be
reused. Because uart->port.dev is set, uart_remove_one_port() is called
for the half-initialized entry and bails out with an error message:
bcm2835-aux-uart 3f215040.serial: Removing wrong port: (null) != (ptrval)
The same happens on failure of mctrl_gpio_init() since commit
4a96895f74c9 ("tty/serial/8250: use mctrl_gpio helpers").
Fix by zeroing the uart->port.dev pointer in the probe error path.
The bug was introduced in v2.6.10 by historical commit befff6f5bf5f
("[SERIAL] Add new port registration/unregistration functions."):
https://git.kernel.org/tglx/history/c/befff6f5bf5f
The commit added an unconditional call to uart_remove_one_port() in
serial8250_register_port(). In v3.7, commit 835d844d1a28 ("8250_pnp:
do pnp probe before legacy probe") made that call conditional on
uart->port.dev which allows me to fix the issue by zeroing that pointer
in the error path. Thus, the present commit will fix the problem as far
back as v3.7 whereas still older versions need to also cherry-pick
835d844d1a28.
Fixes: 835d844d1a28 ("8250_pnp: do pnp probe before legacy probe")
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Cc: stable(a)vger.kernel.org # v2.6.10
Cc: stable(a)vger.kernel.org # v2.6.10: 835d844d1a28: 8250_pnp: do pnp probe before legacy
Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Link: https://lore.kernel.org/r/b4a072013ee1a1d13ee06b4325afb19bda57ca1b.15892858…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
[iwamatsu: Backported to 4.14, 4.19: adjust context]
Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu(a)toshiba.co.jp>
---
drivers/tty/serial/8250/8250_core.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index d6b790510c94..8d46bd612888 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -1065,8 +1065,10 @@ int serial8250_register_8250_port(struct uart_8250_port *up)
serial8250_apply_quirks(uart);
ret = uart_add_one_port(&serial8250_reg,
&uart->port);
- if (ret == 0)
- ret = uart->port.line;
+ if (ret)
+ goto err;
+
+ ret = uart->port.line;
} else {
dev_info(uart->port.dev,
"skipping CIR port at 0x%lx / 0x%llx, IRQ %d\n",
@@ -1091,6 +1093,11 @@ int serial8250_register_8250_port(struct uart_8250_port *up)
mutex_unlock(&serial_mutex);
return ret;
+
+err:
+ uart->port.dev = NULL;
+ mutex_unlock(&serial_mutex);
+ return ret;
}
EXPORT_SYMBOL(serial8250_register_8250_port);
--
2.27.0
From: Xunlei Pang <xlpang(a)linux.alibaba.com>
commit e3336cab2579012b1e72b5265adf98e2d6e244ad upstream
We've met softlockup with "CONFIG_PREEMPT_NONE=y", when the target memcg
doesn't have any reclaimable memory.
It can be easily reproduced as below:
watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204]
CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12
Call Trace:
shrink_lruvec+0x49f/0x640
shrink_node+0x2a6/0x6f0
do_try_to_free_pages+0xe9/0x3e0
try_to_free_mem_cgroup_pages+0xef/0x1f0
try_charge+0x2c1/0x750
mem_cgroup_charge+0xd7/0x240
__add_to_page_cache_locked+0x2fd/0x370
add_to_page_cache_lru+0x4a/0xc0
pagecache_get_page+0x10b/0x2f0
filemap_fault+0x661/0xad0
ext4_filemap_fault+0x2c/0x40
__do_fault+0x4d/0xf9
handle_mm_fault+0x1080/0x1790
It only happens on our 1-vcpu instances, because there's no chance for
oom reaper to run to reclaim the to-be-killed process.
Add a cond_resched() at the upper shrink_node_memcgs() to solve this
issue, this will mean that we will get a scheduling point for each memcg
in the reclaimed hierarchy without any dependency on the reclaimable
memory in that memcg thus making it more predictable.
[jpitti(a)cisco.com:
- backported to v5.4.y
- Upstream patch applies fix in shrink_node_memcgs(), which
is not present to v5.4.y. Appled to shrink_node()]
Suggested-by: Michal Hocko <mhocko(a)suse.com>
Signed-off-by: Xunlei Pang <xlpang(a)linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Acked-by: Chris Down <chris(a)chrisdown.name>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Link: http://lkml.kernel.org/r/1598495549-67324-1-git-send-email-xlpang@linux.ali…
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Fixes: b0dedc49a2da ("mm/vmscan.c: iterate only over charged shrinkers during memcg shrink_slab()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Julius Hemanth Pitti <jpitti(a)cisco.com>
---
mm/vmscan.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7fde5f904c8d..6db9176d8c63 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2775,6 +2775,14 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
unsigned long reclaimed;
unsigned long scanned;
+ /*
+ * This loop can become CPU-bound when target memcgs
+ * aren't eligible for reclaim - either because they
+ * don't have any reclaimable pages, or because their
+ * memory is explicitly protected. Avoid soft lockups.
+ */
+ cond_resched();
+
switch (mem_cgroup_protected(root, memcg)) {
case MEMCG_PROT_MIN:
/*
--
2.17.1
From: Lukas Wunner <lukas(a)wunner.de>
commit e0a851fe6b9b619527bd928aa93caaddd003f70c upstream.
If the call to uart_add_one_port() in serial8250_register_8250_port()
fails, a half-initialized entry in the serial_8250ports[] array is left
behind.
A subsequent reprobe of the same serial port causes that entry to be
reused. Because uart->port.dev is set, uart_remove_one_port() is called
for the half-initialized entry and bails out with an error message:
bcm2835-aux-uart 3f215040.serial: Removing wrong port: (null) != (ptrval)
The same happens on failure of mctrl_gpio_init() since commit
4a96895f74c9 ("tty/serial/8250: use mctrl_gpio helpers").
Fix by zeroing the uart->port.dev pointer in the probe error path.
The bug was introduced in v2.6.10 by historical commit befff6f5bf5f
("[SERIAL] Add new port registration/unregistration functions."):
https://git.kernel.org/tglx/history/c/befff6f5bf5f
The commit added an unconditional call to uart_remove_one_port() in
serial8250_register_port(). In v3.7, commit 835d844d1a28 ("8250_pnp:
do pnp probe before legacy probe") made that call conditional on
uart->port.dev which allows me to fix the issue by zeroing that pointer
in the error path. Thus, the present commit will fix the problem as far
back as v3.7 whereas still older versions need to also cherry-pick
835d844d1a28.
Fixes: 835d844d1a28 ("8250_pnp: do pnp probe before legacy probe")
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Cc: stable(a)vger.kernel.org # v2.6.10
Cc: stable(a)vger.kernel.org # v2.6.10: 835d844d1a28: 8250_pnp: do pnp probe before legacy
Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Link: https://lore.kernel.org/r/b4a072013ee1a1d13ee06b4325afb19bda57ca1b.15892858…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
[iwamatsu: Backported to 4.4, 4.9: adjust context]
Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu(a)toshiba.co.jp>
---
drivers/tty/serial/8250/8250_core.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index e9ea9005a984..f24fa99da69f 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -1037,8 +1037,10 @@ int serial8250_register_8250_port(struct uart_8250_port *up)
ret = uart_add_one_port(&serial8250_reg,
&uart->port);
- if (ret == 0)
- ret = uart->port.line;
+ if (ret)
+ goto err;
+
+ ret = uart->port.line;
} else {
dev_info(uart->port.dev,
"skipping CIR port at 0x%lx / 0x%llx, IRQ %d\n",
@@ -1052,6 +1054,11 @@ int serial8250_register_8250_port(struct uart_8250_port *up)
mutex_unlock(&serial_mutex);
return ret;
+
+err:
+ uart->port.dev = NULL;
+ mutex_unlock(&serial_mutex);
+ return ret;
}
EXPORT_SYMBOL(serial8250_register_8250_port);
--
2.27.0
Dear stable kernel maintainers,
Please consider the attached mbox file, which contains 10 patches which
cherry pick cleanly onto 4.19.y:
1. commit 8708e13c6a06 ("MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info")
2. commit 7bac98707f65 ("kbuild: add OBJSIZE variable for the size tool")
3. commit fcf1b6a35c ("Documentation/llvm: add documentation on
building w/ Clang/LLVM")
4. commit 0f44fbc162b7 ("Documentation/llvm: fix the name of llvm-size")
5. commit 63b903dfebde ("net: wan: wanxl: use allow to pass
CROSS_COMPILE_M68k for rebuilding firmware")
6. commit 734f3719d343 ("net: wan: wanxl: use $(M68KCC) instead of
$(M68KAS) for rebuilding firmware")
7. commit eefb8c124fd9 ("x86/boot: kbuild: allow readelf executable to
be specified")
8. commit aa824e0c962b ("kbuild: remove AS variable")
9. commit 7e20e47c70f8 ("kbuild: replace AS=clang with LLVM_IAS=1")
10. commit a0d1c951ef08 ("kbuild: support LLVM=1 to switch the default
tools to Clang/LLVM")
The series is analogous to the previous accepted series sent for 5.4,
though this series is for 4.19.y:
https://lore.kernel.org/stable/CAKwvOd=Ko_UHWF-bYotqjPVw=chW_KMUFuBp_o8uOg0…
I don't plan to backport the series any further than 4.19.
This series improves/simplifies building kernels with Clang and LLVM
utilities; it will help the various CI systems testing kernels built
with Clang+LLVM utilities, and
we will make immediate use of it in Android (see also:
https://android-review.googlesource.com/c/platform/prebuilts/clang/host/lin…).
We can always carry it out of tree in Android, but I think the series
is fairly tame, and would prefer not to.
Some differences in this series from the one sent for 5.4:
additional backports:
1. commit 8708e13c6a06 ("MAINTAINERS: add CLANG/LLVM BUILD SUPPORT info")
2. commit 7bac98707f65 ("kbuild: add OBJSIZE variable for the size tool")
which landed in v5.2-rc7 and v5.3-rc4 respectively. It also drops the
5.4 backport of
1. commit e9781b52d4e0 ("kbuild: add PYTHON2 and PYTHON3 variables")
which was not necessary, as it was backported to 5.4 to resolve merge
conflicts which occurred anyways in this series due to not backporting
to 4.19 the following commits which caused minor conflicts:
1. commit e83b9f55448a ("kbuild: add ability to generate BTF type info
for vmlinux")
2. commit cd238effefa2 ("docs: kbuild: convert docs to ReST and rename
to *.rst")
Put another way, because I avoided backporting e83b9f55448a and
cd238effefa2, most of the patches needed to be slightly modified in
small ways (noted in each commit message), such that there was no
point in backporting e9781b52d4e0.
--
Thanks,
~Nick Desaulniers
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 52fb84070999 - ibmvnic: add missing parenthesis in do_reset()
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
Host 4:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
🚧 ❌ Storage nvme - tcp
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage nvme - tcp
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ Firmware test suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
The LAUNCH_SECRET command performs encryption of the
launch secret memory contents. Mark pinned pages as
dirty, before unpinning them.
This matches the logic in sev_launch_update_data().
Fixes: 9c5e0afaf157 ("KVM: SVM: Add support for SEV LAUNCH_SECRET command")
Signed-off-by: Cfir Cohen <cfir(a)google.com>
---
Changelog since v2:
- Added 'Fixes' tag, updated comments.
Changelog since v1:
- Updated commit message.
arch/x86/kvm/svm/sev.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 5573a97f1520..55edaf3577a0 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -440,10 +440,8 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
}
/*
- * The LAUNCH_UPDATE command will perform in-place encryption of the
- * memory content (i.e it will write the same memory region with C=1).
- * It's possible that the cache may contain the data with C=0, i.e.,
- * unencrypted so invalidate it first.
+ * Flush (on non-coherent CPUs) before LAUNCH_UPDATE encrypts pages in
+ * place, the cache may contain data that was written unencrypted.
*/
sev_clflush_pages(inpages, npages);
@@ -799,10 +797,9 @@ static int sev_dbg_crypt(struct kvm *kvm, struct kvm_sev_cmd *argp, bool dec)
}
/*
- * The DBG_{DE,EN}CRYPT commands will perform {dec,en}cryption of the
- * memory content (i.e it will write the same memory region with C=1).
- * It's possible that the cache may contain the data with C=0, i.e.,
- * unencrypted so invalidate it first.
+ * Flush (on non-coherent CPUs) before DBG_{DE,EN}CRYPT reads or modifies
+ * the pages, flush the destination too in case the cache contains its
+ * current data.
*/
sev_clflush_pages(src_p, 1);
sev_clflush_pages(dst_p, 1);
@@ -850,7 +847,7 @@ static int sev_launch_secret(struct kvm *kvm, struct kvm_sev_cmd *argp)
struct kvm_sev_launch_secret params;
struct page **pages;
void *blob, *hdr;
- unsigned long n;
+ unsigned long n, i;
int ret, offset;
if (!sev_guest(kvm))
@@ -863,6 +860,12 @@ static int sev_launch_secret(struct kvm *kvm, struct kvm_sev_cmd *argp)
if (!pages)
return -ENOMEM;
+ /*
+ * Flush (on non-coherent CPUs) before LAUNCH_SECRET encrypts pages in
+ * place, the cache may contain data that was written unencrypted.
+ */
+ sev_clflush_pages(pages, n);
+
/*
* The secret must be copied into contiguous memory region, lets verify
* that userspace memory pages are contiguous before we issue command.
@@ -908,6 +911,11 @@ static int sev_launch_secret(struct kvm *kvm, struct kvm_sev_cmd *argp)
e_free:
kfree(data);
e_unpin_memory:
+ /* content of memory is updated, mark pages dirty */
+ for (i = 0; i < n; i++) {
+ set_page_dirty_lock(pages[i]);
+ mark_page_accessed(pages[i]);
+ }
sev_unpin_memory(kvm, pages, n);
return ret;
}
--
2.28.0.681.g6f77f65b4e-goog
Hello there,
I am Laghouili Abdellatif. I am contacting you because I have a
proposal that I think may be interested in. I represent the
interest of my brother in-law who was a minister in the Syrian
Government. As you probably know, there is a lot of crisis going
on currently in Syria and my brother in-law has fallen out with
the ruling Junta and the president because of his foreign
policies and the senseless war and killings that has been going
on for a while. Everybody in Syria is fed up and want a change
but the president is too powerfull and he simply kills anyone
that tries to oppose him. My brother in-law belives that he is at
risk and he is now very scared for the safety of his family
especially his kids. In order to ensure that his family is taken
care of and protected incase anything happens to him, he has
asked me to help him find a foreign investor who can help him
accommodate and invest 100 MUSD privately that he has secured in
Europe. He wants these funds safely invested so that the future
and safety of his family can be secured.
I am contacting you with the hope that you will be interested in
helping us. We need your help to accommodate the funds in the
banking system in your country and also invest it in lucrative
projects that will yeild good profits. We will handle all the
logistics involved in the movement of the funds to you. The funds
is already in Europe so you have nothing to worry about because
this transaction will be executed in a legal way. My brother in-
law has also promised to compensate you for your help. He wants
this to be done discretely so I will be acting as his eyes and
ears during the course of this transaction.
If this proposal interests you, please kindly respond so that I
can give you more details.
Regards,
Laghouili.
The patch titled
Subject: mm: khugepaged: avoid overriding min_free_kbytes set by user
has been removed from the -mm tree. Its filename was
mm-khugepaged-avoid-overriding-min_free_kbytes-set-by-user.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Vijay Balakrishna <vijayb(a)linux.microsoft.com>
Subject: mm: khugepaged: avoid overriding min_free_kbytes set by user
set_recommended_min_free_kbytes need to honor min_free_kbytes set by the
user. Post start-of-day THP enable or memory hotplug operations can lose
user specified min_free_kbytes, in particular when it is higher than
calculated recommended value. user_min_free_kbytes initialized to 0 to
avoid undesired result when comparing with "unsigned long" type.
Link: https://lkml.kernel.org/r/1600305709-2319-3-git-send-email-vijayb@linux.mic…
Signed-off-by: Vijay Balakrishna <vijayb(a)linux.microsoft.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Allen Pais <apais(a)microsoft.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 3 ++-
mm/page_alloc.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
--- a/mm/khugepaged.c~mm-khugepaged-avoid-overriding-min_free_kbytes-set-by-user
+++ a/mm/khugepaged.c
@@ -2283,7 +2283,8 @@ static void set_recommended_min_free_kby
(unsigned long) nr_free_buffer_pages() / 20);
recommended_min <<= (PAGE_SHIFT-10);
- if (recommended_min > min_free_kbytes) {
+ if (recommended_min > min_free_kbytes ||
+ recommended_min > user_min_free_kbytes) {
if (user_min_free_kbytes >= 0)
pr_info("raising min_free_kbytes from %d to %lu to help transparent hugepage allocations\n",
min_free_kbytes, recommended_min);
--- a/mm/page_alloc.c~mm-khugepaged-avoid-overriding-min_free_kbytes-set-by-user
+++ a/mm/page_alloc.c
@@ -315,7 +315,7 @@ compound_page_dtor * const compound_page
};
int min_free_kbytes = 1024;
-int user_min_free_kbytes = -1;
+int user_min_free_kbytes = 0;
#ifdef CONFIG_DISCONTIGMEM
/*
* DiscontigMem defines memory ranges as separate pg_data_t even if the ranges
_
Patches currently in -mm which might be from vijayb(a)linux.microsoft.com are
mm-khugepaged-recalculate-min_free_kbytes-after-memory-hotplug-as-expected-by-khugepaged.patch
The patch titled
Subject: mm: khugepaged: avoid overriding min_free_kbytes set by user
has been added to the -mm tree. Its filename is
mm-khugepaged-avoid-overriding-min_free_kbytes-set-by-user.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-khugepaged-avoid-overriding-mi…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-khugepaged-avoid-overriding-mi…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Vijay Balakrishna <vijayb(a)linux.microsoft.com>
Subject: mm: khugepaged: avoid overriding min_free_kbytes set by user
set_recommended_min_free_kbytes need to honor min_free_kbytes set by the
user. Post start-of-day THP enable or memory hotplug operations can lose
user specified min_free_kbytes, in particular when it is higher than
calculated recommended value. user_min_free_kbytes initialized to 0 to
avoid undesired result when comparing with "unsigned long" type.
Link: https://lkml.kernel.org/r/1600305709-2319-3-git-send-email-vijayb@linux.mic…
Signed-off-by: Vijay Balakrishna <vijayb(a)linux.microsoft.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Allen Pais <apais(a)microsoft.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 3 ++-
mm/page_alloc.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
--- a/mm/khugepaged.c~mm-khugepaged-avoid-overriding-min_free_kbytes-set-by-user
+++ a/mm/khugepaged.c
@@ -2283,7 +2283,8 @@ static void set_recommended_min_free_kby
(unsigned long) nr_free_buffer_pages() / 20);
recommended_min <<= (PAGE_SHIFT-10);
- if (recommended_min > min_free_kbytes) {
+ if (recommended_min > min_free_kbytes ||
+ recommended_min > user_min_free_kbytes) {
if (user_min_free_kbytes >= 0)
pr_info("raising min_free_kbytes from %d to %lu to help transparent hugepage allocations\n",
min_free_kbytes, recommended_min);
--- a/mm/page_alloc.c~mm-khugepaged-avoid-overriding-min_free_kbytes-set-by-user
+++ a/mm/page_alloc.c
@@ -315,7 +315,7 @@ compound_page_dtor * const compound_page
};
int min_free_kbytes = 1024;
-int user_min_free_kbytes = -1;
+int user_min_free_kbytes = 0;
#ifdef CONFIG_DISCONTIGMEM
/*
* DiscontigMem defines memory ranges as separate pg_data_t even if the ranges
_
Patches currently in -mm which might be from vijayb(a)linux.microsoft.com are
mm-khugepaged-recalculate-min_free_kbytes-after-memory-hotplug-as-expected-by-khugepaged.patch
mm-khugepaged-avoid-overriding-min_free_kbytes-set-by-user.patch
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b0399092ccebd9feef68d4ceb8d6219a8c0caa05 Mon Sep 17 00:00:00 2001
From: Muchun Song <songmuchun(a)bytedance.com>
Date: Fri, 18 Sep 2020 21:20:21 -0700
Subject: [PATCH] kprobes: fix kill kprobe which has been marked as gone
If a kprobe is marked as gone, we should not kill it again. Otherwise, we
can disarm the kprobe more than once. In that case, the statistics of
kprobe_ftrace_enabled can unbalance which can lead to that kprobe do not
work.
Fixes: e8386a0cb22f ("kprobes: support probing module __exit function")
Co-developed-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Signed-off-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: "Naveen N . Rao" <naveen.n.rao(a)linux.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy(a)intel.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20200822030055.32383-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 287b263c9cb9..049da84e1952 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2140,6 +2140,9 @@ static void kill_kprobe(struct kprobe *p)
lockdep_assert_held(&kprobe_mutex);
+ if (WARN_ON_ONCE(kprobe_gone(p)))
+ return;
+
p->flags |= KPROBE_FLAG_GONE;
if (kprobe_aggrprobe(p)) {
/*
@@ -2419,7 +2422,10 @@ static int kprobes_module_callback(struct notifier_block *nb,
mutex_lock(&kprobe_mutex);
for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
head = &kprobe_table[i];
- hlist_for_each_entry(p, head, hlist)
+ hlist_for_each_entry(p, head, hlist) {
+ if (kprobe_gone(p))
+ continue;
+
if (within_module_init((unsigned long)p->addr, mod) ||
(checkcore &&
within_module_core((unsigned long)p->addr, mod))) {
@@ -2436,6 +2442,7 @@ static int kprobes_module_callback(struct notifier_block *nb,
*/
kill_kprobe(p);
}
+ }
}
if (val == MODULE_STATE_GOING)
remove_module_kprobe_blacklist(mod);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8a224ffb3f52b0027f6b7279854c71a31c48fc97 Mon Sep 17 00:00:00 2001
From: Chengming Zhou <zhouchengming(a)bytedance.com>
Date: Wed, 29 Jul 2020 02:05:53 +0800
Subject: [PATCH] ftrace: Setup correct FTRACE_FL_REGS flags for module
When module loaded and enabled, we will use __ftrace_replace_code
for module if any ftrace_ops referenced it found. But we will get
wrong ftrace_addr for module rec in ftrace_get_addr_new, because
rec->flags has not been setup correctly. It can cause the callback
function of a ftrace_ops has FTRACE_OPS_FL_SAVE_REGS to be called
with pt_regs set to NULL.
So setup correct FTRACE_FL_REGS flags for rec when we call
referenced_filters to find ftrace_ops references it.
Link: https://lkml.kernel.org/r/20200728180554.65203-1-zhouchengming@bytedance.com
Cc: stable(a)vger.kernel.org
Fixes: 8c4f3c3fa9681 ("ftrace: Check module functions being traced on reload")
Signed-off-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index c141d347f71a..d052f856f1cf 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6198,8 +6198,11 @@ static int referenced_filters(struct dyn_ftrace *rec)
int cnt = 0;
for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
- if (ops_references_rec(ops, rec))
- cnt++;
+ if (ops_references_rec(ops, rec)) {
+ cnt++;
+ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS)
+ rec->flags |= FTRACE_FL_REGS;
+ }
}
return cnt;
@@ -6378,8 +6381,8 @@ void ftrace_module_enable(struct module *mod)
if (ftrace_start_up)
cnt += referenced_filters(rec);
- /* This clears FTRACE_FL_DISABLED */
- rec->flags = cnt;
+ rec->flags &= ~FTRACE_FL_DISABLED;
+ rec->flags += cnt;
if (ftrace_start_up && cnt) {
int failed = __ftrace_replace_code(rec, 1);
From: Ben Hutchings <ben(a)decadent.org.uk>
commit ea739a287f4f16d6250bea779a1026ead79695f2 upstream.
Commit 9e343e87d2c4 ("mtd: cfi: convert inline functions to macros")
changed map_word_andequal() into a macro, but also changed the right
hand side of the comparison from val3 to val2. Change it back to use
val3 on the right hand side.
Thankfully this did not cause a regression because all callers
currently pass the same argument for val2 and val3.
Fixes: 9e343e87d2c4 ("mtd: cfi: convert inline functions to macros")
Signed-off-by: Ben Hutchings <ben(a)decadent.org.uk>
Signed-off-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
Signed-off-by: Nobuhiro Iwamatsu (CIP) <noburhio1.nobuhiro(a)toshiba.co.jp>
---
include/linux/mtd/map.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/mtd/map.h b/include/linux/mtd/map.h
index b5b43f94f31162..01b990e4b228a9 100644
--- a/include/linux/mtd/map.h
+++ b/include/linux/mtd/map.h
@@ -312,7 +312,7 @@ void map_destroy(struct mtd_info *mtd);
({ \
int i, ret = 1; \
for (i = 0; i < map_words(map); i++) { \
- if (((val1).x[i] & (val2).x[i]) != (val2).x[i]) { \
+ if (((val1).x[i] & (val2).x[i]) != (val3).x[i]) { \
ret = 0; \
break; \
} \
--
2.28.0